My 10 UNIX Command Line Mistakes

by on June 21, 2009 · 638 comments· LAST UPDATED June 23, 2009

in , ,

Anyone who has never made a mistake has never tried anything new. -- Albert Einstein.

Here are a few mistakes that I made while working at UNIX prompt. Some mistakes caused me a good amount of downtime. Most of these mistakes are from my early days as a UNIX admin.

userdel Command

The file /etc/deluser.conf was configured to remove the home directory (it was done by previous sys admin and it was my first day at work) and mail spool of the user to be removed. I just wanted to remove the user account and I end up deleting everything (note -r was activated via deluser.conf):
userdel foo

Rebooted Solaris Box

On Linux killall command kill processes by name (killall httpd). On Solaris it kill all active processes. As root I killed all process, this was our main Oracle db box:
killall process-name

Destroyed named.conf

I wanted to append a new zone to /var/named/chroot/etc/named.conf file., but end up running:
./mkzone example.com > /var/named/chroot/etc/named.conf

Destroyed Working Backups with Tar and Rsync (personal backups)

I had only one backup copy of my QT project and I just wanted to get a directory called functions. I end up deleting entire backup (note -c switch instead of -x):
cd /mnt/bacupusbharddisk
tar -zcvf project.tar.gz functions

I had no backup. Similarly I end up running rsync command and deleted all new files by overwriting files from backup set (now I’ve switched to rsnapshot)
rsync -av -delete /dest /src
Again, I had no backup.

Deleted Apache DocumentRoot

I had sym links for my web server docroot (/home/httpd/http was symlinked to /www). I forgot about symlink issue. To save disk space, I ran rm -rf on http directory. Luckily, I had full working backup set.

Accidentally Changed Hostname and Triggered False Alarm

Accidentally changed the current hostname (I wanted to see current hostname settings) for one of our cluster node. Within minutes I received an alert message on both mobile and email.
hostname foo.example.com

Public Network Interface Shutdown

I wanted to shutdown VPN interface eth0, but ended up shutting down eth1 while I was logged in via SSH:
ifconfig eth1 down

Firewall Lockdown

I made changes to sshd_config and changed the ssh port number from 22 to 1022, but failed to update firewall rules. After a quick kernel upgrade, I had rebooted the box. I had to call remote data center tech to reset firewall settings. (now I use firewall reset script to avoid lockdowns).

Typing UNIX Commands on Wrong Box

I wanted to shutdown my local Fedora desktop system, but I issued halt on remote server (I was logged into remote box via SSH):
halt
service httpd stop

Wrong CNAME DNS Entry

Created a wrong DNS CNAME entry in example.com zone file. The end result - a few visitors went to /dev/null:
echo 'foo 86400 IN CNAME lb0.example.com' >> example.com && rndc reload

Failed To Update Postfix RBL Configuration

In 2006 ORDB went out of operation. But, I failed to update my Postfix RBL settings. One day ORDB was re-activated and it was returning every IP address queried as being on its blacklist. The end result was a disaster.

Conclusion

All men make mistakes, but only wise men learn from their mistakes -- Winston Churchill.

From all those mistakes I’ve learnt that:

  1. Backup = ( Full + Removable tapes (or media) + Offline + Offsite + Tested )
  2. The clear choice for preserving all data of UNIX file systems is dump, which is only tool that guaranties recovery under all conditions. (see Torture-testing Backup and Archive Programs paper).
  3. Never use rsync with single backup directory. Create a snapshots using rsync or rsnapshots.
  4. Use CVS to store configuration files.
  5. Wait and read command line again before hitting the dam [Enter] key.
  6. Use your well tested perl / shell scripts and open source configuration management software such as puppet, Cfengine or Chef to configure all servers. This also applies to day today jobs such as creating the users and so on.

Mistakes are the inevitable, so did you made any mistakes that have caused some sort of downtime? Please add them into the comments below.

TwitterFacebookGoogle+PDF versionFound an error/typo on this page? Help us!

{ 638 comments… read them below or add one }

1 Jon June 21, 2009 at 2:42 am

My all time favorite mistake was a simple extra space:

cd /usr/lib
ls /tmp/foo/bar

I typed
rm -rf /tmp/foo/bar/ *
instead of
rm -rf /tmp/foo/bar/*
The system doesn’t run very will without all of it’s libraries……

Reply

2 Vinicius August 21, 2010 at 5:42 pm

I Did something similar on a remote server
I was going to type ‘chmod -R 755 ./’ but i throw ‘chmod -R 755 /’ |:

Reply

3 Daniel December 30, 2013 at 9:40 pm

I typed ‘chmod -R 777′ , to allow all files rwx permissions from all users (RPi) .
Doesn’t work that well without sudo!

Reply

4 James January 6, 2011 at 11:01 am

Yes – I think I’ve made almost every possible linux mistake over the years – when I was a young sys admin I did exactly what you did – and put a space in the middle of a rm -rf /stuff/to/delete/ *
I think now that the best thing is to use virtual machines, and backup those VM’s locally and remotely. It’s easy to restart a VM & roll back changes if needed.

Reply

5 Michael Shigorin January 9, 2011 at 5:03 pm

* VM is no silver bullet (and definitely no substitute for /dev/head and /proc/care);
* zsh can warn on this type of error, and its rm has additional -s option to handle “buried symlink” case;
* I’ve got a habit of hitting (at least when zsh/bash3 is handy) and examining the static list instead of removing by pattern.

Reply

6 Simon January 9, 2011 at 10:54 pm

Michael, I don’t mean to be judgemental or start a discussion, but the idea behind this comments section (at least to my understanding) is to share experience and non-obvious mistakes in order to keep others from making them, not to discuss general ideas on how to do things like backup, etc. So please share more experience rather than correcting mistakes others have made.
Cheers, Simon

Reply

7 Michael Shigorin January 10, 2011 at 6:27 pm

Yeah, see below — sorta slow mood with a “feel that” attitude… dropping/hiding extra comments by me might benefit the page.

Thanks Simon!

Reply

8 robert wlaschin May 1, 2012 at 9:57 pm

Hm… I was trying to format a USB flash

dd if=big_null_file of=/dev/sdb

unfortunately /dev/sdb was my local secondary drive, sdc was the usb … shucks.

I discovered this after I rebooted.

Reply

9 Alex February 8, 2013 at 2:04 pm

Did the same thing, Found out after 20 minutes or so, by then most of the important data was gone…

Reply

10 Ken Kam February 27, 2011 at 3:44 pm

I did this as well just the other day. I thought it took a while to delete a folder, until I realised there was a space in between =/.

Reply

11 Jeff April 21, 2011 at 10:46 pm

I did something similar on my first day as a junior admin. As root, I copied my buddy’s dot files (.profile, etc.) from his home directory to mine because he had some cool customizations. He also had some scripts in a directory called .scripts/ that he wanted me to copy. I gave myself ownership of the dot files and the contents of the .scripts directory with this command:

cd ~jeff; chown -R jeff .*

It was only later that I realized that “.*” matched “.” and “..”, so my userid owned the entire machine… which happened to be our production Oracle database.

That was 15 years ago and we’ve both changed jobs a few times, but that friend reminds me of that mistake every time I see him.

Reply

12 Alex July 8, 2011 at 9:40 am

Or yeah :) This mistake is definitely in TOP10. I’ve done it too.

Reply

13 Sam Watkins July 12, 2011 at 3:30 pm

For most of these errors above that occured in the workplace, perhaps the biggest mistake was that a senior admin or manager allowed some junior who does not know the difference between / and \ to type at a # root prompt on a valuable production server. I would not so much blame the junior, but I would suggest that the (ir)responsible senior should be fired! If my 3 year old son strangles the cat on my watch, I’m responsible!

Reply

14 Tom April 8, 2012 at 5:43 pm

Recent issue.
We’ve switched all servers PSU1 to backup power PSU2 since all servers have redundant power units to replace main UPS with higher model. However, SAN switches do not have redundant PSU. So, We’re observed how LUN paths are switched from VMWare vSphere Client, and they were up after switching power one by one.
However, storage for main Oracle DB box didn’t come back because of Windows driver failure..
Lesson learned: ALWAYS check are all LUNs back on-line, for Windows and Linux separately.

Reply

15 Garry April 11, 2014 at 8:02 pm

I once had a bunch of dot files I wanted to remove. So I did:

rm -r .*

This, of course, includes “..” – recursively.

I had taken over SysAdmin of a server. The server had a cron job that ran, as root, that cd’ed into a directory and did a find, removing any files older than 3 days. It was to clean up the log files of some program they had. They quit using the program. About a year later, someone removed the directory. The cron job ran. The cd into the log file directory didn’t work, but the cron job kept going. It was still in / – removing any files older than 3 days old! I restored the filesystems and went home to get some sleep, thinking I would investigate root cause after I had some rest. As soon as my head hit the pillow, the phone rang. “It did it again”. The cron job had run again.

Lastly, I once had an accidental copy & paste, which renamed (mv) /usr/lib. Did you know the “mv” command uses libraries in /usr/lib? I found that out the hard way when I discovered I could not move it back to its original pathname. Nor could I copy it (cp uses /usr/lib).

An “Ohnosecond” is defined as the period of time between when you hit enter and you realize what you just did.

Reply

16 Michael Shigorin April 12, 2014 at 8:14 am

That’s why set -e or #!/bin/sh -e (in this particular case I’d just tell find that_dir … though).

My .. incident has taught me to hit tab just in case to see what actually gets removed; BTW zsh is very helpful in that regard, it has some safety net means for the usual * ~ cases — but then again touching nothing with destructive tools when tired, especially as root, is a bitter but prudent decision.

Regarding /usr/lib: ALT Linux coreutils are built properly ;-) (although there are some leftovers as we’ve found when looking with some Gentoo guys at LVEE conference)

Reply

17 nels June 21, 2009 at 7:32 am

rm -rf /tmp
I didn’t think it would hose me too much, but on a F10 box it was important……

Reply

18 Ed May 28, 2011 at 3:30 pm

how about rm -rf * whilst in /etc?

Thought I was a subdirectory of my home, forgot the cd I’d issued a few minutes before. Fortunately an ancient error.

Reply

19 Crash July 19, 2011 at 3:39 pm

Been there, done that. Pain in the butt!

Reply

20 Jose ruiz August 22, 2011 at 9:36 pm

The rm -rf is the most common for Linux beginners that ends up dooming people. I ran into a problem where I needed to make sure the date on the servers was at least 30 seconds apart from each server in a oracle database network. I forgot to put a “.” at the end to represent the seconds so the next day my servers had a date where the year was 2048, even now my co-workers still call me lightspeed.

Reply

21 Cody January 16, 2012 at 9:15 pm

Sounds like my mom’s office (telecommunication). They’re relentless. That’s a pretty funny story (and thing to call you; sure it’s at your expense but it’s not really that mean, and its more clever).

I think my favourite story is something that your (Jose’s) story reminds me of. My mom fixes databases and other problems. One time there was a database issue she was debugging. She was talking to the person on the phone. The person kept reading the error he was getting to my mom (over phone). Finally, my mom realized that he had actually printed out the file right – the file is what contained an error – not the printer itself! In other words, he printed out a file that contained the error and he thought that it was the printer having an issue. But hey, he learned from it and we all make mistakes at times. Those who say you should just fire them don’t realize that if that were the case they’d have no employees (and potentially law suits – yes, I’m serious. Some firings over things that may seem harmless can lead to lawsuits, whether its not a legit/unfounded case or not).

Reply

22 firmit June 21, 2009 at 9:00 am

Some classics here :)

Reply

23 georgesdev June 21, 2009 at 9:15 am

never type anything such as:
rm -rf /usr/tmp/whatever
maybe you are going to type enter by mistake before the end of the line. You would then for example erase all your disk starting on /.

if you want to use -rf option, add it at the end on the line:
rm /usr/tmp/whatever -rf
and even this way, read your line twice before adding -rf

Reply

24 Leon July 29, 2010 at 2:43 pm

That was a good one!! :)

Reply

25 Denis November 23, 2010 at 9:27 am

I think it is a good practice to use parameter i whithin the -rf:
rm -rfi /usr/tmp/whatever
-i will ask you do you sure to delete all that stuff.

Reply

26 John February 25, 2011 at 11:11 am

I worked with a guy who always used “rm -rf” to delete anything. And he always logged in as root. Another worker set the stage for him by creating a file called “~” in a visible location (that would be a filed entered as “\~”, as not to expand to the user’s home directory. User one then dealt with that file with “rm -rf ~”. This was when the root home directory was / and not something like /root. You got it.

Reply

27 S February 28, 2011 at 5:32 am

ouch !!

Reply

28 Cody March 22, 2011 at 1:33 pm

(Note to mod: put this in wrong place initially; sorry about that. here is the correct place).

This reminds me of when I told a friend a way to auto-log out on login (many ways but this would be more obscure). He then told someone who was “annoying” him to try it on his shell. End result was this person was furious. Quite so. And although I don’t find it so funny now (keyword not as – I still think it’s amusing), I found it hilarious then (hey, was young and obnoxious as can be!).

The command, for what its worth :

echo “PS1=`kill -9 0`” >> ~/.bash_profile

Yes, that’s setting the prompt to run the command : kill -9 0 upon sourcing of ~/.bash_profile which means kill that shell. Bad idea!

I don’t even remember what inspired me to think of that command as this was years and years ago. However, it does bring up an important point :

Word of the wise : if you do not know what a command does, don’t run it! Amazing how many fail that one…

Reply

29 mike March 29, 2011 at 11:35 am

I altered /boot/grub/rm wrong .iso
reboot

Reply

30 mike March 29, 2011 at 11:47 am

and if you look at that comment I even managed to mess up
/boot/rm wrong.iso
not /boot/grub……..

Reply

31 Peter Odding January 7, 2012 at 6:40 pm

I once read a nasty trick that’s also fun in a very sadistic kind of way:

echo ‘echo sleep 1 >> ~/.profile’ >> /home/unhappy-user/.profile

The idea is that every time the user logs in it will take a second longer than the previous time… This stacks up quickly and gets reallllly annoying :-)

Reply

32 Cody March 4, 2014 at 5:22 pm

Peter, for whatever reason I didn’t see your response to my prank with regards to user profiles. I love the idea you mention too! I would never use this (or indeed what I told a friend ‘in case’ years ago) on anyone now but its still a fun thought/idea to read (it also brings to mind the number of ways that you can screw with users heads or screw your own head – too many to count without getting bored. Ultimately that is why we all – even those who are extremely cynical like me – should always keep in mind that trust is dangerous and given too often). Thanks for sharing that. Yes, it would be very annoying but probably less sadistic (especially since any user with a clue would pick up on what is happening and know the obvious locations to check) than what I ultimately caused: the person that ran the command did it on a remote shell, thereby disabling their shell account (lucky for them they were not the system’s administrator as if they would be willing to run a command without knowing what it does then they would likely be logged in as root ‘just in case’ and then he’d have a real problem – since it was remote he would not be able to rescue it by himself).

Incidentally, this whole topic (running commands without knowing what it does) while dangerous, can be good as I describe below. Take for instance users meaning to do:
# last | grep reboot
but instead do
# last | reboot
when they should have just done (and note the prompt change!):
$ last reboot
Had they not been root or just did ‘last reboot’ (better is not being root and better still is not being root and running last reboot) they’d not have the problem. Still, as long as they learn from it (and earlier than better and preferably before someone takes advantage of them for malicious intent) then it is – in my opinion – not a mistake but a learning opportunity!

Reply

33 Daniel Hoherd May 4, 2012 at 4:58 pm

Another good test is to first do “echo rm -rf /dir/whatever/*” to see the expansion of the glob and what will be deleted. I especially do this when writing loops, then just pipe to bash when I know I’ve got it right.

Reply

34 3ToKoJ June 21, 2009 at 9:26 am

public network interface shutdown … done
typing unix command on wrong box … done
Delete apache DocumentRoot … done
Firewall lockdone … done with a NAT rule redirecting the configuration interface of the firewall to another box, serial connection saved me

I can add, being trapped by aptitude keeping tracks of previously planned — but not executed — actions, like “remove slapd from the master directory server”

Reply

35 UnixEagle June 21, 2009 at 11:03 am

Rebooted the wrong box
While adding alias to main network interface I ended up changing the main IP address, the system froze right away and I had to call for a reboot
Instead of appending text to Apache config file, I overwritten it’s contents
Firewall lockdown while changing the ssh port
Wrongfully run a script contained recursive chmod and chown as root on / caused me a downtime of about 12 hours and a complete re-install

Some mistakes are really silly, and when they happen, you don’t believe your self you did that, but every mistake, regardless of it’s silliness, should be a learned lesson.
If you did a trivial mistake, you should not just overlook it, you have to think of the reasons that made you did it, like: you didn’t have much sleep or your mind was confused about personal life or …..etc.

I like Einstein’s quote, you really have to do mistakes to learn.

Reply

36 smaramba June 21, 2009 at 11:31 am

typing unix command on wrong box and firewall lockdown are all time classics: been there, done that.
but for me the absolute worst, on linux, was checking a mounted filesystem on a production server…

fsck /dev/sda2

the root filesystem was rendered unreadable. system down. dead. users really pissed off.
fortunately there was a full backup and the machine rebooted within an hour.

Reply

37 Don May 10, 2011 at 4:14 pm

I know this thread is a couple of years old but …

Using lpr from the command line, forgetting that I was logged in to a remote machine in another state. My print job contained sensitive information which was now on a printer several hundred miles away! Fortunately, a friend intercepted the message and emailed me while I was trying to figure out what was wrong with my printer :-)

Reply

38 od June 21, 2009 at 12:50 pm

“Typing UNIX Commands on Wrong Box”

Yea, I did that one too. Wanted to shut down my own vm but I issued init 0 on a remote server which I accessed via ssh. And oh yes, it was the production server.

Reply

39 Adi June 21, 2009 at 10:24 pm

tar -czvf /path/to/file file_archive.tgz
instead of
tar -czvf file_archive.tgz /path/to/file
I ended up destroying that file and had no backup as this command was intended to provide the first backup – it was on the DHCP Linux production server and the file wad dhcpd.conf!

Reply

40 ikke July 15, 2010 at 3:33 pm

same with
tar -czvf ${pattern}*
instead of
tar -czvf ${target} ${pattern}*
PLUS
no backup
EQUALS
:(

Reply

41 Mel August 5, 2010 at 4:54 pm

yeah i did this too, luckily only to the maillog but still! lots of (*&$(£*$&£(!!!

Reply

42 Addy August 2, 2011 at 10:43 am

me too, but with the basic cp
cp destination source
instead of
cp source destination

Reply

43 sims June 22, 2009 at 2:23 am

Funny thing, I don’t remember typing typing in the wrong console. I think that’s because I usually have the hostname right there. Fortunately, I don’t do the same things over and over again very much. Which means I don’t remember command syntax for all but most used commands.

Locking myself out while configuring the firewall – done – more than once. It wasn’t really a CLI mistake though. Just being a n00b.

georgesdev, good one. I usually:

ls -a /path/to/files
to double check the contents
then up arrowkey homekey hit del a few times and type rm. I always get nervous with rm sitting at the prompt. I’ll have to remember that -rf at the end of the line.

I always make mistakes making links. I can never remember the syntax. :/

Here’s to less CLI mistakes… (beer)

Reply

44 Cody March 4, 2014 at 5:38 pm

I suspect you don’t remember the syntax to ‘ln’ because it actually has four invocations. As a tip: In all invocations except the last (which is specifying option -t which is to specify the target directory) the target of the link comes first. But see the man page as there’s more specifics (like the actual invocations themselves). And remember, if it is a symbolic link, the link you’re creating should not exist and if it does you either remove it first or specify -f (ln won’t overwrite it otherwise).

I think though that not remembering every thing is just human nature and that’s the great thing about having such details in the man pages which is easy to access (or if you have info installed, even more details in the documentation). Some would argue that if you can look it up, then its not a problem to not remember it; if you do remember it (e.g., from use over time), great, but if not there’s no harm in looking at the documentation.

Reply

45 Grant D. Vallance June 22, 2009 at 7:56 am

A couple of days ago I typed and executed (as root): rm -rf /* on my home development server. Not good. Thankfully, the server at the time had nothing important on it, which is why I had no backups …

I am still not sure *why* I did when I have read about all the warnings about using this command. (A dyslexic moment with the syntax?)

Ah well, a good lesson learned. At least it was not the disaster it could of been. I shall be *very* paranoid about this command in the future.

Reply

46 wtf dude April 8, 2013 at 6:01 pm

wtf dude, this command just doesn’t make ANY sense …
why did u exeute it …

Reply

47 Joren June 22, 2009 at 9:30 am

I wanted to remove the subfolder etc from the /usr/local/matlab/ directory. So I accidentally added the ‘/’ symbol in a force of habit when going to the /etc folder and I typed from the /usr/local/matlab directory:

sudo rm /etc

instead of

sudo rm etc

Without the entire /etc folder the computer didn't work anymore (which was to be expected ofcourse) and I ended up reinstalling my computer.

Reply

48 Michael Shigorin January 9, 2011 at 5:20 pm

A collague got a habit of explicitly spelling relative paths, even if he’s not to execute off-$PATH executable — like:
cp ./a ./b
rm ./c

(btw it’s not worth getting used to cp/mv/rm -i — if you like it, put it in explicitly as well)

Reply

49 Ramaswamy June 22, 2009 at 10:47 am

Deleted the files
I used place some files in /tmp/rama and some conf files at /home//httpd/conf file
I used to swap between these two directories by “cd -”
Executed the command rm -fr ./*
supposed to remove the files at /tmp/rama/*, but ended up by removing the file at /home//httpd/conf/*, with out any backup

conclusion: check the directory where the rm command removing files of which directory

Reply

50 Robsteranium June 22, 2009 at 11:05 am

Aza Rashkin explains how habituation can lead to stupid errors – confirming “yes I’m sure/ overwrite file etc” automatically without realising it. Perhaps rm and the > command need an undo/ built-in backup…

Reply

51 Ulver June 22, 2009 at 2:06 pm

rm -rf /dev/{null,zero} $HOME/dev/

instead of

cp -rf /dev/{null,zero} $HOME/dev/{null,zero}

…live cd and mknod savemes

echo > somefileWithManyCodeLines.pl

instead

echo >> somefileWithManyCodeLines.pl

the x-term history saves me !! (preveusly i dump the content of the file using cat)

so doesn’t care what term uses…set the history at the max.

and…. my prefered (many years ago happens me ….could be an excelent way to …really delete & hide information)

dd if=/dev/urandom of=/dev/withManyData count=1024 bs=1024

instead of

dd if=/dev/zero of=/dev/withManyData count=1 bs=512

the last one i callit “dd unleashed” ….

Reply

52 Greywolf February 9, 2012 at 9:18 pm

On a SunOS/Solaris box, or anything with dynamically linked critical programs [cp, cat, tar, sh], if /dev/zero or /usr/lib/crt0.so vanishes, you’re screwed. /dev/zero is especially insidious, because, in their infinite wisdom, they decided to make mknod a dynamically linked executable, so you can’t even mknod it.

Reply

53 Ulver June 22, 2009 at 2:08 pm

i forget it….i the last command i haven’t way to rollback only reinstalling

Reply

54 Andrea Ratto June 22, 2009 at 5:36 pm

I did a rm -r /usr on my two-years-old gentoo installation. Ctrl-C after one second, most of /usr/bin removed. Running ubuntu since then.

Reply

55 RudyD June 22, 2009 at 11:32 pm

Sure there were.
First thing to take an ocassional backup before any modifications.
And be careful with tnf testing of scripts with redirections to important commands with absolute path to the binary. In case. somehow I have found the mail binary in a same version server.

Other nice thing to remember. Do wait for password prompt. Especially when someone around you.
Sometimes there are a few seconds to wait and not to type clean readable characters to the console.

Reply

56 Nate June 23, 2009 at 12:16 am

My worst mistake was when I started using Ubuntu and changing abruptly (but willingly) from Windows to Linux. I accidentally deleted the entire filesystem with a command. No backups but it was a clean install.

Reply

57 Yonitg June 23, 2009 at 8:06 am

Great post !
I did my share of system mishaps,
killing servers in production, etc.
the most emberassing one was sending 70K users the wrong message.
or beter yet, telling the CEO we have a major crysis, gathering up many people to solve it, and finding that it is nothing at all while all the management is standing in my cube.

Reply

58 romal August 7, 2013 at 7:12 am

i can imagine the situation…

Reply

59 Jerry November 6, 2013 at 9:14 pm

Sounds familiar…

Reply

60 Serge June 23, 2009 at 3:45 pm

or chown all the hidden file in a website directory by ” chown user.group .* ” end up changing owner on all the websites .. (one directory up)

one should use :
chown user.group .??*

good job ;-)

Reply

61 rafael March 2, 2012 at 7:45 pm

I did it too, but to /home/user/.. At the end, I’ve changed all ownership of /home subdirectories

:ó(

Reply

62 Derrick June 23, 2009 at 4:10 pm

While adding a user, I attempted to specify the home directory. The directory did not exist yet, so I used the “m” option to create the directory as the user was being created.

But, I left off the name of the home directory that I wanted to create when I typed:

# useradd -s /bin/false -d /export/home -m new_user

I designated the home directory of the new user as /export/home instead of /export/home/new_user.

I ended up wiping out all the other users’ home directories on that server that were also located in /export/home!

I broke out in a cold sweat.

Thank goodness I was able to replace those directories from a backup.

*Whew*

Reply

63 Mat Enders April 17, 2011 at 10:33 am

I did something very similar. I was creating all of the user accounts on the new samba domain controller that was going into production the next day. Everything was done and configured except creating the users. when creating the last user instead specifying her home directory as /home/staff/9807mr I designated it as /home/staff there by putting only way to fix was reinstall had all of the configuration files backed up but was still up all night recreating all the user accounts.

Reply

64 Solaris June 23, 2009 at 8:37 pm

Firewall lock out: done.
Command on wrong server: done.

And the worst: update and upgrade while some important applications were running, of
course on a production server.. as someone mentioned the system doesn’t run very well
without all of its original libraries :)

Reply

65 niskotink0 June 24, 2009 at 11:14 am

hi.. )
My biggest error, with IPFW on FreeBSD system..
i’m turn on it in kernel:
options IPFIREWALL
and forget ..
options IPFIREWALL_DEFAULT_TO_ACCEPT
reboot.. and ..ups )))

Reply

66 Valqk June 26, 2009 at 3:51 pm

This is how I protect myself from making wrong halt:

in short I use molly-guard

Reply

67 Valqk June 26, 2009 at 4:03 pm

Rebooted/halted wrong severs – done. (posted link how to protect from this on linux)
stop wrong interface, firewall lockup – done.
some fun examples.
wanted to delete all hidden files in users home.
rm -fr .* :-D guess if that match . ..? :)
and the nasties thing I did recently is to run reiserfschk on a lvm device! ROLF! Thanks got it was on a testing setup server… neither reiser checked ok neither lvm worked after that. :)

Reply

68 ddn June 26, 2009 at 10:18 pm

Have done /sbin/init 1 instead of /sbin/init q multiple times.
after siwtching to MS Natural Keyboard.

Reply

69 Alex July 8, 2011 at 10:10 am

I’ve got this “init 1″ too :) It was core router of our home network.

Reply

70 nixCraft June 27, 2009 at 2:47 am

@everyone:

Thanks for sharing your experience with us.

Reply

71 Peko June 30, 2009 at 8:46 am

I invented a new one today.

Just assuming that a [-v] option stands for –verbose

Yep, most of the time. But not on a [pkill] command.
[pkill -v myprocess] will kill _any_ process you can kill — except those whose name contains “myprocess”. Ooooops. :-!
(I just wanted pkill to display “verbose” information when killing processes)

Yes, I know. Pretty dumb thing. Lesson learned ?

I would suggest adding another critical rule to your list:
” Read The Fantastic Manual — First” ;-)

Reply

72 Maroo July 1, 2009 at 4:46 pm

I issued the following command on a BackOffice Trading Box in an attempt to clean out a user’s directory. But issued it in the /local. The command ended up taking out the Application mounted SAN directory and the /local directory.

find . -name “foo.log*” -exec ls -l {} \;|cut -f2 -d “/”|while read NAME; do gzip -c $NAME > $NAME.gz; rm -r $NAME;

done

Took out the server for an entire day.

Reply

73 Jai Prakash July 3, 2009 at 1:43 pm

Mistake 1:

My Friend tried to see last reboot time and mistakenly executed command “last | reboot” instead of “last | grep reboot”

It made a outage on Production DB server.

Mistake 2:

Another guy, wants to see the FQDN on solaris box and executed “hostname -f”
It changed the hostname name to “-f” and clients faced lot of connectivity issues due to this mistake.
[ hostname -f is used in Linux to see FQDN name but it solaris its usage is different ]

Reply

74 Name July 4, 2009 at 5:20 pm

Worse thing i’ve done so far, It accidentally dropped a MySQL database containing 13k accounts for a gameserver :D

Luckily i had backups but took a little while to restore,

Reply

75 Scott September 20, 2011 at 8:04 pm

I’ve done something very similar –

mysql> update tablename set field=value;

OOPS! Should have been:

mysql> update tablename set field=value where anotherfield=anothervalue;

Reply

76 Vince Stevenson July 6, 2009 at 6:23 pm

I was dragged into a meeting one day and forgot to secure my Solaris session. A colleague and former friend did this: alias ll=’/usr/sbin/shutdown -g5 -i5 “Bye bye Vince”‘ He must have thought that I was logged into my personal host machine, not the company’s cashcow server. What happens when it all goes wrong. Secure your session… Rgds Vince

Reply

77 Bjarne Rasmussen July 7, 2009 at 7:56 pm

well, tried many times, the crontab fast typing failure…

crontab -r instead of -e
e for edit
r for remove..

now i always use -l for list before editing…

Reply

78 Eob October 6, 2011 at 9:08 pm

Yes, done that one. R and E are rather close to each other on the keyboard. In fact my name is really Rob!

Reply

79 Ian July 8, 2009 at 4:15 am

Made a script that automatically removes all files from a directory. Now, rather than making it logically (this was early on) I did it stupidly.

cd /tmp/files
rm ./*

Of course, eventually someone removed /tmp/files..

Reply

80 Cody August 20, 2014 at 1:10 pm

tmpwatch

and/or

find (but well, best test it thoroughly if you’re wanting to delete files automatically. and mind the newline risk!)

Reply

81 shlomi July 12, 2009 at 9:21 am

Hi

On My RHEL 5 sever I create /tmp mount point to my storage and tmpwatch script that run under cron.daily removes files which have not been accessed 12 hours !!!

Reply

82 Ville July 14, 2009 at 12:17 am

I run a periodic (daily) script on a BSD system to clean out a temp directory for joe (the editor). Anything older than a day gets wiped out. For some historical reason the temp directory sits in /usr/joe-cache rather than in, for instance, /usr/local/joe-cache or /var/joe-cache or /tmp/joe-cache. The first version of the line in the script that does the deleting looked like this:


find /usr/joe-cache/.* -maxdepth 1 -mtime +1 -exec rm {} \;

Good thing the only files in /usr were two symlinks that were neither mission critical nor difficult to recreate as the above also matches “/usr/edit-cache/..” In the above the rather extraneous (joe doesn’t save backup files in sub-directories) “-maxdepth 1″ saved the entire /usr from being wiped out!

The revised version:

find -E /usr/joe-cache/ -regex '/usr/joe-cache/\.?.+$' -maxdepth 1 -mtime +1 -exec rm {} \;

.. which matches files beginning with a dot within “/usr/joe-cache”, but won’t match “/usr/joe-cache/..”

Lesson learned: always test find statements with “-print” before adding “-exec rm {} \;”.

Reply

83 brownis July 18, 2009 at 6:47 am

sudo chmod 777 * -R
sudo chown user * -R
it is good but not if u r not located at the ~/

i had to reinstall linux, because of the permission mistake !

Reply

84 Shrikant July 22, 2009 at 2:42 pm

mv /* /tmp (or some where else).

Reply

85 Daniel December 30, 2013 at 9:52 pm

mv /home/foo /usr
No .profile!

Reply

86 Shantanu Oak July 24, 2009 at 6:31 am

I remember loosing the .tar files, and now I know why :)
I have never lost important files while using rm -rf
But I do remember copying / creating files at the root when I forgot to add the starting /

Reply

87 Shantanu Oak July 24, 2009 at 7:12 am

One more mistake that I do remember is copying a directory to another server but without using the recursive option. That copied the files found at the root but the files stored in the sub-folders were not copied.

Reply

88 M.S. Babaei August 1, 2009 at 3:39 am

once upon a time mkfs is killing me on ext3 partition I want
instead of
mkfs.ext3 /dev/sda1
I did this
mkfs.ext3 /dev/sdb1

I never forget what I lost??

Reply

89 Braindead August 1, 2009 at 5:57 am

I’ve most of these too :D

My favorite mistake was sitting around waiting for 250 or so gigs of stuff to copy to an nfs share, only to find I forgot to mount the remote share and it all just copied into the directory in /mnt instead. Good thing I had a huge root partition…

Reply

90 veeresh August 7, 2009 at 6:03 am

i have information abouttttt unix commands

Reply

91 Simon B August 7, 2009 at 2:47 pm

Whilst a colleague was away from their keyboard I entered :


rm -rf *

… but did not press enter on the last line (as a joke). I expected them to come back and see it as a joke and rofl….back space… The unthinkable happened, the screen went to sleep and they banged the enter key to wake it up a couple of times. We lost 3 days worth of business and some new clients. estimated cost $50,000+

Reply

92 Steve March 18, 2011 at 7:02 am

That seems a bit like cutting someone’s brake lines as a joke.
The least you could have done is cd to their music directory or something. :P

Reply

93 Susheel Jalali. September 5, 2011 at 7:38 am

Fear of running an unintended command or unexpected behaviour, has always inculcated in me a habit to this day: I never wake up the screen with [ENTER], but always with the most harmless [SHIFT] key.

Reply

94 RudyD November 2, 2011 at 10:52 pm

Hi!

Something similar here. I use the [Num Lock] or the [Ctrl] twice for this reason first…

A was wandering that is there any harm to these versions than that the double [Ctrl] is mapped to the console switching on some KVMs?

plus one: There were a few good times routinely push [Ctrl+Alt+Del] commands on a Virtual host console with plenty of Windows servers and a few linux ones. You can bet on this. Very good trick to draw attention…

Reply

95 Julian November 30, 2011 at 8:23 pm

During my first job in aix while saving a file with vim, it happens that sometimes you press another key after pressing w so the file gets saved with the new name. Usually i simply delete these files and nothing more happens. But this is a task i have automated in my mind (rm -rf file)

i don’t know how my fingers reached the star key but once it happened that i saved the file as *.

Imagine what happened after i finished working in the script and went back to shell to remove the file and my automated ‘rm -rf file’ stuff came to my mind …. my whole user directory deleted ….

Reply

96 ginzero August 17, 2009 at 5:10 am

tar cvf /dev/sda1 blah blah…

Reply

97 Kevin August 25, 2009 at 10:50 am

tar cvf my_dir/* dir.tar
and your write your archive in the first file of the directory …

Reply

98 ST September 17, 2009 at 10:14 am

I’ve done the wrong server thing. SSH’d into the mailserver to archive some old messages and clear up space.
Mistake #1: I didn’t logoff when I was done, but simply minimized the terminal and kept working
Mistake#2: At the end of the day I opened what I thought was a local terminal and typed:
/sbin/shutdown -h now
thinking I was bringing down my laptop. The angry phone calls started less than a minute later. Thankfully, I just had to run to the server room and press power.

I never thought about using CVS to backup config files. After doing some really dumb things to files in /etc (deleting, stupid edits, etc), I started creating a directory to hold original config files, and renaming those files things like httpd.conf.orig or httpd.conf.091709

As always, the best way to learn this operating system is to break it…however unintentionally.

Reply

99 amanda January 7, 2011 at 5:58 pm

Ohhh, I did this once in an LPIC certification class. I had my laptop running Ubuntu, but we all had an account on SUSE box the instructor wanted us to go through the class on, so I was logged into that as well via ssh. Two identical-looking terminal windows up… you can guess what I did. The worst part was that we had been working for nearly an hour and some people hadn’t saved their files…

Reply

100 Wolf Halton September 21, 2009 at 3:16 pm

Attempting to update a Fedora box over the wire from Fedora8 to Fedora9
I updated the repositories to the Fedora9 repos, and ran
# yum -y upgrade
I have now tested this on a couple of boxes and without exception the upgrades failed with many loose older-version packages and dozens of missing dependencies, as well as some fun circular dependencies which cannot be resolved. By the time it is done, eth0 is disabled and a reboot will not get to the kernel-choice stage.

Oddly, this kind of update works great in Ubuntu.

Reply

101 Ruben September 24, 2009 at 8:23 pm

while cleaning the backup hdd late the night, a ‘/’ can change everything…

“rm -fr /home” instead of “rm -fr home/”

It was a sleepless night, but thanks to Carlo Wood and his ext3grep I rescued about 95% of data ;-)

Reply

102 foo September 25, 2009 at 9:36 pm

# svn add foo
—-> Added 5 extra files that were not to be commited, so I decided to undo the change,delete the files and add to svn again…..
# svn rm foo –force

and it deleted the foo directory from disk :(…lost all my code just before the dead line :(

Reply

103 Michael Shigorin January 9, 2011 at 6:06 pm

> …lost all my code just before the dead line :(
Learned from borked
git rebase –interactive
early on in a personal project repo: tar workdir up (or copy it, be it rsync, scp, cp, or mc, whatever), only then continue with whatever you’ve a tiniest bit of doubt with.

Reply

104 Peter Odding January 7, 2012 at 9:47 pm

I do the same thing when I’m afraid of breaking a source code repository. Depending on context, sometimes “cp -al important-files important-backup” is enough and it’s a lot faster than creating a tarball (that command creates a tree of hard links to the files in the original tree, so if you edit a file the change is visible in both directories, but if you accidentally delete some files from the original directory you can still reach them from the backup directory).

Reply

105 foo September 25, 2009 at 9:41 pm

wanted to kill all the instances of a service on HP-UX (pkill like util not available)…

# ps -aef | grep -v foo | awk {print’$2′} | xargs kill -9

Typed “grep -v” instead of “grep -i” and u can guess what happened :(

Reply

106 LinAdmin September 29, 2009 at 2:38 pm

Typing rm -Rf /var/* in the wrong box. Recovered in few minutes by doing scp root@healty_box:/var . – the ssh session on the just broken box was still open . This saved my life :-P

Reply

107 Deltaray October 3, 2009 at 4:37 am

Like Peko above, I too once ran pkill with the -v option and ended up killing everything else. This was on a very important enterprise production machine and I reminded myself the hard lesson of making sure you check man pages before trying some new option.

I understand where pkill gets its -v functionality from (pgrep and thus from grep), but honestly I don’t see what use of -v would be for pkill. When do you really need to say something like kill all processes except this one? Seems reckless. Maybe 1 in a million times you’d use it properly, but probably most of the time people just get burned by it. I wrote to the author of pkill about this but never heard anything back. Oh well.

Reply

108 Guntram October 5, 2009 at 7:51 pm

This is why i never use pkill; always use something like “ps ….| grep …” and, when it’s ok, type a ” | awk ‘{print $2}’ | xargs kill” behind it. But, as a normal user, something like “pkill -v bash” might make perfect sense if you’re sitting at the console (so you can’t just switch to a different window or something) and have a background program rapidly filling your screen.

Worst thing that ever happened to me:
Our oracle database runs some rdbms jobs at midnight to clean out very old rows from various tables, along the line of “delete from XXXX where last_access < sysdate-3650". One sunday i installed ntp to all machines, made a start script that does an ntpdate first, then runs ntpd. Tested it:
$ date 010100002030; /etc/init.d/ntpd start; date
Worked great, current time was ok.
$ date 010100002030; reboot
After the machine was back up i noticed i had forgotten the /etc/rc*.d symlinks. But i never thought of the database until a lot of people were very angry monday morning. Fortunately, there's an automated backup every saturday.

Reply

109 sqn October 7, 2009 at 6:05 pm

tried to lockout a folder by removing it’s attributes (chmod 000) as a beginner and wanted to impress myself, did:

# cd /folder
# chmod 000 .. -R
used two points instead of one, and of course the system used the upper folder witch is / for modifying attributes
ended up getting out of my home and go the the server to reset the permissions back to normal. I got lucky because i just did a dd to move the system from one HDD to another and I haven’t deleted the old one yet :)
And of course the classical configuring the wrong box, firewall lockout :)

Reply

110 dev October 15, 2009 at 10:15 am

while I was working on many ssh window:

rm -rf *

I intended to remove all files under a site, after changing the current working
directory, then replacing with the stable one

wrong window, wrong server, and I did it on production server xx((
just aware the mistakes 1.5 after typing [ENTER]
no backup. maybe luckily, the site was keep running smooth..

it seems that the deleted files were such images, or media contents
1-2 secs incidental removal in heavy machine gave me loss approx. 20 MB

Reply

111 LMatt October 17, 2009 at 3:36 pm

In a hurry to get a db back up for a user, I had to parse through nearly a several terabyte .tar.gz for the correct SQL dumpfile. So, being the good sysadmin, I locate it within an hour, and in my hurry to get db up for the client who was on the phone the entire time:
mysql > dbdump.sql
Fortunately I didn’t sit and wait all that long before checking to make sure that the database size was increasing, and the client was on hold when I realized my error.
mysql > dbdump.sql — SHOULD be —
mysql < dbdump.sql
I had just sent stdout of the mysql CLI interface to a file named dbdump.sql. I had to re-retrieve the damn sqldump file and start over!
BAH! FOILED AGAIN!

Reply

112 Mr Z October 18, 2009 at 5:13 am

After 10+ years I’ve made a lot of mistakes. Early on I got myself in the habit of testing commands before using them. For instance:

ls ~usr/tar/foo/bar then rm -f ~usr/tar/foo/bar – make sure you know what you will delete

When working with SSH, always make sure what system you are on. Modifying system prompts generally eliminates all confusion there.

It’s all just creating a habit of doing things safely… at least for me.

Reply

113 Michael Shigorin January 9, 2011 at 6:14 pm

> ls ~usr/tar/foo/bar then rm -f ~usr/tar/foo/bar
Alt-. (or less repeatable, Esc .) might help as it’s “last argument history” at least in zsh/bash/ksh:
ls -lh somethingreallylong
rm -f
[PAUSE]

> When working with SSH, always make sure what system you are on.
My favourite ~/.screenrc part:
caption always “%{+b rk}%H%{gk} |%c %{yk}%d.%m.%Y | %72=Load: %l %{wk}”

Reply

114 Michael Shigorin January 9, 2011 at 6:23 pm

erm, the example should read (<brackets> didn’t make it):

ls -lh somethingreallylong
rm -f [Alt-.]
[PAUSE]
[Enter]

Reply

115 chris October 22, 2009 at 11:15 pm

cd /var/opt/sysadmin/etc
rm -f /etc

note the /etc. It was supposed to be rm -rf etc

Reply

116 Jonix October 23, 2009 at 11:18 am

The deadline were coming too close to comfort, I’d worked for too looong hours for months.

We were developing a website, and I was in charge of developing the CGI scripts which generated a lot of temporary files, so on pure routine i worked in “/var/www/web/” and entered “rm temp/*” which i misspelled at some point as “rm tmp/ *”. I kind of wondered, in my overtired brain, what took so long for the delete to finish, it should only be 20 small files that is should delete.

The very next morning the paying client was to fly in and pay us a visit, and get a demonstration of the project.

P.S Thanks to Subversion and opened files in Emacs buffers I managed to get almost all files back, and I had rewritten the missing files before the morning.

Reply

117 Cougar October 29, 2009 at 3:00 pm

rm * in one of my project directory (no backup). I planned to do rm *~ to delete backup files but used international keyboard where space was required after ~ (dead key for letters like õ)..

Reply

118 BattleHardened October 30, 2009 at 1:33 am

Some of my more choice moments:

postsuper -d ALL (instead of -r ALL, adjacent keys – 80k spooled mails gone). No recovery possible – ramfs :/

Had a .pl script to delete mails in .Spam directories older than X days, didn’t put in enough error checking, some helpdesk guy provisioned a domain with a leading space in it and script rm’d (rm -rf /mailstore/ domain.com/.Spam/*) the whole mailstore. (250k users – 500GB used) – Hooray for 1 day old backup

chown -R named:named /var/named when there was a proc filesystem under /var/named/proc. Every running process on system got chown.. /bin/bash, /usr/sbin/sshd and so on. Took hours of manual find’s to fix.

.. and pretty much all the ones everyone else listed :)

You break it, you fix it.

Reply

119 PowerPeeCee November 2, 2009 at 1:01 am

As an Ubuntu user for a while, Y’all are giving me nightmares, I will make extra discs and keep them handy. Eek! I am sure that I will break it somehow rather spectacularly at some point.

Reply

120 mahelious November 2, 2009 at 10:44 pm

second day on the job i rebooted apache on the live web server, forgetting to first check the cert password. i was finally able to find it in an obscure doc file after about 30 minutes. the resulting firestorm of angry clients would have made Nero proud. I was very, very surprised to find out I still had a job after that debacle.

lesson learned: keep your passwords secure, but handy

Reply

121 Shantanu Oak November 3, 2009 at 11:20 am

scp overwrites an existing file if exists on the destination server. I just used the following command and soon realised that it has replaced the “somefile” of that server!!
scp somefile root@192.168.0.1:/root/

Reply

122 thatguy November 4, 2009 at 3:37 pm

Hmm, most of these mistakes I have done – but my personal favourite.

# cd /usr/local/bin
# ls -l -> that displayed some binaries that I didn’t need / want.
# cd ..
# rm -Rf /bin
— Yeah, you guessed it – smoked the bin folder ! The system wasn’t happy after that. This is what happens when you are root and do something without reading the command before hitting [enter] late at night. First and last time …

Reply

123 Gurudatt November 6, 2009 at 12:05 am

chmod 777 /

never try this, if u do so even root will not be able to login

Reply

124 Alex July 8, 2011 at 10:35 am

A few days ago to give myself root’s pemissions I asked a collegue to do “sudo chmod 640 /etc/sudoers” on his ubuntu box. Result – sudo not working completely, and root’s password was unknown :/ Booting from LiveCD saved our day. But I consider this sudo’s behaviour in ubuntu rather stupid.

Reply

125 Michael Shigorin July 8, 2011 at 11:57 am

I’ve done (just as quite a few other folks) chmod -R .* back then, and still consider not reading manuals, not experimenting small scale beforehand, and blaming a tool instead of myself when I’m at fault rather stupid. (however dumbed down ubuntu might already be, suggesting breaking a controversial* but still security related tool even further isn’t gonna win someone IQ awards eh?)

* security pros argue that PermitRootLogin without-password and several
r_* accounts at UID 0 are better at avoiding password leak/privilege escalation:
http://www.openwall.com/lists/owl-users/2004/10/20/6

Reply

126 Cody January 16, 2012 at 9:31 pm

As Michael pointed out, the stupid thing is blaming a program for functionality when the person is at fault (please note I”m not trying or not even am calling you stupid, see next point). There’s a reason permissions are the way they are. Something may seem stupid, but does that make it so ?

For instance : You know why you need to be root to chown a file even if it belongs to you ? Does breach of security mean anything to you ? Because as I recall, that would (could) be the end result. And yes this is very much related to permission issues.

Reply

127 richard November 9, 2009 at 6:59 pm

so in recovering a binary backup of a large mysql database, produced by copying and tarballing ‘/var/lib/mysql’, I untarred it in tmp, and did the recovery without incident. (at 2am in the morning, when it went down). Feeling rather pleased with myself for suck a quick and successful recovery, I went to deltete the ‘var’ directory in ‘/tmp’ . I wanted to type:
rm -rf var/

instead I typed :
rm -rf /var

unfortunatley I didnt spot it for a while, and not until after did I realize that my on-site backups were stored in /var/backups …
IT was a truly miserable few days that followed while I pieced together the box from SVN and various other sources …

Reply

128 Henry November 10, 2009 at 6:00 pm

Nice post and familiar with the classic mistakes.

My all time classic:
– rm -rf /foo/bar/ * [space between / and *]

Be carefull with clamscan’s:
–detect-pua=yes –detect-structured=yes –remove=no –move=DIRECTORY

I chose to scan / instead of /home/user and I ended with a screwed apt, libs, and missing files from allover the place :D I luckily had –log=/home/user/scan.log and not console output, so I could restore the moved files one by one
next time I use –copy instead of move and never start with /

these 2 happened at home, while working I’ve learned a long time ago (SCO Unix times) to backup files before rm :D

Reply

129 Derek November 12, 2009 at 10:26 pm

Heh,
These were great.
I have many above.. my first was
reboot
….Connection reset by peer. Unfortunately, I thought I was rebooting my desktop. Luckily, the performance test server I was on hadn’t been running tests(normally they can take 24-72 hours to run)..

symlinks… ack! I was cleaning up space and thought weird.. I don’t remember having a bunch of databases in this location.. rm -f * unfortunately, it was a symlink to my /db slice, that DID have my databases, friday afternoon fun.

I did a similar with being in the wrong directory… deleted all my mysql binaries.

This was also after we had acquired a company and the same happened on one of their servers months before.. we never realized that, and the server had an issue one dady… so we rebooted. Mysql had been running in memory for months, and upon reboot there was no more mysql. Took us a while to figure that out because no one had thought that the mysql binaries were GONE! Luckily I wasn’t the one who had deleted the binaries, just got to witness the aftermath.

Reply

130 Ahmad Abubakr November 13, 2009 at 2:23 pm

My favourite :)

sudo chmod 777 /

Reply

131 Paulraj May 20, 2011 at 6:19 pm

@ Ahmad,

As I am a beginner using Ubuntu, I too faced the same problem of this chmod command. should be careful when using the commands like this.

@ Vivek,
thanks for this nice post. Bookmarked it !

Reply

132 jason November 18, 2009 at 4:19 pm

The best ones are when you f*ck up and take down the production server and are then asked to investigate what happened and report on it to management….

Reply

133 Mr Z November 19, 2009 at 3:02 pm
134 John November 20, 2009 at 2:29 am

Clearing up space used by no-longer-needed archive files:

# du -sh /home/myuser/oldserver/var
32G /home/myuser/oldserver/var
# cd /home/myuser/oldserver
# rm -rf /var

The box ran for 6 months after doing this, by the way, until I had to shut it down to upgrade the RAM…although of course all the mail, Web content, and cron jobs were gone. *sigh*

Reply

135 Erick Mendes November 24, 2009 at 7:55 pm

Yesterday I’ve locked my self outside of a switch I was setting up. lol
I was setting up a VLAN on it and my PC was directly connected to it thru one of the ports I messed up.

Had to get thru serial to undo vlan config.

Oh, the funny thing is that some hours later my boss just made the same mistake lol

Reply

136 John Kennedy November 25, 2009 at 2:09 pm

Remotely logged into a (Solaris) box at 3am. Made some changes that required a reboot. Being too lazy to even try and remember the difference between Solaris and Linux shutdown commands I decided to use init. I typed init 0…No one at work to hit the power switch for me so I had to make the 30 minute drive into work.
This one I chalked up to being a noob…I was on an XTerminal which was connected to a Solaris machine. I wanted to reboot the terminal due to display problems…Instead of just powering off the terminal I typed reboot on the commandline. I was logged in as root…

Reply

137 bram November 27, 2009 at 8:45 pm

on a remote freebsd box:

[root@localhost ~]# pkg_delete bash

The next time i tried to log in, it kept on telling me access denied… hmmmm… ow sh#t

(since my default shell in /etc/passwd was still pointing to a non-existent /usr/local/bin/bash, i would never be able to log in)

Reply

138 Li Tai Fang November 29, 2009 at 8:02 am

On a number of occasions, I typed “rm” when I wanted to type “mv,” i.e., I wanted to rename a file, but instead I deleted it.

Reply

139 vmware November 30, 2009 at 4:59 am

last | reboot
instead
last | grep reboot

Reply

140 um yea June 14, 2011 at 7:39 pm

I have done that before.

Reply

141 ColtonCat December 2, 2009 at 4:21 am

I have a habit of renaming config files I work on to the same file with a “~” at the end for a backup, so that I can roll back if I make a mistake, and then once all is well I just do a rm *~. Trouble happened to me when I accidentally typed rm * ~ and as Murphy would have it a production asterisk telephony server.

Reply

142 bye bye box December 2, 2009 at 7:54 pm

Slicked the wrong box in a production data center at my old job.

In all fairness it was labeled wrong on the box and kvm ID.

Now I’ve learned to check hostname before decom’ing anything.

Reply

143 Murphy's Red December 2, 2009 at 9:11 pm

Running out of diskspace while updating a kernel on FreeBSD.

Not fully inserting a memory module on my home machine which shortcircuited my motherboard.

On several occasions i had to use a rdesktop session to windows machine and use putty to connect to a machine (yep.. i know it sounds weird ;-) ) Anyway.. text copied in windows is stored differently than text copied in the shell. Why changing a root passwd on a box, (password copied using putty) i just control v-ed it and logged off. I had to go to the datacenter to boot into single user mode to acces the box again.

Using the same crappy setup, i copied some text in windows, accidently hit control-v in the putty screen of the box i was logged into as root, the first word was halt, the last character an enter.

Configuring nat on the wrong interface while connected through ssh

Adding a new interface on a machine, filled in the details of a home network in kudzu which changed the default gateway to 192.168.1.1 on the main interface. Only checking the output of ifconfig but not the traffic or gateway and dns settings.

fsck -y on filesystem without unmounting it

Reply

144 Michael Shigorin January 9, 2011 at 7:55 pm

> Why changing a root passwd on a box
…do check that you can still access it while having a root shell open.

This applies to sudo reconfiguration, groups, uids/gids, upgrades(!), and to some extent to network interface configuration and firewalls.

I’d usually reconfigure iptables like this — under screen(1) of course:

[apply changes]; sleep 30; [rollback changes]

where “apply” might be iptables -A/-I (then rollback might be iptables -D or -F), or “service iptables restart” (with “service iptables stop” to let me back in). Sure the particular solution depends on existence of e.g. NAT rules to still access the system but that’s rather nasty a habit of itself.

If I press Enter after considering the changes being done and suddenly the screen stops to respond, then I’ll wait half a minute and hopefully get the console back to reconsider.

Reply

145 ehrichweiss December 3, 2009 at 6:55 pm

I’ve definitely rebooted the wrong box, locked myself out with firewall rules, rm -rf’ed a huge portion of my system. I had my infant son bang on the keyboard for my SGI Indigo2 and somehow hit the right key combo to undo a couple of symlinks I had created for /usr(I had to delete them a couple of times in the process of creating them) AND cleared the terminal/history so I had no idea what was going on when I started getting errors. I had created the symlink a week prior so it took me a while to figure out what I had to do to get the system operational again.

My best and most recent FUBAR was when I was backing up my system(I have horrible, HORRIBLE luck with backups to the point I don’t bother doing them any more for the most part); I was using mondorescue and backing the files up to an NTFS partition I had mounted under /mondo and had done a backup that wouldn’t restore anything because of an apostrophe or single quote in one of the file names was backing up, so I had to remove the files causing the problem which wasn’t really a biggie and did the backup, then formatted the drive as I had been planning………..only to discover that I hadn’t remounted the NTFS partition under /mondo as I had thought and all 30+ GB of data was gone. I attempted recovery several times but it was just gone.

Reply

146 fly December 4, 2009 at 3:55 pm

my personal favorite, a script somehow created few dozens file in /etc dit … all named ??somestrings so i promplty did rm -rf ??* … (at the point when i hit [enter] i remebered that ? is a wildchar … Too late :)) luckily that was my home box … but reinstall was imminent :)

Reply

147 bips December 6, 2009 at 9:56 am

il m’est arrivé de farie :
crontab -r

au lieu de :
crontab -e

ce qui a eu pour effet de vider la liste crontab…

Reply

148 bips December 6, 2009 at 9:59 am

also i’ve done

shutdown -n
(i thaught -n meant “now”)

which had for consequence to reboot the server without networking…

Reply

149 Deltaray December 6, 2009 at 4:51 pm

bips: What does shutdown -n do? Its not in the shutdown man page.

Reply

150 miss December 14, 2009 at 8:42 am

crontab -e vs crontab -r is the best :)

Reply

151 marty December 18, 2009 at 12:21 am

the extra space before a * is one I’ve done before only the root cause was tab completion.

#rm /some/directory/FilesToBeDele[TAB]*

Thinking there were multiple files that began with FilesToBeDele. Instead, there was only one and pressing tab put in the extra space. Luckily I was in my home dir, and there was a file with write only permission so rm paused to ask if I was sure. I ^C and wiped my brow. Of course the [TAB] is totally unneccesary in this instance, but my pinky is faster than my brain.

Reply

152 Janne K. October 13, 2011 at 2:18 pm

Tab completion is *so* handy, I love it. Back in the days, my zsh didn’t ask too many questions.

# rm -rf /etc/(something)[TAB][CR]

Note that ‘#’. Well somehow, the (something) part there got lost and my fingers, of course, were faster than my optical nerves and brain. Lady luck was smiling on me that day, this happened to my own workstation. Try running without /etc, it’s quite hilarious.

(Never, ever rm -rf anything beginning with ‘/’. cd, cd, cd, cd, cd…)

Reply

153 Sam Watkins October 14, 2011 at 6:50 am

with GNU rm, I like to put the -rf at the end, like this:

rm /etc/foo/bar -rf

so I can be careful!

Better still to use mv, i.e.

mv -i /etc/foo/bar/ ~/trash/

or a ‘move to trash’ script aliased to ‘rm’.
Then you can empty it later if you really did want to destroy it, like those desktop users do.

Reply

154 fastgsx December 19, 2009 at 2:54 am

Worst mistake ever, ran fsck -y on an encrypted volume, trashed the whole partition. You can only check the filesystem after its unlocked and mounted as a logical volume.

Reply

155 CB December 23, 2009 at 10:40 pm

Worst.
Sysadmin.
Ever.

…nah, just kidding. If anything, you’ve proved your worth by helping newbies like me who don’t know what they are doing.

I’m taking your lessons to heart. Thanks for taking the time to put them out there.

Reply

156 Evilnick December 29, 2009 at 4:51 pm

heheh. yeah, i think we have all been there.

I always change the default prompt on server boxes, preferably to some flashing, garishly coloured text just to remind me i am not at home any more.

Reply

157 Solaris Admin January 4, 2010 at 3:57 am

Solars 10 – Hot drive replacement – Lab system

prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s – /dev/rdsk/c1t1d0s2

…… however c1t1d0 was my good disk… not the one I had just replaced :(

Reply

158 Danny January 6, 2010 at 7:34 pm


chmod -R www-data:www-data /etc/

Later on….

reboot

SSH never came back up. Wouldn't start with those permissions

Reply

159 LongShot January 8, 2010 at 8:58 am

My own favorite was “chown -R userx:userx ../*” when I meant ./* in /opt/somdir – it recursed nicely, I cursed not so nicely. In my defence, I was trying to make sure I got the . files. It took many hours to straighten that mess out.

Another favorite was on day one of a new job. The local alpha-geek was hotdogging and he ran a script that pushed a new user into /etc/password on all production servers. But the script had no error checking and he ended up zeroing out /etc/passwd on every single one (30+ HPUX). It was like watching a slow-motion trainwreck. I felt much less intimidated after that ;)

In terms of that sinking feeling, I was telnet’d in to multiple production servers at multiple call centers (pre SSH – yes, I’m that old). One sever started circling the drain (known database ipc problems) and the only solution was a quick reboot before it locked up. I grabbed a window and ran shutdown. Of course it was the wrong window so I took down 250+ people at a remote site and let the sever lock up at my own site for another 250+ people. The remote site was bad enough, but after the hard power-off at my site I had to repair around 20 large ISAM database files which took about two hours. Now I try to use a different background color for each server I connect to.

Reply

160 i-ghost January 9, 2010 at 5:55 pm

You can always modify /etc/bash.bashrc and add:
alias rm"=rm -i"
Mine reads:
alias rm="
echo -------;
echo Think before you delete...;
echo Use yes on stdin or -f if you must bypass the prompt;
echo You have been warned!;
echo -------;
rm -i"

It’s not the best fix, and there are better ones out there, but it works fine for home desktops.

Reply

161 mh166 January 11, 2010 at 9:41 am

# svn rm foo –force

and it deleted the foo directory from disk :(…lost all my code just before the dead line :(
Oh boy … I did the very same thing leading to everlasting loss of dozens of TEX-files …

Another one happened to me when I was going to create a cronjob that deletes all files older than X days. So I was at the shell in the correct folder and tested it:

find -mtime +23 -print -exec rm -f \{\} \;

Worked like charm … Therefore I was putting it just like that into the crontab and went to sleep. On the next day I got a 8+ Meg Logwatch-Mail … thousands of lines telling me some libs weren’t found.

The bad thing: I didn’t give a starting directory for find, since for testing I did cd to it … Therefore the cron started deleting from where it was: right away at / …. Lucky me I had a second box that was almost identically … *pheew*

But of course, the classics have also been done: wrong box, firewall lockout, halt instead of reboot, deleting the just-restored files from the production-directory instead of the temporary location under /tmp … there was lots of fun. And i bet there will be much more waiting for me to happen. ;)

Regards,
mh166

Reply

162 Not Available January 13, 2010 at 6:31 pm

Copying many gigs of stuff with dd command. If you mix the source and destination, you very precisely make of copy of the blank disk over whatever it was you thought you were copying.

Reply

163 KING SABRI January 14, 2010 at 2:25 pm

I was Make Fatal Mistake also

yum -y remove *python*

it’s make damage to most of programs :(

thank you Vivek

Reply

164 sidhabo January 15, 2010 at 8:07 am

Doing a lot of copy/paste from installation manuals always risky. Like copy with this not too unusual prompt:

hostname :> /bin/sh

If you mistakenly copy not only /bin/sh you end up wiping out whatever executable you where trying to run ( in this case /bin/sh ) by doing

# :> /bin/sh

Reply

165 Luke Seelenbinder January 15, 2010 at 8:28 pm

Worst mistake:

rm -rf /*
I meant this:
rm -rf ./*

Yes… production server. Yes… I had to totally reinstall the system. Yes… I didn’t have a current backup. And you guys that you did stupid stuff. :-)

Reply

166 Justin Anthony January 18, 2010 at 2:27 am

i dont consider myself a noob but i did the most noobish thing in the world today

ls -a > ls

Nothing wrong with it, except i was in the /bin directory testing some new shell scripts

I couldnt figure out why my ls command would return a blank on every folder.

Reply

167 vicky January 20, 2010 at 3:12 pm

It was really great of u all to write these mistakes.I have just started on, and would always remember to check and check again, the commands I use !! :)

Reply

168 Sam Watkins January 20, 2010 at 7:59 pm

One mistake I made was to run “slay” as an unprivileged user. This damn program by default (mean mode) will kill all your own processes in that case. It shouldn’t be shipped like that in a serious distro like Debian or Ubuntu, but they ignored my complaint.

Reply

169 Yogesh January 21, 2010 at 12:23 pm

I used
rm -rf *
instead of
rm -rf xxx.*
That too in tomcat server home directory. In result needed install tomcat server one more time. :)

Reply

170 matt January 22, 2010 at 4:48 am

instead of sudo sh -c ‘cat /etc/passwd > /etc/passwd.core’

i did:

sudo sh -c ‘cat > /etc/passwd /etc/passwd.core’

needless to say this made the system useless. i’m not sure why i was even doing this when i could have done sudo cp /etc/passwd /etc/passwd.core. it think it’s because i sometimes do sudo sh -c ‘head -25 /etc/passwd > /etc/passwd.core’. the lucky part was that this server was not in production yet.

Reply

171 heinzharald January 26, 2010 at 12:05 am

Nice. Thank you for openly sharing. For all the “rm” errors – I’ve learned the hard way and have replaced “rm” via alias in bash to “rm -i” that way I get a wakeup call as soon as I want to delete big-time. I have to type “/bin/rm” to bypass it.

Reply

172 Eddie January 26, 2010 at 1:08 am

One time I typed g++ -o program.cpp program.cpp instead of
g++ -o program program.cpp!

It was a final proyect program, so I had to write it all over again in a couple of hours.

One of my coworkers did: apt-get remove perl
At first sight it does not seem to dangerous, but it removed many programs, including the X environment and even the apt applications, according to him.

I used that command on my old computer when I got a new one, just to see what happened. :D

Reply

173 Zachary January 26, 2010 at 4:24 am

The amount of advertising on this site is disgusting.

Reply

174 Paul January 27, 2010 at 12:26 pm

# cat /dev/nul >/etc/motd
came out for some reason as
cat /dev/nul >/etc/passwd
GOK why. A senior moment near going home time. Couldn’t find any old passwd files around so had to invent one with root in so I could log in at the console and extract the original from the backup tape. ps showed the users as numbers. No one noticed but my boss went into a sweat when I told him the next day :) SCO box. Last millennium!

Reply

175 heinzharald January 27, 2010 at 3:24 pm

@Zachary
What ads? I use Adblock Plus for Firefox – not one f**kin ad left :-)

Reply

176 DirtySaint January 27, 2010 at 7:04 pm

hah! Hah!…majority seem to hv turtured “rm” to death. Surprisingly “rm” is still around…
anyways my personal goof-up, about twenty years back, when there was no way of knowing where u r in the filesystem – except using “pwd”…no bash, no helpful PS1 configured to show ur current location, etc. (Actually after this episode I created a small script to show the current location).

So here I was on a Xenix 286 system (apparently a unix “developed”/supplied by Microsoft), in single user mode and thinking I was in “/tmp”, issued “rm -r *” (during those days /tmp was just a simple directory)…

Well! Instead of “/tmp” I was in “/” …rest u can imagine…also in those days “rm” was much more powerful…I don’t recall whether “-f” switch was yet invented or for that matter “-i”. Or maybe I was unaware!…”rm” was pretty raw and did what it was asked to do – no questions asked!

But from that day I treat “rm” as my extra-marital wife…be real careful…

Reply

177 Ben January 28, 2010 at 12:30 pm

I had a Mail-System with a IMAP-Maildir-Structure. For any reason a single Maildir was created under root. And the Name of the User had an german mutated vowel in its name – so the name was not /.foo but /.foaer& or something. (Maildirs had a point in front of the name to know its a maildir) I copied the hole directory to the home-directory and wanted to delete the directory under root – and typed rm -rf . foaer& (you see this little space….). Now I had to do another work on the same machine – and i had to download a package with 300MB – i took wget and waited till it had been downloaded. But when half of the package was ready there comes a line behind the progress bar says: already deleted. Now I becomed scared – because of the “&” as last letter in the name of the directory – and then in the command – rm -rf erased the hole disk in the background…. And my last backup was about 2 weeks old – so, the hole company lost 2 weeks of mails – and i had a 24 hours job to completly setup the mail-server again – and this happend sunday night – my company startet at 06:00 o´clock in the morning to work. What a shitty night… But, after this there was a budget for a new Mail-Backup-System^^

So always be careful with mutated vowels and rm -rf!

Reply

178 glee January 28, 2010 at 3:36 pm

route -f xxx

I was trying to add a backup route to a primary sybase server and didn’t include the “hop count”. Since it didn’t work, I tried to “force” it and ended up deleting all routes from the routing table. :/

Reply

179 Phyrstorm January 29, 2010 at 1:53 am

Done just about all of the above. The worst I have done is dropped the public interface on a production server at a conference center in the middle of DefCon 17 at the Riviera. Luckily there was IT on site to get the system rebooted within 5 minutes.

Reply

180 r4pP157 January 29, 2010 at 5:28 pm

On a production Oracle RAC server, while connected to the console of one of the nodes through the system controller I typed:

~.

Which sent the server to OBP, halting the OS
instead of typing:

~~#

To exit console and go back to system controller.
Luckily that one time, the other node was up.

Reply

181 CuriousJoe January 31, 2010 at 11:41 pm

I had a usb drive with many folders of photos of my home in a main folder called “home”. I accidentally copied these as root and the folder was thus owned by root. Later on another computer went to this directory noticed the permission problem, switched to root, copied the photos and mysteriously typed:
rm -rf /home
in that situation that one “/” costs me a whole days work. Daft mistake, a well, live and learn.

Reply

182 Wolf Halton September 5, 2010 at 5:12 am

I had a similar experience.
I had an old machine with several users. I wanted to add another harddrive make the /home partition there and move everything across. The problem came when I wanted to delete the old home directory to get more space back on the original disk.
rm -rf /home deleted the new home directory contents, including all the old home directory’s contents. at the end I had 100% of nothing in the most important user’s directory. No back-up and no excuses at all. This happened 6 months ago, and I could have told me (if I had bothered asking) that this would happen.

Reply

183 GuyK February 2, 2010 at 12:15 pm

Is there a smart way to chmod an entire path, like mkdir -p ?
As I typed , I forgot to remove an usual -R :

cd /any/deep/path/to/allow/
chmod -R a+rx ../../../../..

oups. all the server was chmoded ….
luckily it was on a virtual machine. reload and reinstall.

Reply

184 Gilles Detillieux February 4, 2010 at 5:25 pm

After accidentally rebooting the wrong server a couple times, forgetting what I had ssh’ed to, I decided on a handy fix, this /usr/local/bin/confirm script:

#!/bin/sh
echo "You are about to run the command '$*' on `hostname`." >&2
echo -n "Are you sure this is what you want to do? (Y/N): [N] " >&2
read x || exit
case "$x" in
[Yy]*)  exec "$@" ;;
*)      echo "Command '$*' aborted." >&2  ; exit 1 ;;
esac

Now, on my servers I add these aliases to root’s .bashrc file:

alias halt=’confirm halt'; alias reboot=’confirm reboot’

That’s saved me a few times since then. The .bashrc file also already had the “alias rm=’rm -i'” in there, which at first I hated, but learned to like it after it too saved my skin on more than one occasion.

Reply

185 nixCraft February 4, 2010 at 6:06 pm

@Gilles

Nice hack. I wish I had some sort of mod points here :)

Reply

186 iKay May 19, 2010 at 11:02 pm

Nice! That’s a very clever idea indeed. Noted; and will be using this in future :)

Reply

187 amanda January 7, 2011 at 8:21 pm

That is absolutely brilliant. I’m snagging this if you don’t mind.

Reply

188 Mike February 9, 2010 at 7:36 pm

I accidentally deleted /etc. I had restored /etc to another location and just went in and didn’t put the correct path in the command. That was a painful mistake.

Reply

189 Salim February 10, 2010 at 10:01 am

Accidently rebooted servers…best is uname -a before you hit reboot command.
Wrongly plugged Sun M-series with one input 110V and 220v on the other. Server wouln’t start whatsover you do…
Best backup is dump or ufsdump (solaris), most of the time tar and cpio may lead you to loose your job.

Reply

190 Otto February 11, 2010 at 2:00 pm

Adding a user to a group:
# usermod -G groupname username

If you forget the -G, you’ll have all the current user groups replaced by the one you thought you were adding.

Very tricky.

Reply

191 nathan February 13, 2010 at 12:37 am

@Otto — I did basically that to myself with sudo,

# sudo usermod -G groupname myusername

I was immediately no longer a member of ‘wheel’ and was no longer allowed to sudo.

Reply

192 Sarper February 14, 2010 at 10:07 am

I have screwed myself most ways mentioned.
My shining moment was building a firewall for a local government site.
They said port 1049 was experiencing lag, so to debug I decided to clear the firewall.
INPUT and OUTPUT were set to reject and I typed:
iptables -X
really bad part? It was a no-access VM, had to call the host to chroot and release.

ALWAYS DO THIS:

iptables -A INPUT -j ACCEPT && iptables -A OUTPUT -j ACCEPT && iptables -X

Reply

193 Anonymous coward February 15, 2010 at 12:45 am

tar -czf dirname1 &
tar -czf dirname2 &

tar -czf dirname13 &

#Seems like all of them have been tarred and gzipped

rm -rf dirname1
rm -rf dirname2

rm -rf dirname13

#Oh, some of the larger directories weren’t.

Reply

194 Michael Shigorin January 9, 2011 at 9:08 pm

Too bad, you should be running more or less serious hardware (for that mistake’s scale) to handle a dozen of active I/O+CPU jobs efficiently. Otherwise (e.g. on a older dual socket or even modern quad core) it would rather increase seek contention and scheduling overhead thus total execution time.

…or wait(1) for them to finish at least.

Reply

195 Andreas February 15, 2010 at 7:03 am

I wanted to format my sd-card but I typed:

dd if=/dev/null of=/dev/sda instead dd if=/dev/null of=/dev/sdc

and watched my system beeing formatted in front of me.

Reply

196 Steve March 18, 2011 at 6:47 am

I did almost exactly the same thing, except I was copying a bootable image to the card.
I ended up overwriting my primary hard disk’s partition table.

It would have been easy to recover the data, had I not partitioned as XFS!!!

Reply

197 Fool February 17, 2010 at 10:56 am

On a Personal PC, not so much a linux mistake though…

This was in the age of laptops moving to having NO more floppy drives, I needed some way of removing grub as it was no longer needed. Since there were no floppy drives I was searching for an alternative to the win98 bootdisk so I could run a “fdisk /mbr”.

Found a nice little program that I burnt to a disc, that would allow me to modify the boot sector.

… accidently ended up wiping the entire partition table!

Reply

198 mordovorot February 18, 2010 at 5:49 pm

Here is what I did:

– Backed up /etc directory on a Sun Solaris server:
oldbox> tar cf etc_oldbox.tar /etc

– Transferred the tar to my home directory on the new server and extracted it to get some config files:
newbox> tar xf etc_oldbox.tar

The newbox was killed. Why? The native Sun tar doesn’t remove the leading “/” by default, as GNU tar does. As a result, the whole /etc content was completely overwritten with files from another server. Linux people, be careful with Sun Solaris!

Reply

199 Mike Lama February 19, 2011 at 9:53 pm

It’s not a Solaris fault. Even on Linux, tar will assume the base path of the argument. ALWAYS ALWAYS use . as the letter of the argument

Reply

200 Cody August 20, 2014 at 1:13 pm

Set up a staging environment, backup, …

No, as Mike put it: not the fault of Solaris. I’m afraid it was a user error.

Reply

201 Daniel February 19, 2010 at 12:56 pm

I saw a fresh installed centos system broken by command:
yum remove python :)

Reply

202 Daniel May 9, 2014 at 11:55 am

I did that. apt didn’t work so I couldn’t install it. I used the iso and dpkg.

Reply

203 Cody May 9, 2014 at 4:50 pm

I’m calling nonsense on this. Either it was not a true CentOS install (which is discussed in their FAQ or some such… web and virtual private server where the host – company – claim they have CentOS but in reality it is a mock up at best) or it was very early version without the way it is now (and I somehow doubt that). Because the truth is yum will remove dependencies and guess what yum relies on? Exactly – python. So you’ll get something like this (just tried it even… though I knew the result already):

Error: Trying to remove “yum”, which is protected
You could try running: rpm -Va –nofiles –nodigest

Of course, that is not to say you couldn’t remove python but that command as you give is not going to cause that in any way, certainly not for many years (2010 included, which is when Daniel posted the message unless my eyes are tricking me).

Python is not the only package that this protection will occur with, either. Further, if it is a plugin (or setting) it is most certainly _not_ set that way by default. That would be beyond stupid. Of course, fixing it wouldn’t be too difficult, assuming you have physical access, but that’s another issue entirely.

Reply

204 Michael Shigorin May 9, 2014 at 6:19 pm

Guys, let’s not turn a valuable discussion into yet another user support thread.

Reply

205 Cody May 9, 2014 at 6:35 pm

Define ‘support’ ?

I was stating that what was claimed as something that happened is a fishy description. So in other words: if there is a mistake to be described and it isn’t a mistake, then it is off topic (following your suggestion that it should only be about the topic: an actual mistake made at the command line, lest it becomes a “support thread” albeit with a very strange definition of support). But I don’t think it is off topic because the point of their statement was that it broke the system by a mistake (at the command line) and that is regardless to what truly happened (indeed, I could claim I did the firewall lock out one but truthfully I have not… still, if I claimed it, would that be off topic since I actually didn’t do that? You know, kind of like I wondered about how he managed to pull of what is blocked by default…).
Shortly: My response was nothing of support – not even close to – and neither was it anything off topic (it was in response to something on topic). Therefore, I wasn’t turning the discussion anywhere at all (unless your idea of turning the discussion is in fact participating in it…). Matter of fact, I was in the prompt thread at one point trying to explain something to someone BUT I realised that it was beginning to turn into helping rather than discussing, and I _personally_ stopped it and mentioned the reason for it. The only thing off topic is THIS (which I shouldn’t have to explain)! And if this sounds as me being grouchy, well, I cannot help that fact and it doesn’t change the fact I did nothing wrong (no matter how you interpret it). If I misinterpreted your response, then I am sorry for that (but just) – it was a really bad night last night. Bottom line: I was on topic.

Reply

206 Katzider February 19, 2010 at 5:59 pm

#init 0
On remote clustering server…

Reply

207 Mike Davies February 21, 2010 at 9:05 pm

This was years ago and now I think back about what an idiot I was.
I created a user, mike of course in a box running DEC 4.0E and I decided I wanted more permissions so I decided to make it user 0, just like root. The system actually asked if I was sure I wanted to do this. I said yes….It took me 2 hours to get the system back up and running because it changed every file in the system that was owned by root to mike.
dumb, real dumb.
Being superuser can be a disaster!
I use dd a lot now and make many backups.

Reply

208 Kang February 24, 2010 at 3:09 pm

Stupid french AZERTY keyboards have the * key located just on the left of the ENTER key.

while typing some :
mv /path/to/file /usr/local/
I pressed the ENTER key… pressing the * key at the same time :)
It ended up executing :
mv /path/to/file /usr/local/*
which moved every dirs in /usr/local into the last directory in /usr/local

Since then, I’m using only QWERTY keyboards ^^

Reply

209 james February 24, 2010 at 3:57 pm

On at least three entirely seperate occasions I’ve been on the command line within a C project I’ve been writing and gone to remove backup files:

rm *~

but been distracted after typing the asterisk and then switched back to the terminal and immediately pressed enter forgetting about the tilde sign. Immediately deleting weeks worth of *.c files.

Luckily I found ext3undelete, and while I did recover my files, it painfully insisted upon recovering the entire partition. Which of course happened to be the largest partition. So I then spent the rest of the evening not only waiting for it to restore the files I actually wanted, but manually deleting everything it restored which I had *not* deleted in the first place so the partition I was restoring to, did not run out of space before the files I actually deleted were restored!

Reply

210 zeve March 4, 2010 at 8:45 am

I did:
# route -f
on my early first days, ended up with no routes.

Reply

211 philbo March 4, 2010 at 2:49 pm

A nice old one – not by me but by a less-than-worthy colleague, who liked to do everything as root on the server box:

WHAT HE MEANT : last | grep reboot

WHAT HE TYPED : last | reboot

Reply

212 Cody May 9, 2014 at 4:56 pm

I already mentioned this here but it is worth mentioning again. Whether he meant last | grep reboot or not, what he _really_ meant (read: should have done) is:

$ last reboot
(Of course, yes, insisting on being root “in case” is a problem as you point out, which is indeed why I used $. Of course, I could also have used % but I hate csh – it’s a joke especially for C programmers like myself).

Reply

213 Micha March 4, 2010 at 7:45 pm

“Best thing” happend to me. Tried to clone a Serversys after 12 hours configuration.
But typed: dd if=/dev/sad of=/dev/sda bs=1024
Work was done for nothing. Samething even happend on Windows Server 2008 Backup Tool.
What have I learned, check your syntax and read warnings when they appear … they are usefull at least for something …

Reply

214 Nick March 8, 2010 at 6:15 am

I work as a student worker at a university and as a web developer there have some limited sudo access on the production web server. They upgraded to a new server and didn’t warn me that they hadn’t made bash the primary shell. So, naturally I typed the same commands I always have to change ownership of a folder to myself so I could make some changes to files and inevitably changed ownership of the whole server to myself! Oops!

sudo chown -R user folder [tab] / didn’t execute the same as sudo chown -R user folder/

Reply

215 Barius March 8, 2010 at 2:22 pm

We had a cas of rm -rf / at work recently, but thankfully I was not the culprit. After that, they renamed the OPS team to OOPS.

My fav was a laptop which dual-booted windows 2000 / linux. I installed VMware under w2k with raw disk access, which allowed me to boot the linux partition as a VM while runnining windows. This worked great for about a year, until I gave the wrong partition number to mke2fs and formatted the NTFS as EXT2 while w2k was running. W2k didn’t even notice until it’s fs cache was cleaned out by a reboot. After that it was–how shall I say–completely fscked.

Reply

216 Kenneth Heal March 9, 2010 at 9:19 pm

Once I installed a Linux server and forgot to log out of the local console. I noticed this and foolishly decided to pkill all the users processes. The user happened to be root and one of its processes was sshd; thus meaning I locked myself out of the box.

Reply

217 matt March 12, 2010 at 3:50 pm

The best one I have seen (and had to rebuild the server afterwards was someone’s Q&D tidy up script to clean old files out of /tmp…

cd /temp; find . -mtime +30 -exec rm -f {} \;

Practically everything on that line was fine… except it failed to complete the change directory and then went it’s happy way from / deleting anything that hadn’t been changed in the last month until it hit something important at which point the serverwent down. It was run by cron as root in the wee small hours so no-one spotted the effect until the next morning.

The best thing about this apart from it not being _me_ that executed the command… the management believed the line that it was a virus… on AIX a decade ago.

Reply

218 matt March 12, 2010 at 4:02 pm

Oh, I forgot the other really funny error… customer who wouldn’t use vi… copy /etc/passwd to a PC using Windows built-in ftp, add a user in to it in notepad and then use ftp to put it back… and added a ^M to all of the entries!

A call in late afternoon… ‘I can’t log on. Help!’

Reply

219 George March 17, 2010 at 7:59 pm

First week with my new Ubuntu…
– Program: Libc6 needs to be updated to run
– User:
– User: sudo apt-get remove libc6
– System: WARNING: DELETING THIS LIBRARY MAY AFFECT YOUR SYSTEM. TYPE ‘I’M AWARE OF THIS AND WANT TO CONTINUE’ IF YOU WANT TO PROCEED.
– User: types ‘I’M AWARE OF THIS AND WANT TO CONTINUE’ …
– System: *sigh*… ok… deleting everything that depends on C (the whole system)
– User: open eyes, stomach twists, push “Power” button in panic
– User: restart computer, nothing works. Grub had problems, can’t access windows installation. Doesn’t have Live CD. Has to ask to friends for a Live CD, and explain the whole story every time.

Reply

220 Tom June 9, 2011 at 10:35 pm

Sounds like I know that.
Few years ago as newbie linux admin tried to upgrade very old Slackware router with single IDE drive. I discovered that openssl library has bug reported and wanted to upgrade it.
But system said: first upgrade libc6 (or libc5, whatever). So I did it. It FAILED.
Rebooted box (it has high uptime, nobody rebooted it long time). System didn’t come up. Also, I saw on screen: IDE TIMEOUT.. blach blach. Drive died.
I had backup machine with clean Debian installed on mirrored raid, but still missing some things.. Lerned how Iptables works within 1,5 hrs :)))
It happends.. almost exacly 28001 hours ago, becouse this server is still up and this is hard drives “Power on hours” attribute.

Reply

221 Daniel May 9, 2014 at 11:58 am

Python needed downgrading, so sudo apt-get remove python. Just as bad as libc6.

Reply

222 I know this guy... March 17, 2010 at 11:49 pm

Can’t top the keystrokes, but think I’ve got a top 10 spot for consequences:

One of those ‘start rebuilding the wrong box in a failed redundant pair’ scenarios.
Sadly it was the dispatch note / purchase order processing system for largest European lights-off central distribution center in a very big industry sector, and config had been ‘evolved’ or time.
Result: they couldn’t dispatch anything and had to turn away every delivery for a day and a half. We are talking about a hundred cross-continental trucks sent back to their depots empty, or still full.
I could tell you that it wasn’t me, but I don’t expect anyone to believe that.

Moral: Never, ever, ever trust a sticky label – look at the prompt, and if the prompt doesn’t tell you: change it so it does. SVN’ed config would have massively reduced down time too.

Reply

223 Clark March 22, 2010 at 2:58 am

Was on a server, one of a dozen for TRAINING airmen and soldiers on a military base, not to track weapons or shoot anything. The same training scenarios have been run for months. Logs (multiple hundreds of megabytes were collected per day) are kept for at least a year. I wanted to copy a day’s log collection and reset the log queue so the next daily logs don’t get too terribly big for analysis (search for unauthorized logons.) I entered the command:
# copy {logfile} {logfile.bak}; rm {logfile}; touch {newlogfile}; {start proven analysis script on logfile.bak};

Copy is a DOS command. Rm worked. I had no more logfile for one day of training. I immediately and voluntarily told the information assurance team lead that I accidently deleted this one log file.

I was escorted off base within 15 minutes, no longer employed.
This is a true story. I wish it weren’t. There is no humor in this event.

Lesson for my next job: the rm command is the enemy when combined with idiot information assurance staff members. Avoid it. Use the mv command instead.

Reply

224 Michael Shigorin January 9, 2011 at 9:29 pm

I’d say no, rather “use && instead of ; (or don’t chain up stuff at all)”.

mv can leave you longing for rsync if it bails out half-way through a large tree to move… of course, for a single file it’s rather about “use mv but don’t chain”.

That lead probably wasn’t an idiot, just an army man. Still take my condolescence.

Reply

225 Gerard March 22, 2010 at 10:31 pm

My worst ever … in “/root” doing

chown -r root:root .*

.* also includes ..

:)

Reply

226 naomi March 25, 2010 at 5:14 pm

update Customers set Surname = ‘Smith’

Not strictly linux CLI but … Cold sweat is a very apt description!

Reply

227 Anonymous March 26, 2010 at 9:22 am

nice mistakes

Reply

228 Pradipta March 30, 2010 at 10:47 am

I have put command “#last|reboot” to know the last login information on a production box and ended up with rebooting the server with 18 VPS.

Reply

229 Rules to live by March 30, 2010 at 11:02 pm

Yes, I have done some excellent fatfingers, IDtenTs, etc…more importantly here is what I do now to help combat human error…my personal favorite is /sbin/service network down and forgetting you are not on console…Doh!

1. Measure twice cut once – Look carefully at the commands prior to running them, don’t ever hit unless you really know what the command will do.
2. Build a practice – Being a SysAdmin is like being a doctor, think about how you do what you do, do it consistently…document it in a wiki ideally.
3. Take the wiki documentation you create and automate what you do in scripts so life is easier, and there are less mistakes, as you test your scripts on a lab machine.
4. NEVER TEST ON PRODUCTION
5. Create build plans for what you plan to do if it is complex…refer to your docs when you do it, ideally test your docs on a lab server or a DEV server prior to doing it on PROD.
6. Play nicely with others.
7. Hire junior admins to do the junior stuff to keep the senior admins to mentor, and do senior stuff…

Reply

230 ju March 31, 2010 at 8:04 am

Created perl script to fetch files from remote server but forgot to add check of directory existence. All of the remote files were lost because my scp command was overwriting them on the destination path.

Reply

231 dexter April 2, 2010 at 12:04 pm

I typed the 3 fingers salute on a windows server to log in on a kvm switch, but unfortunately, I was on the ERP under RHEL, the trap wasn’t desactivated… reboot took 30 minutes, it was while a work experience and I was alone at the I.T. department.

Reply

232 NightDragon April 3, 2010 at 7:33 am

In my company the default set up for desktop systems was a mv and rm without -i … so i wanted to delete some files … the usual mistake (a space between dir and the wildcard)… and thats it

Reply

233 MEH April 7, 2010 at 6:06 am

I was administering a centos- box remotely via ssh and trying to resolve a RPM hell of one sort. Finally got things down to ssh version conflict and decided to unistall it.

# yum -y uninstall openssl

I was bit puzzled to notice that my ssh- session was terminated and I had to find someone to access the box via console to install ssh back so I could restore all the damage done so easily.

Reply

234 Chris de Vidal April 8, 2010 at 1:20 am

Dump rocks, but I found out the hard way of its Linux limitations.

Reply

235 Chris de Vidal April 8, 2010 at 1:23 am

Clark: SAD story. Lesson I have to keep learning is not to chain commands with ; but && (you are not alone).

Reply

236 Dave April 14, 2010 at 6:43 pm

I loved this. I wish I could remember some of my specifics, but I’ll just have to add a general comment. It applies to anyone sitting at a console – and I learned them the hard way a long time ago.

Never test a script or program on a production server. It WILL burn you eventually. We’re human, and we make mistakes.

Test every file manipulation command by first listing the files that will be affected without actually modifying anything.

Let your users know you’re taking the production system down well in advance, so they have plenty of time to prepare. Emergencies do happen, but you’d better be able to explain to the CEO why his administrative assistant’s work was suddenly interupted, or the presentation they were giving to a customer that just flew 12,000 miles to see it, was interrupted.

Never run through the hallway like you’re racing to put out a fire – it scares people. Calmly act as though you have it all under control and use the extra time to THINK about what you can do to resolve the crises. Never go to your boss with an ill-formed, by-the-seat-of-your-pants analysis. If your wrong, you’ll look like a reactionary or a chicken-little. Take the time to think, analyze, and consult w/ peers. You’ll look more professional.

Reply

237 Mr Z April 14, 2010 at 7:21 pm

Dave, I agree completely, even the part about not scaring them LOL, but you did forget one thing:

Have a backout plan and TEST that plan on the test system prior to even thinking about touching the production system. When you are certain that plan B works it’s really easy to look calm while you stop to get a cup of coffee on the way to the data center!! Trust me on that one.

Reply

238 Mark Johnson April 16, 2010 at 7:23 am

Setting up a testing server I wanted to copy /etc from another machine. On the machine I was copying to I ran
mv /etc /etc.old
instead of
cp /etc /etc.old
Reinstall, then…

Reply

239 Mike April 20, 2010 at 8:38 am

Why didn’t you just ‘reverse’ it then:

mv /etc.old /etc ?

Reply

240 Hans Henrik May 7, 2010 at 7:16 pm

ah, or
cp /etc.old /etc

Reply

241 Marco April 16, 2010 at 4:20 pm

While cleaning up someones mistake of running this on a live production box:

rm -rf / home/[…]/file
(note the space)

All that was left was an SSH connection and existing services – the box was maybe half operational with no one onsite at the DC, restoring files to the box from another similar production machine with RSYNC. Went to restart a daemon with:

ps -aef | grep (service) | awk ‘{print $2}’ | xargs kill -9

Ran the angry loopus but forgot the grep

ps -aef | awk ‘{print $2}’ | xargs kill -9

Within moments init was killed and the box was lost.

Reply

242 Joey Adams April 16, 2010 at 4:29 pm

This these aren’t Unix mistakes, but I was backing up files from my old Macintosh (Mac OS 7.6), so I sent them via FTP, but the program I was using defaulted to text mode. The real mistake was not knowing about md5sum .

Also, I used TI-Connect to back up 2 years of BASIC programs from my TI-89. After backing up, I wiped my calculator. I wanted to move the backup archive to another directory, so I issued Cut from TI-Connect so I could Paste it into another. Paste didn’t work.

Reply

243 Joey Adams April 16, 2010 at 4:30 pm

Oops, should have proofread. s/This these/These/

Reply

244 Shehzad April 17, 2010 at 10:50 am

With great power comes great responsibility.

Once I run,
rm -rf *
and then later found few directories were required. (luckily already had backup of them all)

Reply

245 me April 20, 2010 at 6:22 am

made a script to delete files older than and put script file in that folder
when script was old enough…

Reply

246 Christian-Manuel Butzke April 20, 2010 at 8:33 am

A perfect evening. At 3AM in the morning I finished revising some last minute changes on my local ubuntu virtual box, pushed them to git, updated the dev servers AWS instance, rechecked and then, ready to sleep and shut down my max, i entered a “init 0″ to shut down my local ubuntu box… good night…

unfortunately it was not the ubuntu box i shut down, but the AWS instance.

As AWS instances will be deleted on termination and shutting down an instance results in termination of the instance, the complete dev server + all the data was gone…
No backup ( the instance had a ebs connected to contain the db data, but a year ago we switched from postgres back to mysql. for some pleasant reason, only postgres was using ebs…)

I would have had much more fun in executing “sudo rm -rf /” manually…

This enter key is quite a dangerous weapon…..

Reply

247 Yehosef April 20, 2010 at 9:13 am

I was making testing some changes to the database schema locally for a live remote database. After doing one of the tests, I was going to drop the local database and restore a previous version from a backup file. Unfortunately, I had both the local and remote database admin panels open in a separate tabs in my browser and I dropped the wrong (live) database.

I had made backups of the live database shortly before this, but there was an error in the dump and only half the database was there.

And the datacenter didn’t realize that we had wanted them to back up the database dir also…

Reply

248 yehosef April 20, 2010 at 9:14 am

I realized it’s not really a unix command line mistake – nevermind.

Reply

249 slayedbylucifer April 24, 2010 at 5:45 am

I guess this is the dumbest one.

yesterday I installed RHEL-6 Beta and was logged into it from home via ssh. I had an X session active on the server which I left running yesterday when I left the office.

/home is a LV and I wanted to reduce it. but i could not as it said volume in use. so i thought running killall5 will kill my active ssesion X Session (as lsof showed only my X-session is using /home) and will leave my ssh session from home running.

haha, I ran killall5, and my putty got disconnected, I can no longer connet to my server. I will be going to office on monday and will start the ssh service.

Fortunately, this was a test system I had configured to playaround with RHEL-6.

Reply

250 matt April 26, 2010 at 6:44 pm

when i was new to linux i used
rm /home with no -i by default…. what was meant was ls /home
read before you hit that enter key!!!!!

Reply

251 Sepiraph April 30, 2010 at 5:55 am

Mine wasn’t a Unix mistake but a IOS mistake …

The date was Christmas’s eve, 2009. I was doing some firewall change for a friend I knew back in HS for his company router. I put:

deny x.x.x.x 255.255.255.0
instead of
deny x.x.x.x 0.0.0.255

since I didn’t use the reload timer command … he had to drive to the data center…

Reply

252 Win May 5, 2010 at 2:59 am

> Conclusion:
> 3. Never use rsync with single backup directory. Create a snapshots using rsync or rsnapshots.

Can you elaborate a bit on this? Not too sure what you mean.

Reply

253 Michael Shigorin January 9, 2011 at 9:48 pm

Maybe “you can ruin a single backup directory with e.g. rsyncing an empty dir onto it”. With snapshots (think cp -al), one has a bit more trouble destroying that backup — still replacing a heavily hardlinked file’s contents will ruin its contents in all the hardlinked snapshots.

So disk based backup is complemented really nice by tape one (modern tapes can take around 800Gb uncompressed data at ~100Mb/s, and tape changers cost somewhat less than a fortune by now). Bacula is highly recommended and worth the time.

Reply

254 Valeri Galtsev January 16, 2012 at 8:40 pm

Hm… somehow rsync works differently for me (on FreeBSD, Linux – CentOS 6, Mac OS 10.6). In particular, rsyncing empty directory onto not empty directory doesn’t change anything. I.e., rsync works as documented: it copies over to destination _only_ files that have older timestamp on destination, leaving everything intact. If one wants to also keep an older copy of file being updated, one can add -b option (which will rename older “file” into “file~” before copying newer version.

[valeri@point ~]$ mkdir test1
[valeri@point ~]$ mkdir -p test2/test1
[valeri@point ~]$ touch test2/test1/file
[valeri@point ~]$ rsync -avu test1 test2
sending incremental file list
test1/

sent 57 bytes received 16 bytes 146.00 bytes/sec
total size is 0 speedup is 0.00
[valeri@point ~]$ ls test2/test1/file
test2/test1/file

– you see, the “file” still exists! I use rsync for years, I never had anything trashed by it. But I _did_ my share of other flops described here!

Reply

255 dani May 6, 2010 at 9:38 am

last |reboot

instead of

last |grep reboot

while logged as root on critical production system … downtime were about 40 minutes :S

Reply

256 Simon May 10, 2010 at 1:21 am

In my home directory, I usually have a couple of .torrent files named “[isohunt] foo.torrent”, “[isohunt] bar.torrent” and alike.
Even if you’re sure you have many files starting with the same letters, don’t type
$ rm \[iso*

In one case, there was only one such file, so what I did was essentially
$ rm \[isohunt\] foo.torrent *
instead of
$ rm \[isohunt\]\ *

erasing my entire home directory (at least without subdirectories). What a shame

Reply

257 Alex May 13, 2010 at 9:34 am

I had two similar PCs where I had to install Ubuntu and similar software. I am a lazy boy ;), so I’ve installed Ubuntu on one of them (let’s say BoxA) then I inserted HDD from second one (let’s say BoxB) to BoxA and run:

dd if=/dev/sda of=/dev/sdb bs=1M

to clone this HDDs. For some reason HDD with working system was sdb after reboot. So in result instead of two working PCs I’ve got two clean HDD drives.

That was really epic fail :)

Reply

258 Valeri Galtsev January 16, 2012 at 8:56 pm

Yes, linux has quite a few weird (well, illogical) things. You had one drive, it was assigned sda device name. You added second drive to clone firs onto, but when you booted _linux_ live CD, former sda device was assigned sdb name and second drive: sda. Expect dd cloning to fail unless you know linux id weird here or better test which device is which. The same is (at least was for quite some time) about network interfaces. Linux names devices in an order of discovery of devices, only it reverses it as if it pushes devices into stack, and assigned names later when pulling them from stack. BSD in this respect is much better: it assigned device names according to their physical place in hardware.

Reply

259 Michael Shigorin January 16, 2012 at 9:22 pm

Valery, please note (as was noted to me before) that this page is explicitly for recalling troubles and not trying to market one’s crap over someone else’s crap, especially when one blames Linux instead of his own krivye ruki.

With FreeBSD in particular, you could end up with a kernel panic *just* by going over IIRC 7 vinum’s software RAID volumes back in the day, or by plugging a USB storage device into a router not so long ago.

Just in case, I did mess with SATA drives too by plugging “the next one” in a socket with the lower number (that’s pretty easy when one got 6 or 8 of them but quite feasible with even 4 or 2 when there’s no decent light source, you see). *BUT* I didn’t reverse the dd if/of so far due to a habit of “fdisk -l” and other double-checking before things like that.

Please stop spreading that lame misinformation on device numbering as the bus scan order can change with BIOS, kernel, or kernel option *BUT* you’re wrong again, uchite matchast’ (e.g. udev’s KERNEL variable description for a start).

One of the fundamental mistakes a *NIX admin can do is listen carelessly for some local “authority” who would “back” their words by being loud and proud, and not by being actually knowledgeable and reasonable (the test is “why?”). That’s pretty wide-spread in Russian-speaking *BSD circles, unfortunately.

Reply

260 Cody January 16, 2012 at 9:43 pm

Funny you’d say that Michael. I actually came to this thread for this same reason, but got caught in some other messages I wanted to reply to.

Indeed, there’s such a thing as labels.. uuid’s and so on. Using a different OS for not knowing how to fix a problem when its readily available knowledge seems silly to me. All OS’s have their own flaws and strengths. So does however each human.

But that said – I used to love bsd and scoff at linux. I was even called an elitist by some and frankly I do not blame them. My attitude combined with really extreme sarcasm, I did seem like one! I think one ofthe biggest things I learned (after years of blindness) is to use what works for you! As it happens I much prefer gnu libc than bsd’s much less useful libc (maybe improved by now ?). And this is coming from someone who used tothink, say, C++ is bloated (more specifically OOP’s). My reasoning or even my “defence” was that even Linus says it is. A silly and stupid logic there (more like no logic). I knew that much (it was more of well ‘if he sees it..’). However, as a friend said to me: you mean the person who created a very bloated OS ? Its true, Linux has a lot of stuff that a lot of people will never use. That doesn’t make it bad or useles.

Point was well taken and I now love OOP (it’s very beneficial, its more type safe [in C++ versus say C], and really it gave me something new to learn!).

In other words: Linux has flaws. BSD has flaws. Windows has flaws. MacOS has flaws.. NOTHING is perfect. Use what works for you.

Reply

261 Michael Shigorin January 16, 2012 at 10:11 pm

Yup, nothing is perfect under the moon.

Many of my mistakes could be less harmful to me — and others — if I learned it earlier that knowing the *weak* sides of what is available and avoiding them is way better than knowing the strong sides and just relying on them…

Thanks Vivek, it was a decent idea for a blog post to share the bumps earned and prompt us colleagues to do the same ;-)

Reply

262 Valeri Galtsev January 18, 2012 at 6:41 am

Wow, i didn’t mean to cause an explosion, sorry everybody… And compared to you, Michael, kernel expert, I’m just a humble sysadmin. I only put my dirty hands into kernel once, when we had 32 bit box with 8GB of ram: to get more than 1 GB for user data. I do use Linux a lot, as it’s something that just works for me. Way back I remember 3 year uptime of some of our Linux boxes. A couple of years back I started to seriously look at better alternatives, when Linux became more like windows: every 1.5 Months on average: kernel security update (==reboot)… Respectfully, – Humble Sysadmin.

Reply

263 Valeri Galtsev January 18, 2012 at 7:12 am

I don’t mind if moderator deletes my posts: I agree with Michael, my posts are just junk compared to elegant and instructive shell errors found here.

One thing I couldn’t buy as ultimately devastating though:
rm -rf /
– if I ever manage to do it as root on my *nix box I expect /bin, /boot, and part of /dev gone (and whatever else could be in / alphabetically before /dev on that box). Then the device hosting root filesystem will be deleted, and this will be end of my trouble. The rest: /home, /lib, /lib64, /sbin, /tmp, /usr, /var will stay intact. Other opinions?

Reply

264 Michael Shigorin February 7, 2012 at 5:33 pm

Just try that (not exactly that command but you’ll figure it out) in a virtual machine, then think of mmap’ed files, open handles, cached filesystem metadata.

On the bright side, on Linux at least one can salvage a (wrongly) deleted file at times by knowing it is still open by a running process, SIGSTOPping that process just in case, examining /proc/THAT_PID/fd/ symlinks and cat(1) the contents of the needed one, conveniently marked as “(deleted)”, into a safe place.

I’m not a kernel guy either, only fixed iso9660 perms back in 1999 or so for localhost :) but I know freebsd FUD when I see it, and if you failed to find a distro that works for you (like I did in 2001), don’t blame “linux” for it — it’s just not professional in the first place. Ну, не стоит уши под лапшу подставлять и дальше её тиражировать.

Back to topic: on Linux it might be safer to check /proc/mounts and not `mount` in case one has to doublecheck: I once had a trouble with a recently cloned hard drive after a reboot, blasting the wrong one with dd(1) after having missed the *real* mounts state (don’t remember the details but LABEL and UUID won’t help to differentiate between bit-per-bit copies, obviously). There’s a tendency to have /etc/mtab just symlinked to /proc/mounts though.

Reply

265 Cody May 8, 2014 at 1:18 pm

Re: “On the bright side, on Linux at least one can salvage a (wrongly) deleted file at times by knowing it is still open by a running process, SIGSTOPping that process just in case, examining /proc/THAT_PID/fd/ symlinks and cat(1) the contents of the needed one, conveniently marked as “(deleted)”, into a safe place.”

Actually, it isn’t necessarily Linux itself, at least the part about the files being “deleted” but still “existing”. That’s an inode thing. Indeed, more than one process can have a reference to the same file. That’s why deleting a file (eg with unlink(2), C system call) is not necessarily a complete deletion. See that man page for more details. Similar, moving (as in mv) a file will keep its inode while cp will use a new inode. This is handy when (example) you have in a Makefile (right before linking the objects into the binary) mv -f outputfile outputfile.bak or some such; if the outputfile is running, you won’t cause problems because you only changed the name (or put another way, with Linux’s procfs /proc/PID where PID is of course the pid of the program that is running, will contain the same information as before the mv).

As for salvaging files, again, see what I wrote about checking the man page for unlink (but again section 2!). Further, another way of checking (e.g., under Linux) for a file or any file in fact, that is open by a process but is deleted:

$ lsof | grep deleted
(of course as non root you’ll likely get permission errors but if so desired you can do it as root).
You’ll see files that are deleted ( and indeed it’ll show (deleted) ) with that command. With Linux’s procfs you’ll notice it under /proc/PID like you refer to. Haven’t used BSD or any other Unix in far too long to really remark on it except that inode is nothing specific to Linux (that or I’ve really forgotten some things…).

Yes, this was some what off topic but I think that since it was mentioned (deleted files and salvaging them) I would elaborate on why and _how_ that is possible and what is truly happening.

Reply

266 monkeyslayer56 May 13, 2010 at 5:47 pm

i accidently blocked windows from squid proxy… 99% of the computers on the net are windows…

Reply

267 Cody May 8, 2014 at 1:21 pm

Isn’t that a good thing? I mean not having to deal with Windows? I would think it good. Humour aside: I’m not sure 99% is correct, now or in 2010. But certainly – not counting servers – it the majority, and depending on your setup I could see that being a problem indeed.

Reply

268 Ruban May 14, 2010 at 9:58 pm

I usually work night shifts, was doing healthcheck as usual on my office servers. There was one time where i press command top to check the memory usage.

Saw that mysql process Cpu usage at 99%, i thought by killing the mysql process it will free up the resources and we can start back later.

Once i kill the pid, i start receiving alerts the server is down and all the transactions was pending at GUI.

Paniced and called my senior at 4am, luckyly he thought me how to start back the mysql process. ^_^

Learned from mistakes. ^_^

Reply

269 Michael Shigorin January 9, 2011 at 9:59 pm

Heh, fixing what ain’t broken (admin-wise), and then postponing the “how *exactly* can I start what I try and stop now”. It’s like speeding the crossroads, might just make it 90% of the time but die the tenth time…

Reply

270 iKay May 17, 2010 at 4:03 pm

Where to start, where to start!

Having a lot of servers from one provider, all with similar hostnames, and reinstalling the OS of the wrong server.

Rushed setting up a server; normally I setup alias for cp so that it runs cp -R. So I need to back up an SQL database. I go in to the MySQL data directory and run cp database_name /root/db_backup/database_name. I proceed to run killall mysqld and then rm -rf database_name.
Reboot the server and SQL comes up all is fine so killall mysqld and cp /home/db_backup/database_name /var/lib/mysql/. Bring back up MySQL and then try to hit the website. Realise that I have only copied the files and no directories, the database is incomplete and site is destroyed. No other backups.

Setup a script to automatically ban IP addresses on 5 failed login attempts to SSH and theres no timeout to remove banned addresses. I also blocked my server provider from getting into the box via SSH key. I changed the root password then logged out of the box. Went to sleep and forget what I changed the password to then after 5 attempts I’m locked out. Give my provider what I think is my root password and after 5 attempts they are locked out at the NOC.

Reply

271 Michael Shigorin January 9, 2011 at 10:02 pm

sshutout at least has timeouts and a whitelist, albeit I managed to help a colleague lock himself out for something like 6 attempts at his password :)

Reply

272 Navneet May 17, 2010 at 4:43 pm

i want to run unix commands with multiple options…like
snmpwalk -Cpublic 192.168.1.1 and get the result on a webpage….how do i give the options..

Reply

273 JP May 18, 2010 at 6:58 pm

crontab -r instead of crontab -e
doh!!

Reply

274 crontab doh'er May 20, 2010 at 2:45 am

crontab -l before any crontab -e is a friend to all :)

Reply

275 Mike -r May 25, 2010 at 1:26 am

I now have this nickname at work. On deployment did a crontab -r on a critical system with no backup of the crontab easily accessible. The worst!!

Reply

276 Jeez Man May 20, 2010 at 2:54 pm

Jeez Man! Going about advertising your mistakes, especially where you rebooted the Oracle DB Box! I wouldnt employ you!

Reply

277 Exolon May 22, 2010 at 12:04 am

Then you wouldn’t find anybody to hire (or you’d hire a liar who claimed never to make booboos). Everyone makes mistakes. _Everyone_.

Probably the best one I made in the last while was removing a program that didn’t work on my Mac at home… I typed in “rm -rf /Applications/”, thought that I’d typed the initials of the program and hit tab to complete it, then hit enter. It didn’t match anything of course, so about 3 seconds passed while it deleted programs before I noticed and hit ctrl-c. It got as far as C*, so I was able to re-install the stuff I’d nuked :/

Reply

278 Juraj May 21, 2010 at 1:53 pm

Why anyone insists on using > and >> operators instead of proper text editor, is beyond me…. even if it would save some time (I doubt), it is not worth it.

Reply

279 iDale May 26, 2010 at 5:54 pm

Very much worth it from within a well tested script. I agree with you in spirit, though: using them manually is just asking for it!

Reply

280 riffraff June 17, 2010 at 8:48 pm

You must not have very many servers to look after.

Reply

281 Cody June 3, 2011 at 2:07 pm

Well you should learn it rather than fear it.

It’s not hard. It’s not elusive. It’s not difficult to learn. Editors have their uses. But if you’re also afraid of > or >> then I guess you’d be afraid of sed and awk or pipelines or any command that edits a file. Think of sed -i. One screwup and you could wipe or make useless, an important file or many files. Yet, I guarantee it’s not only FAR faster, it is worth it and among the best solution for many problems (multi file search and replace and similar). And as for > and >>, here’s another example of how knowing how it works, saved my hide.

Imagine this: mistake in or missing entirely /etc/fstab. In my case, I think the root filesystem entry was gone but this was years ago so I could be remembering wrong. I may have even made a mistake. Whatever, the important point is, I knew how to fix it. It was on a FreeBSD box is all I remember. Regardless: you don’t always have editors. One such reason: think of linked in libraries. Another: a not mounted file system (possibly this is the most relevant reason in the case I fixed).

So how did I solve this issue without having to reinstall (remember: no editor!) ? Simple:

cat > /etc/fstab <> for multiple lines.

Yes, I actually reconstructed /etc/fstab by knowing how to use the shell.

So that’s one reason people insist on using it: it is VERY useful and not really hard to keep straight. It reminds me of the quote that goes along the lines of ‘unix is user friendly, but it’s very particular about who it is friendly with’.

Reply

282 Cody June 3, 2011 at 2:10 pm

Of course, that was misinterpreted by the site.

Bleh. it should be two left arrows and then “EOF”.

E.g.:

cat > /etc/fstab << “EOF”

EOF

Hopefully that comes out properly this time.

Reply

283 AmbientAngels May 22, 2010 at 5:13 am

A few years ago when I was still relatively new to linux, I had my windows drive mounted as /windows/c on the system. I wanted to delete my wine windows directory. So I moved to the .wine directory and ran rm -r /windows. I never had the opportunity to abort as I had left the house after running that command.

To this day I’m still grateful I had gotten into the habit of backing up data on a regular basis.

Reply

284 me May 28, 2010 at 6:19 pm

I liked this one:

rm *#*

The intention was to remove all files containing ‘#’ sign – these were some temporary ones. Unfortunately, my shell interpreted ‘#’ as beginning of comment. Guess what happened :)

Reply

285 Bodsda June 2, 2010 at 6:29 am

Wow, thats about as bad as my

rm -rf * ~/junk

Never put that asterisk on the wrong side :)

Reply

286 iMadalin June 9, 2010 at 2:34 pm

can’t stop laughing :)) you guys are great :)

Reply

287 TheEye June 2, 2010 at 5:23 pm

instead of ‘crontab -e’ I mistakenly hit the ‘r’ instead of ‘e’, and removed the root cron completely

Whoever wrote that to allow a remove without verification, bad, very bad…

Reply

288 Dave July 22, 2010 at 10:10 am

Yes, I’ve done this too many times. R and E are so close on the keyboard and no are you sure??

Reply

289 Anonymous June 3, 2010 at 2:16 pm

rm -r /var/* instead of rm -r ./var/*

Reply

290 dudhead June 5, 2010 at 2:01 pm

When a s/w installation went wrong, I decided to delete the incomplete installation, which included a /bin directory, then merrily type y for all files when asked if I wanted to delete it….lost system /bin, couldn’t do anything. Got Ubuntu live, copied /bin from a sister machine, then manually created all symbolic links. Found I couldn’t use su anymore, then gave chmod u+s su (and other similar files). Recovered all!

Reply

291 Anonymous June 5, 2010 at 5:55 pm

You got way lucky on that one. Reminds me of the time I ssh’d into the mail server to work on an Inbox and didn’t exit out. At closing time I found an open terminal on my laptop and typed sudo /sbin/shutdown -h now….it wasn’t my laptop I was shutting down.

Reply

292 Chris M June 8, 2010 at 6:48 pm

On FreeBSD and new to rsync, set up rsyncd.conf with a new module section named svntrac to allow syncing from path /usr/local and including two subfolders (one for svn repositories and one for trac repositories). On client machine for backup as root ran rsync -avzr –delete server::svntrac /usr/local.

It backed up my svn and trac repos fine, but deleted everything else from /usr/local ! All installed programs gone just like that (including rsync!)

Reply

293 Mark Scholten June 9, 2010 at 9:43 am

I once used rm progname * instead of rm progname*.
Just one additional space in between the progname and the asterix.
Unfortunately the system was used by two departments. Each department thought that the other one made backups of the system. So nobody did. Took me two weeks to rewrite the code. Since that day I always use rm -i progname* and check doublecheck.

Years later, I accidentally put the wrong permissions on /etc/passwd. So no one could login into the system. Not even root. Luckily some IBM guy came along on the same day to repair a harddisk. He knew a back-door into the system and saved my day.

Sofar only 2 mistakes in 20 years of Unix. Not too bad.

Reply

294 Simon June 9, 2010 at 3:22 pm

I was once working on a mounted samba share when I discovered two directories which seemed to have identical content. Let’s call them foo and bar. To make sure I did a diff foo bar on the directories, returning no differences.

In order to free those 5GB of disk space, I continued to rm -rf bar when, after about 10 seconds, it struck me that I was working on a samba share and aborted the operation, luckily.

Next, I logged in using SSH and discovered that bar was only a symbolic link to foo, a fact that is hidden from the user when working on a mounted share :-/. Well… the 1.5GB that were already deleted could be recovered for the most part, but I certainly learned a lesson. Don’t trust samba shares ;-)

Reply

295 Russell June 9, 2010 at 10:01 pm

LOL!! I hope I never do one of those mistakes!!

Reply

296 atlgnt June 11, 2010 at 6:45 pm

One time I was copying some neat stuff unix commands along with the process steps I documented.from a journal i had on Lotus notes…was going to paste into a txt file as I moved to a new mail system….. well I pasted the info into the worng window…..as i had several windows open at the time…. one of the commands was a shutdown command i had in some processing steps to prepare for some work to be done on a unix server……so I shutdown a production server,,,amazing how fast it shutdown….not so amazing how long it took to come back up…..lucky for alerts…instantly got an email saying system was down….i said ..what idiot shut down that box….oopps …that idiot was me……lucky most folks were gone for the day….so it was not as bad as it could have been

Reply

297 Jason Barnett June 15, 2010 at 5:51 pm

chown -R jbarnett /

yeah… it didn’t end so well =/

Reply

298 oops June 17, 2010 at 4:56 pm

chmod -x /bin/chmod

Reply

299 Andy T June 17, 2010 at 7:42 pm

I love this one most of all because it made me think about the fastest way of recovering the situation: fire up the filesystem tools and toggle some bits? su and then cat the contents of /bin/chmod over another less important system executable? something much simpler I’ve missed? some reason a rebuild was unavoidable – please expand because I know it’ll be useful…

Reply

300 Sam Watkins June 21, 2010 at 12:33 am

to fix that, you could do something like this:

cp -p /bin/sh ~/chmod
cp /bin/chmod ~/chmod
~/chmod +x /bin/chmod

It would be more difficult if you had done this: chmod -R -x /bin
In that case you might be able to rescue it with the current shell and something in /usr/bin, or else use a rescue disk such as knoppix, not sure…

Reply

301 Mike June 17, 2010 at 8:17 pm

My personal favorite that I did fairly recently was changing my login password, and forgetting to change my ecryptfs password before restarting… When I restarted, I could login with my new password, however when it would try to pass the new password on, it wouldn’t match the ecryptfs password, and it could not mount my /home directory… sucked!

I then made things worse by logging on to a different account, su-ing to my main account, and trying to change the ecryptfs passphrase to match the new password I picked…

I ended up getting cought in a loop of changing passwords and su-ing into parrallel accounts… I eventually gave up on it and formatted/reinstalled…. SUCKED!

Reply

302 riffraff June 17, 2010 at 8:52 pm

Did the killall thing on an AIX box. Entered ‘killall’ just to get the syntax and switches. Next thing I know, my ssh connection is gone and people are calling about the server being down. Fun times.

Reply

303 kajienk June 18, 2010 at 11:56 am

My best error was following.
I reinstalled Linux and copy home folder backup from ntfs partition. While I was removing the annoying executable flag from all files with something like
chmod -x * -R
I ended removing executable flag in whole /bin :(

Especially /bin/bash is very troublesome :)) I couldn’t log back I couldn’t do anything. Installation again….

Reply

304 Eric June 21, 2010 at 11:24 am

I have done the “userdel” one. But I am wondering how to delete user home directory only while keeping their mail spool..

Reply

305 8bit July 6, 2010 at 3:15 pm

Done this a few times – thought I was logged onto a Red Hat box and done “init 5″. Surprised to see server start to shutdown, instead of transitioning to multi-user with graphical login. Realise it’s a Solaris box, runlevel 5 on Solaris is shutdown and power-off…. :|

Reply

306 Karin July 11, 2010 at 7:19 pm

I was moving some files and accidentally typed this;

mv / /home/….

Everything under the root directory was moved to the home directory. Yeah!

Reply

307 edward July 12, 2010 at 10:26 pm

I typed
chmod -R USER:USER /* instead of chmod -R USER:USERS ./*

changed the whole dang webserver to a single owner, that wasnt root.
All 87 domains. :(

Reply

308 dhx July 16, 2010 at 10:06 pm

I have done a lot of “the classics” but my best was :

being in /backup
meaning to write rm -rf www/
i wrote rm -rf /www
guess what, it was nice..
but the worst part was a second later I saw my mistake, and wrote the right command deleting the backup _ALSO_
just then I realized what have I done…

Reply

309 Dave July 21, 2010 at 12:11 am

“chown -R root:root /” instead of “chown -R root:root .” == reinstall of whole system and a night without sleep :-)

Reply

310 Dave July 21, 2010 at 12:22 am

Also did “cat db_backup.sql > mysql” instead of “cat db_backup.sql | mysql” as root and destroyed the server binary requiring a complete rebuild from source

Reply

311 giany July 22, 2010 at 7:37 am

I told a client to do a : chmod 600 /root/.ssh/authorized_keys2 and he did :

chmod 600 / root / .ssh / authorized_keys2

Reply

312 Douglas July 22, 2010 at 8:20 pm

I had a years worth of MySQL backups using the XML format, unfortunately i had failed to read the FULL manual and therefore did not know that while MySQL would in fact write to an XML file, it could not read or import an XML file for the version on the server.
Several panicked hours later, I had a working setup of the latest version (from the website, not pretty), and had managed to import and the XML file just to reexport them in a format readable by the version we had on the production server.
Lesson learned: always read the full manual before trying new or “better” features. We all thought the XML format was great, more portable, etc…
Oh, and the newest version (as of May 2010) could not import the files directly, they were too large and in the wrong format, I had to write a Perl script to do everything in chunks.

Reply

313 Elton July 29, 2010 at 4:28 pm

My keyboard is under my desk, in a sort of a drawer, and the monitor is fixed on the wall, so I have free space to work on my desk.

I was trying to solve some problems in my laptop with help from people at freenode. When I had to do
[code]
su

[/code]
in the laptop, I typed in the wrong keyboard and my root password (which was the same for the desktop and the latop, for the last time ever) went directly to the channel.

Reply

314 Jared July 30, 2010 at 4:54 pm

Long ago I learned to always lift my finger completely from the shift key immediately after typing the ‘*’ character.

Here’s what I meant to type: rm -f *.txt

Here’s what happens it you type the ‘.’ without ensuring you have completely lifted your other finger from the shift key:

rm -f *>txt

You get a directory with only one file in it,called ‘txt’, with 0 bytes.

That was a painful (never to be repeated) lesson.

Reply

315 Andy July 30, 2010 at 10:46 pm

Thanks god, the earth is still running…

Reply

316 Bruce K. August 4, 2010 at 5:52 pm

This is really pitiful … such terrible work habits and carelessness. Certainly no one is perfect, but all one has to do is be deliberate and look carefully at what they type instead of getting in a mad rush that will set you back hours or days.

One thing I have done is to create a super-prompt on root account:

>>> [515] user@machine 2010-08-03 13:09:30 [515]
>>> /home/user
>>> $

The “>>>” is not part of the prompt of course. This tells you exactly where you are, what command you are in, what time you did your last command.

Next … when you want to know what files you are going to operate on such as “*”. instead of doing the command “rm *” … do “echo *” to see the file list. Then backtrack in history and edit the command.

One thing that has bitten me before is getting too fast on history editing … especially in multiple windows. If you use the same history file for every login you run the risk of thinking you are repeating a command when you may be repeating a command that was typed in another window and added to the history file. The way around that is to create separate history files for every login in your init scripts. You must manually delete the old history files at some point, but sometimes it is useful to be able to search for a command you used in the past at some point.

All the rest of it is question of being here now … not going so fast you do not have time to think and see what you are doing. Computers are unforgiving.

Reply

317 Marcus August 6, 2010 at 3:57 pm

While root I typed:
kill 1
instead of:
kill %1

The screen went very blank, and then there was a lot of yelling from users on other terminals…

Reply

318 Jack August 12, 2010 at 5:11 pm

Not an Unix mistake but I’ve deleted the wrong LUN from Storage side, the redo log from Oracle, It scared the crap out of me. Luckily the dba managed to rebuild redo logs.

Reply

319 Evan Richardson August 14, 2010 at 7:10 am

Not a unix command, but thought I’d share mine. I was RDC’d into a remote box, and after having updated some info on it’s host dhcp server, I wanted the machine to pull down new changes. Instead of typing ipconfig /renew, I typed “Ipconfig /release”

needless to say, I had to go to the remote location and type ipconfig /renew myself =(

Reply

320 Jimboooo August 15, 2010 at 9:29 pm

Removing some core files from a host…
# find / -local -type f -name ‘core*’ -exec rm -f {} \;
#

Phone rings – DBA complaining about his database not working. “Which database?”, I ask. “Core”, he replies.

Reply

321 Mr Z August 20, 2010 at 2:12 pm

This one has had me laughing for days now. I’ve been pretty careful for many years now. Worst I’ve managed was somewhat trivial:

Pulled the cover from a running Sun E-250 to see why a fan alarm was going. The interlock switch helped me to get down time to change the fans.

When I first started using Ubuntu, playing with it a bit I decided that I didn’t want to learn python, so removed it. If you’re not sure why that’s bad, try it :-)

Reply

322 Cody May 9, 2014 at 5:01 pm

I’m not sure why I never responded to this, but I have to admit this is the funniest of them all, and I have thought this the first time I saw it. It is absolutely hysterical. Of course for me it is more so as being a programmer and also being really good at debugging (with or without debugger – indeed both) I know why you would want core files removed (besides the fact it holds the memory and call stack of the program at time of creation, which includes potential security risks, there is that issue of size …).

Thanks for sharing that one. It is a pure classic.

Reply

323 Anonymous for safety August 19, 2010 at 4:23 pm

I managed to get physically assaulted (kidding, was just a a pretty hard slap on the back of the head), joked at for 6 months, and get a to-this-day epic dressing down from the boss in front of all my coworkers for a very simple, and idiotic mistake.

Had an SSH terminal for our DNS / DHCP server, which also doubled as a mailserver.
I opened another tab inside the terminal window, to another server I was gonna install (it was supposed to be the new mail server). As a joke, I called a coworker nearby, and said “watch this!”. To his horror, I proceeded to rm -rf / in front of him, apparently on the production server.
(you can see where this is going)
He went white. I laughed and said “heheh gotcha. that was the old server. its now all ready to reinstall!”. He goes, “nooo, it was the production server, are you MAD?” – while turning from white to red. After a couple of “no it wasnt; yes it was”, I looked back at the screen.

Sure enough.. I typed the rm-rf into the wrong tab.

Icing on the cake? The work order was install a backup system on the dns/dhcp server, and migrate the mailserver to a new machine. Obviously there were no backups.
Punishment: having all my coleagues leave early for the day, with instructions to me of “you will get out of here when the everything is back up!”. Was a loooong weekend.

According to the boss, I wasnt fired on the spot only because “well, at least we know YOU will never ever do another rm-rf withought thinking twice or thrice…”.

Reply

324 brux August 19, 2010 at 9:53 pm

231 Anonymous for safety ….

Sorry to pick on you, and don’t mean to, but this is one really big source of mistakes on computers, people getting emotionally involved. Like playing around … you were thinking more about other people and making a joke than what you were doing, so you were and will be bound to always make mistakes like this. Like playing around and not realizing what window the focus is in, or that your ssh has time out and you are now back on the original machine, or whatever.

With computers you have to think very carefully about what you are doing, and then look at it, and even test it if you can before you run a command that does something complicated, or any kind of “write” operation. You also have to think about the ease of recovering the data should something go wrong. Before we did upgrades on user’s machine we used to just naturally assume users were lying about have local data on their machines, so we would do a backup image to a admin server just in case. That saved a ton of data from people who sometimes did not really understand the difference, or were not totally thinking.

I used to get kidded about my seriousness and the fact that when we did common operations, upgrades, etc, I would look at a command line and ask everyone there if it was OK. I got neverending shit about that, until a few times when people did not do that we lost customer data.

We were doing an upgrade once and one hot shot admin was ready to hit the return key to start it. I asked him if there were backups and he said he did not know, but nothing is going to go wrong. I told him to make sure and backup the machine, and of course, you know what happened, becuase he got his emotions, his arrogance involved in it and could not stand to be questioned.

When I am hiring I try to look as best I can for this trait, because it is the number one problem with a good admin … that and just plain crookedness or dishonesty. Why are their so many anti-social sys-admins? ;-)

Reply

325 Marcus August 20, 2010 at 2:01 pm

No offense, Brux, but isn’t this for people sharing their own mistakes? He clearly knows he screwed up and how, or he wouldn’t be here. We already know you’re good because you’re on a site that’s about improving your craft, but do you have a mistake of your own you can share for our education and entertainment? Mine is at #228, fyi.

Reply

326 brux August 19, 2010 at 9:57 pm

by the way … i really love the individual icons that this site creates for its commentors … can someone please, please, please, email me and tell me what that is … it’s really cool and I’d like to use something like that myself …. please!!!! very very cool!

Reply

327 Shehzad August 20, 2010 at 6:14 am

BIG mount blunder:
Very old existing mount was:
/mnt/mountpoint1
/mnt/mountpoint2

After long days, once I was experimenting with SSDs and mistakenly mounted it on /mnt
A script scheduled in cron using mountpoint1 and mountpoint2 broke and take me many hours to fix!!!

Reply

328 Sam Watkins August 23, 2010 at 1:21 am

One time while I was somewhat mentally ill (no excuse really!) I was using my father’s computer with Mac OS X. I had created some files at various places on the system, and I was thinking I want to remove all the files I created. So I thought, well I can just run rm -rf /, that should delete just the files which I have permission to delete, and fail for others.

Not such a good idea, since he had an old filesystem mounted 777 without proper owner or permissions! Fortunately I did stop the rm process before it trashed everything. Luckily he did have backups of those items early in the alphabet, and didn’t lose anything important.

Reply

329 Meteor Jenkins August 24, 2010 at 7:03 am

I had booted a fully working Windows XP box with my rescue USB stick, to show off to my friend, showing him various command-line stuff.

Then – as a regular user – I typed:
$ dd if=/dev/urandom of=/dev/sda

Telling him that regular users don’t have permission to alter the hard drive directly, I hit return.

Uh-oh. No permission denied message. Hitting Ctrl+C, I realise my friend has just lost his partition table and Windows installation. Luckily, the data partition was separate and recoverable by TestDisk.

I had added my user to the ‘disk’ groups months ago without realising it.

Reply

330 glued August 24, 2010 at 7:08 am

Not as awesome as your stories, but paraphrasing my own coding mistake:

[i_am_doomed]
$ while read LINES
> epic_fail_command_omgwtfbbq_why_did_you_put_me_here_stoopid_glued
> do
> …….
> alerting_script ${LINES}
> done < ${file}

*goes to lunch*

Hilarity, 100,000 alert emails, tech calls, and a low yearly appraisal came after.

Reply

331 Sam Watkins August 24, 2010 at 7:16 am

I love this one :)

Reply

332 Wayne August 26, 2010 at 5:06 am

Edited /etc/hosts.deny
ALL: ALL
Edited /etc/hosts.qllow (made a typo)
ALL: 10.1.
Logged off system to test, couldn’t get back on.

Reply

333 digitalsushi August 28, 2010 at 1:30 am

function halt (){
echo “really?”
read ans
if [ “ans” = “yes” ]
then /sbin/halt;
fi
}

Reply

334 CP September 4, 2010 at 9:53 pm

Poignant post, very sincere, lots of classics here!

To add, I wanted to deliberately delete an old 486-based debian install by typing rm -rf /etc (and so on) while the system was running, and I was surprised by how resilient it was and kept itself alive.

Maybe it’s better to avoid the mistake than committing and having to learn from it.

Reply

335 Wolf Halton September 5, 2010 at 5:18 am

I think this thread should be required reading for all students taking a linux administration class. These are classic blunders. Thanks for the OP and thanks for all the comments.

Reply

336 Elton Carvalho September 5, 2010 at 3:26 pm

Just made this one. Not exactly a command-line mistake, but will do.

Got a new hard drive and moved the contents of the / and /home partitions from the old one to the new one. In order to boot from it, I should clone the MBR, so I wouldn’t need to go through the hassle of setting up grub. And there I went:

dd if=/dev/sdb of=/dev/sda bs=512 count=1

Note the “512” instead of “446”. Those 66 extra bytes had the old disk’s partition table and it was written over the new disk’s. Of course I hadn’t backup what I was about to rewrite. Now I’m doing all the copying again and adding “get a LiveCD which supports ext4″ to my to-do list.

Reply

337 bart September 6, 2010 at 8:09 pm

This one was fun:
sudo chmod -R 777 /
instead of
sudo chomd -R 777 ./

..

Reply

338 bob September 8, 2010 at 6:26 pm

First n00b error, that end up discovering a weird bug in RHEL:
Typed reboot on the wrong server
As soon as I realized the error, I typed
– init 5
Runlevel showed
6 5
This locked the computer totally. Couldn’t reboot it anymore, couldn’t use it etc. Deadlock. I had to physically reboot the machine.

Second error was a better one. I was working offsite and decided to do some modifications on /etc/passwd using sed. So I made a backup:
mv /etc/passwd{,.old}
Bad idea: nobody could log anymore, root included, on the machine. We had to boot single user and restore the file for it to work.

Reply

339 nikin September 14, 2010 at 9:37 am

Yes. i have done some of these to :D

Nowdays i have backups off the more important things. offsite every week, and offline every now and then. For work material, like source files and databases on a dayly basis.
It did save my ass a couple of times :)

Reply

340 Dab September 14, 2010 at 2:21 pm

Once a time I copied some text in a technical deployment website to paste to a word document. Then I worked on something else and (I thought) I copied something. when I paste the stuff to my unix session it resulted everything from the deployment page was pasted there… everything was invalud UNIX comamnd except 1 line….. \rm ~

Reply

341 Ameth September 15, 2010 at 10:31 am

The only system-crashing thing I have done was to install busybox in what was supposed to become a initramfs, but forgetting to chroot. Nothing of value was lost, though.

Also, I once took a backup of a MBR on the same disk it belonged to, before whiping the old one (I don’t clearly remember why, but I think it had something to do with dual booting windows). Not being one to panic, I opened my own MBR in a hex-editor, found some common patterns, booted up the wasted computer with a live CD and made a small C program that searched through the disk for something matching a boot sector, finding a single one some 80 gigabytes out. The whole room applauded (… I wish).

Reply

342 Nihal September 18, 2010 at 7:14 am

instead of umount /foo/bar i did rm -fr /foo/bar to “remove” the drive :D

Reply

343 Dallas September 21, 2010 at 3:05 am

Hi, I’m adding my horror stories too.
in konqueror, delayed mouse activity.. managed to move /usr/lib somewhere (was logged on as root for some particular reason). reinstall:(

manual rpm upgrade of glibc rpm. In the days prior to yum/smart etc. issue was The update was i386 file, not i686 as installed. caused the system to fuad! used recovery disk :)

Reply

344 erik September 23, 2010 at 8:28 am

AIX root login by default has no / as its home directory. so:
1. logged as root
2. cp -pr /tmp/root_home /root
3. cd
4. rm -rf *
duh! now first thing i do when doing a fresh AIX install is to create /root and make it root’s home directory (via post install script)
lots of classic above. fun but painful when it hits you. thanks!

Reply

345 jfc September 23, 2010 at 11:53 am

i had setup a chroot to test something and mounted proc under this directory. after testing, i did a rm -rf testdir. wondering, why the rm took so long, i saw the mounted proc :( i had to restore the /home directory from backups

Reply

346 naikta September 29, 2010 at 3:50 am

Once upon time in my earlier day’s in unix I was working on HP-UX box .

I wanted to reboot the sap application server but halted it.

Fired–> #reboot -hy 0 instead of #reboot -ry 0

No console, no datacenter person available in weekend to hard-boot.
Extended 1 hrs activity to 10 hrs.

Reply

347 Sparcrypt September 29, 2010 at 11:36 pm

Cleared all routes from the machine while accessing it from another subnet. Oops.

Good lesson in NEVER typing any command in live unless you know exactly what it does.

Reply

348 Linus October 8, 2010 at 11:44 am

I wanted to delete ~/etc
I was in the Homedir and Typed
rm -rf /etc
Instead of
rm -rf etc
Luckily it was just a Test-Server

Reply

349 Ashish October 11, 2010 at 7:48 am

Guys,

Kindly be very careful when you work as a root user. My recent mistake on linux box was as below. I wanted to check the reboot history for the system, and the blunder that i made while issuing the command is:

The correct command is:

last reboot

The command i entered:

last | reboot

Hence, the system got rebooted. It was a production system, and was a big escalation. Be cautious while issuing commands when working as a root user.

Reply

350 Taiko October 15, 2010 at 10:42 am

Wow, this thread seems to demonstrate admirably that Unix interfaces are indeed cryptic and dangerous.

Reply

351 Michael Shigorin January 9, 2011 at 11:30 pm

It’s habits, lack of experience, ignorance, showing off that hurt. Interfaces are important but second to all of that (reboot could check if stdin is a tty, but then again an ancient one wouldn’t and one might be less careful expecting it to be somewhat fingerproof).

Reply

352 Marcel S Henselin October 16, 2010 at 9:38 am

Hi folks

I set up a linux machine without monitor to react on ctrl-alt-del to poweroff
/etc/inittab:
ca::ctrlaltdel:/sbin/shutdown -h -t 4 now

unfortunately I had TWO keybords for TWO servers and no KVM switch – guess what –
I wanted to log on to a Windows 2000 Server – pressed ctrl-alt-del and heard the beep from the linux machine – just while compiling a newly set up kernel.

since then I use KVM switches most of the time :)

Marcel

Reply

353 Christos October 20, 2010 at 11:23 pm

Locked out due to firewall reconfiguration. Done that. Quite some times.
I nevel did liked firewall and firewall don’t like mw either.

rm -fr to the wrong path. Not really, but there was a case on a an old netra machine, were the filesystem was corrupted and when i issued ls in /tmp i could see the whole / filesystem underneath it. “just some crazy inode mixed-up resulting in ghost entries” i thought. Luckily there was a recent backup around :)

And the most embarrassing mistake thus far, I once was in a telco just before it went online. and one of the sun clusters there, had some strange network issues.
“The problem lies in the arp cache, one network guy suggests. After clearing some arps and still no avail, i decide to clear everything from the arp cache!
There goes the cluster interconnection, and i ‘m experiencing my first cluster split brain. One node immediately panics, while the other one is ..well no at it’s best.
I rebooted the whole cluster. Fortunately it only took about an 30 minutes and resolved that network issue, which up until now remain a mystery.

Reply

354 JohnK October 21, 2010 at 7:30 pm

At 2am typed init 0 instead of init 6 – doing scheduled upgrade including kernel. No one in the building had access to the server room. Had to drive 90 minutes into work to simply push the power button…

In my first sys admin job I did not quite grasp the concept of a dumb terminal…I wanted to reset the terminal so I typed reboot while logged in as root…The terminal did not reset but the server did…

Reply

355 khh October 22, 2010 at 10:04 pm

My worst one was while I was working on the build/compile system of a program. The ./bin/ directory was filled with files, but I wanted to see what files the build system itself copied there. But instead of
rm -rf ./bin/*
I typed
rm -rf /bin/*
And as my luck would have it, I’d been doing some work requiring the root earlier and forgotten to exit. Had to reinstall the operating system, but at least I was able to backup all files and settings.

Another good one was su -c “passwd” and pasting the password. I don’t know what the I pasted, but it sure wasn’t the password I had in mind.
Other than that I’ve done the “sudo ipconfig eth0 down” and the reboot on the wrong box. Once I wanted to play a prank on my friend and sent him an “:(){ :|:& };:”. I should have done that, it ended up causing him a lot of problems.

Reply

356 danO October 28, 2010 at 3:46 am

thanx for jinxing me. did a rm -rf tragedy today on project files 2 days after reading this post.

Reply

357 gormux November 2, 2010 at 10:49 am

Once, I wanted to reset my FreeBSD system to a clean state.
Usually you just have to
rm -rf /usr/local

Unfortunately, this time, I did a
rm -rf /usr/lib

Damn tab-completion…

Reply

358 Pawel November 9, 2010 at 10:44 pm

Few years ago:
UPDATE email_accounts SET vacation =’some text';
instead of:
UPDATE email_accounts SET vacation =’some text’ where id = ‘some_id';

About 3000 accounts where changed. Hopefully, we’ve had backups.

Reply

359 Andy T November 14, 2010 at 7:31 pm

Aaaaaaaaaaaaaaaaaaarrrrgh!
Just hosed my main dev instance’s /usr/include dir

There’s a very good reason this was not a stupid mistake, but when I told it to my dog, he got up and went into another room so it clearly is not very convincing…

OK, it could have been worse – but since it should be recoverable, I thought I’d post the fix:

I remember thinking it was odd that with such a long list of mistakes, no-one had ever posted the quickest fix for their problem – I may be about to find out…

OK, so I’ve never put anything in /usr/include manually (I’d use /usr/local/include for that) so rpm or yum should be able to rescue me (dpkg or apt for Debian).

I’m assuming that only packages with ‘-devel’ in the name will deposit files in /usr/include. (-dev for debian)

I’m in a rush so I:
rpm -qa | grep devel | cut -d ‘-‘ -f 1,2,3

then mess around until I have a list of names ending in -devel which yum will accept. I then create a new xen instance of the same linux flavor. Making sure my hosed xen dev instance and the new xen instance are updated to today’s release, I issue ‘yum install (package)’ for each of the identified packages on the new ‘box’.

I then cross my fingers and scp everything from the new instance’s /usr/include to the hosed box.

I’m going to do this tomorrow, not now.

If you see another post, that means it didn’t work…

Reply

360 Hollygirl November 19, 2010 at 2:07 am

On a Solaris Box running production cluster Oracle databases – trying to get date, time and other misc output

(after typing many misc commands in a row and not paying attention anymore…)
typed:
> hostname -a

opps – changed hostname to ‘-a’ followed shortly by P1 Oracle database crash.

Reply

361 Hollygirl November 19, 2010 at 2:11 am

A very common mishap *(one of my power-users did this – took about 3 hours of my time to fix):

ftp the /etc/password file from a Unix box to a Windows box so it can be edited.
Edit the /etc/password file in notepad or wordpad (which likes to add line-breaks etc and other formatting). Save it and ftp the file back to the Unix box.
Viola – noone can log into the box and noone can su to root.

There is no way to fix this but bring down the box, go into maintenance mode and edit the /etc/passwd file manually (remove all the blank end of line characters from each line of the file).

Reply

362 pepoluan November 24, 2010 at 11:20 am

Hahaha… wonderfully instructing “OMG” moments there, thanks for sharing :)

Hmmm… let’s see, what mistakes I have did…
Locking myself out from an SSH session … check
rm -rf * while on /etc … check
> instead of >> … check
rm /etc/sv1.* . instead of cp /etc/sv1.* . … check

(The last line was me copying some custom config scripts; I usually prefix them with the server name so I can find them quickly. The last line has the added bonus of removing files I’m still working on in my ~ . *sigh*)

Nowadays, I always install my Ubuntu inside XenServer, and religiously create snapshots before thinking of doing drastic things :)

Reply

363 Derrick November 29, 2010 at 4:55 am

My first major n00b blunder:

Wanting to move the files and folders of a blog up one directory, I typed:

mv /* ../

Notice the missing period, should have been:

mv ./* ../

VPS backup on production server: $5/month.
Lesson learned: priceless

Reply

364 DB Man December 1, 2010 at 3:52 pm

I was working at a company which had a home made ticket solution, the database used mysql4 (MyISAM, AUTOCOMMIT=1) and a flaw in the design, a column which just included integers was created as varchar(10) and I was to update a phone number for a customer in the database:

UPDATE table SET phone=’1234567′ WHERE ticketid = 123;

This resulted in that all customers got the same phone number, it wasn’t trivial to get the phone numbers back, we managed to get most of them from a backup and the rest from Apache logs.

Reply

365 Ali December 11, 2010 at 6:29 pm

I was working on a linux box, and besides other things, just wanted to review my scheduled tasks; typed crontab -r (instead of -e) and all my tasks wiped out.

Reply

366 Derek December 11, 2010 at 11:42 pm

crontab -r is the worst :( no warning. I’ve made it a habit to always type crontab -l before ever typing crontab anything else :)

Reply

367 Karl December 14, 2010 at 3:23 pm

I issued a “shutdown -h” on a remote system when I meant to use “shutdown -r”
It’s a bit embarrassing to have to phone your hosting provider to turn it back on :P

Reply

368 Ian G December 16, 2010 at 12:15 am

On my first day as an admin, I deleted emacs. Fortunately my co-worker was able to restore it.

Another time, I mistyped a sql command, and reset all user’s scores on a production machine. Luckily, we had a backup… but that didn’t prevent users from complaining that they had lost points!

Reply

369 Brian December 16, 2010 at 3:55 pm

I went to backup passwd group and shadow because we were setting up a new server and needed to copy over the users and when i ran the tar command i did this
tar -cf /etc/passwd /etc/group /etc/shadow user_backup.tar

luckily passwd had been backed up into passwd-, I phoned the datacenter and had to get somebody to boot to single user mode and replace the file….

Reply

370 aussierob December 21, 2010 at 2:20 am

I keep several clients’ data on my testing server, testing data in /aaadata.
I keep customer1 data in /cu1data and customer2 in /cu2data etc.
so its easy to mv /aaadata /cu1data and mv /cu2data /aaadata
when I want to swap test sets of data.

One day I confused being remotely logged onto customer1 live server one day…
did the mv /aaadata /cu1data
and then the mv /cu2data /aaadata which (thankfully) did not work.

Luckily I quickly figured out where I was,
did a mv /cu1data /aaadata to put it all back,
kicked everyone off, rebooted and got away with it !
Close one !
(I sitll df -v before these moves – a little paranoia is good for you )

Reply

371 Qasim December 25, 2010 at 5:56 pm

… Scary!!!

Reply

372 Alex J Avriette January 7, 2011 at 9:20 pm

A lot of your mistakes are pretty tame compared to some I’ve seen in the wild and some I’ve committed myself.

When I was really young (this was my first Unix machine, a Sparc 2; I was maybe fifteen or sixteen…) and not only didn’t understand Unix permissions and was very frustrated by them. I figured that the easiest way to make sure I could access everything I needed to was to say

chmod -R a+rwx /

I was root, on what had to be Solaris 2.4 or 2.5, and let the command run for a while before thinking, hm, maybe this is a bad idea. The original “logic” was, “well, since I’m the only person using this machine, why shouldn’t I have permissions to read and write everything?” — I completely failed to understand what the execute bit did, for starters.

Permissions were so hozed there was no option left but to reinstall the machine. Learned that mistake but good.

Reply

373 Michael Shigorin January 9, 2011 at 5:16 pm

Using *the* simplest command which can make a backup one knows *is* key, especially if tired already (it’s antagonist to being responsible).

I’ve hosed a production backup server’s disks (what an irony) quite recently due to:
– we installed on a single HDD of two as they still await for a hardware RAID controller
– I did mess half a year ago trying to “at least make a backup” (dd’ing sda to sdb)
– I did pay attention to replacing UUID-based mounts by device-based
– I did look into /etc/fstab and `mount` before proceeding with dd (all clear)
– I did *not* look into /proc/mounts (where it was a rootfs UUID-mounted from sdb)
– I did *not* perform a most basic off-host backup I easily could, with rsync

What was still on the positive side, the tapes weren’t harmed by that dd (20 gigs in when it finally struck me), and there was somewhat older snapshot of /etc/bacula, and while ls would segfault already (had to use echo *) some other tools still worked off the damaged rootfs and so some more parts were salvaged but it was a reinstall and waste of time, even if not hitting users heavily.

And yes, I did off-host and off-site backups after reinstalling on software RAID. Working/commuted iKVM did help immensely either.

A habit of running a primitive local snapshot after considerable changes still holds:
# tar zcf /root/BAK/etc-`hostname -s`-`date +%Y%m%d`.tar.gz /etc

Reply

374 Michael Shigorin January 9, 2011 at 5:32 pm

> dd if=/dev/urandom of=/dev/withManyData count=1024 bs=1024
> i forget it….i the last command i haven’t way to rollback only reinstalling
Not exactly: DO NOT REBOOT while you still have the kernel which mounted the filesystems while all information to do that was available.

STEP AWAY from console, have some tea, sit down and try to calm yourself.

Then estimate the consequences of losing the info and if it’s valuable, consider the possibilities to save it.

Second HDD is OK if it’s already mounted, otherwise consider networked backup (rsync/scp). USB HDD/flash might not work already if you had the luck of damaging kernel module file contents which would be needed to use them.

Then backup up /etc, /home, /var or whatever might still be readable.

Then have some more tea — unless the downtime is really pressing on you or you have 99% solid backups. Maybe you’ll remember some more stuff.

Only then say goodbye to that filesystem and reboot, afterwards only salvaging tools might be able to help (testdisk is easy-to-use but limited, gpart is a PITA but did help me to recover partitions after installing a distro on workstation disk not the test one; photorec reportedly helps with salvaging files from damaged filesystem).

Reply

375 Michael Shigorin January 9, 2011 at 10:53 pm

Aside from firewall/sudo lockouts, chown -R smth .* (back when it did go from /home/smth to /home, which was a minor disaster), and aforementioned “dd backup server down” case (BTW I did ask a colleague to advise/witness me then took a cup of tea then mailed those concerned in half an hour when downtime and reinstall was considered inevitable), I could recall these…

One of my first Linux systems (a RHL5.x, libc5) was hosed by very much wanting to upgrade xmms-0.7 to xmms-0.9 from a “pirate” “Red Hat Linux 6.01″ (actually 5.9, a 6.0 beta) CD where it was linked against glibc2. Well, rpm did try to stop me. And I thought to cheat it around. After rpm -Uvh –force glibc*rpm finished what it was ordered, even ls would blow up. It would be only years later that I would know how to recover from that situation (like, boot the 5.x installer — there were no livecds yet it seems — mount that filesystem, copy glibc rpms there, rpm2cpio | cpio -id at least the /lib/libc.so and VIP friends, and then try booting off that root to rpm -Uvh –oldpackage that glibc*rpm — or keeping rpm-static at hand in the first place). Well, at least glibc2 is actually nice regarding backwards compatibility.

Several years later there was a Very Important Dump sitting in /tmp with stmpclean set up to shoot down month-old crap there; the next day was a moment to remember. My fault for placing it there in the first place but had also to talk with folks who set it up so in packages for the distribution installed.

On the same job, we once had to take development environment into production real fast — and chose to continue running backend on an office server where it was already deployed. Then we had tough time moving it to production servers at colocation, and then summer came with increased power consumption for downtown’s conditioning systems. One day we were running on all the office UPSes with two-minute downtimes to change them (no second PSU to juggle cables, and we didn’t chain up UPSes for the reason I don’t remember already)… the decision to move production services to production environment became a bit more obvious ;-)

On a community-built LUG server I was relying on a donated DAC960 controller to handle a separate SCSI HDD holding the root filesystem for an FTP server for some time — until it began to sneeze on a virtually new 18G drive and fire a single physical volume from its single logical volume… since virtual environments (linux-vserver back in 2004, openvz by now) lived on ATA/SATA drives (there was a mix back then), I would end up with perfectly working but unmanageable server until that half-a-meter board was retired to a stand. I thought it’d be cool to run IDE+SATA+SCSI, in fact it wasn’t — and “coolness” is a bad factor to account for.

Didn’t reboot/halt the wrong system so far — a friend of mine did, and he told us the short story to remember (thanks Nick!). Didn’t get caught with crontab -e/-r so far either, as well as prepositioning / when it wasn’t meant to. Well, lots of wisdom read, thanks.

Bonus tracks:

A colleague was experimenting with FreeBSD 4.x softraid (vinum) on a production server with some 4 HDDs and found it the hard way that after hitting some pretty low limit of volumes the kernel just froze up. He didn’t expect that in all honesty but we were down for that night.

Another anecdote was of a junior who decided to test how the hardware RAID5 works and pulled a drive from production server, then pushed it back and pulled anoter one. The very same moment the array was ruined — poor guy apparently didn’t understand that it takes time to rebuild an already degraded array after pushing first drive back, and that he was unneccessarily risking double fault even if he didn’t pull the second drive (if one of the remaining spindles would be close to fault it might not bear the added load of rebuild and go down bringing the whole array with it, again).

PS: yes, I did notice the difference in stories where there was a backup (“phew!”) and when there wasn’t. Taking care to review and test backups, *especially* after reworking the storage scheme (e.g. Bacula won’t descend into mounted filesystems by default, and moving data to a separate filesystem might also prevent it from being backed up), is really worth the trouble. Backups are sysadmin’s children: he tends for them, then they tend for him…;

Reply

376 James Alvarez January 12, 2011 at 5:50 pm

When managing multiple SSH console, my always mistake was to type into the wrong SSH console, which is completely insane. :)

Reply

377 /activate brainless actions January 17, 2011 at 11:03 pm

Well, I have Used Ubuntu for a few years and mine (among many) was I was trying To add a repository for my anti-virus and I added the repository for all of the Debian volatile project. It was a huge update and then nothing when i restarted.

Reply

378 W8Hosting January 19, 2011 at 1:13 pm

Command on wrong server: done

I would rebuild my backup server that at that time wasn’t fully working. I didn’t realized that I was logged in on the main cPanel server. Result: Angry customers and some downtime.

Reply

379 Ulric Eriksson January 24, 2011 at 10:19 pm

My most remarkable command line mistake happened sometime in the eighties. I didn’t know there was this thing called “Usenet”. Trying to delete some file, I accidentally typed rn instead of rm. That mistake cost me countless hours over the next ten years or so.

Reply

380 AlexUnix March 4, 2011 at 2:31 pm

Nice way of praising the Internet! :))
rm is not even so bad after all! :P

Reply

381 Aura January 26, 2011 at 4:37 pm

I was going to wipe a little USB-stick and did dd if=/dev/urandom of=/dev/sde on the production server. Ops!

Lesson learned: Use very obvious, colorful, different prompts for each system. At the time I had the same .bashrc on multiple systems.

Another mistake I did was to add “exit 0″ in some function in my .bashrc, when I really wanted a “return 0″. It took me some time to realize why I was instantly kicked off everytime I was logging in via SSH. Haha!

Reply

382 Aaron Samuel February 9, 2011 at 8:41 am

On Debian

rainofkayos@karma ~ [4398/255] % hostname -f
karma.lan

On Solaris:

0@solaris-rain:~[root@solaris-rain /]# hostname -f
0@solaris-rain:~[root@solaris-rain /]# hostname
-f

Result?

Wasn’t bad service wise , was a weblogic instance running in a massive cluster of about 10 instances. The members just dropped it from the cluster when the hostname changed to -f. I noticed immediately.. lol naturally because hostname returned nothing,, it was a quick puzzled feeling, then one of the “oh crap” feelings, =). Luckily that client (no names) had the shell prompt configured with hostname, and I could easily see what i needed to change it back to by looking at the previous commands I had run.

Embarassing wise, super, about 20 minutes after I had “cleaned up the issue” , I thought it just looked like a blip and no one had noticed. Which in general was true, the client hadn’t noticed it at all, however a senior of mine and some what mentor, walked over to me, pulled over (rolled over) an office chair, sat down, looked left, looked right, ensured we weren’t on attention lane, and said in a strangely nice yet taunting way, “so… (long pause) dont always run commands as root… (another long pause) … and dont do hostname -f on solaris.” He then got up, and walked off back to his office.

LMAO,,, _fail_…

What did i learn?

Ofcourse one would think the point is dont run hostname -f on solaris. But actually I learned from this and from the years, dont abuse root or super user access. Take advantage of proper permission usage.

Cheers~!
Aaron.

Reply

383 jimmy jam February 9, 2011 at 4:21 pm

My best mistake while learning Linux at work was accidently deleting a config file in sites-enabled using winscp! Straight away the clients website was unavailable and I had to put my thinking cap to good use. So I managed to copy the contents of another file and reconfigure. It was a tough learning stage, the power of Root should never be underestimated!

Reply

384 Tof February 9, 2011 at 6:15 pm

Hi all,
So funny ! My vote goes to:
– typing in the wrong console (i rebooted the main oracle DB production server once … all the operators where coming to tell me there was a problem with the DB … oups .

My second favorit : the rm -fr /stuff * (with the space before *) …
Even been carefull after the first time i made it about 8 years ago … i reach to do it 3 more times … in production environement! Sure your happy to have some backups then !

Reply

385 Dan Sichel February 10, 2011 at 12:01 am

keep config files in CVS. Nice. Simple and useful as H*LL now and then. Thank you. Chuckled at the DNS zone file overwrite. What a difference a greater than sigh makes!

Reply

386 Nimda February 11, 2011 at 12:04 pm

The reboot command on solaris has no time factor option, but any followed text is interpreted as the kernel that will be loaded after reboot. So without knowing it…

My Mistake on a Solaris box, dont ask me why but i typed…:
#reboot now

Error message on bootloader:
“Cannot boot now”

oh well …. :)

Reply

387 Muppet February 17, 2011 at 3:33 pm

Shutdown ssh daemon instead of restarting it apply a config change… remotely.

Reply

388 giggity February 18, 2011 at 3:37 pm

Got into work one morning..
Logged into the main production server (at another site), and found a new directory:

/aaaaaaaaaaaarghhh_dont_delete_me

Called the sysadmin there asking about it, and had the reply:

“Oh, you found it then!”

Apparently, he’s been removing a user from the system the previous evening and did a:

rm -rf / home/username (yes, with a space between / and home)
Fortunately, since we were using amanda, there was a /amanda directory full of backup related files, which gave him a few seconds of Cntrl-C time :-)

Hence the new /aaaaaaaaaaaarghhh_dont_delete_me directory, full of small, random files, just to provide a few more seconds of Cntrl-C time should anyone repeat the command!

Oh, also did a ‘rm -rf /’ (knowingly!) on an old SunOS4 server that was being shipped offsite, we were rather disappointed to find that it thrashed it’s disks for half an hour before just returning to the ‘#’ prompt. I think the only useful thing we could do with it afterwards was ‘echo *’ :-)

Reply

389 Anders February 23, 2011 at 10:57 am

I once did rm -rf .*/ instead of rm -rf ./*

Luckily I was doing this as a non-privileged user, and only lost some of my (backed up)
personal files.

Reply

390 Dennis February 24, 2011 at 1:49 pm

Played with some commands a long time ago on one of my test systems.

# cd / ; chattr +R *

forgot all about it and then finally rebooted the system a few weeks (months?) later…
…BIG MISTAKE…….

Reply

391 Deltaray February 24, 2011 at 7:01 pm

Dennis, what does chattr +R do? R is not one of the attributes listed in the man page.

Reply

392 cybernijntje February 26, 2011 at 1:46 pm

My bad!

Correct syntax is: chattr +i -R *

*sigh*

Reply

393 BBHoss February 27, 2011 at 5:57 pm

CVS is horrible for configuration. Use git instead. Also for your QT PROJECT!!!

Reply

394 George February 27, 2011 at 9:34 pm

AFAIR, the “Unix Haters Handbook” has one from alt.risks where the user had typed:
rm -rf *>o (fat fingering the SHIFT key and getting > instead of .

Instead of deleting *.o, everything is deleted and you get a zero length file called “o”.

Reply

395 Graham February 28, 2011 at 8:19 am

My main mistake was being super user and issuing rm -r * from the root directory, instead of a temp directory I wanted to get rid of. Unix does not run very well with no files at all.

Reply

396 pepoluan February 28, 2011 at 8:56 am

That’s just happened to me last week! Granted it was not really my fault ;-)

At the staging area (where we prepare boxes) we have no KVM switch. Due to a stack of new hard disks, the keyboard needs to be placed one in front of another (instead of right in front of each monitor). I had earlier moved the Windows box’s keyboard in front. Without my knowledge, when I was out my colleague switched the keyboard to do some Linux stuffs. I went back, and tried to log on to Windows. A yell from behind me confirmed that I’ve done something terrible…

… lucky it was still being installed, so nothing important is gone. The inconvenience of having to reinstall was promptly forgotten when I treat her to a nice dinner ;-)

Reply

397 Tom May 2, 2011 at 10:59 pm

KVM switch is danger thing. I had connected 2 x Linux and 2 x Windows to one KVM. The screen was blank.
Just hitted CTRL+ALT+DEL becouse wanted to login to Windows machine…
… caused immediate reboot of main Linux router. Fortunately, it tooks only 2 minutes to boot up.

Reply

398 Rick February 28, 2011 at 1:20 pm

If there is any reason to worry about what I’m doing I test commands like rm by substituting ls first.

Reason to worry? You’re root, that’s reason enough to be extra careful.

Reply

399 Mike March 7, 2011 at 11:36 pm

Forgetting sudo on the first run has kept me from doing some really stupid stuff, like removing my /etc folder or similar. Damn was I glad to see that “Permission denied” after rereading the command and noticing my (almost) fatal mistake.

Reply

400 z March 9, 2011 at 7:14 pm

# crle -l /appdir/lib

This is effectively replacing (not appending) the application library path to the system library path configuration. (Solaris)

Care to guess how many system utilities this mucks up? This sent me looking for a miniroot boot image as the errant change survived even after hitting the big red switch.

Reply

401 shola benjamin March 14, 2011 at 11:37 pm

how can i get unix operating system for my pc, please try email me

i really appreciate this forum

regards

Reply

402 Christos March 15, 2011 at 6:43 pm

I’d recommend Solaris 10 for x86 (or x64)
Get it here: http://download.oracle.com/otn/solaris/10/sol-10-u9-ga-x86-dvd-iso.zip

Reply

403 Cody March 22, 2011 at 1:05 pm

Excellent!

This reminds me of when I told a friend a way to auto-log out on login (many ways but this would be more obscure). He then told someone who was “annoying” him to try it on his shell. End result was this person was furious. Quite so. And although I don’t find it so funny now (keyword not as – I still think it’s amusing), I found it hilarious then (hey, was young and obnoxious as can be!).

The command, for what its worth :

echo “PS1=`kill -9 0`” >> ~/.bash_profile

Yes, that’s setting the prompt to run the command : kill -9 0 upon sourcing of ~/.bash_profile which means kill that shell. Bad idea!

I don’t even remember what inspired me to think of that command as this was years and years ago. However, it does bring up an important point :

Word of the wise : if you do not know what a command does, don’t run it! Amazing how many fail that one…

Reply

404 Cody March 22, 2011 at 1:13 pm

Hmm. Note to self : be careful which one you reply to (noscript might have had a play in this). To the op – if you would delete this, I’d appreciate it as i put it in the right section now (and delete this too, obviously).

Reply

405 PrometheeFeu March 24, 2011 at 3:54 am

Here is my favorite:

I was tasked with compiling a list of certain functions in our source code. I was slowly refining the list removing stuff. Then, I typed:

$ grep “pattern” list_file > list_file

You might know it means it was all gone.

Reply

406 Cody May 8, 2014 at 1:32 pm

I love that one too. But I wonder, if you’re looking for a certain list, would it not be OK to use something like (I don’t know what language you were using but point is the same) cscope or some such ? Of course, I actually use grep a lot when programming so I could see that being why you did it this way (though for me it isn’t so much compiling a list of, but where. In that case cscope might be more useful but only if I have one shell open and don’t want to rely on, say, multi-tabbed vim or screen. Still, it’s similar to compiling a list and so I’m not suggesting anything and it is besides the point).
On this mistake: it of course can happen with other commands. E.g., try sed without (-i) on a file and directing to the file, to (try) to update the file in place. Same with other commands. It makes sense though. Either way, the result is you might not be too happy, indeed, unless you have a backup. Of course, if it is source tree, one might have more luck at least if it’s a project of much worth (and they haven’t made many working copy revisions), since revision control. But yes, agreed: this mistake is quite fun. I think I’ve done it but not on a file that was important and never one that I couldn’t restore. Still, those who are afraid of > and >> (Hopefully that came out OK) – using them or otherwise – are basically afraid of learning and in the end will not be as efficient as they could be.

Reply

407 Nate March 24, 2011 at 9:32 pm

my developers love doing this as root:

cd (non-existent path)
chown -R root:root
chmod -R 777

instead of

chown -R root:root (non-existent path) [so it fails]

Reply

408 Cody May 8, 2014 at 1:39 pm

(Yes, I’m going through some posts today out of boredom and I’m going to offer some mistakes I recently made, after that, so for those subscribed to this post, I am sorry for the several messages).

I don’t see how the commands would do any harm, unless there are some versions of chown and chmod that don’t require a file when specifying recursion (as -R). chmod -R 400
or whatever permissions (and same without -R) by itself, with out any file (or files) will just give an error. Is there actually a version that is that naive, as to assume you mean the current working directory (or was this a typo on your part, perhaps? Maybe you meant ‘.’ at the end of the command)? (One can hope that if there is, it is very careful about the parent directory…)

Reply

409 kalyani March 29, 2011 at 12:26 am

sir am kalyan,
i have i problem, am creating folders in home directry but they r not visible in unix terminal wen i enter
cmnd cd /home
ls
no files were opened wts the prob….

Reply

410 Aaron March 29, 2011 at 4:57 pm

who owns the folders and who is trying to view them?

Dont create everything as root and if you do, understand that you may need to change ownership or change permissons on the files.

/home on linux is the location of $USER home directory. It should be owned by the user that needs to use/view/modify it, actually IT should be the home directory of the user.

I.E.

user name aaron

/home/aaron

ownerships by default atleast on a debian system would be

aaron:aaron

however lets say you make aaron under home and the perms are

root:root

aaron will not be able to do anything with this folder when he is logged in, furthermore, he will get errors on login saying he has no home.

Reply

411 Michael Shigorin March 29, 2011 at 6:03 pm

Hey, now that was a tip to get into this list with yet another story ;-)

/home holds homedirs, but usually isn’t any user’s homedir itself. Tampering with its permissions might feel very wrong with other users.

2 kalyani: do “cd; ls” — plain “cd” is equivalent to “cd ~” or “cd $HOME”, and this isn’t a place to ask (or give) advice as I was already kindly pointed at — head over to linuxquestions.org or whatever forum suits you better. See also e.g. http://www.tuxfiles.org/linuxhelp/cli.html (googled up as “linux command prompt howto”).

PS: anti-offtopic: my latest colocation visit (after IP-KVM session but no remote boot media available) was due to hda->sda issue (2.6.18 would still use legacy IDE driver for an IDE CF root device, while 2.6.32 went libata); I did it “the smart way” and replaced “boot=/dev/hda” value with whatever UUID “blkid /dev/sda1″ returned earlier while modifying /etc/fstab for old-or-new kernel setup. The thing is, one doesn’t really want a bootloader in _partition_ when MBR isn’t prepared to boot off that partition. And I managed to get that MBR LILO to L 99 99 state when the backout entry wouldn’t be of any use…

Reply

412 cayey April 3, 2011 at 7:01 am

is is possible that “shutdown immediate” in oracle automatically runs?

Reply

413 aaron April 4, 2011 at 8:46 pm

Don’t elieve so other than some script external to oracle. You Cal also check the logs as oracle would log shutdowns.

Reply

414 Paul-Willy Jean April 6, 2011 at 8:22 pm

Best one yet, I had setted up a firewall at with ipfire the informatic club of my CEGEP and found out that my dhcp server would end it’s leases every 2 hours. I then decided to set the max lease time to 0, thinking that it would remove any limits. It didn’t take long to find out that it was releasing new ip adresses to the computers on the local network on every second, rendering it inaccessible. I had to go directly on the machine to change it.

Reply

415 m47730 April 7, 2011 at 4:20 pm

One of my worst mistake:

i’d like to delete all hidden directories…

rm -fr .*

but “..” and “.” are counted in the glob!

argh!
m47730

Reply

416 Tom April 13, 2011 at 10:44 pm

The worst thing I ever made caused a disaster.
There was one, single SATA in the server for few users, where I was thinking it was proffesional SCSI (same name: sda, 1st mistake). It was containing Samba with roaming profiles on reiserfs partition.
One day one user asked is it possible to recover deleted file. Answer was: maybe.
I found command to rebuild reiserfs file structure, which can help recover data, something like this:
reiserfsck –rebuild-tree -S -l /root/recovery.log /dev/hda3
where /dev/hda3 was /home. Unmounted /home and did it. Unfortunately, there was a BAD SECTOR on the drive (I didn’t know that) which caused command to interrupt.
All data went gone, including all raming profile’s data from stations next day. No recovery possible. No backup.

Remember: just tell them it’s not possible… and pay attention.

Reply

417 Sam April 14, 2011 at 7:01 am

The only thing you did wrong was NOT HAVING A BACKUP!!!!!!!!! There are hundreds of ways your data can be lost, and only one way to prevent it: backup.

Reply

418 Michael Shigorin April 15, 2011 at 3:48 am

Backup and RAID are orthogonal: one will lose data added after the last backup if the only disk (or degraded RAID) goes down or a filesystem gets seriously corrupted.

BTW my another similar fault was trying to mess with a damaged filesystem on its native drive (which developed read troubles) and not on a _copy of_ a copy of that block device… that sort of extra duplication very much pays off when you badly wish to get two minutes back in time when data was still relatively close at hand.

Reply

419 Michael Shigorin April 14, 2011 at 7:41 am

Heh. I’ve stumbled onto somewhat similar problem last year: we’ve got “master” git directory on a mirrored “enterprise class” SATA drives holding reiserfs (one colleague insisted on that) and backed up nightly with Bacula; I’ve moved it onto mirrored SAS drives for performance reasons (and ext4 for sanity reasons). Of course that means a separate filesystem.

And it wasn’t until another colleague asked me to restore some file from backup that I’ve realised that git backups effectively stopped the day we did the move (something like a month ago back then).

Bacula honestly warned in reports that it didn’t descend in a mounted filesystem, and docs stated clearly that one should explicitly either specify such filesystems, or tell it to cross mount points (and care for /proc and friends on his own).

Mind you, it was proper hardware, mirrored disks, scheduled downtime, an extra copy taken just in case — but yet another gotcha and only myself to blame.

Lesson learned: don’t only backup, do verify what is extractable. Especially after storage-related changes.

Reply

420 Matou April 14, 2011 at 3:51 pm

Quite a long time ago, I discovered once the magic of /etc/inittab, in the “default” line. I changed to ‘s’ for “single user mode by default” and did what I had to do, reverted to the “norrmal” mode, typing “m” for “multi-user mode”, then rebooted the server. Since no line matched this strange code ‘m’, it was just starting NO tty. Hopefully, I had a similar server 10 miles away where I could make a bootable tape…

Reply

421 Ken April 21, 2011 at 4:43 pm

I once ran a Windows server box, didn’t make any mistakes, but Windows took care of corrupting all data anyways. Restored from backup and running websites on a Redhat box now.

Reply

422 subhajit April 22, 2011 at 6:01 am

how can i configure a local dns server in my home pc where i use ubuntu 10.10?
any give the brief funda of naming a server like(127.0.0.3)?
why we use this kind of naming system?what r the facilities and why?
tell me how i can configure a ip address? n how can i create a lan comnnection with tectniacal terms n eclanation?
one more qsn is for any of the above qsn understanding of .NET language is it essential??????n books name where u got that n how u learn that?i want to learn all that procedure?????????

Reply

423 Michael Shigorin April 22, 2011 at 1:11 pm

As you see this blog post is about mistakes made; yours one is not searching for your own question *before* asking it to anyone who might have already answered many times :-) http://lmgtfy.com/?q=how+can+i+configure+a+local+dns+server

I make this one more often than I’d consider ok, so bonus hint is “DNS HOWTO”.

Re .NET, I’m happy to _not_ have made a mistake starting with something but LISP. One might enjoy SICP book, see http://mitpress.mit.edu/sicp/ (it also happens to be one of top google results for “sicp”, incidentally).

Reply

424 Skydev April 28, 2011 at 7:22 pm

After going from Windows to Linux typed

pkill /?

on a Solaris box trying to get help.
Ended up with a system reboot.

Reply

425 Lord_Pinhead May 2, 2011 at 10:10 am

Hehe, nice ones :D

My best kill of an Server was:
./reset_postfix >> /dev/null

after destroying /dev/null, i have to reboot the server into an rescue system and create a new /dev/null with mknod:

mknod /dev/null c 1 3
chmod 666 /dev/null

Reply

426 Iain Kay May 2, 2011 at 4:54 pm

Rofl at this one that’s an absolute classic! Destroying /dev/null :P

Reply

427 Cody June 3, 2011 at 1:53 pm

As is already noted or referred to: writing to /dev/null writes nothing to nothing (essentially that’s the end result). But even then, not only would > imply the file would exist after (even if empty – which /dev/null is anyway), >> would just append instead of overwrite. But this means the file exists. And hey, look at this:

# mknod -m666 /dev/null c 1 3
mknod: `/dev/null': File exists

Now, if you were to actually rm the device, that’s another story.
And in that case, you could just use mknod to set mode. Also, if you have selinux enabled you’ll need to do more or else you’ll get denials.

Nice try Pinhead….

Reply

428 Tom May 2, 2011 at 10:35 pm

Few years ago, friend of mine accidentaly did another, funny thing.
He connected one port from Netgear 24-port industry-entry switch to another port on the same switch.
In result, this caused route loop, pushing packets to go over and over in the same switch rendering all network devices inoperate within minutes (!)
This also can happend when “smart” switch uses LACP and 2 or more cables to connect to another switch, when it doesn’t have function like “IEEE 802.1D Spanning Tree Protocol (STP): provides redundant links while preventing network loops”, You forgot to save the settings and switch has been restarted… so always remember to click damn SAVE before You get out of the work!

Reply

429 Kosta Cavic May 27, 2011 at 7:29 am

Tom,

Same thing happened to me few years ago, when a tech guy from Telco company connected same cable on the same switch within my client’s VDSL network causing so called “broadcast storm”. He’s explanation was that he intented to made VDSL modems work constantly good by forcing them into broadcast to speed up VDSL network based on analog telephone lines :). When we put firewall device on front of the VDSL network, firewall device was rendered unusable within minute by detecting some kind of DoS attacks and bringing down ethernet ports, both LAN and WAN.
Investigation took us a day to find segment (switch) that is causing problems.

I know that this post is based upon Unix-Linux mistakes, my senseirly appologies, but it is usefull to know what kind of weird problems can prevent network from operating normal.

Kosta

Reply

430 Kosta Cavic May 26, 2011 at 1:14 pm

After entire night spent at client’s site doing migration of Oracle DB from W2K3 box to RHEL 5.5 box (with Jboss on it), arround 06:00 i’ve just wanted to delete some symlink and ended up doing something like this “rm -rf /etc/* /some_sym_link (notice blank space). I wasn’t sure what i wanted to do with this command because i was to tired.
It was production Jboss server with no backup and it should run Oracle too. Luckilly i had still W2K3 box to continue work with no downtime. Within an hour everything was OK on W2K3 box, but still needed to reinstall Linux box with Oracle and Jboss.
It happened 3 months ago.

Regards,

Kosta

Reply

431 Kosta Cavic May 27, 2011 at 7:46 am

Guys do not force Unix or Linux box to do S.M.A.R.T. checks unless you are 100% sure what you are doing. Few years ago i forced Unix or perhaps Linux box, cannt remember, to do that and it rendered system disk unusable within two weeks. There are conf file which can be used to configure such option but that option should be enabled in BIOS also.
Forgot to do that and ended up with problem :).

Regards,

Kosta

Reply

432 Josh May 30, 2011 at 8:13 am

damn redirects have gotten me more than once

#echo ‘ServerName 127.0.0.1′ > /etc/apache2/apache.conf
#echo ‘UUID=xxxxxxxx /data/disk1 ntfs-3g defaults 0 0′ > /etc/fstab

I learned my lesson and stopped being lazy, just open the file and edit…

Reply

433 Tom May 30, 2011 at 11:16 pm

I’m here again. Just lerned something new.
Be careful for UUIDs of Your volumes and LVM2 snapshots.
Recently happend to me that both: original partiton and its snapshot has been mounted at same location!!
/etc/fstab entry was:
UUID=123456….. /var/lib/libvirt ext4 defaults ….
but also there was entry in /etc/rc.local
mount -t ext4 UUID=123456….. /var/lib/libvirt
this should just mount same partition twice. But this not happend.
First mounted was snapshot, I think. Luckily, the LAST mounted partition was original /var/lib/libvirt and then Libvirtd started. I was wondering long time am I going to loose data after delete of snapshot… no file size increase, becouse of RAW format already filled… but checked modification date and time of files when remounted to separate temporary folders.
This happend on Ubuntu Server 10.x in test env.
By the way, anyone know why read/write snapshoted LVM volume is REALLY slow?

Reply

434 Shahina Shaik June 3, 2011 at 11:32 am

I was checking whether is possible to recover files which got deleted using “rm -rf *” command.
i was in my home folder and i created a test folder and i created two test files inside the test folder(Actually after creating the test folder i forgot to change the directory, so i was still in home folder). And i issued the command “rm -rf *”. Silly mistake… :)

Reply

435 markus June 5, 2011 at 6:08 pm

I’ve accidentally run commands on a remote shell instead of local machine more times than I care to admit.

Ive learned the hard way to be VERY careful when using chmod -R, and rm -rf. Thankfully its never happened on a production server, although I have annoyed a friend or two who have asked for help setting up servers.

Reply

436 hpavc June 10, 2011 at 6:10 pm

./script.here | tee *

a coworker did this to me, overwrites everyfile in the cwd with the output of ./script.here.

this can also be pretty crazy intensive if you have a crazy amount of logfiles for example and lots of output.

Reply

437 D.plomat June 14, 2011 at 5:03 pm

stop the wrong server… done.
firewall lockdown… done.

But the most outstanding i had was initiated by a buggy KVM switch:
We had 3 identical Debian Lenny server in production (redundant+load-balancing DB servers) we wanted to dist-upgrade to Squeeze…
We did it the obvious way, choosing a low-load hour, upgrading first one while keeping the two other in production… Well in fact that *** KVM switch actually broadcasted the keystrokes to the 3 servers. We noticed at the kernel update that all were rebooting, then we got phone calls for the service not being available.

Even more outstanding, is the dist-upgrade went equally well for the two other and as soon as they’d rebooted they were fully-functional… great testimony on the reliability and ease of upgrade you can expect from this distro, the first time i’ve seen such a potential big-disaster mishap turning to negligible impact unattended.

(lol, after the adrelanine rush when you’re waiting them to boot, you can enjoy the ineffable feeling of having succeed a flawless unaware upgrade of a full pool of production servers…)

Reply

438 Spinifex June 17, 2011 at 10:08 pm

This is wonderful. Absolutly wonderful. I don’t feel like such a fool after all.
I have done a few of these.

Reply

439 wolfric June 19, 2011 at 9:41 am

instead of ifconfig eth0:1 bla, doing eth0 bla…. yeah that wasn’t pretty.
did > instead of >> once as well

Reading these comments, it sounds like this might help
alias rm=’sleep 2&&rm’

Reply

440 James June 26, 2011 at 2:53 pm

has anyone else noticed that very few of the comments on this page use full English???? it’s scary…. and one made no logical sense until i imagined a few commas.
also, i’ve done the remove everything …..

Reply

441 fooboo June 26, 2011 at 7:39 pm

Capital ‘h’ for “Has”. One question mark is enough to denote a question.
Capital ‘i’ for “It’s” since it is the beginning of a sentence.
Three full stops ‘…’ denote an ellipsis not four. Capital ‘i’ for “I imagined…”.
Capital ‘a’ for “also” as it’s the start of a sentence albeit a poor one. Capital ‘i’ for “I’ve” and another ellipsis with too many full stops/periods.

Yes, I suppose it is fun commenting on other people’s poor grammar.

Reply

442 Sumner June 27, 2011 at 4:57 pm

It’s crucial to stay aware of context! My coworker wanted to delete the contents of the current working directory, so he typed “rm *”, but he didn’t have the necessary permissions to remove those files as himself, so he moved on to:

su –
[root password]
rm *

There was just a little problem with doing that. The “su -” command moved him to the root directory, and shortly thereafter, he realized the system was no longer bootable.

Reply

443 Amit Chaudhary July 1, 2011 at 5:12 am

Hi,

I did

rm -rf /abc/* /

Just out of hurry and my machine was dead.Luckly it was local machine.One day rebooted the server thinking working on local machine and bang i rebooted the nagios monitoring server :)

Reply

444 dagon July 1, 2011 at 1:39 pm

I removed a lot of files by accident. This learned me to create a wrapper for rm which copies the files into “trash” directory instead of physicaly deleting them.

Reply

445 Cody July 1, 2011 at 4:34 pm

Hm, why that when you have these options to rm (assuming you do; else ignore) ?
-i prompt before every removal
-I prompt once before removing more than three files, or when removing recursively. Less intrusive than -i, while still giving protection against most mistakes

(E.g., just alias rm to ‘rm -i’ for example – like: alias rm=’rm -i’ in your login script (.bashrc or equivalent))

The problem with such approaches, of course, is if you change accounts, move to a different machine that doesn’t have the option or an alias set up, you could end up in trouble if you rely on them.

Reply

446 Oisín July 1, 2011 at 6:54 pm

“The problem with such approaches, of course, is if you change accounts, move to a different machine that doesn’t have the option or an alias set up, you could end up in trouble if you rely on them.”

True. Best to have a unique alias that will fail with an error on another machine.
For example, alias rim=’rm -i’.
Similarly you could have a “rim” for killing processes (usage: rim [job])…

Reply

447 Cody May 8, 2014 at 1:50 pm

Better is to explicitly type the option (e.g., get into habit) if you’re wanting to rely on -i for rm (or others). This way, if you do find a version of rm with no -i option, you’ll get an error (or should). And keep in mind that depending on where you put the alias definition, you might not always have it available (though in that case unless a command existed by the name ‘rim’ you’d indeed get an error). I think my point initially was this though: you should always explicitly specify options of this type, rather than relying on aliases as it can bite you at some point, in ways you might not foresee (including someone compromising your machine – or simply your account to e.g., to then go after root, gather information, whatever) and doing [whatever] to your alias [or whatever else] but at same time not doing anything else… or another possibility is if you are at work and leave yourself logged in they could do their deed that way, as has been done to many). Also, when specifying -i, keep in mind that rm -fi will prompt you, rm -i will prompt you but rm -if will not (and yes I am skipping the file as I’m referring to command and option only).

Reply

448 Bill Timmins July 8, 2011 at 10:01 am

My favourite experience was the error message when I typed

rm -rf * .bak
which was
.bak not found

I realised slowly what it meant :-)

Reply

449 edueloco July 11, 2011 at 7:00 pm

Guys/Gals, i had a great time reading all you!!!

We all have stumbled for God’s sake!!! i remember my, already mentioned, `last|reboot’ in a payroll server…. at the end of the month….

LOL’s for all, may we all learn from our mistakes!!!

Farewell!!

Reply

450 Greywolf July 11, 2011 at 7:36 pm

I was watching an admin deal with an interesting problem. We had a file named ” ” (space) in the root directory which he wanted to delete…

…you’re way ahead of me.

Sure enough, he typed:

rm(space)-rf(space)/(space)(return)

This system was the last running of its kind in the company, we had no alternate software for it and no way to recover or boot it up again. This was in 1988 at a shop that specialised in doing its own ports of UNIX to just about any piece of hardware you gave it (“give us 7 days and we’ll have a single-user prompt on a crowbar.”). For this one, though, it was so ancient there was just no helping it…

Reply

451 Tom July 19, 2011 at 11:22 pm

But it’s so easy..
touch ” ”
rm ” ”
and that’s all..

Reply

452 Sam Watkins July 12, 2011 at 2:57 am

One thing I stuffed up at work was dumping a database to text format, then reinserting the records. I think it was with postgresql a few years back. You would expect it could read its own SQL, but we were using a human-readable date format and the timezone was printed as ‘EST’, for Eastern Standard Time in Australia. However on re-importing the date, the DBMS interpreted it as EST in America, apparently. So, I learned always to use ISO-style date and time format, e.g. 2011-12-31 23:59:59. It sorts nicely too.

This ‘EST’ confusion continues today:
> date
Tue Jul 12 12:54:29 EST 2011
> TZ=Australia/Melbourne date
Tue Jul 12 12:54:30 EST 2011
> TZ=EST date
Mon Jul 11 21:54:35 EST 2011

Reply

453 Sam Watkins July 12, 2011 at 3:23 pm

I’d like to propose a new direction for this thread… What, if anything, have we learned from all these many and varied UNIX disasters? Here’s my list so far:

1. Keep multiple remote versioned backups of anything you care about.
2. Keep multiple failover systems ready to replace any critical system in a moment.
3. If a process is complex or time consuming or error-prone, try to automate it.
4. Hire competent netops and coders, not the numbskulls who posted here ;)
5. Do not entrust any shell coding ‘toddler’ with root, DBA powers… or anything else.

Please tell me what you’ve learned too… or any other good advice for netops…

Reply

454 Mr Z July 12, 2011 at 3:52 pm

Do a search on the page for ‘learn’ or ‘lesson’ and you’ll quickly compile a list.

One of the simpler things is to make a backup of any file you edit before you edit, delete it when the edits test as good.

Test with ‘ls’ command before running the ‘rm’ command – the extra seconds is priceless

If you can’t replace it or rebuild it – think long and hard about why you can’t, why there is no backup, and just what in the fuck do you think you are doing by editing/changing anything on it. That goes double if it is a production box.

Seriously, why don’t you have a test system?

Why are you running a command which modifies or deletes anything if you don’t have a backout plan? Sorry, say that again? Are you sure you want to do that? NO, go to lunch and think about it. If it’s still a good idea, well then ok. Go ahead, run that command with no backout plan. It’s on your head.

As for your points, 1=yes, 2=yes, 3=YES, 4=ha ha, 5=hey, wait a minute. Even script gods have bad days, so don’t trust anyone to get it right. Severely restrict root access, DBA powers etc. to only those who must have them. There are some things which should not be at root access levels – if that is inconvenient, reassess permissions for it. If it should be root-access-only reassess why anyone needs access to it. Even if you are a script god/admin don’t do your daily work as root. Use sudo. Restrict root login to ONLY the console, no remote root access. Stuff the reboot command inside an alias script which prompts you with all the right information that you should know before rebooting before it runs the reboot command, such as server name, uptime, number of connected users etc. You might even restrict that command (and others) to a special user that you must sudo to first. On production systems, 15 seconds to access it before you run the reboot command can save your job for you.

Now, write a script to implement these changes you have made on any and all production servers. Implement them on all production servers. Prepare your USB controlled NERF missile launcher to destroy anyone who complains.

Reply

455 Elvis July 23, 2011 at 12:26 pm

Wait… are you *THE* Mr. Z ???
If so, thank you for everything, with thirty years delay :)

Reply

456 Mr Z July 24, 2011 at 4:01 am

I LIKE my moniker. Apparently 4000+ others like it enough to use it also. If I managed to do something kind 30 years ago, I don’t remember it so I’ll not try taking credit for it. Of course, I can’t remember much of last week, so 40 years ago is going to be a stretch no matter how much red bull I drink. I guess I’m “an” Mr Z, probably not “THE” Mr Z, but he has a good name ;-)

Reply

457 Elvis July 24, 2011 at 3:36 pm

Anyway, thank you! Being admin for years, I still find usefull things on this site, at least thank you for that.
I understand what you’re saying. I can’t remember what I ate three days ago.
Mr. Z I was looking for is related to Triad, Google knows, Google knows.
I learned from one of his friends that he’s “working some Linux boxes with his father”, that’s why my neural network rang a bell.
Best!

Reply

458 Mr Z July 24, 2011 at 4:16 am

BTW, my comments to you are how I treat system administration. I always assume my assumptions are wrong… ba dump ba. Seriously I always check twice especially when I think I know what I’m doing. This applies to many things in life. If you are building some cabinets or such at home, measure 4 times, cut once. You only get to cut it too short one time. You can cut it too long several times. There is no substitute for a second opinion. Google et al are a wonderful source of second opinions, and like this site, those opinions are based on how to fuck it up. So read these and others and do not make those mistakes. Of course you will, we all will, but if this reduces the volume and regularity of your mistakes then it was well worth putting this blog online.

My own personal mistakes:

Sun hardware often has interlock switches on the covers. Don’t remove them to look inside of a production box that is on line.

TEST your backups. No, I really mean TEST your backups the hard way. This is what test servers are for. If you don’t know your backup works it WILL fail. While you are at it, try a little bit of disk-2-disk-2-tape backup. Multiple copies never hurts and available on network attached disk is a speedy recovery method. Did I mention that you should always have backups – plural. Disaster recovery is a life style, not a SLA requirement. Repeat that to yourself every morning. Whether you care for your customer’s data or not, you SHOULD care about your nights and weekends. Many of which you might spend in a blind panic if you lose your backups, or don’t have any to start with.

Oh, on the topic of backups – install a version control system. Use it for all your scripts. Trust me, you don’t want to rewrite them from scratch – ever. Build all your boxen so that they can be replaced in a heartbeat or as few heartbeats as you can manage. Seriously, remember the sanctity of free evenings and weekends at the bar with your friends. These will be violated regularly if you are not ready for disaster(s).

Another? High availability systems are worth less the the single router they are connected to. tut tut tut. Yep, I’ve done that one, or rather suffered it. The OSS group or Network group are NOT your friends. Make sure that disaster recovery is THEIR lifestyle too. Trust no one!

Reply

459 Beus July 12, 2011 at 3:26 pm

As I’m reading these mistakes, I remember making a funny one a few years ago. At university, we had our own server in school’s computer center with some “home-made” utilities, one of which being called “kick”. It served to kill (-9) all the processes of a specified user. This was achieved simply by setting a root suid attribute to the binary, calling setuid() syscall to the specified UID and executing killall -9 -1 to kill everything what kernel allowed it to. It had already been a long time since we created these utilities, partly forgetting its inner functioning, and we freshly introduced our “home-made” kernel security patch that was configured to make the setuid() and seteuid() syscalls ineffective (the syscalls returned no error, but process’s (E)UID remained unchanged) so as to disable any kind of root access via an exploited setuid program – to become root, one had to log in properly via ssh from one of the defined IP addresses. In addition, to make it more convenient for the roots to administer the machine, certain UIDs have had certain capabilities added, e.g. to delete some non-owned files from defined directories (not the system ones) or to kill not-owned processes. I was, of course, one of such users.
That day, it was Friday evening, I was somewhere else, working remotely, and when finishing, I was too lazy to exit all my shells and other processes properly and wanted to show my friends a way how to log out both impressively and easily. So I issued
kick beus
and watched with surprise how all my friends’ screens (logged in on the same machine) went blank suddenly.
Then I realized the truth of the matter: kick command did a setuid() syscall to a specified user UID which failed, and, as still having my capabilities of killing non-owned (including root’s) processes, the executed killall did its work properly and we ended up with the machine being dead for the whole weekend and Monday morning. Just the kernel and init survived so we could only watch it responding to ping ;-)
Since then, we moved sshd into inittab and later introduced a remote-sysrq target into iptables :-)

Reply

460 SW July 13, 2011 at 5:39 am

Faulty muscle memory. I once wanted to type
vi xxxxxx.c
but ended up with
rm xxxxxx.c

The Return keys are really “Point of No Return” keys with commands like this.

Reply

461 alldf July 26, 2011 at 11:31 am

When i was young and stupid i accidentaly run poweroff command on remote linux box. I wanted to shut-down my local machine, but forgot i’m logged on remote server. This was at evening just before i leaved to the pub. I have to investigate why this server is not running at the morning and i noticed shutdown sequence in log…
From this time I’M NOT using poweroff command. I’m using “reboot” in every case and then shut down machine while rebooting by pressing power button. If i have to use poweroff command (eg. i really want to shutdown remote machine) i double check if i’m logged on right box.

Reply

462 Christian Straube August 1, 2011 at 2:58 pm

Hi,
my last mistake was an “iptables -F” while connecting via SSH :-) After that, the local server support had to perform a reboot the restore the iptables rules… I locked myself out ^^

Reply

463 Philip550c August 4, 2011 at 7:22 am

Not to be a dick but you made another mistake, in your article you said “day today jobs” but its “day to day jobs”. Sorry I couldn’t resist.

Reply

464 zappafan August 12, 2011 at 6:23 am

My first, great, unforgettable mistake: first days in linux, run gparted in a have-absolutely-no-idea-of-what-I’m-doing mood. Just answered “Yes” without reading the warning text: that was the last time I saw a running Ubuntu, on reboot computer said “no partition table found”.

Second great mistake: first hours in zsh, totally amazed by its great tab completion system! its damned tab completion made me type “rm /bin/mount”.

Reply

465 zappafan August 12, 2011 at 6:41 am

I nearly forgot my third great mistake!
I was having so much fun on the University gentoo server, and that days were my first very funny days on Apache, I was editing httpd.conf and I didn’t realize I had pressed return key, decommenting a help line… Server restarted and within minutes ALL WEB APPLICATIONS USED BY THOUSAND STUDENTS returned a very lolling SOCKET ERROR! ahahah

Reply

466 phord August 24, 2011 at 11:23 am

I’m on the phone with a lab tech at a large cable TV company. He’s got a runaway process on their video-on-demand server chewing up a CPU core.

me: It’s a non-critical process, so you just kill it and restart it.
tech: How do I do that?
me: Type “ps -ax | grep processname”. Find the PID. Then do a “kill -9 ” followed by that pid.
tech: “kay eye ell ell dash nine” ok. Now what.
me: It should restar…
tech: Whoa. Uh. Hang on. Something else is going on here. I, uh, I gotta go.
me: Ok. call me later when you have time.

An hour later…

tech: Don’t know what that was, but it’s fixed now. Ok, so I killed that process.
me: Yeah? So check if it started back up now.
tech: Ok. I remember the pid was “1”…
me: hahaha. Wait. you are kidding, right?
tech:
me: O_o

Reply

467 Doomsday1337 March 26, 2013 at 10:48 pm

He killed Init, didn’t he? Wouldn’t that auto-restart the system? So what’s the problem?

Reply

468 steve August 31, 2011 at 12:56 pm

…login…
/usr/local/bin/sudo su –
cd /
ls -al …looking for a txt file…
unix2dos *
………ctrl-c……..ctrl-d……….ctrl-z…….go get the installation media…

Reply

469 Aaron August 31, 2011 at 11:53 pm

One time, when compiling some of my code, I accidentally did
gcc -o file1.c file2.c
instead of
gcc -o file.o file1.c file2.c
:( Obviously, I was not happy after that.

I have also done the accidental “type command into wrong window” before, except I did it the other way around. I meant to shut down the remote computer (It was another workstation, not a production server), but I accidentally did the shutdown in a window for the local computer instead. So I was the only one that was bothered, but I had a LOT of stuff up that I was working on that wasn’t all saved. End users spared, but I was furious.

Somewhat similar, but not necessarily my own mistyped commands:
I really like making programs for automation, so I was all too happy to make something when some of the staff here said “We’re tired of shutting down computers in this area manually every night after people leave.” I made them something that would shut down all the computers in their area, only to be used at the end of the day. It was called “SHUTDOWN LEARNING COMMONS” To this day, they still get a new person from time to time which runs it, wondering what it is for. They have shut their area down a few times when they had a lot of people doing work over there.

As an added bonus, I made the program initially so that it would run in the context of my personal admin account so that it had privilege to shut down the remote computers. The remote computers were windows xp, and the xp window that pops up warning users of the remote shutdown displays front and center what network account initiated the remote shutdown. I was told one day that someone had accidentally run it again, and that one of the more “important” users was searching for me to “give me a piece of his mind” because he saw my name attached to it, even though I was not even at work when it happened. Since then, I made an account called “shutdownLC” whose only purpose is for that one program.

Another time, network switch related rather than unix:
I accidentally locked myself out of our core switch when I was changing vlan settings on it. This rendered our entire network basically useless. I know this is one of the more common mistakes, but here’s the worst part: It was an older switch that doesn’t have an RJ45 console maintenance port, so all of our serial cables with RJ45 connectors on the other end were useless for this switch. We had a few old ones in the building, but it had been so long since they had been used that nobody knew where they were and we couldn’t find them.

The network was down for a few hours before my boss made a special trip in and made me feel like an idiot after he got there: he unplugged the cables that went to the core switch and plugged them into a different switch then said “This might slow things down a bit, but at least it will work now.” I don’t know why I didn’t think of that…

Reply

470 Fab September 5, 2011 at 3:07 pm

Had to fix a box that was borked by an admin that (possibly) accidentally deleted
/bin/grep
the system wouldn’t boot. It was full of errors everywhere and the system was hanging.
While troubleshooting in chroot environment that booted from a CD I tried
to grep config files to find out to my surprise that grep wasn’t there (2-3 hours down the road of course).

Reply

471 Fab September 5, 2011 at 3:12 pm

But my personal worst was after one month of software development in C:

$ rm * .o
rm: cannot remove `.o': No such file or directory

(hint: the space after the * was fatal)
(Reward: I learned how to make Makefiles after that)
(Reward2: I learned how to make backups as well)

Reply

472 Tom September 5, 2011 at 8:25 pm

Some filesystems has undelete utility.
Just turn plug off the power cable not allowing disks to sync or write something, take a deep breath, grab some coffee and try to get data back. I learned this after destroying some data on different filesystems, ext2/3/4/reiserfs, even ntfs.
The more data is written, the more fragmented filesystem is, the less chance is for recovery.
Case is more difficult when using RAID, it usually needs backup system to another array or tape.

Reply

473 csm September 19, 2011 at 10:49 am

In my previous company 5 years ago.

Day 1, they made me use a solaris machine to practice while waiting for my project assignment. After a boring day, I shut it down and went home.

Day 2, Cad admin is hunting me for shutting down the NIS server.

:)

Reply

474 cr September 22, 2011 at 10:44 am

Locking myself down from remoteserver – done
Removing all file in ~ – done
I accidentally typed

rm -rf * /opt/whatever/

instead of

rm -rf /opt/whatever/*

I don’t really know why i did this but it happened..
Since that tim i always have a file called ‘-i’ in my homedirectory.

Reply

475 Wolf Halton September 22, 2011 at 10:58 pm

Adding a sudoer to a machine on which I had sudo rights, but not root:

# User privilege specification
cschultz ALL=(ALL:ALL) ALL # this was already here
mjagger ALL=(ALL:ALL) ALL # this was already here
sadams ALL=(ALL:ALL) ALL # this was already here
metoo ALL=(ALL;ALL) ALL # the semicolon broke sudo so
# none of the admins can log in.

Reply

476 Robert Parten September 27, 2011 at 4:48 pm

Well let’s see:

1. Shutdown public interface on server – check
2. Locked myself out of a firewall – check
3. Caused metldown of database servers by not reviewing a switch config – check
4. Gave wrong IP to switch in lab and deployed to production – check
5. Shutdown server via ssh shell when I meant to shutdown my machine – check

Reply

477 Nirvani September 29, 2011 at 5:14 am

Some mistakes I have done as an admin:

changed the network(eth0) configuration of around 100 servers while data center move and forgot changing the network ID – This was done through a script from a trusted host. Also issued ssh xyz `poweroff -f` on the admin host which killed the admin host itself.

Copied upto 50GB of data to /tmp as a temporary location(had planned to copy them to another location) and forgot. One reboot and the whole data is gone.

ran crle -l /some/path/to/libs on a Solaris production box and it messed the whole system. None of the other admins could fix it and we had to go for a reinstallation.

Reply

478 Nirvani September 29, 2011 at 5:26 am

Another mistake i forgot to mention:

cat > /etc/vfstab instead of cat >> /etc/vfstab

luckily i had another identical machine and copied the contents of fstab. Dint break anything :)

Reply

479 Superkikim October 4, 2011 at 2:59 pm

Excellent topic. Thank you for sharing with us. And thank you to all who left some more comments.

I think chown and chmod are by far worse then rm -rf. Because usually, you have backups, and you can easily restore a folder unless you deleted your entire file system. Fast and clean. But chown and chmod ? you’re good to make a full restore…. unless you want to lose some time believing that you may find a way to fix the mess…

Another one I did was a quite simple

apt-get autoremove

Well… It removed automatically alright…. REmoved dependencies for a uninstalled application…. but these dependencies were still required for quite a lot other packages. …. I still don’t know if it was supposed to do that… but I don’t use autoremove anymore …

I’m still new on Linux… Working sporadically on it since 3 or 4 years… Lots of mistake to come… for sure. I’ll keep this post in my bookmarks to come by sometimes and read again.

Reply

480 Jack Wade October 7, 2011 at 10:02 pm

Not really a significant mistake in terms of downtime/destruction/damage/etc, but still. So I work for a hosting company, and some of our boxes have upwards of a thousand or so domains on them, including the Apache virtual hosts as well as named zones.

Well, one day I edited a zonefile and restarted named rather than reloading it. named takes *forever* to start up from scratch with that many zones so all the sites were unresolvable for around 7 or so minutes. (Note we don’t have a redundant/slave DNS set up, but we definitely should)

Reply

481 Aaron October 8, 2011 at 1:21 pm

A bad mix of screen and rsync. Working on adding a slave database server to a staging environment. opened screen and a term to the existing database, and a term to the new host which was to be a slave. got my rsync command ready, and fired off,, only I was on the wrong DB and using –delete, so i wiped out the staging database in a matter of seconds. I never had so many perplexed developers calling me in my life. =), luckily staging is a dump from prod with some sanitizations run to fudge data so i simply needed to run the process manually. In the long run the dev’s weren’t that upset because i gave them a free day off. woohoo

Reply

482 Ken October 10, 2011 at 5:32 pm

Our Linux servers are -csh by default, because our production software uses csh environment variables. My workstation is bash by default (and preferred), so, upon making changes to the .login script, I use this:
if [ $this = “$that” ]
then
# more bash syntax that killed the login

Luckily, not a huge mistake, but it did kill everyone’s ability to login. So, had to mount the home directory as a sftpfs, and fix .login.

Multiple shells.

Reply

483 jd October 26, 2011 at 9:14 pm

Well, I do remember having a mc with two panes: a 10 gig folder with some clients’ emails and a 9 gig folder with their backup from a week ago. I went to delete the backup folder … the wrong one.
Yay

Reply

484 Lala October 27, 2011 at 4:01 pm

Bought a new disk for the stuff i mirror because home partition on my workstation was quite full.

mkdir /home/lala/dump
mount /dev/sdaX /home/lala/dump
mv /home/lala/mirror /home/lala/dump

Then i created a cron job:

cd /home/lala/dump/mirror ; rsync -rav –delete rsync://somestuff.org/repoX . > /dev/null 2>&1

No problem for a couple of weeks until I had to reboot my box. Next morning, looking at my completely empty home (except some RPMs…), I realized that i forgot to add an entry for /home/lala/dump and a “&&” instead of “;” in the cron job would have been a nice to have.

-.-

Reply

485 Blurf October 30, 2011 at 2:10 pm

Ahh..memories…..

Starting to fill up the hard drive on my development machine….
So..company buys me a nice shiny new drive, Install it late the night before, get up early to get some extra work done because I know it is going to take a while to set it all up.

mkfs the first partition on the second drive…..or… is it the second partition on the first drive? …OH crap!

Good news…there is a full backup on tape..
Bad news..it’s offsite, half hour out, half hour back, and 3 hours to restore it, *after* I rebuild a bootable system to restore it to…

It’s the little details that kill you – P

Reply

486 Michael November 2, 2011 at 9:30 pm

Digging up old thread, but I have a story that I thought would be more common:

I needed to create an account on the production database server. The admin trusted me, so logged in as root and let me create the account while he fetched coffee. It was running an old version of MySQL and was being fussy with the GRANT command so I just added the row directly to the mysql user’s table. Problem was I got the password wrong, so I thought, no problem, I’ll just update it:
UPDATE user SET `password` = Password(‘mypass’);

I got distracted by remembering to hash the password, and by reflex hit the semicolon [enter] before catching myself.

You know that moment when you go cold and realise you have just majorly messed up a production server that isn’t even your own and you really ought not to have been on in the first place? Fortunately I had done a select a few lines before and it was in my terminal scrollback. I had to manually enter in the 30 odd user password hashes before the admin got back with his coffee. I never told him.

I have now got in the habit of typing in “\c” and then only after I have finished the whole command and reread it do I remove the \c and add a semicolon.

Reply

487 martianunlimited November 4, 2011 at 12:19 am

My personal favourite is accidentally copying the default tcsh/csh prompt from the command line and pasting it. (ya.. i know you can just browse the history, but i think i was working with multiple terminals and trying the command line out in a different shell environment)

prompt> ./my_script.csh
prompt> prompt> ./my_script.csh
prompt: Command not found

Immediate changed my prompt after that and refused to copy and paste command lines unless i absolutely have to

Reply

488 laapsaap November 10, 2011 at 8:33 am

All linux administrators will make all of these mistakes at some point. But the difference between a master and an apprentice is that masters only make the same mistake once.

Reply

489 Barfo Rama December 16, 2011 at 6:24 pm

The difference between a guru and a master is the guru knows how to fix it before anyone notices.

Reply

490 htan6x November 15, 2011 at 9:05 am

i run hostname -a instead of hostname on hpux. Noticed it right away and change back.

Reply

491 agreatunixadminIneverwillbe November 15, 2011 at 7:09 pm

I’ve done most all the mistakes mentioned here… for me, the worst was when early in my career I was working in a pre-production environment on a customer site, our “pilot” machine shows off our custom software to customer mgt and execs, lets them demo the software/provide feedback, that kind of thing… it’s hugely visible and must have constant availability. On this project, it’s the only one of its kind at the time due to budget constraints.

So, we spend almost a year developing the software for this particular project and had gone through a couple of rounds of feedback, software updates, etc for the customer to see, play with, approve, etc.

One day I’m on the box in the middle of the day prep’ing for an update later that night and instead of removing the contents of an old version’s backup directory, I run this at root

rm -rf *

and go back to my work, waiting for the command to finish. Only when the command was still running 10 minutes later did I realize what was going on. I stopped the rm command, but the damage was done. The box was hosed when it was supposed to have 24/7 availability.

I happened to be working with a great guy at the time that was superior with Unix. He did some command line magic using cpio, some pipes, and some other stuff to this day I still don’t know what he did that in real-time over the network re-built the the machine, software and all in a few hours. No one knew what happened… we took some heat for the downtime, but blamed it on something else I don’t remember.

I always always always have backups now… local backups, as well as remote backups.. and whenever I design systems, at least N+1 redundancy.

Reply

492 wharfie November 21, 2011 at 6:53 pm

Two of my favorites, one by me and one by the “boss”.

I was in charge if Usenet for a pretty big ISP. We had two identical Usenet servers, huge systems for the time, dual Xeons with hundreds of gigabytes of storage and as much ram as the rest of the machine room combined. We’d test upgrades on one, get it all working, then change the IP and swap the cables so that the test box became the production box. Well, one day after a successful upgrade I issued the command to remove and rebuild the news spool on the test box. Of course, I was *actually* logged into the identical-in-appearance production system. Tech support started getting calls 5 seconds after I hit enter. The moral: at least change the prompts or something to be different on test/production…

Another time I worked for a bunch of scientists and wasn’t “really” the admin. My boss was the sysadmin and I was the lab assistant, although he didn’t know much about running a computer so I did everything. But he insisted on asserting himself occasionally just for appearances and one day informed me that he’d added a cron job to “clean up” during the night. Next day we came in to a server filesystem that was completely empty except for /bin/crond and /bin/rm… The guy had written a shell that ended with (cd $HOME;rm -rf *). Of course, in those days cron ran as root… And root’s $HOME used to be /, not /root :-) Whoops.

Reply

493 Joce November 26, 2011 at 10:29 pm

$ls important_*
important_1 important_2
$mv important_*

…forgetting the destination directory name. Well, some consolation: you lose just 1/2 of the stuff…

Reply

494 zorg December 2, 2011 at 12:15 pm

ln -fs ./blabla /dev/null instead of ln -fs /dev/null ./blabla
(ipfw flush; sleep 3; ipfw add allow all from any to any;) &
but forget “&” :))

Reply

495 Doug December 13, 2011 at 7:55 pm

I’m not a sysadmin, but am an app admin. When we did a major software platform upgrade on the same hardware I had to get the new code deployed. It was one night before the release and I had to get the deploy script kicked off but it was late and I needed to get home. Screen wasn’t installed and nohup didn’t always work. So I ended up putting it in cron. Then I got sick and was out for several days which was the major release. Everyone kept wondering why one of the clusters kept going down at the same time each day and reverting out code.

Reply

496 shaheed December 14, 2011 at 7:13 am

The usual one which most of them has said already

rm -rf * instead of rm -rf ./*

Reply

497 matty August 28, 2012 at 11:21 pm

Do you mean rm -rf /* instead of rm -rf ./* ?

Reply

498 Pablo December 15, 2011 at 11:50 pm

The most painful form me what discover why an aix server only allow logins from root user…

The answer:

Some TSM Genius did a chmod 666 /

Reply

499 Barfo Rama December 16, 2011 at 6:18 pm

It seems to be possible to comment a working command in hp-ux crontab, creating a syntax error, and thereby disappear crontab on exit. I always make a manual copy now, as well as auto-copy off the box periodically along with /usr/local/bin and such.

Reply

500 jason December 19, 2011 at 3:05 am

I forget the exact sequence and not about to experiment, but was trying to upgrade the libc binary on a live machine to gcc3 to compile, I think I just moved libc.so.6 to libc.so.6.old… and ln, mv, cp, etc rely on libc.so, so wouldn’t work anymore. I did manage to google a solution to get the machine going again.

Reply

501 jason December 19, 2011 at 3:08 am

crontab -r just has no reason to exist. open the cron file and delete everything if that’s what you want to do. I had a bad experience, and then could never remember if the bad one was -e (erase) or -r (remove). (-e is edit, -r is remove)

Reply

502 kishan December 23, 2011 at 10:03 am

i deleted my root dir :|
edit this command too

rm -rf ~

Reply

503 Kaeza December 27, 2011 at 6:29 am

How about this?

rm -rf "$foo/$bar";

Now guess what happens when you didn’t properly define *both* the foo and bar variables, and run the script as root…

Oh yeah… I realized when it was too late:

Reply

504 pooh January 2, 2012 at 12:12 am

I am beginner in Linux administration and.. accidentelly I have deleted anaconda-ks.cfg and install logs under /root/ directory:

1. I unzip files with overwride option using ssh
2. I press refresh button in WinSCP to make sure that the files are overwritten
3. I did not realise that WinSCP have changed the directory after refresh :(
4. I found that file anaconda-ks.cfg and two more install logs are not expected and impulsive have deleted those files.
5. I minute later I found that this is mistake – in result there is no install history and template for future installations / possible some company scripts rely on this file.

Conclusion:

Make backups of all important;
Make sure that you are working in the right directory;
Focus your attention on one console and avoid guis for installation tasks;

Good Luck ;-)

Reply

505 Ken Mason January 4, 2012 at 7:02 pm

I needed to place server into single user mode. I did it. Then realized I was remote = I watched my ssh term complain of lost connection. Drive to data center. Feeling rather human.

Reply

506 Wolf Halton January 8, 2012 at 5:25 am

Virtual Machines have helped with that. With VSphere/ESXi when I mess up my ssh configs, I can just find the windows machine )that is the only OS that runs the VSphere client) and go in the back way to the server – fix my error and go happily on my way.
One time I accidentally used a semicolon instead of a colon in the sudoers file when adding a new admin to a client’s machine. The machine happened to be 1000 miles away in a data center I do not have access to otherwise. Since it was not one of my own VMs, I had to request a colleague in New England go in and fix my error.
It does make one feel human.
Cheers,
Wolf

Reply

507 Nicholas Bodley January 6, 2012 at 5:06 pm

[Please feel free to delete this, in parts, or even totally, btw, if it’s too verbose.]

Fortunately, the only semi-serious goof I ever made to a system not my own was to drop the backup Flexowriter console typewriter at the early (above-ground) NORAD/BMEWS COC. I was fired. Should have asked for help lifting it off the floor.
Hit cork tile over concrete. I also wiped some customer data (not important) from a mag. drum-based data-storage device (SEMA, quite obscure) that I was trying to interface to an IBM electromechanical tab-card machine with lots of internal sparking contacts. They weren’t really concerned.

ThinkGeeks, iirc, is where you get T-shirts that say “I void warranties”. While that applies to me also, mine should say,
“I disable operating systems
(Only my own)”

At least three times, I have destroyed my Linux installations (and one Win XP; sorry).
Being over-tired and doing something risky is just stupid, when one is free to hit the pad.

Linspire was a lovely Linux distro., very nice user community. It faded when its developer died, and his son really didn’t want to continue. I hosed mine via emelFM, a twin-pane file manager, new to me. Had one pane up showing /, the other /home. Wanted to recursively ensure that all of /home was owned by {user}, not root. Chose wrong pane, because UI didn’t really make clear which pane was active. I forget details, but permissions for the system were hopelessly hosed. Backups? Good idea, but not yet…

I think it was XP; I had about six partitions, and some unallocated space. Made a partition there to put Linux into. Did a mkfs.ext3fs on what I thought was the free partition. Unfortunately, partition “index” numbers (such as /dev/sda2) don’t necessarily follow the physical sequence on the platters. That can be an horrible “gotcha!”
I got my ext3fs, OK, but XP was puzzled by the leftover hybrid partition type.. Backups? Not yet… (Most of the data is probably between the superblocks and inodes, still recoverable.)

While I like the Unix philosophy of avoiding user feedback unless wanted (I’m 75, and have some feel for the command line on, say, a DECwriter), I do think it shouldn’t be that difficult for the (GNU) mkfs commands to warn the user that a partition about to be formatted contains data. Afaik, # rm -rf actually does ask whether you’re sure, at least in some distros.

I tend to delete Bash history any time I’ve used an rm command that could cause future grief. All too easy to poke the up-arrow and hit Enter.

Discovered Parted Magic, iirc; small bootable CD, actually a workable small distro in its own right. Wanted to see what the GUI for deleting the partition table looked like.
OK, so /that’s/ what it looks like. Over-tired; clicked on [execute] instead of [dismiss] (actual names differ). Should clone that HD and try C. Grenier’s [testdisk].

OK, although have long believed in backups, and too rarely done them, decided to back up openSUSE 11.1. Real fool, doing it in a very makeshift fashion (I know better). Krusader, twin-pane file mgr. Active. F5 is Copy, F6 is Move, iirc. Meant to hit F5, hit F6, instead. Watched progress, and suddenly files in / started to disappear.

So, at last, I’m running Mint Linux (do investigate Cinnamon!), and do need to remember not to waste 100s of GB backing up, say, a downloaded, but outdated Linux .iso file.

Last, had several 100 GB of mostly non-valuable videos and stills on a 1-TB HD. Hosed it (tried to expand a partition, but machine timed out and shut down during resizing). Used the wrong recovery app. written by Chr. Grenier with no luck, of course. Wanted [photorec], but was trying to do it with [testdisk]. No luck; segfaulted testdisk once. Finally tried
# dd if=/dev/null of=/dev/sdb BS=1M (details from memory, maybe not totally correct). Took over 3 hours, and, this time, I had disabled machine shutdown after 2 hours; set it to “never”. Only then did I realize that I should have used [photorec].

Reply

508 Cody January 6, 2012 at 7:41 pm

Quick response, some may be vague. Hope its helpful.

Re: “Afaik, # rm -rf actually does ask whether you’re sure, at least in some distros.”

It’s an alias. rm -i does that. Do be careful you don’t rely on that though, because when the alias isn’t there.. well you can be surprised.

As for mkfs – define “data”. Best option for future is to (of course make sure its not mounted too – may give notice on that but not done it manually in some long while) check disk free on that mount point, e.g.:

df -h /

would show how much is in use / free on the root volume.
(that’s the other thing. if a partition is not mounted, nothing will be seen data wise – not at the higher level anyway).

And for backup, depending on size of what needs backup, a few thoughts:
cron job that backs up nightly/weekly, whatever ? (or even something like the program bacula).
Also, you can save a dump of the partition table, boot sector and so on (not useful for everything, however it can be of use at times).

One other thing about backups : when recovering data, never write to the disk its on, including restoring or installing the program to recover with to it.

And on another note, regarding permissions. You could always do:

chown -vR user.group /home/
or
chown -vR user:group /home/

Just make sure you never do chown -Rv user.group .* or similar (in general be very careful with recursively changing ownership).

Reply

509 Cody January 6, 2012 at 7:42 pm

Hmm, last two commands got messed up. I stupidly used arrow brackets. I should have used the html codes.

after /home/ should be the user’s directory.

Reply

510 Been there done that January 9, 2012 at 5:29 pm

Working for a hosting company has given me some interesting experiences.

Like the time a customer support agent gave a customer the instruction (via email) to simply execute on his CentOS box [rpm -e XXX]. Apparently the customer didn’t understand so the customer support agent typed in [rpm -e ] and then pasted [rpm -e XXX]. I spent well over 6 hours getting RPM functional again and I can assure you it is not a fun experience.

Or the entertaining situation created when a customer support agent wanted to “grow” an existing partition for a customer on a second drive by creating a new partition with the same name and then mounted it over the existing partition. Nothing was lost, but you should have heard the customer scream about his “data loss”. Luckily, mounting a partition in this fashion doesn’t erase data but simply hides it – yet it still took several of us a couple of hours to figure out what had actually transpired…

Reply

511 Kiwi Nick January 16, 2012 at 7:41 am

A few that I’ve seen happen

  1. SQL: DELETE FROM some_table; — missing WHERE clause. Bun fight for the next 3 hours for the trained monkeys at the client to get the backup tapes for a restore.
  2. There is a Linux installation where almost EVERYONE has the same home directory (no need to store user-specific stuff on this machine, they all share the same .profile). From force of habit: userdel -r minion; ## because minion had left the company. -r wiped out the shared home dir for all users, including all the DBF files for the production Oracle database. About a day’s downtime.
  3. The same guy made a second mistake: logged in remotely to a Unix server, he goes: ifdown eth0; ## then intended to ifup eth0; to pick up a config change. But he couldn’t get his prompt back after the first command, I wonder why ;-) He rang up and told someone to force the power off, then on.
  4. An entire group of suburbs without phone service for a day: some vague rumour about something misconfigured.
  5. One that I did myself: very early days of Unix – confusing rm instead of mv. It was only a single file deleted (the intended destination didn’t exist, and rm croaked with the error, which alerted me to the mistake). But it was still 2 hours recreating the C-program assignment from memory – some one else’s who I was helping.
  6. Another one I did myself: swapping between the CVS storage area, and the equivalent working directory in my home directory: wanted to delete part of the working directory tree and start over: rm -fr stem_of_tree; ## but in the CVS area, not mine. Got someone to restore from backups. It was a pretty rudimentary CVS setup.

Reply

512 Kiwi Nick January 16, 2012 at 8:09 am

Now for some prevention.

  • mv -i ## always use -i
  • rm -i ## at least when you have your L-plates on (L-plates is Aus slang for “learner driver”).
  • Only execute something as root if it won’t work otherwise; I generally have two windows open: my normal user and the root user, and I do things like ‘ls, cat, less’ in the normal window. Then I do things like vi, or rm as root (but I do a ls just before the rm, as root).
  • I configure my root windows with a different background colour.
  • pwd; prior to rm or recursive commands.
  • prior to a recursive command on a small directory tree, find . -print; or, find . -type d -print; see how many files will be affected (or for slightly larger situations, the second one gives the range of directories to be affected).
  • When I write a script to rename files or change them in a robotic manner, I put an echo in front of the intended command, so it will give me the commands that are going to happen. If satisfied, I remove the echo, to let the command actually happen.
  • If I’m editing files in a robotic manner (by script), I create the new files with a backup extension (eg file.cxx becomes file.cxx.new). Then I diff the various original files against their new ones, to ensure it is as intended, then I rename the files, mv -i ; and keep my eyes peeled as I say yes to each one.
  • The > is your enemy. If I want to create a brand new file by redirection, I … cat brand_new_file; (expect a not found error), then some-cmd > brand_new_file;
  • If I want to append by redirection, I make a backup of the target file regardless.
  • When I edit a config file in /etc (or anywhere), I … cp -i file.conf file.conf.20120116; named after today’s date. The dates aren’t that truthful, but it’s miles better than anything.
  • Create dedicated users for running VMs and similar (especially if they are shared). That’s the approach Apache uses (usually automated).

Reply

513 Beus January 16, 2012 at 4:32 pm

- Adding echo before the actual effective command when making any automated file-manipulation script and only removing it after several satisfactory test-runs is exactly my approach, too :)
– If creating a brand-new-file from redirection, I use the advantage of tabkey completion ;)
– When editing config files, I usually just copy & comment out the original line(s) and then edit – it saves time spent by diff-ing the versions. This way I sometimes get configs a few times larger than they have to be, but an occasional cleanup of unnecessary stuff during the next edit takes care of that :-)
– As to the root access – I never have any root-shell window at all and do everything under my user. If anything really needs root privileges, on machines not needing top security I avoid having to type the root/sudo password each time by just running “do command parameters” where “do” is a tiny hyper-simple program in my ~/bin (which I have in $PATH) running the rest of its commandline under root privileges (has chown root:special_group and chmod 4510 (r-s–x—) ), basically going like this:

setuid(geteuid());
execvp(argv[1],argv+1);

Mine is extended by running the shell if run without parameters but I almost never leave such a shell open unused.
I even used this on some servers – being executable by only a certain group together with hiding it in odd locations provided high enough level of security (even if someone gained an access to one of the root-allowed users’ accounts, would have to know what to run).
– Anyway, what contributes most to the commandline safety is the distinction in prompt – I always set it up to show hostname[tty]:dir> for me and root@hostname[tty]:dir> for root, sometimes in different colour. This way I’m always warned that it’s a root-shell before issuing a command. Benefit of always seeing the hostname and directory prior to even starting to type a command or manipulating files does not need any explanation (or if hostname is not distinguishing enough, just use any string to identify the machine) and knowing the tty proved very useful when hunting processes.
So, the prompt consumes the whole line sometimes (in some deep subdir trees) but the information it provides is worth that space, and it’s more convenient and safe than to check for “pwd” or “hostname” each time, what you can forget sometimes (especially when tired, what is the worst time to run dangerous commands) :)

Reply

514 Kiwi Nick January 16, 2012 at 8:28 am

A few more, for major reconfigurations.

  • During major reconfigurations (using a liveCD), I mount everything -r (readonly) unless the task demands I write into that partition.
  • Mount a partition and have a nosey prior to hosing it (sometimes it won’t work coz it’s never had anything in it, but still: if the mount unexectedly works …)
  • Prior to removing a large directory tree, chmod 000; and/or, mv -i; to a new name, and leave it for a few days, to see if anything stops working.
  • rpm –test as a non-root user, prior to the real rpm.

Ironic: there’s an advert below this box saying OMG’S of the week (Telstra). Yes, it’s a bit like that.

Reply

515 Oisín January 16, 2012 at 5:35 pm

“Mount a partition and have a nosey prior to hosing it (sometimes it won’t work coz it’s never had anything in it, but still: if the mount unexectedly works …)”

Good point, which saved my skin a few times… “I’ll format that for the new backup partition. Well, I’ll just check if there’s anything on it and… wait, isn’t that my Windows partition? That’s not possible, because /dev/sdb is supposed to … oh, I plugged in that thing and… yikes.”

Reply

516 Allen Garvin January 19, 2012 at 5:54 pm

I once untarred an archive in the root dir. This was long ago, on HP-UX 8. It changed the permissions of the root (/) to 700. Since I was logged on as root, I saw no change in behavior. However, everyone else began seeing very strange behavior, and no one else could log on at all (it would seemingly log them on, then drop them). It took about 20 minutes of looking, before finding the problem.

I did the killall thing once long ago on Solaris, too. It taught me to switch to using pkill everywhere (that had it).

Can’t tell you how many times I installed Solaris in the 90s with a fresh install, and then forgot to enable nis, and had to drive back to the data center afterwards because you couldn’t log in as root in telnet.

Reply

517 Programie February 7, 2012 at 12:25 pm

You are currently in your systems root directory (/).
You want to delete all files in /tmp but typ “rm -rf /tmp/ *” (Note the space between / and *)
After the command completed successfully, you try to change to another directory. But you can’t. There is nothing!
It doesn’t happen to me yet… but hey, the life can be long. :D

Another mistake which is possible: You are on a remote machine (via SSH) and think you are on your local machine. You want to re-partition your hard disk…

Reply

518 Valeri Galtsev February 7, 2012 at 3:42 pm

Hm…
with a little knowledge of unix one can realize that

rm -rf /

is _NOT_ as devastating as many think here it is (I would add that as unix certification exam question). Let me re- post here what I already posted:

“…
One thing I couldn’t buy as ultimately devastating though:
rm -rf /
– if I ever manage to do it as root on my *nix box I expect /bin, /boot, and part of /dev gone (and whatever else could be in / alphabetically before /dev on that box). Then the device hosting root filesystem will be deleted, and this will be end of my trouble. The rest: /home, /lib, /lib64, /sbin, /tmp, /usr, /var will stay intact. Other opinions?

Of course, you loose the system on the fly, but you can mount the partitions of the drive on another system, and you will see other filesystems mentioned above intact.

Reply

519 cyclooctane April 5, 2012 at 9:22 pm

The only problem with the above logic is what happends when someone lets the command run to completion. (I have seen exactly that scenario happen when a young sysadmin failed to realise in time what he had done)

Anyway.
My list
Running comarnds on the wrong box (done)
Locking my self out of a box with a firewall rule (done)
ifdown eth0 when logged in on this NIC via ssh (done)
chown apache:apache -R / var/www/html/* (done)
dd if=/dev/zero of=/dev/sdb (when it should have been /dev/sda)

I am sure there are many more that I can not remember off the top of my head.
I remember these because they were the most painfull. :(

Reply

520 Thomas Bliesener April 9, 2012 at 5:42 am

> Locking my self out of a box with a firewall rule (done)
>ifdown eth0 when logged in on this NIC via ssh (done)

These are my favorite ones on very remote systems. ;-)

1 Set a reload of the firewall in the crontab 25 mins in the future
2 Set a reboot in the crontab 30 mins in the future.
3 Set an alarm 23 mins in the future on your desk.
3 Do changes only on the command line first and when they really really work, modify the config file.

Reply

521 Tom February 25, 2012 at 7:24 am

I stopped the iscsi daemon on the wrong machine because I was typing in the wrong shell. This resulted in bsod’s on a lot of VM’s that were running production

Reply

522 LennoN February 28, 2012 at 12:57 pm

Editing the pam.d/auth despite all the warning about it :)

Cant even log into my root :)

Reply

523 d March 4, 2012 at 7:41 pm

chmod -R 000 /

Reply

524 sachin March 6, 2012 at 4:21 am

hi all ,

i was doing a database activity .
i rarely use window machine .that night i was using windows and login using putty.
i was just searching history where i found
/etc/init.d/mysql stop
i select that command and again start searching history
even i dont know what i was searching then suddenly i press right mouse .
in putty doing so is executing the selected command and within no time the database was down that was production database..

conclusion:
never select orkeep any command in buffer .

regards,
sachin singh

Reply

525 Cesar Chirino March 8, 2012 at 3:43 am

One of the most absurd things had happened to me is that some years ago, the operating system let me create a file named “*”. Ha ha ha!!

Imagine the efforts to delete, rename, move or something that file!

Finally got backup file and got last changes lost… Wrote some programs again

Reply

526 Cody March 10, 2012 at 1:24 pm

Re: Imagine the efforts to delete, rename, move or something that file!

Like this ?

$ touch \*
$ rm -i \*
rm: remove regular empty file `*’? y

As I recall there was always a way to remove/etc a file by that name. There’s also similar tricks if a file – exists, e.g., most programs (gnu versions at least, cannot check others nowadays) allow — to signify end of – options.

Reply

527 Cody March 10, 2012 at 1:44 pm

Oh, and another way (just for reference). You actually had it kind of (hint, you quoted it). Another shell feature :
$ rm “*”

That would also delete it (look up in e.g., the bash man page QUOTING). Note that it refers to this :
The special parameters * and @ have special meaning when in double quotes (see PARAMETERS below).

* by itself is not a parameter, so it’s not referring to that. See Special Parameters section for more info.

Reply

528 Daniel Hoherd May 4, 2012 at 5:10 pm

Regarding funky characters, sometimes you can use a ? in place of the problematic character, or use some other substring that will match. For instance, if you wanted to remove a file called “~*_blah” you could “rm ./??_blah”

Reply

529 Cody May 4, 2012 at 6:51 pm

You should be very careful with ? though. Reasons – although it won’t match a ‘.’ at the beginning, it would match the dot for file.ext if you specified, file?ext or a second dot.

That may not seem like a problem, but imagine this:

touch \.\?
( create a file called .? )

rm -f ??
(won’t delete the file, would delete files [that you have permission to delete] that have two character names; and if you specify -r it would do same for directories too, minus ..).

Sure, you can do ./.? but… hope you don’t do rm -rf without escaping the ?

Contrived or not, it’s best to escape characters or make absolutely sure it won’t do anything else other than you intend (which you even mention in a different post). Okay, so if you have a letter after, fine, but then you might be tempted to specify .?* and there’s another issue there. So yes, it can be done, but it should be pointed out it is risky.

Reply

530 soundman2020 March 12, 2012 at 2:37 am

Then there was the time that I wanted to use the tcsh shell for root login, instead of the standard bash shell. So I carefully edited /etc/passwd and changed the relevant part to “/bin/ctsh” instead of “/bin/tcsh”. Guess what happened next time I wanted to log in as root …

Reply

531 Programie March 12, 2012 at 8:48 am

I know that. And I also did some mistake like that.

My suggestion:
*NEVER* close you current opened shell before testing the new configuration in another instance. That also helps while configuring the SSH daemon. ;-)

Another mistake I did in the past but: Changing the SSH port without updating the firewall rules. :D
I ended with a real headless system. I was happy about the possibility to use a VNC remote console. ;-)

Reply

532 Daniel Hoherd May 4, 2012 at 5:08 pm

Something I’ve done in the past as a safety net for firewall edits (which isn’t fool proof but worked for me) is to open a screen session on the server whose firewall you’re modifying, then do “ssh -R 29922:localhost:22 some.other.server”, then if the firewall prevents inbound connections you can still connect through that pre-existing remote tunnel.

Reply

533 Cody March 13, 2012 at 5:50 pm

Maybe you know of it now, but.. if not:

Next time, instead of editing the passwd file, try invoking the program ‘chsh’. There’s also ‘chfn’ (for change finger information). Both have been around for aeons.

Reply

534 Collin B March 13, 2012 at 10:20 pm

I set the /bin/chmod to not be executable due to a failed copy and paste and got my commands switched around. Took some outside the box thinking but I used perl to reset it. I now use puppet to handle my configs, I use git to clone my puppet configs to a local user environment, test those changes then use git and push those changes back to puppet and update configs that way. Definitely a smarter way to go.

Reply

535 Marcus March 15, 2012 at 6:03 pm

Backup, backup and backup. Right! The most important thing. :)

Reply

536 Vincent March 29, 2012 at 6:01 pm

I wanted to change permissions on all hidden stuff of a user’s home directory :
chmod -r 644 /home/user/.*
Had to build and run a perl script to clone permissions on all /home from the last backup…

Reply

537 Vincent March 29, 2012 at 6:08 pm

btw you should try :
$ cat > -r
hello world
Ctrl-D

Mouhahaha

Reply

538 Cody March 29, 2012 at 8:14 pm

And ? What’s your point ? You miss several things, among them :

– Many commands, rm included (and other similar basic utils) have the option ‘–‘ to say stop processing – options.
– It’s not hard to delete, ie :
rm — -r
– rm -r won’t work (because missing operand).
– If you specify, say ‘rm -r ‘ it’ll remove the directory and skip the -r file.
– List goes on.
– See the first point.

In other words: Nice try, but your attempt at being ‘clever’ actually is the opposite. Maybe you should try testing your cleverness first next time ? The words that come to mind of what it makes you look like otherwise, is.. well, I won’t even go there.

Reply

539 Cody March 29, 2012 at 8:15 pm

Right, html. I should say ‘rm -r [enter]’ won’t work because it will be missing an operand.

Reply

540 Cody March 29, 2012 at 8:24 pm

Oh, and clarification of why it’ll skip it.

If you don’t have it aliased, it might be an issue (but again – see first point). However, anyone with say rm -i as the alias for rm (which is, though not the best choice – it is still a default on say red hat systems and potentially others). In all cases then, it’s pretty silly to think it’s going to cause an issue (oh and as the person above you said: backup).

So, yeah, it’s not an issue to delete files with special meaning characters. It’s also not the 70s when more was possible. I mean even these days you have less permission (as non root) to write to consoles and so on (say, echo or cat a file > /dev/pts/ …).

Reply

541 Oisín March 29, 2012 at 8:35 pm

rm ./-r
?

Reply

542 Cody March 29, 2012 at 8:40 pm

Another nice example indeed.

And interestingly, I just tried unalias on rm, and it was still smart enough to process it as the option -r rather than play around with the file. Really though, I shouldn’t be surprised given how programs do process arguments passed to the command line. It makes perfect sense given it starts with the – character.

In short, his idea is just very flawed from the start.

Reply

543 Stefan April 7, 2012 at 10:46 am

not linux related:
i cut of the power cord from our primary core switch to the ups,
while we re-configured the second switch. downtime about 20 minutes

Reply

544 Luigi April 21, 2012 at 1:41 pm

One little program that will save a lot of time: molly-guard. I found it on Aptitude and asks you the hostname of the machine you want to shutdown/reboot. This should adequately prevent accidental shutdowns

Reply

545 Thomas Bliesener May 4, 2012 at 7:08 pm

Sometimes the filename is so ugly that you can’t even type it. In these cases you can address the file by its i-node:
1. Catch the i-node with ls -i
2. find . -inum -exec rm {} \;
This trick is desribed in “Unix Power Tools” from O’Reilly.

Reply

546 Oisín May 5, 2012 at 3:28 am

That is a lovely trick! Pity about the slightly painful find/rm syntax.

Reply

547 Cody May 5, 2012 at 12:02 pm

Yes, and that book happens to be quite good. I have an old edition of it and its still good. You could of course use the ‘-delete’ option to find to make it ‘easier’, but observe the man page warning on it first (do man find and then type /-delete to search for that).

Reply

548 Jon May 6, 2012 at 9:02 pm

Another one for mixing up terminal windows here – I had two terminal tabs open in OSX. One local, one SSH onto the server. The local mySQL had some changes I wanted to put onto the server, and I wanted the local to match the server more closely (apart from the changes).

I was about to copy the server’s httpd.conf over to local, so I went to delete the local copy beforehand. cd’d to the right folder and went to delete it…. but between the two I’d swapped to a different window and then back, losing track of which terminal tab I was in. Deleted the httpd.conf file on the server and had to shut the box down while I rooted out the backup… then had to port all my changes over again.

Reply

549 dn3s May 13, 2012 at 6:24 am

my worst mistake is typing > instead of >>. i use >> way too often for my own good on files i really shouldn’t use it on. bash completion is the second worst, when i type rm startOfFilenam[tab] [enter without looking].

Reply

550 eeq May 14, 2012 at 6:41 pm

It wasn’t all my fault, only about 98 – 99%.

Someone had somehow created a file with zero length and the name “\”, without the double-quotes of course. Well, that just had to go.

On about the fourth or fifth try I noticed it appeared to be working but taking way too long to delete one file. Ctrl-c. Real fast. And of course I was root.

Was I ever lucky. I wiped the OS but stopped it before any user data was lost. The lead sys admin moved that HD to slot 1, put a new HD into slot 0, and handed me the OS CDs. I spent the rest of the evening re-installing the OS.

Reply

551 Rich May 18, 2012 at 5:59 pm

I’ve done some of these… ‘rm -rf fileprefix *’ late at night when trying to free disk space (instead of ‘fileprefix*’) removed all our corporate binaries; fortunately, backups had just completed and restore went quickly.

Here’s a trick I depend on every day: All my ssh/xterm windows are color-coded – production server windows are loud and glaringly obvious, no matter what I am doing. Makes me nervous just typing in them…

Reply

552 10speed June 11, 2012 at 12:49 pm

My all time favourite was a co-workers mistake.
DELETE FROM table name WHERE id – 1000

yup, he tried to remove all coloum 1000 bur ir qas the onlt one left. He luckly with a bit of scripting and his long buffer we were able to resore in just a few minites.

Lesson learned select * before delete :)

Reply

553 Phord June 12, 2012 at 6:53 pm

Ouch! I’ve had some SQL typos like that, too. Lesson I learned is to always BEGIN TRANSACTION; before any data-modifying statements. If I have several steps to do, I’ll do something like this in Notepad or vim:

BEGIN;
DELETE FROM table name WHERE id – 1000 ;
ROLLBACK;
— COMMIT;

I try this a few times until I get it right. Once I do, I remove the ROLLBACK and run it one more time.

BEGIN;
DELETE FROM table name WHERE id = 1000 ;
— ROLLBACK;
COMMIT;

Reply

554 Remco June 17, 2012 at 5:24 pm

My worst mistake ever was when I wanted to remove all DOT-directories (.esd-1000 etc.) in my /tmp-directory.

cd /tmp
rm -rf .*

As this took more than the expected very short time period, I realized that .* also expanded to .., so I effectively did an “rm -rf /tmp/..” which is equivalent to “rm -rf /”… As I didn’t know what files were gone, I reinstalled my system.

Nowadays bash prevents this from happening:
rm -rf .*
rm: cannot remove directory: `.’
rm: cannot remove directory: `..’

Reply

555 Zeddarn June 18, 2012 at 2:15 pm

Looked in as root, and deleted password file from RHEL 6, didn’t know how until when I restarted the server. Luckily I copied the file from another server and everything was back.

Reply

556 Stv T June 20, 2012 at 10:08 am

Way back, I was trying to debug a ‘.cshrc’ which was not working correctly and wanted to see where it was hanging. I put ‘#!/bin/csh -x’ at the head of the file and started a new session. Unremarkably, I was left staring at a blank screen with no output, but succeeded in exhausting all process slots, preventing anyone else from logging in and slowing the machine to a crawl. Luckily I had another window open and was able to remove the ‘#!/bin/csh -x’ and kill all my processes, although it took about 30 minutes. Never done it since !!

Reply

557 Lou Marco August 4, 2012 at 2:09 pm

Not really quite a command line screwup, but some of you might find it humorous. It does involve volcopy, at least (remember that one?)

I used to manage a room full of 11/780’s and 750’s. They all used those old washing machine sized RP05 drives, think they held like 100 MB. They set the bus ID with a big plastic plug with a number, like a kid’s building block, that you stuck into
a hole on the front. They used a removable platter stack – you could open the top
of the drive like, well, a washing machine, reach in and unscrew the platters, and the whole thing came out. I took advantage of this to do full backups. Every Sunday night I’d come in, swap a spare set of platters into one of the drives, volcopy another drive to it, then just switch the ID blocks. The new copy became the old /usr or whatever, the old platters went on a shelf as a known good backup, and the now vacant drive was ready to run the hext backup. Lots faster than the 9 track tape drives I would have had to use and I could run more than one at once.

Except one time I ran the volcopy before I remembered to switch ID blocks. All volcopy cared about where “volumes” – disc partitions – and the garbage scratch disc I’d just told it to back up to my /usr drive had perfectly valid partitions on it…. Whoops.

Reply

558 okobaka August 13, 2012 at 6:46 pm

1. alt+f4 on putty hurts
2. `rm cd /path/to/dir` # someone called me on the cell while executing rm and after that i typed cp /path/to/dir (alias rm -r)
3. again rm dots, stars, spaces and other weird stuff with char coding
ctrl+c and ctrl+z is your friend
My way to prevent rm was `touch 0; chmod 000 0` in directory i want to protect. Secret zero file.

Same with:
* remote mistakes, iptables, ifconfig, route, dhcpd.
* partitioning (you want write to disc? yes, oh no no no … ), cfdisk, MBR, lilo
* cat binary files, forced me to drop a shell because i didn’t know about reset/restart
* wrong umask in root profile, as root chown or chmod -R /home/user/, hip hip horray for symlinks,
* remote rsync
* Whats going on? Something is not working? Why i cant do that? lets reboot – never reboot when something is cooking. First check if you have enough space to write your damn files on partition! Permissions and so on …

… for sure i did more.

I would like to say now i’m fine with mistakes because of double check everything, but that rm pissed me today, i’m going to block some arguments to rm – damn it.

Nothing important was lost.

Reply

559 kiran October 5, 2012 at 2:43 am

while something went bad with a human err, its always better not to reboot..
I agree :)

Reply

560 Behrooz August 16, 2012 at 1:33 pm

My list:
rm -rf /
chmod 1700 /
dd if=/dev/sdb1 of=/dev/sda
dpkg -r $(dpkg –get-selections| awk ‘{print $1}’)#wanted to reinstall them afterwards

Reply

561 Steve Holdoway August 30, 2012 at 5:56 am

I’ve made a few (along with most of those above) , and had some done my predecessors…

Overlapping partitions… the first one fills up, overwrites the start of the second one, which was when we found out that his backups didn’t work either.

In the days before vipw, editing the password file on a full partition.

Believing the old 2 volume System V ringbinder which told me to format, not mkfs a partition. At least my backups worked that time. Steep learning curve, that.

A colleague that realised that hey could replace /etc/passwd from his PC ( DEC Pathworks anyone??? ). Unfortunately a DOS formatted /etc/passwd is as much use as no /etc/passwd.

Re: the tar/untar problems with resetting the CWD permissions. Now you know the reason no to create an archive as tar cfz arch.tgz . – always add a directory above, or use tar cfz arch.tgz * .[a-zA-Z0-9]* – obviously ony use the second pattern if there are hidden files to transfer.

Reply

562 Hoover September 1, 2012 at 9:23 am

A long long time ago when I was learning my ropes as a DEC VAX / Ultrix admin, I hit the BREAK key on the VT220 terminal expecting to return to the LAT server prompt in order to open a new connection to another VAX (all serial terminals connected to a terminal server back in the days, must have been the late 80s or thereabouts).

However, I did not know that when you hit BREAK on the terminal connected directly to the server’s serial port, it, well, … breaks. ;-P

The machine was our main facultiy student server and had about 30 people logged in at the time, and as the classroom was right next to the admin room my boss and I instantly heard angry shouts and pitchforks being sharpened next door. He probably saved my life (or at the very least my reproductive capabilities ;-) by intoning in a rather loud voice: “Damn those pesky students! They’ve crashed math6 again!!!” ;-)

Good times, good times.

Reply

563 Jake September 11, 2012 at 8:14 am

I’m just a young guy, however I do enjoy reading everyone’s war stories. Best to learn from listening to the people who have been there. I got a boneheaded one though. While I was updating my system, I forgot to update libc first. Luckily I had a boot disk handy, and was able to roll back the mistake. Cost me about 2 hours of a Sunday.

Reply

564 UriH October 7, 2012 at 12:46 pm

Man you gave me a good laugh, shutting down the interface you’re connecting from and then you can’t ssh anymore hahaha :D

Reply

565 James October 14, 2012 at 3:43 am

Two that come to mind, looks they’ve been mentioned already, but…

ifconfig eth0 down on a box I’m ssh’d into on the eth0 interface. >_<

and then the one that was the worst…

mounting a drive to a folder on another drive that had files in it. My FreeNAS server through a FIT along with SMART errors (it seemed to think I was about 500GB short on disk space) :P

Reply

566 Bernhard October 29, 2012 at 9:18 am

Two more for the history:

1) “last | reboot” instead of “last reboot” …
2) ldd /bin/ls then copied the following output line from this command as root and executed:
libc.so.6 => /lib64/libc.so.6. This deletes the contents of libc.so ….

Reply

567 woot October 31, 2012 at 2:30 pm

Wow if anyone of you was on my company, i think you will be fired very fast.
I’m working since 10 years in IT typing commands every day, i never did any of these noob mistake.

Reply

568 Cody October 31, 2012 at 8:55 pm

You’d have no employees then and you’d be on your own in the end. You’d also eventually (if not already) need to fire yourself.

And as for “noob mistakes” – either you’re lying, you’re unaware of mistakes you made or you simply are so inexperienced that the commands you type are incredibly basic. Maybe even a combination of the above.

10 years isn’t all that impressive anyway.

And while I try not to criticize based on grammar (no one is perfect and this is more like a discussion forum than a professional document), I’ll make the exception (and I admit I make mistakes too). You might (although anyone with experience knows that to be a complete lie) make no mistakes at the command prompt but you sure make a lot of mistakes in your comment as far as language is concerned. I’ll refrain from pointing them out for the reasons I already mentioned (and I’m not going to stoop so low).

I will however point out something else: those who are in denial about such things will never learn from mistakes and will never grow. In other words, they’re incapable of improving and should be asking themselves “why do I bother?” rather than insulting others who are more mature, honest and willing to grow. Truthfully, people with the attitude you have are afraid of admitting to mistakes and often the severity and amount of those mistakes. Does it really make you feel better? I doubt that very much.

Hint: humans learn from mistakes. So either you’re not human or you don’t learn often.

Reply

569 defenestratexp October 31, 2012 at 9:04 pm

No offense, woot, but, if you’ve been working in I.T. for 10 years and never keyed in any mistakes, you’re either lying or not doing anything complex.

#!/bin/bash
grep woot /var/cache/competent_techs.dat >> /var/cache/lying_techs.dat
sed s/woot//g /var/cache/competent_techs.dat > /tmp/temp_list
mv /tmp/temp_list /var/cache/competent_techs.dat

Reply

570 defenestratexp October 31, 2012 at 9:06 pm

Damn! if you grep for Cody in the file on_the_warpath.txt, you will DEFINITELY get a hit. Well said, Cody.

Reply

571 Cody November 2, 2012 at 4:58 pm

Thanks :)

No one is perfect and that includes me. I’m far from perfect. But to insult someone for being human (and therefore prone to mistakes) is quite arrogant and hypocritical. I have made quite some bad mistakes (though most not command line but in source code). Bad mistakes also means basic mistakes and something I had never made even when starting (when you’d expect it to happen). But it happens. I remember being quite tired at the time but that was about it. Later on the problem was discovered and I spent a great deal of effort to track it down and did. What matters is how you respond to the problems you may run into or cause. It’s about responsibility, and always improving yourself (and if you can, others). He or she is doing the opposite.

woot reminds me of a bully. Nothing else. And anyone who has either experienced bullying first hand (as victim) or just has some insight to issues in the real world would most certainly know what that’s a sign of: very low self worth and esteem, embarrassed to admit to their own flaws and simply put people down to try to compensate for their own issues. It isn’t even authentic and they know it and everyone else knows it. The bully just is in denial. They may have issues that bother them but they are afraid to get help or whatever else. But in the end, it hurts everyone involved.

Anyway – I appreciate the kind words. :) Speaking of such – I love your little script up there!

Reply

572 10speed October 31, 2012 at 8:53 pm

@woot.

You must use windows

Reply

573 Russ Thompson November 10, 2012 at 4:35 am

Last big blooper was a simple restore of a test database — on the production server. I learned quickly how nice MySQL recovery works!

Anyone that can, color code your PS1 strings for different servers, or at least have test one color and prod another. it helps to be able to just glance at the screen and the prompt color tells (yells?) what box you are on. But then again — humanity reigns — see above.

Fumble Fingers Thompson

Woot: reboot –now

Reply

574 ps November 20, 2012 at 1:43 am

my clm
pkill -fv theapp
I added -v the see what processes that where killed but pkill comes from grep so I killed every process _except_ theapp

Reply

575 Akim November 20, 2012 at 4:37 pm

Good one… Must have been painful, if the server was in production…

Reply

576 DotMG November 20, 2012 at 6:44 pm

cd /home/$USER; rm -rf *

But /home/$USER didn’t exist. And the script was in /

Reply

577 Akim November 21, 2012 at 11:04 am

… well… should I say you deserved this one … That was my first thought…

And then, I thought that in the end, we all deserve what happens when we do mistakes :)

Reminds me of good all times with format C: /autotest … As efficient as rm -rf /

rm -rf * is quite bad in / for sure. But I wonder if chown -R or chmod -R is not worse…

With rm -rf, you know exactly what happens… You then just have to fix it. with chown or chmod… Hell… you can either decide to waste hours to try to fix it, or … just do an rm -rf and start again :D

Reply

578 Akim November 21, 2012 at 11:08 am

An easy one I made yesterday…

Connected via SSH, I made a script with:

iptables -F
iptables -P INPUT DROP
iptables -P OUTPUT DROP
iptables -P FORWARD DROP

then a full set of rules including rules allowing traffic on port 22 obviously.

I applied the script. Was working fine, except for one port. So I wanted to log iptables instead of drop. Made a new:

iptables -F
iptables -A INPUT -p tcp -m tcp –dport 22 -j LOG

and some more lines… Executed the script, and …. Well…. Had to go on vcenter to access the VM console and reset my rules…

Reply

579 Russ Hore November 26, 2012 at 4:24 pm

Being seen as the ‘Unix Expert’ meant anytime an admin wanted help they would call us and immediately log us in as root.
I wanted to see if a certain command (can’t remember which now) was installed in /bin so without looking typed
cd /bin
ls | grep thecommand
without looking at the screen.
Unfortunately the keyboard was not mapped properly and the “|” came out as a “>” overwriting grep

Reply

580 Carl Smith December 7, 2012 at 2:08 am

I once installed a newer version of Python, then removed the older one, which was the system default. Doh!

Reply

581 Creeture December 10, 2012 at 4:19 pm

You guys should really stop doing normal stuff as root all of the time. Force yourself to sudo, pfexec, su – whatever suits your fancy – *any* time you want to do anything that you think might require root privileges. It’s probably the single biggest piece of advice I can give to a youngling administrator.

Reply

582 George December 11, 2012 at 2:12 pm

a dangerous one is this one:
rm -rf /whatever
The problem is maybe you’ll hit enter after the / (maybe someone pushes your chair, or whatever). Obviously your system is dead after that.
my solution is to type:
rm /whatever
then use the left arrow, and insert the -rf missing part

Reply

583 Ron December 13, 2012 at 10:35 am

For everyone who has done the ‘rm -rf *’ thing … there is a nice trick you can add to all the servers you administer ……

One time only, issue the command touch /-i

This creates a file called ‘-i’ and when the * in an rm -rf * mistake is expanded, it includes the ‘-i’ file name … that gets passed to the ‘rm’ command which treats it as a command line option – and prompts for confirmation of the action.

The result is that you get a second chance :-)

Reply

584 Rory Toma February 14, 2013 at 9:06 pm

My all-time favorite mistake:

I was on the main server that served NFS for about a dozen diskless workstations. I was adding a new one, and made a mistake, so I wanted to delete the etc directory, so I typed…

“rm -rf /etc”

instead of

“rm -rf etc”

Oops

Reply

585 Abhishek Bhardwaj February 24, 2013 at 10:22 am

Its not related to unix but I was taking a backup of .cbs file from cbex tool which is used in Clarify11.5 in production environment.
For taking backup you have to click on Export button but by mistake I clicked on Export/Purge button while it took the backup but it also deleted those files from production.
I told my local delivery team about it and then quickly put the backup files in production again.

Reply

586 Anonymous person March 26, 2013 at 2:10 pm

playing around with shell, uses dd
forget to use cd /mnt/
tries using dd instead of rm, for fun
forgets that sdb is where all projects are stored and the sdb1 is the old disk.

root@670box: / dd if=/dev/random of=/dev/sdb

Reply

587 John Smith March 26, 2013 at 11:39 pm

As a new Linux user it would be a big help if commenters could post some links to “safety scripts” and aliases. A script that intercepted a command like “rm -rf /*” and asked if I REALLY wanted to delete my entire filesystem would be nice.

I am learning to write scripts but shortcuts are always appreciated.

These kinds of scripts might also give some ideas to kernel programmers.

I think this is so true, I don’t want to cut my throat (Linux as straight razor) – tinyurl (dot) com/c9vwy2w

Reply

588 Cody October 21, 2013 at 12:00 am

Be careful with that kind of thing. Okay so you put in .bashrc the following :

alias rm=’rm -i’

which then makes rm (once $HOME/.bashrc is sourced/read) interactive. Now what happens if you add a new user and/or forget that you’re a user on a new system (and have not set that up)? You assume that rm will protect you but instead you by accident remove however many files. By all means, get in the habit of adding the -i option every time you type rm but be very careful with what you put in aliases to save the time and effort (remember: yes, you save some time typing but does that matter when you have to fetch backups or worse not have backups, of important files – system files or not?). Relying on and trusting everything so easily is a dangerous game to play.

Reply

589 LoCoZeNoz ZUE March 29, 2013 at 6:22 pm

I wanted to delete all USB support on server….. and did it on my main personal laptop.
I still didnt fix em :(

Reply

590 Russ Thompson April 25, 2013 at 2:03 am

Wrong Box Story:

simple command: mysql <testDataBase.backup

But the production server, not the test server

ug!

Reply

591 Lingeswaran May 15, 2013 at 12:20 pm

Very useful post

Reply

592 salil kumar May 28, 2013 at 10:09 am

One of our cool thread server (running 4 guest ldoms) got rebooted by its own and when I was checking the server by mistake I issued command “last |reboot” instead of “last reboot” and server started rebooting again. I would have thought that it is because of faulty hardware or something else as server started rebooting.
On the top of this, I executed this command to another running Solaris box (however I logged on a test solaris server) and this server started rebooted. That is the time I realize that the command I executed was wrong..

Reply

593 Cody October 20, 2013 at 11:29 pm

Very wrong even. You sent the output of last to reboot, or put another way you piped last to reboot. Of course ‘last reboot’ does indeed work as would ‘last | grep reboot’. I would suggest you learn more about the pipe because it is incredibly powerful. A nice article explaining this and other features of the shell can be found by (if you have info installed):
info coreutils
then type:
/toolbox
then hit enter twice and read the information.

Of course, even if you already know this stuff it is easy enough to do and that you admit it is half the battle.

Reply

594 Pemuda Sunnah May 31, 2013 at 10:10 am

waw that was a nice information, i think linux better than windows! ya, because it’s open source.

Reply

595 Rod June 1, 2013 at 11:19 pm

#5 mistake is my #1, and is my vote for king! The key needs to be re-named the key… :)

Reply

596 Sawi June 14, 2013 at 4:03 pm

I wanted to move some files, but changed my mind before I completed typing the command. But I hit Return instead of ^C

[root@sawi-laptop pkgconfig]# mv *

There were two files in the directory. I ended up overwriting the second file with contents of first file!

(Try invoking ‘echo *’. This problem with wildcard expansion at shell level (which sometimes yields unpredictable behavior) is highlighted in “The Unix-Haters Handbook”.)

Reply

597 Cody October 20, 2013 at 11:53 pm

I would argue that it is more like “problem” – as in it is not a real problem. It is not unpredictable in the slightest. If you did the same thing and in the same conditions it would do the same thing, yes? Yes, it would. So how is that unpredictable? Same command, same conditions, same result. It isn’t unpredictable – it is exactly the opposite. Can you really expect it to do more? If you so need to use -i (yes, the unix hater’s handbook is old but some of their stuff is still ridiculous for that time frame. But much credit to them for allowing some co-creators of UNIX to write parts of it, especially when – I believe it was Ken Thompson – they got the better of the authors). Blaming a system for your own mistake is really just making an excuse for not being perfect (which, not being perfect, isn’t bad – you learn, you expand, you make mistakes – we all make mistakes) which is the way to not really learn (you learn through mistakes if you actually admit to them). No one is perfect and placing the blame elsewhere shows who is at fault more than if you were to laugh it off and make a mental note (and I don’t necessarily mean YOU personally but I do mean people in general) and learn from it. And yes, that goes for everyone myself included. I’ve more than one screwed myself over because of doing [something] while sleep deprived. However, I took the blame (as in I blamed myself as it was my fault) and I fixed the problems from backup (in one example it was an overwritten file).

Although they are not the same thing file globbing and regular expressions have rules and logic to them. That there are only two files (other than the normal ‘.’ and ‘..’ – which be thankful * doesn’t trigger) is not the fault of the shell and as a user you make a choice: functionality or baby sitting. Any one who is using mv (or cp or any thing like that) on * without first checking (or knowing) which files it will act on is asking for trouble and it isn’t the system’s fault. The same could happen in other systems. Another choice of the user: whether you back up or not (and trusting an administrator to do all backups is not always the best plan although yes some times you don’t have a choice, depending on the environment). Unless it was a very new file then there’s really no excuse not for it to be in a backup (but that’s only one mistake).

(And I hope this did not sound as an attack or anything. I’m just making some points that everyone could stand to think about from time to time, because the world isn’t perfect but accepting that and improving where you can is so much better than blaming something or someone else. If it did feel like an attack I didn’t mean it to be and I’m sorry in advance there).

Reply

598 Holmer Simpson June 19, 2013 at 3:13 pm

Unfortunately I once typed “dd if=/dev/urandom of=/dev/hda” when I meant to type “echo ‘Hello World'”.
Doh!

Reply

599 Sebb767 July 20, 2013 at 12:22 pm

I once set up a Windows-Box via VirtualBox on my Server so I could use Visual Studio on my netbook. When the VM was running, I’d tried to connect to it via remote desktop. I installed 3 remote desktop applications on my server until I realized that I was using my server’s shell …

Reply

600 Patanjali Hardikar August 20, 2013 at 9:52 pm

This is the most number of times i have said ‘f*ck’ on a single web page.
Amazing list of mistakes. Truly scared me.

Reply

601 TiTex September 4, 2013 at 7:24 am

what about
–preserve-root and –no-preserve-root options for rm , chgrp , chmod , chown ?
that should save us some headache

root@ubuntu:~# rm -rf / tmp/junk
rm: it is dangerous to operate recursively on `/’
rm: use –no-preserve-root to override this failsafe

Reply

602 Cody September 7, 2013 at 3:50 am

Maybe if they added it as default to the others (rm aside) but definitely not if you make it an alias and you ever expect to manage more than one system (even for a moment). Of course, any administrator who is doing such a thing to begin with (as in recursively running any command on / which ends up in a disaster or actually any command in any circumstance that is risky) would do well to learn of such things and frankly an administrator running a command they are not too familiar with is asking for trouble (“oops, I forgot I was root… ” Yes and if you just used sudo or log out once you’re done with root’s task you would not have that problem of forgetting. Besides: ever hear of whoami ?).
But regardless there’s a fine line between baby sitting (which I would argue that adding those options as default is exactly baby sitting) and being helpful though (the opposite of baby sitting). Also by making it “easier” or adding more warnings you are in fact making people think less (a dangerous thing) and so what happens when they come across a different system without this setup (or alias) ? All it takes is one bad movement with e.g., a fat hand or finger (or not – just being careless is enough!) or even (for homes) a cat jumping on the keyboard at the wrong time (or like I did in the days of ctrlaltdel being in /etc/inittab : having two keyboards around attached to two different systems with only one monitor and hitting ctrl-alt-del on the wrong keyboard… stupid as it may be it happens). But at that point: the mistake is made, they have no idea WHAT they did or HOW it happened and they don’t even know what to do about it (they might not even know the problem exists until later depending on what the mistake was). One can argue that they should be protected from mistakes but the truth is no one is perfect and you can learn from your mistakes if you are responsible about the mistakes (or responsible in general). This isn’t Windows though and whether some think it’s too risky, too difficult or anything in between is their problem. One must learn to not (as I wrote somewhere else in this thread) be so trusting and actually be sure you know what the command you are typing is doing and also be aware of your environment (and I mean that in the sense of permissions/access and I mean it in the physical / spatial sense).

Reply

603 Daniel September 27, 2013 at 2:03 am

I think my worst one of all time: I was experimenting with encrypted partitions, and wanted to initialize a blank partition with random data. I meant to run

dd if=/dev/urandom of=/dev/sda6
instead
dd if=/dev/urandom of=/dev/sda

(as root of course).

Ctrl-C was not fast enough… No current backups. At least I though to dump in-memory partition tables before rebooting. And so began my research into data recovery and digital forensics.

Reply

604 Jason S November 5, 2013 at 7:32 pm

I was clearing an old project off of my Ubuntu web server…

> sudo su
# cd /var/www/projects/oldprojects

Oops. That’s the directory I wanted to delete. I should probably just move up a directory and issue:

# rm oldprojects/ -R

Instead, against my better judgement, I decided to not waste any time with an extra cd command, and issued the following command:

# rm / -R

Oh, dear God! Did I forget the period!?
CTRL+C! CTRL+C! CTRL+C!
One rebuilt web server later…

Reply

605 Jerry November 6, 2013 at 8:55 pm

I had never used Unix but I was hired on contract to head up a small programming team for an insurance company. They had a SUN Solaris system that was totally overloaded so my first task was to purchase/install a new system. This was in the early 1990’s so Unix had not yet acquired many of it’s most helpful commands and options.

I was working 20+ hours a day trying to understand that strange nest of daemons called Unix, configure this new hardware so I could ethernet it to the old (for work space) and install and configure SAS. A little after 3AM about 4 days after I had started I had everything done and tested. Now, all that remained was to get rid of all the work and trash files so I could have clean file systems, etc.

So, from the root I did the only easy thing “rm *”.

I spent the next two days restoring everything, after reformatting the box.

Reply

606 Keith Hinton November 8, 2013 at 6:02 pm

Hi,
I wanted to intraduce myself before teling you about an amusing hilarious mistake that I should state is hilarious now, though it was not funny when it happened.
I’m a totally blind computer user, using a screen reader to navigate the web, and other things.
All I T related things have been learned through experience, and no formal college education.
I’m actually now the person who gets asked for any technical advice actually in my family.
I’m regarded as the geek, and enjoy working on boxes over SSH.
I spend a lot of time since I dont’ actually have a job, doing a lot of things, working on remote Linux boxes, and other geeky things.

Now, for the not so hilarious miake.
If you don’t think blind folks cannot do social engineering, I hope you’ll reconsider!
In this case, the computer ran both Windows XP (as my main production system) and Gentoo Linux at the time as a duel-boot.
I’d somehow screwed up Windows, preventing it from booting.
So I started up Gentoo, logged onto IRC, joined the channel where a lot of my blind geeky friends were, and asked for help.
Please note that at the time, my NTFS partition was mounted read/write.
Well, one of my blind IRC friends said: soething like:
“Sure, I’ll help. Give me a user account with permission to get full admin access.”
Then he submitted user ID and password, and instructed me to add him to the wheel group.
I’d installed the superadduser package, (I think it was) that gave me a wizard interface of some type.
So I added Mo (was his nickname on IRC) to my system, under the user account he’d asked me to setup.
I as you can also imagine I hope, added him to the wheel group.
I had no idea what wheel would let him do, and when he did this, I had to go take a bath at the time, so was out of the room.
I came back to my Linux box, to find that it was no longer speaking at all.
I thought it had locked up.
So, I powered the machine off (as it was not accessible to me at all) at this point, from a blind user perspective (will explain more ina second) and tried to get back into Windows.
The result was I had to reinstall Windows 98 with sighted assistance from my brother, and then upgrade again to Windows XP.
I didn’t try the complex process of Gentoo reinstalls, as I wasn’t the one back in 2005 who did it, a friend of mine actually did it over SSH for me at the time.
What I later learned was that Mo had executed a command that I had never heard of at the time, the command being:
rm -rf /
PLEASE DO NOT GO TYPING THAT IF YOU VALUE YOUR SYSTEM!
I now know what that did, it basicly whipped everything (including my NTFS Windows volume), and also whiped the software speech components my Linux box used
to speak through the sound card using the Speakup Linux screen reader for the text console.
That is why I thought the system had “locked”, also please remember that without speech, or Braille (usually Sspech) since a Braille Display is very expensive, $2000 plus) and since that’s not something I have available at the moment, speech is the next best thing.
But without either of those (especially since a monitor is pointless), you might think a box had locked up.
How’s that for a mistake? And not one I intended to have happen.
I got used though, and Mo did that on purpose knowing full well what it would do.
That was his form of “help”.

Let me know what you think!

Reply

607 Superkikim November 9, 2013 at 7:21 am

Well, Keith, you’re mistake is not quite a command line mistake, but a naive (no offense intended) guy in need of help mistake…

I would never trust a guy from an IRC channel to help.

I’m not blind. If I needed help and let come an unknown person to my computer, and then, go away while he temper my system, I sure would make a mistake as well ;-)

The moral of the story is: Never trust anyone but close experienced friends/relatives or professionals when you need help :) Or google it and do it yourself, which might eventually be, in that case, a bit hardware while beeing blind. Or at least, it might take more time than if you had sight, I guess :)

I’m curious: how old are you and are you using Linux again now and then ? Your story must be quite old as you speak about Windows 98 and upgrading to XP.

Reply

608 Cody November 9, 2013 at 6:36 pm

As Superkikim points out, your mistake was you were too trusting. Trust is given all too easily then and still to this day. Anyone (this is a rhetorical question) remember hosts.equiv and its implications? To this day it is hard to imagine that it ever was a problem but oh was it a serious problem. Admins would argue that “I don’t care if the attacker has root in their own machine as long as it is not mine” while at the same time they allowed the r* commands and especially ANY host as long as you have the same login NAME. Well what do you think root can do? Exactly, create a user so they basically have control of at least one account which means they are one step closer to root access. And even when it was host-restricted you have to be realistic: although IP spoofing is a blind attack the thing to remember is IP spoofing is only part of the exploit (trust-relationship exploitation). Sad (albeit also funny) also is remote mounts being abused, too. And as I recall (might be remembering this part wrong although the exploit I know did happen) hosts.equiv was also used in order to hide the presence of a login (meaning ‘who’ for example, would not show that someone is logged in from some host – completely hidden so that they could work on the next step).

As for what I think about your comment on social engineering: I don’t believe for a second that that was social engineering. No, he didn’t have to do anything at all other than say he would help and do this so I can help. That’s akin to a coward mugging an elderly by saying “You look as if you need help getting that stuff in to your vehicle…” and then the elderly person agreeing followed by being robbed. While some might argue that this is semantics the truth of the matter is he didn’t do any engineering other than saying he’ll help (typically social engineering is to trick the user into giving the attacker information or giving them something that is of value based on who they are. Classic example: convincing people over the phone that they are in need of a login/password as they need it to fix a problem… yeah, what problem was it again? One the social engineering created from thin air in order to get into the system. That’s the only problem in that case and now the attacker is ahead of the unsuspecting victim).

Reply

609 Keith Hinton November 10, 2013 at 11:49 am

I’m 26.
As for it being social engineering or not, it was at least from the fact that this is what he intended to do the entire time once an account was created. He wanted to see if I was stupid. Or so he said. Would I just happily not do Internet research and setup an account, and do what I was told to do?
These days, no way.
In 2005?
Probably.
Although I have done my own command line mistakes, ticking off a blind friend who had helped me install Gentoo over SSH (and I was on a cruddy connection back then ) with a five second delay for terminal text to appear and be spoken by his speech system at the time.
He had been monitoring me, and I had gone to run an emerge –remove package name at least that’s what I think it was back then.
As part of the output generated, it said “empty /usr/lib”
And I thought it wanted me to actually do it.
So without even doing Internet research, man, or anything else, I happily ran:
rm -rf /usr/lib as root.
Then my friend sent me an ssh message to my terminal somehow teling me that he demanded to see me on Skype, and he was pissed off at the time (since he actually had to reinstall the filesystem).
I then guessed Linux, or any other Unix system obviously couldn’t run well without /usr/lib.
If that particular mistake isn’t a command line one, well then let’s find one that is.

Reply

610 Keith Hinton November 10, 2013 at 11:53 am

I’d actually be curious to know if any of you have ran rm -rf /usr/lib before.
Another time I did something similar to what one of you did on this thread, where you put a space between the / and something else after rm -rf as root, thus hosing your system, I’ve even done that on a dedicated production server over SSH.
The result was I had to open a support ticket requesting that the CentOS system be installed again.
In my experience, once you execute a command like that or the rm -rf /usr/lib example in a previous comment that I refered to, you might as well not bother with control-c.

Reply

611 Cody November 11, 2013 at 2:22 am

No, I’ve not. But since many executables are dynamically linked then naturally removing libs will cause serious issues. But that said: you can salvage data from mistakes but whether or not depends on how long it was ran and what command. E.g., recursively using chown or chmod is a bad idea too if you are not 100% sure you know what you are doing (and that you are not at all going to hit parent directories which means as recursive you are hosing your system). But I’ve had a friend who accidentally ran chown -R ../ instead of what he meant to do (cannot recall what it was exactly and I’m too distracted to bother looking at the chat log about that one). He recovered from it and it mostly just caused about an hour of extra work (because he was quick to notice and react).

As for rm and whether you should bother hitting ctrl-c (note my friend did exactly that: hit ctrl-c and had an extra hour of work which wasn’t too bad but if he had let it run its course then it would have been bad. and he had to be root for the task because chown requires that — is a security risk for non-privileged users to be able to change a file owner) it really depends as well:
First, consider if the admin (my friend does backup but many do not) is naive/something else not so nice. If they do not backup (shame on them but that won’t change anything, will it ?) then depending on the command and where they started they might be able to salvage their user data (or more) that they foolishly did not have backed up. It all depends. Bottom line is twofold (besides backing up but since many don’t do that let’s forget about that):
1. You can salvage messed up systems. Sure, some times it might be easier to start over but that is not always the case and it depends on WHO is doing that (read: their experience and expertise).
2. You (and this is something that is also ignored and I just don’t get it) need to be careful that you don’t just sit at root “in case you might need it” — really, how hard is it to use sudo or su ? An extra command versus an hour, hours or days of frustration, potential lost data, even a potential job loss (as in the company fires you).

And no, it was not social engineering. You can argue that it is all you want but then general pranks (no matter how mean or not) are social engineering. You fell for a nasty prank but he didn’t put any work in getting information out of you. He said he’d help and here’s what to do (what about people that do that and are sincere and actually do help ?). The fact he was being nasty and disguised it as help doesn’t really mean he social engineered you. Better said, he didn’t disguise himself as (for example) tech support or something like that. He didn’t go to you – you went to him. That last line is the main point. A classic example of a social engineer is Kevin Mitnick.

Reply

612 Cody May 8, 2014 at 2:17 pm

As for rm -rf on /usr/lib or its 64 counterpart, there is one other thing that can help you salvage it.

What is it? busybox or if available sash (stand alone shell, I think it stands for). Here’s a brief example. The first command shows how it is able to work when libraries are gone (of course the entire tree being gone may be a lot harder to fix but updates going afoul can cause problems with certain libraries and so by having the ability to fix them is very handy):
$ /usr/bin/ldd /sbin/busybox
not a dynamic executable
Now, while this is only one command it can do, observe (deliberately leaving out any parameters as I’m merely showing the use):
$ /sbin/busybox cp
BusyBox v1.15.1 (2013-11-23 12:50:41 UTC) multi-call binary

Usage: cp [OPTIONS] SOURCE DEST

So any command it has support for (which is quite a few) it can run even if a library is gone, corrupted or something else). A friend who reminded me of this (the other day) – I knew of busybox but never really used it – made use of it when an update (via yum, most likely) corrupted his libc (which itself would be ugly if he did not have something like busybox, very ugly indeed). Also, on this note: yes, you should bother interrupting the command, because as you can see, you do have a chance to salvage it! I learned something that day but knowing that friend it is not that surprising he was able to do so (I consider him a mentor in many respects, computers and life).

Reply

613 Tom December 16, 2013 at 9:31 pm

Upgraded Debian 2.0 production server to 2.2 skipping one major version in between…

The upgrade went fairly well, until some process crashed, because of missing libraries or having wrong version or something. It is particularly nice to realize that if I reboot this machine I am totally fucked. Tried to download missing libraries with wget, couldn’t because of the missing libs. Almost every executable crashed because of the missing libs. Finally managed to use lynx to retrieve working version of libstdc.. and bit by bit I got it upgraded. Learned a lesson though…

Also, as a real novice I was asked by a user to install a software he wanted to the freebsd… I didn’t know how the freebsd worked so I managed to reinstall the whole system base, lost all the user accounts and so on. Finalized my own grave by not realizing that the FreeBSD was wise enough to keep copies of the passwd and group file.

Reply

614 PRADEEP December 26, 2013 at 12:32 pm

I logged in to the server from SSH as root from one of my DC team for trouble shooting. After finishing the work I forgot to logoff. I checked if the terminal is closed from after sometime from my local machine from SSH. I was able to see that session was alive. So I simply used
“pkill -KILL -u root”
and then realised what I have done. It took half an hour to get the system back to on-line.

Reply

615 Daniel December 30, 2013 at 10:13 pm

Haven’t done this, but it’s still good.

:(){ :|:&};:

This is a fork bomb

Reply

616 Pooya January 24, 2014 at 12:10 pm

destroyed my HDD with “dd” more than 3 times :D

Reply

617 SIGLAZY February 4, 2014 at 3:40 pm

Killing a backup process stalled.
PID was 15932, but I accidentally hit the spacebar, so I ended running

kill -9 1 5492

System was not happy when I killed init

Reply

618 jeremiahfelt February 4, 2014 at 5:14 pm

Not so much a fat-finger at the keyboard (though I have had plenty of those- including wiping out an entire copy of the production dataset that took 3 days to replicate… guess I’m waiting 3 more days!), but my most fun disaster occured when de-racking a server.

Boss and I were decabling this box for removal. He’d disconnected everything, and I was in the front to pull the thing forward. Keep in mind, this was a 2U box in 5 & 6 (very near the bottom). He said ‘OK!’ and so I went to yank it out. It stopped, bounced, and wouldn’t come out further. He said ‘Oh, just a cable snag’, and out it came.

The iLO cable, which had not been disconnected, wrapped itself around the isolate switch on the rack UPS in slots 1, 2, 3, and 4. So now we have this server out, but something is beeping. “What do we have in here that beeps?” Took out four domain controllers, the head-end Exchange, and a few other things. Turned out that some knucklehead had plugged both PDUs into the UPS we’d just shut off.

Whoops.

Доверяй, но проверяй  (Doveryai, no proveryai) – trust, but verify. :-)

Reply

619 RichG February 6, 2014 at 2:49 am

shutdown -h
on a remote server where I thought -h would get me _help_, not _halt_

Reply

620 NIX Craft February 6, 2014 at 4:48 am

Heh that was funny…

Reply

621 Moataz Elmasry February 10, 2014 at 12:47 pm

Did the following in a script

[ -d ${MY_DIR} ] && rm -rf ${MY_DIR}/*

${MY_DIR} was not defined at all and it turned out, that [ -d ] returns true, so the command
rm -rf /*
has been executed

Reply

622 Craig Taylor April 23, 2014 at 3:57 pm

I didn’t see this in the original article nor in any of the comments so I thought I’d share this one.
A lot of times we copy and paste commands into a terminal window. I’ve had cases where I’ve attempted to copy something but something messed up with the copy so I’ve pasted other text. Usually this isn’t that big of a deal but one person on my team pasted a bunch of logs into the terminal and the logs had a crap load of >> symbols in it. The application we had running on that host got completely messed up and had to get reinstalled. Hard to say how to prevent this except if you have any doubts what’s in your clipboard paste it somewhere else first.

Reply

623 Michael Shigorin April 24, 2014 at 7:59 am

I’d select something known and harmless first when in doubt, and then go and select what I need to get back and paste it immediately so as no to allow extra “interrupts” in between.

Context switching is what ruins attention.

Reply

624 Cody April 24, 2014 at 2:01 pm

“Context switching is what ruins attention.”

Unless you’re an operating system! That’s a good thing though! Can you imagine how awful it would be to have an OS not able to do that? It’d be as bad as DOS only no TSRs either which is well, DOS is not very helpful even with TSRs (unless it’s for emulation of some old demo, some old game or some such). Indeed though, any interruption – be it taking a break or the bloody phone rings (possibly for the umpteenth time in the day due to bloody telemarketers) – can cause serious issues if not addressed properly. In general, yes, you should either check what’s in your ‘clip board’ or copy and paste right then and there. Of course, if you’re in vi or vim then notwithstanding copying into vim that is in a terminal session (under X, say) then you’ll need to yank (or alternatively delete which does the same but first removes the text) and then paste it then if you want to keep it (and be 100% sure of that). No idea about Escape Meta Alt Control Shift. Sorry, was that actually what I wrote? I meant emacs of course! What was I thinking? (Jokes aside, to any emacs users: I have no intent to offend let alone try to stop others from using what works for them. That said, I have to admit the Church of Emacs and its followup Cult of Vi is a very clever, very fun, very funny and a much better approach to X wars where X is OS, programming languages, text editors, shells, clients of certain services, CPUs and whatever else people might like to argue over).

Reply

625 Michael Shigorin April 24, 2014 at 2:29 pm

>> Context switching is what ruins attention.
> Unless you’re an operating system!
OS “attention” is ruined as in awful throughput, latency, or both…

Reply

626 Cody April 24, 2014 at 3:08 pm

Perhaps, perhaps not, depending on your definition. A properly programmed multi-threaded application does quite well. Improperly is another issue entirely. As for the OS itself, I can offer only two things: it does far better of a job than the human can, that much is for sure! If you’re the curious kind or you need further elaboration, try the experiment(s) in the man page for top, specifically this one (although the others probably are of interest too):
“7. STUPID TRICKS Sampler
Many of these ‘tricks’ work best when you give top a scheduling boost. So plan on starting him with a nice value of -10, assuming you’ve got the authority.

7a. Kernel Magic


I’d paste it all but it’s a bit long and I already pasted enough. You’ll note that with this you’ll see how many even cpu intensive tasks and non cpu intensive tasks all get their time and very quickly.

That aside: I’m not sure you can blame hardware bottlenecks on the OS, really… any more than you can the other way around. At least not, rationally, especially for a system that is efficient otherwise.

Reply

627 kajoj April 28, 2014 at 5:13 pm

I always wondered why commands: cmp and cp are so similar. Anyway, after trying to make comparison:
cp file1 file2
they will always be the same

Reply

628 Cody April 28, 2014 at 6:15 pm

Why? Simple: cp is a core utility and without it it would make manipulating files (like, say, copying them) very difficult, to say the least (ignore dd, which I’ll be mentioning anyway.. besides, this defeats my point which is the real point to consider). cmp is for comparisons and similar is diff. As a programmer I use diff far more often than cmp (which is to suggest: very often to very rarely if ever; I much rather know the differences than where they differ, since I usually deal with source code of some form or another and even with text files[1] it is much more useful if you know how to read the output of diff). And of course cp file1 file2 will be the same with regards to cmp or otherwise: cp is copying file1 to file2. You could suggest the same for dd if=file1 of=file2 as well. Of course, this is not considering hard links (or cp’s options related to links) and neither is it considering file permissions (since the main point is to copy the contents of from one inode to another), at least in the sense of two files with every single attribute as the same (but yes I know that’s besides the point). But bottom line is this: the beauty of Unix (and therefore Linux based OS’s) is the ‘do one thing well’ philosphy (combined with the pipe) allowing for very flexible, very powerful system with many possibilities. What you notice is not even the worst (I don’t see it a problem at all, actually anymore than diff and diff3 being ‘similar’). What is a problem – but is inevitable and again this is another beauty of the environment … allowing different developers, different packages with similar but not necessarily equivalent functionality[2] – is when two packages (e.g., in a binary based distro) have the same file name (path included) so that only one can be installed at a time. I’ve seen this when it is quite a bit different capability, too. This just happens and there is no real reason to cp and cmp being similar in name (if that’s how you view it) other than cp was created for one purpose, cmp is by another and both names are decent for their uses.
[1]Let’s be honest: there is a reason for the old error ‘Text file busy’ and that happening with ‘programs’ – a simple text file can be a program. Even then though, if you want to see more detail between two files, diff is more useful.
[2]Examples: Apache mod_nss and mod_ssl – both offering TLS support for Apache. But this is only good: first, libraries are more likely to be named more carefully. second, some might prefer one over the other. Another really good example (though this not library but utility/service/whatever): sendmail versus postfix versus qmail.

Oh, okay, I cannot resist: there’s cases where cp file1 file2 will not result in what you expect. One cause (actually two): disk space and quotas. I imagine there are others too (not counting cp being aliased to something like ‘cp -n’ and file2 already exists or ‘cp -u’ and file2 is newer than file1, because those are ridiculous and I’m only mentioning them because I’m really bored). (-n is noclobber and -u is of course update and yes when I wrote ‘really bored’ I meant it in a way that suggests I’m deliberately coming up with ways – whether semantics or not – where it could be incorrect… I already answered the real thing you wondered so that’s my excuse).

And to be fair to you… yes, similar commands can cause a problem for those who make typos a lot, as an example (which is not really a good idea at the command prompt; everywhere else, well, fine). You can also argue this for > and >> (if that came out right: overwrite and append to file… and yes, as I’ve written here before, both are valid, both are very useful even though some are too afraid to learn how to make full use of them… which ultimately means that they will not be as proficient as they could be).

Reply

629 Cody May 8, 2014 at 3:08 pm

I’m not sure I ever submitted any mistakes… But here is one that I recently did do. It isn’t exactly a command prompt mistake but it is a BIND configuration issue and so is close enough (system administration mistake).

I got reverse delegation of my IP block and I was setting up named.conf to refer to the .in-addr.arpa zone. Well, while the zone file itself was fine on first go, I unfortunately made a typo: I ended the zone with -in-addr.arpa (- instead .) in named.conf. This means the zone was named one thing (-in-addr.arpa is not correct and it is what the zone was named due to the typo) while the zone file was expected to be something else (correctly named). This led to two or three days of wondering why the …. I was getting out of zone errors (and other errors) when doing rndc reload (as well as queries to my server), why when I did the same changes on an internal network (CNAME Delegation is the way my ISP does it) and then doing the exact same thing with my public block, I still got errors. Further frustration was when I did the same thing with a ‘public’ (but not static) IP in an internal view (= a BIND view that only answers for hosts given in the acl list specified) it worked fine. Well, there were two things that happened, the first being the one I already mentioned (single character typo). The second was that combined with the fact I have really poor vision to begin with (even after multiple eye surgeries earlier in my life and even with glasses) and the fact I had been awake (not deliberately, but unable to sleep) since 1 or so in the morning (all of these days) and working on this problem from anywhere between 4 to 6 am, and then later into the day, and I was oblivious to why I was struggling with this (I have set up PTR records plenty in the past and never had this problem). I was going completely mad (and beyond what I normally am, which is kind of scary…) until I by chance noticed the typo. The problem is of course that BIND will happily name the zone that way where as in programming a single character typo will (if syntax error) generate a compiler error (which if you are experienced enough – I am, so I would have preferred that – you can easily enough decipher the cryptic errors that can occur). This is more like a typo that – while is a mistake – is also valid (somewhat like a spelling checker cannot help with the fact you have a valid word but in the context it is it is incorrect when reading it. e.g., “He wants to the store.” instead of “He went to the store.”).

Cannot truthfully state I made any of the mistakes mentioned in the top 10 but I have certainly made plenty of mistakes (and I only learn more) and I could easily see myself making one of those mistakes when tired, distracted or just simply being what is expected of humans. One I just remembered I made actually could be suggested it is at the command prompt. Symbolic links and chmod when non-recursively can be dangerous. Especially if it happens to link to (for example) /dev/null (which is rather important to the system… as I recall, mknod won’t work without it though I might be remembering wrong). Note: It was a mistake in a cron job and it wasn’t actually trying to change the permissions of that link (and neither the file it refers to) but rather files in that directory – and only that directory – but unfortunately I forgot about the symlink (and actually it was a stupid thing to begin with really… the best I can think of why I did it is updates [e.g., of a CMS] changing creating files with permissions that I didn’t like [not strict enough]).. I had a link to /dev/null for something else. I was able to fix the mistake without any trouble though. But lesson learned: if you need a null like device for anything, just check the major/minor of /dev/null and use mknod with those properties at a different location (whereever the link is). More specifically: be careful with symbolic links and what it refers to.

As for rsync suggestion, I disagree. It is quite useful in many occasions, if used properly. But here’s another way it is relevant: mirroring one site to another (that are in different parts of the world), be it over ssh or otherwise. It is very useful there and can offer many speed ups. As for dump being the only safe option, I don’t really agree that there is any 100% way. Certainly some utilities are going to do better than others and certainly some will do better with some times of data than other utilities will, but there’s no 100% fool proof solution (related to different file types is that compressing certain file types is not too useful [sure there can be some small decreases in size when – example – compressing an mp3, but also exists the chance of increase in size due to extra data in e.g., an archive]. Example: if you compress an mp3 file and a text file, both of the same size, with the same compressing util, and same options, the text file should win). I write should because I suppose there are exceptions to the rule (as always) but a text file is not compressed where as an mp3 file is compressed (at least if it truly is an mp3 file with normal mp3 bitrates/etc.). Also, text is much more simple to process (and understandably so: it represents less).

Reply

630 TomD July 4, 2014 at 11:01 am

Was showing a colleague how to use the date command to output date strings for different times other than now eg. date 01010101. I’ve done this many times before on Linux boxen, knowing that you need to use “date -s” to set the date. Only this was a Solaris 6 box, as root, and “date 01010101″ actually set the date!

This was the active server in a bank’s primary oracle database cluster and needless to say it wasn’t happy at suddenly going back in time and glorked itself!

Don’t use root as your everyday shell kids!

Reply

631 Cody August 20, 2014 at 1:16 pm

Ah yes… I forgot about that difference (been years since I’ve used Solaris). But yes, the real error was being root simply to show someone that kind of thing (or anything that does not require root!). And yes, time is absolutely critical…

Reply

632 Sergey August 9, 2014 at 11:00 pm

Was testing Exim configuration on a staging environment by looking at configs on a live environment, and having finished, typed:

# dpkg -P exim4

In 2 seconds realised I did the above on on the live server. I think I lost like 5 years of my life that day…

Reply

633 Matthew September 4, 2014 at 11:50 am

I didn’t do this, it was another admin managing the server with me. We had just migrated from one server to another and were reinstalling applications.

He was having an issue with file permissions on his user so he decided

# sudo chown $USER:$USER /usr/bin

With /usr/bin/sudo now owned by his user nobody could elevate themselves to superuser permissions. And without a root password set, no further administration could be done on that server.

This was two days after migration. 4-6 hours to rebuild the virtual server and re-migrate all of our data.

Reply

634 Superkikim October 22, 2014 at 4:25 am

he could have boot a rescue iso and change permissions back…

Reply

635 Miguel September 9, 2014 at 5:07 pm

Before known linux, working with msdos 6.1

fdisk
(selecting 1 harddisk)
(delete partition)
(yeah, I’m sure)
o.O Wrong hard disk

Thanks god somes weeks before I was playing with a norton utilities and I did a backup of the partition table

Reply

636 Cody October 22, 2014 at 4:59 pm

Just for future reference:

If you don’t actually format a partition after it, you can generally recover the entire partition as long as you can boot into an environment (e.g., boot disk or before you reboot). If you format it your chances are less although not by any means is it impossible (Windows file systems, I seem to recall, you have an easier time recovering. On the other hand, Windows filesystems manage files differently and much more fragmentation because of it). As for how you recover partitions? It has been too long for me to actually explain it (I’m not going to try it just for this reason even though I know I could work it out) but it IS possible so look in to it. And I might add something else: this is possible regardless of the partition table changing or not (as I seem to recall). As in you don’t need to have a backup of the partition table, to recover the partition (contents thereof). (There’s always exceptions to the rule, of course.) Keeping a backup is a good idea, however. Keeping backups, period (partition table and backups in general), is what too many neglect!

Reply

637 raphidae October 20, 2014 at 2:38 pm

Getting rid of backup files with ‘rm *~’, but having a broken ~ key is not fun when in /etc ;)

Reply

638 SK October 23, 2014 at 8:04 pm

Some day I move to quarantine some files

mv something /quarantine

But… I put a slash

mv something/ /querantine

and someting was a symlink to /bin..

So I move all /bin.. the luck wa than I move and not rm -rf, so access to rescue mode and move again solve the problem :/

Reply

Leave a Comment

Tagged as: , , , , , , , , ,

Previous post:

Next post: