Surviving a Linux Filesystem Failures

by on November 8, 2005 · 20 comments· Last updated November 15, 2007

When you use term filesystem failure, you mean corrupted filesystem data structures (or objects such as inode, directories, superblock etc. This can be caused by any one of the following reason:

* Mistakes by Linux/UNIX Sys admin
* Buggy device driver or utilities (especially third party utilities)
* Power outage (very rarer on production system) due to UPS failure
* Kernel bugs (that is why you don't run latest kernel on production Linux/UNIX system, most of time you need to use stable kernel release)

    Due to filesystem failure:

    • File system will refuse to mount
    • Entire system get hangs
    • Even if filesystem mount operation result into success, users may notice strange behavior when mounted such as system reboot, gibberish characters in directory listings etc

    So how the hell you are gonna Surviving a Filesystem Failures? Most of time fsck (front end to ext2/ext3 utility) can fix the problem, first simply run e2fsck - to check a Linux ext2/ext3 file system (assuming /home [/dev/sda3 partition] filesystem for demo purpose), first unmount /dev/sda3 then type following command :
    # e2fsck -f /dev/sda3
    Where,

    • -f : Force checking even if the file system seems clean.

    Please note that If the superblock is not found, e2fsck will terminate with a fatal error. However Linux maintains multiple redundant copies of the superblock in every file system, so you can use -b {alternative-superblock} option to get rid of this problem. The location of the backup superblock is dependent on the filesystem's blocksize:

    • For filesystems with 1k blocksizes, a backup superblock can be found at block 8193
    • For filesystems with 2k blocksizes, at block 16384
    • For 4k blocksizes, at block 32768.

    Tip you can also try any one of the following command(s) to determine alternative-superblock locations:
    # mke2fs -n /dev/sda3
    OR
    # dumpe2fs /dev/sda3|grep -i superblock
    To repair file system by alternative-superblock use command as follows:
    # e2fsck -f -b 8193 /dev/sda3

    However it is highly recommended that you make backup before you run fsck command on system, use dd command to create a backup (provided that you have spare space under /disk2)
    # dd if=/dev/sda2 of=/disk2/backup-sda2.img

    If you are using Sun Solaris UNIX, see howto: Restoring a Bad Superblock.

    Please note that things started to get complicated if hard disk participates in software RAID array. Take a look at Software-RAID HOWTO - Error Recovery. This article/tip is part of Understanding UNIX/Linux file system series, Continue reading rest of the Understanding Linux file system series (this is part III):

    • Part I - Understanding Linux superblock
    • Part II - Understanding Linux superblock
    • Part III - An example of Surviving a Linux Filesystem Failures
    • Part IV - Understanding filesystem Inodes
    • Part V - Understanding filesystem directories
    • Part VI - Understanding UNIX/Linux symbolic (soft) and hard links
    • Part VII - Why isn't it possible to create hard links across file system boundaries?


    You should follow me on twitter here or grab rss feed to keep track of new changes.

    Featured Articles:

    { 20 comments… read them below or add one }

    1 Anonymous February 6, 2006 at 5:10 am

    I think you meant your dd command to have dd if=… of=…. (the of is missing). cheers, DrD

    Reply

    2 nixcraft February 6, 2006 at 1:04 pm

    Typo is corrected; Thanks for the heads up!

    Reply

    3 Thomas Scott January 29, 2007 at 3:56 am

    Its easy to trash the old fs and
    create a new one , because data
    is never “hooked” to a particular
    fs !
    LiLo , LoadLin et al will still
    boot linux fine .
    Its easy cause the IDE HDD is
    much easier to figure , there
    is no need for partitions , nor
    any other “dependencies” .
    And drivers are much smaller .

    The reason we dont improve ,
    is someones paycheck is in it ,
    to keep it complicated .
    Its not unlike what Bill Gates
    did to USB …

    Reply

    4 Dan October 25, 2007 at 1:27 pm

    That works great IF your system will give you a terminal window. If you don’t get that far, you are SOL.

    Reply

    5 maxcomx August 30, 2009 at 2:09 pm

    can’t read superblock

    max@max-laptop:~$ sudo fsck -f /dev/sdd1
    fsck 1.41.4 (27-Jan-2009)
    dosfsck 3.0.1, 23 Nov 2008, FAT32, LFN
    There are differences between boot sector and its backup.
    Differences: (offset:original/backup)
    65:01/00
    1) Copy original to backup
    2) Copy backup to original
    3) No action
    ? 3
    Both FATs appear to be corrupt. Giving up.
    max@max-laptop:~$

    Hummm… WTF… what can i do.

    Reply

    6 Rod November 28, 2009 at 5:09 pm

    “That works great IF your system will give you a terminal window. If you don’t get that far, you are SOL.”

    That’s what Live CDs are for… ;)

    Thanx for this page-I was able to completely recover a borked system. A great resource!

    Reply

    7 snevi December 3, 2009 at 9:12 am

    Thank u very much for this tips!! it helps a lot to recover a main disc in my server!!!
    thanks.

    Reply

    8 justin January 25, 2010 at 10:04 am

    Thanks man! Recoverd Files from my HDD which i thought was broken!
    i owe you a beer.

    Reply

    9 j01z April 18, 2011 at 1:43 pm

    Thanks a lot! just recovered my baby’s pictures, you’re now her fairy godfather ;p

    Reply

    10 victor elorza May 16, 2011 at 2:18 am

    hello!
    every command i run results in this message:
    e2fsck: attempt to read block form filesystem resulted in short read while trying to open /dev/sda1. Could this be a zero-length partition?

    i can determine alternative superblock locations but the command e2fsck -f -b 32768 /dev/sda1 + with other superblock locations doesn’t work.
    please help thanks !!

    Reply

    11 Miguel May 24, 2011 at 7:21 am

    Hi,

    I tried some data recovery programs and they all found the files on /dev/sdc3 BUT

    ANY attempt to “fsck -f -b -y /dev/sdc3″ results in

    /: Attempt to read block from filesystem resulted in short read while reading block 525

    /: Attempt to read block from filesystem resulted in short read reading journal superblock

    fsck.ext2: Attempt to read block from filesystem resulted in short read while checking ext3 journal for /

    any Idea before I jump out of the window ?

    many thanks

    Reply

    12 Miguel May 24, 2011 at 7:22 am

    oops…

    fsck -f -b “any superblock location” -y /dev/sdc3

    Reply

    13 Fabio Souza August 28, 2011 at 4:54 pm

    Thanks man, you save my life… God bless you!

    Reply

    14 Alec October 24, 2011 at 1:12 am

    I have nothing to add except a “thank you”. Easily-comprehensible tutorials are hard to come by.

    Reply

    15 Ben October 31, 2011 at 12:37 am

    This post saves the world

    Reply

    16 Randy Kramer March 6, 2012 at 2:31 pm

    Several comments:
    * I was most interested in learning more about a superblock, and that part of the series was very helpful to me.
    * I haven’t started digging into it, but I’d like to learn more about how a corrupted filesystem is fixed, my specific question is, are there circumstances in which a filesystem is fixed by deleting all or part of a file? (I know this was done in early versions of Dos and Windows, but I haven’t done much with either in the last 12 years, maybe that is no longer the case.) But I want to find out if that is, or ever was, the case in Linux.
    * Your tutorials are very well written, but you have some typos / grammos. If you’re interested, I can send you (sooner or later) copies of the content with the errors I see fixed. (Just as one example, the title of this page has a problem–you’re mixing singular and plural stuff, the title should either be: “Surviving Linux Filesystem Failures” (all plural) or “Surviving a Linux Filesystem Failure” (all singular).

    Reply

    17 pramod a g May 3, 2012 at 2:57 pm

    A million thanks man i was able to recover my corrupted Ubuntu.

    Reply

    18 slobodan_hb May 16, 2012 at 5:45 am

    Thanx

    Reply

    19 bb October 25, 2012 at 5:03 pm

    Thanks for this tutorial.

    Reply

    20 Dash February 16, 2013 at 3:21 am

    No matter what I do, I get following msg.

    root@tdsrv002 [~]# e2fsck -f -b 32768 -y /dev/xvdj
    e2fsck 1.41.12 (17-May-2010)
    e2fsck: Bad magic number in super-block while trying to open /dev/xvdj

    The superblock could not be read or does not describe a correct ext2
    filesystem. If the device is valid and it really contains an ext2
    filesystem (and not swap or ufs or something else), then the superblock
    is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193

    I tried all block locations which I found here.

    root@tdsrv002 [~]# mke2fs -n /dev/xvdj
    mke2fs 1.41.12 (17-May-2010)
    Filesystem label=
    OS type: Linux
    Block size=4096 (log=2)
    Fragment size=4096 (log=2)
    Stride=0 blocks, Stripe width=0 blocks
    1966080 inodes, 7864320 blocks
    393216 blocks (5.00%) reserved for the super user
    First data block=0
    Maximum filesystem blocks=4294967296
    240 block groups
    32768 blocks per group, 32768 fragments per group
    8192 inodes per group
    Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
    4096000

    Thanks for any hint to fix this issue.

    Reply

    Leave a Comment

    You can use these HTML tags and attributes for your code and commands: <strong> <em> <ol> <li> <u> <ul> <blockquote> <pre> <a href="" title="">
    What is 12 + 13 ?
    Please leave these two fields as-is:
    Solve the simple math so we know that you are a human and not a bot.




    Tagged as: , , , , , , , , , ,

    Previous post:

    Next post: