≡ Menu

superblock

A single inode number use to represent file in each file system. All hard links based upon inode number.

So linking across file system will lead into confusing references for UNIX or Linux. For example, consider following scenario

* File system: /home
* Directory: /home/vivek
* Hard link: /home/vivek/file2
* Original file: /home/vivek/file1

Now you create a hard link as follows:
$ touch file1
$ ln file1 file2
$ ls -l

Output:

-rw-r--r--  2 vivek vivek    0 2006-01-30 13:28 file1
-rw-r--r--  2 vivek vivek    0 2006-01-30 13:28 file2

Now just see inode of both file1 and file2:
$ ls -i file1
782263
$ ls -i file2
782263

As you can see inode number is same for hard link file called file2 in inode table under /home file system. Now if you try to create a hard link for /tmp file system it will lead to confusing references for UNIX or Linux file system. Is that a link no. 782263 in the /home or /tmp file system? To avoid this problem UNIX or Linux does not allow creating hard links across file system boundaries. Continue reading rest of the Understanding Linux file system series (this is part VII):

  • Part I - Understanding Linux superblock
  • Part II - Understanding Linux superblock
  • Part III - An example of Surviving a Linux Filesystem Failures
  • Part IV - Understanding filesystem Inodes
  • Part V - Understanding filesystem directories
  • Part VI - Understanding UNIX/Linux symbolic (soft) and hard links
  • Part VII - Why isn't it possible to create hard links across file system boundaries?

Understanding UNIX / Linux filesystem directories

You use DNS (domain name system) to translate between domain names and IP addresses.

Similarly files are referred by file name, not by inode number. So what is the purpose of a directory? You can groups the files according to your usage. For example all configuration files are stored under /etc directory. So the purpose of a directory is to make a connection between file names and their associated inode number. Inside every directory you will find out two directories . (current directory) and .. (pointer to previous directory i.e. the directory immediately above the one I am in now). The .. appears in every directory except for the root directory.

Directory

A directory contained inside another directory is called a subdirectory. At the end the directories form a tree structure. Use tree command to see directory tree structure:
$ tree /etc | less
Again a directory has an inode just like a file. It is a specially formatted file containing records which associate each name with an inode number. Please note the following limitation of directories under ext2/3 file system:

  • There is an upper limit of 32768 subdirectories in a single directory.
  • There is a "soft" upper limit of about 10-15k files in a single directory

However according to official documentation of ext2/3 file system points that “Using a hashed directory index (which is under development) allows 100k-1M+ files in a single directory without performance problems'. Here are my two favorite alias commands related to directory :
$ alias ..='cd ..'
alias d='ls -l | grep -E "^d"'


Well I'm sure all of you know the basic commands related to directories and files managment. Click above (or here) to see summery of all basic commands related to directories and files managment. See interesting discussion about soft links and directories. This is 6th part of "Understanding UNIX/Linux file system, continue reading rest of the Understanding Linux file system series (this is part IV):

  • Part I - Understanding Linux superblock
  • Part II - Understanding Linux superblock
  • Part III - An example of Surviving a Linux Filesystem Failures
  • Part IV - Understanding filesystem Inodes
  • Part V - Understanding filesystem directories
  • Part VI - Understanding UNIX/Linux symbolic (soft) and hard links
  • Part VII - Why isn't it possible to create hard links across file system boundaries?

Surviving a Linux Filesystem Failures

When you use term filesystem failure, you mean corrupted filesystem data structures (or objects such as inode, directories, superblock etc. This can be caused by any one of the following reason:

* Mistakes by Linux/UNIX Sys admin
* Buggy device driver or utilities (especially third party utilities)
* Power outage (very rarer on production system) due to UPS failure
* Kernel bugs (that is why you don't run latest kernel on production Linux/UNIX system, most of time you need to use stable kernel release)

    Due to filesystem failure:

    • File system will refuse to mount
    • Entire system get hangs
    • Even if filesystem mount operation result into success, users may notice strange behavior when mounted such as system reboot, gibberish characters in directory listings etc

    So how the hell you are gonna Surviving a Filesystem Failures? Most of time fsck (front end to ext2/ext3 utility) can fix the problem, first simply run e2fsck - to check a Linux ext2/ext3 file system (assuming /home [/dev/sda3 partition] filesystem for demo purpose), first unmount /dev/sda3 then type following command :
    # e2fsck -f /dev/sda3
    Where,

    • -f : Force checking even if the file system seems clean.

    Please note that If the superblock is not found, e2fsck will terminate with a fatal error. However Linux maintains multiple redundant copies of the superblock in every file system, so you can use -b {alternative-superblock} option to get rid of this problem. The location of the backup superblock is dependent on the filesystem's blocksize:

    • For filesystems with 1k blocksizes, a backup superblock can be found at block 8193
    • For filesystems with 2k blocksizes, at block 16384
    • For 4k blocksizes, at block 32768.

    Tip you can also try any one of the following command(s) to determine alternative-superblock locations:
    # mke2fs -n /dev/sda3
    OR
    # dumpe2fs /dev/sda3|grep -i superblock
    To repair file system by alternative-superblock use command as follows:
    # e2fsck -f -b 8193 /dev/sda3

    However it is highly recommended that you make backup before you run fsck command on system, use dd command to create a backup (provided that you have spare space under /disk2)
    # dd if=/dev/sda2 of=/disk2/backup-sda2.img

    If you are using Sun Solaris UNIX, see howto: Restoring a Bad Superblock.

    Please note that things started to get complicated if hard disk participates in software RAID array. Take a look at Software-RAID HOWTO - Error Recovery. This article/tip is part of Understanding UNIX/Linux file system series, Continue reading rest of the Understanding Linux file system series (this is part III):

    • Part I - Understanding Linux superblock
    • Part II - Understanding Linux superblock
    • Part III - An example of Surviving a Linux Filesystem Failures
    • Part IV - Understanding filesystem Inodes
    • Part V - Understanding filesystem directories
    • Part VI - Understanding UNIX/Linux symbolic (soft) and hard links
    • Part VII - Why isn't it possible to create hard links across file system boundaries?

    Understanding UNIX / Linux filesystem Superblock

    This is second part of "Understanding UNIX/Linux file system", part I is here. Let us take an example of 20 GB hard disk. The entire disk space subdivided into multiple file system blocks. And blocks used for what?

    Unix / Linux filesystem blocks

    The blocks used for two different purpose:

    1. Most blocks stores user data aka files (user data).
    2. Some blocks in every file system store the file system's metadata. So what the hell is a metadata?

    In simple words Metadata describes the structure of the file system. Most common metadata structure are superblock, inode and directories. Following paragraphs describes each of them.

    Superblock

    Each file system is different and they have type like ext2, ext3 etc. Further each file system has size like 5 GB, 10 GB and status such as mount status. In short each file system has a superblock, which contains information about file system such as:

    • File system type
    • Size
    • Status
    • Information about other metadata structures

    If this information lost, you are in trouble (data loss) so Linux maintains multiple redundant copies of the superblock in every file system. This is very important in many emergency situation, for example you can use backup copies to restore damaged primary super block. Following command displays primary and backup superblock location on /dev/sda3:
    # dumpe2fs /dev/hda3 | grep -i superblock
    Output:

    Primary superblock at 0, Group descriptors at 1-1
    Backup superblock at 32768, Group descriptors at 32769-32769
    Backup superblock at 98304, Group descriptors at 98305-98305
    Backup superblock at 163840, Group descriptors at 163841-163841
    Backup superblock at 229376, Group descriptors at 229377-229377
    Backup superblock at 294912, Group descriptors at 294913-294913

    Continue reading rest of the Understanding Linux file system series (this is part II):

    • Part I - Understanding Linux superblock
    • Part II - Understanding Linux superblock
    • Part III - An example of Surviving a Linux Filesystem Failures
    • Part IV - Understanding filesystem Inodes
    • Part V - Understanding filesystem directories
    • Part VI - Understanding UNIX/Linux symbolic (soft) and hard links
    • Part VII - Why isn't it possible to create hard links across file system boundaries?

    The importance of Linux partitions

    Disk partitioning is the creation of separate divisions of a hard disk drive using partition editors such as fdisk. Once a disk is divided into several partitions, directories and files of different categories may be stored in different partitions.

    Many new Linux sys admin (or Windows admin) create only two partitions / (root) and swap for entire hard drive. This is really a bad idea. You need to consider the following points while partitioning disk.

    Purposes for Disk Partitioning

    An operating system like Windows / Linux can be installed on a single, unpartitioned hard disk. However, the ability to divide a hard disk into multiple partitions offers some important advantages. If you are running Linux on server consider following facts:

    • Ease of use - Make it easier to recover a corrupted file system or operating system installation.
    • Performance - Smaller file systems are more efficient. You can tune file system as per application such as log or cache files. Dedicated swap partition can also improve the performance (this may not be true with latest Linux kernel 2.6).
    • Security - Separation of the operating system files from user files may result into a better and secure system. Restrict the growth of certain file systems is possible using various techniques.
    • Backup and Recovery - Easier backup and recovery.
    • Stability and efficiency - You can increase disk space efficiency by formatting disk with various block sizes. It depends upon usage. For example, if the data is lots of small files, it is better to use small block size.
    • Testing - Boot multiple operating systems such as Linux, Windows and FreeBSD from a single hard disk.


    File systems that need their own partitions
    PartitionPurpose
    /usrThis is where most executable binaries, the kernel source tree and much documentation go.
    /varThis is where spool directories such as those for mail and printing go. In addition, it contains the error log directory.
    /tmpThis is where most temporary data files stored by apps.
    /bootThis is where your kernel images and boot loader configuration go.
    /homeThis is where users home directories go.

    Let us assume you have 120 GB SCSI hard disk with / (root) and swap partitions only. One of user (may be internal or external or cracker ) runs something which eats up all your hard disk space (DoS attack). For example, consider following tiny script that user can run in /tmp directory:

    #!/bin/sh
    man bash > $(mktemp)
    $0

    Anyone can run above script via cron (if allowed), or even with nohup command:
    $ nohup bad-script &

    The result can be a total disaster as entire file system comes under Denial of Service attack. It will even bypass the disk quota restriction. One of our Jr. Linux sys admin created only two partition. Later poorly written application eats up all space in /var/log/. End result was memo for him (as he did not followed internal docs that has guidelines for partition setup for clients server). Bottom line create the partition on Linux server.

    If you do not have a partition schema, than following attacks can take place:

    1. Runaway processes.
    2. Denial of Service attack against disk space (see above example script).
    3. Users can download or compile SUID programs in /tmp or even in /home.
    4. Performance tuning is not possible.
    5. Mounting /usr as read only not possible to improve security.
    6. All of this attack can be stopped by adding following option to /etc/fstab file:
    • nosuid - Do not set SUID/SGID access on this partition
    • nodev - Do not character or special devices on this partition
    • noexec - Do not set execution of any binaries on this partition
    • ro - Mount file system as readonly
    • quota - Enable disk quota

    Please note that above options can be set only, if you have a separate partition. Make sure you create a partition as above with special option set on each partition:

    • /home - Set option nosuid, and nodev with diskquota option
    • /usr - Set option nodev
    • /tmp - Set option nodev, nosuid, noexec option must be enabled

    For example entry in /etc/fstabe for /home should read as follows:

    /dev/sda1  /home          ext3    defaults,nosuid,nodev 1 2

    Here is mount command output from one of my OpenBSD production server:

    /dev/wd0a on / type ffs (local)
    /dev/wd1a on /home type ffs (local, nodev, nosuid, with quotas)
    /dev/wd0d on /root type ffs (local)
    /dev/wd0e on /usr type ffs (local, nodev)
    /dev/wd0f on /tmp type ffs (local, nodev)
    /dev/wd0h on /var type ffs (local, nodev, nosuid)
    /dev/wd0g on /var/log type ffs (local, nodev)

    How do I obtain information about partitions?

    There are several ways that information about partitions can be obtained on Linux / UNIX like operating systems.

    List partitions:

    fdisk -l

    Report file system disk space usage:

    df -h
    OR
    df -k

    Display partition mount options including mount points

    mount
    Sample output:

    /dev/sda2 on / type ext3 (rw,relatime,errors=remount-ro)
    tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
    /proc on /proc type proc (rw,noexec,nosuid,nodev)
    sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
    varrun on /var/run type tmpfs (rw,nosuid,mode=0755)
    varlock on /var/lock type tmpfs (rw,noexec,nosuid,nodev,mode=1777)
    udev on /dev type tmpfs (rw,mode=0755)
    tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
    devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
    fusectl on /sys/fs/fuse/connections type fusectl (rw)
    /dev/sda1 on /media/sda1 type fuseblk (rw,nosuid,nodev,allow_other,default_permissions,blksize=4096)
    /dev/sda5 on /share type fuseblk (rw,nosuid,nodev,allow_other,default_permissions,blksize=4096)
    /dev/sdb2 on /disk1p2 type ext3 (rw,relatime,errors=remount-ro)
    securityfs on /sys/kernel/security type securityfs (rw)
    debugfs on /sys/kernel/debug type debugfs (rw)
    binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,noexec,nosuid,nodev)
    gvfs-fuse-daemon on /home/vivek/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev,user=vivek)
    

    Display / edit file system configuration options

    less /etc/fstab
    or
    vi /etc/fstab

    Quickly remount /usr in ro mode

    mount -o remount, ro /usr

    Quickly mount all file system configured in /etc/fstab

    mount -a

    References: