≡ Menu

File system

If your network is heavily loaded you may see some problem with Common Internet File System (CIFS) and NFS under Linux. By default Linux CIFS mount command will try to cache files open by the client. You can use mount option forcedirectio when mounting the CIFS filesystem to disable caching on the CIFS client. This is tested with NETAPP and other storage devices and Novell, CentOS, UNIX and Red Hat Linux systems. This is the only way to avoid data mis-compare and problems.

The default is to attempt to cache ie try to request oplock on files opened by the client (forcedirectio is off). Foredirectio also can indirectly alter the network read and write size, since i/o will now match what was requested by the application, as readahead and writebehind is not being performed by the page cache when forcedirectio is enabled for a mount

mount -t cifs //mystorage/data2 -o username=vivek,password=myPassword,rw,bg,vers=3,proto=tcp,hard,intr,rsize=32768,wsize=32768,forcedirectio,llock /data2

Refer mount.cifs man page, docs stored at Documentation/filesystems/cifs.txt and fs/cifs/README in the linux kernel source tree for additional options and information.

NFS is pretty old file sharing technology for UNIX based system and storage systems. However, it suffers from performance issues. NFSv4.1 address data access issues by adding a new feature called parallel NFS (pNFS) - a method of introducing Data Access Parallelism. The end result is ultra fast file sharing for clusters and high availability configurations.

The Network File System (NFS) is a stalwart component of most modern local area networks (LANs). But NFS is inadequate for the demanding input- and output-intensive applications commonly found in high-performance computing -- or, at least it was. The newest revision of the NFS standard includes Parallel NFS (pNFS), a parallelized implementation of file sharing that multiplies transfer rates by orders of magnitude.

In addition to pNFS, NFSv4.1 provides Sessions, Directory Delegation and Notifications, Multi-server Namespace, ACL/SACL/DACL, Retention Attributions, and SECINFO_NO_NAME.

Fig.01: The conceptual organization of pNFS - Image credit IBM

Fig.01: The conceptual organization of pNFS - Image credit IBM

According to wikipedia:

The NFSv4.1 protocol defines a method of separating the meta-data (names and attributes) of a filesystem from the location of the file data; it goes beyond the simple name/data separation of striping the data amongst a set of data servers. This is different from the traditional NFS server which holds the names of files and their data under the single umbrella of the server. There exists products which are multi-node NFS servers, but the participation of the client in separation of meta-data and data is limited. The NFSv4.1 client can be enabled to be a direct participant in the exact location of file data and avoid solitary interaction with the single NFS server when moving data.

The NFSv4.1 pNFS server is a collection of server resources or components; these are assumed to be controlled by the meta-data server.

The pNFS client still accesses a single meta-data server for traversal or interaction with the namespace; when the client moves data to and from the server it may be directly interacting with the set of data servers belonging to the pNFS server collection.

More information about pNFS

  1. Scale your file system with Parallel NFS
  2. Linux NFS Overview, FAQ and HOWTO Documents
  3. NFSv4 delivers seamless network access
  4. Nfsv4 Status Pages
  5. NFS article from the Wikipedia

Linux tgtadm: Setup iSCSI Target ( SAN )

Linux target framework (tgt) aims to simplify various SCSI target driver (iSCSI, Fibre Channel, SRP, etc) creation and maintenance. The key goals are the clean integration into the scsi-mid layer and implementing a great portion of tgt in user space.

The developer of IET is also helping to develop Linux SCSI target framework (stgt) which looks like it might lead to an iSCSI target implementation with an upstream kernel component. iSCSI Target can be useful:

a] To setup stateless server / client (used in diskless setups).
b] Share disks and tape drives with remote client over LAN, Wan or the Internet.
c] Setup SAN - Storage array.
d] To setup loadbalanced webcluser using cluster aware Linux file system etc.

In this tutorial you will learn how to have a fully functional Linux iSCSI SAN using tgt framework.
[click to continue…]

This is an user contributed article.

Midnight Commander (mc) is an user-friendly text-based file manager UI for Unix. Using mc, you can browse the filesystem easily and manipulate the files and directories quickly. You will not miss the standard command line prompt, which is also available within the mc itself. If you are new to mc, Midnight Commander (mc) Guide: Powerful Text based File Manager for Unix article will give you a quick jumpstart. In this article, let us review how to solve couple of common annoyance about viewing a file in mc.

Use vi as default editor and viewer in mc

Mc uses mcedit for file editor and mcview for file viewer. Like most of you, I'm very comfortable with vi and would like to use vi for both viewing and editing than the mc's internal editor and viewer.

Launch mc in color mode by typing “mc -c” from the command line. Press F9 (or Esc followed by 9) to activate the top menu → select Options menu → select Configurations menu-item, which will display “Configure Options” dialog → de-select the check-box next to “Use Internal Edit” and “Use Internal View”, as shown below to disable, internal editor and viewer.

After this change, when you select a file and press Esc 3 to view or Esc 4 to edit, mc will use vi.

Fig.01 Mc configure options to disable internal editor and viewer

Fig.01 Mc configure options to disable internal editor and viewer

Change the Enter key behavior to view file using vi instead of executing it.

When you select a shell script and press enter, mc will execute it by default. Also, by default when you press enter on text files, nothing happens. I prefer to view the shell script when I press enter key. Also, I would like to view the text file using vi when I press enter key. You can achieve this by modifying the mc extension file as shown below.

Press F9 (or Esc followed by 9) to active the top menu. From command menu → select “Edit extension file” men-item → This will display the extension file. Go to the bottom of the extension file and change the value of open and view parameter values as shown below.

# Default target for anything not described above default/*

        Open=%var{EDITOR:vi} %f
        View=%var{EDITOR:vi} %f 

After the above change, when you press enter key on a shell script, it will open it in vi instead of executing it. This will also open text files in vi when you press enter after selecting it.

A Redundant Array of Independent Drives (or Disks), also known as Redundant Array of Inexpensive Drives (or Disks) (RAID) is an term for data storage schemes that divide and/or replicate data among multiple hard drives. RAID can be designed to provide increased data reliability or increased I/O performance, though one goal may compromise the other. There are 10 RAID level. But which one is recommended for data safety and performance considering that hard drives are commodity priced?

I did some research in last few months and based upon my experince I started to use RAID10 for both Vmware / XEN Virtualization and database servers. A few MS-Exchange and Oracle admins also recommended RAID 10 for both safety and performance over RAID 5.

Quick RAID 10 overview (raid 10 explained)

RAID 10 = Combining features of RAID 0 + RAID 1. It provides optimization for fault tolerance.

RAID 0 helps to increase performance by striping volume data across multiple disk drives.

RAID 1 provides disk mirroring which duplicates your data.

In some cases, RAID 10 offers faster data reads and writes than RAID 5 because it does not need to manage parity.

Fig.01: Raid 10 in action

Fig.01: Raid 10 in action

RAID 5 vs RAID 10

From Art S. Kagel research findings:

If a drive costs $1000US (and most are far less expensive than that) then switching from a 4 pair RAID10 array to a 5 drive RAID5 array will save 3 drives or $3000US. What is the cost of overtime, wear and tear on the technicians, DBAs, managers, and customers of even a recovery scare? What is the cost of reduced performance and possibly reduced customer satisfaction? Finally what is the cost of lost business if data is unrecoverable? I maintain that the drives are FAR cheaper! Hence my mantra:
NO RAID5! NO RAID5! NO RAID5! NO RAID5! NO RAID5! NO RAID5! NO RAID5!

Is RAID 5 Really a Bargain?

Cary Millsap, manager of Hotsos LLC and the editor of Hotsos Journal found the following facts - Is RAID 5 Really a Bargain?":

  • RAID 5 costs more for write-intensive applications than RAID 1.
  • RAID 5 is less outage resilient than RAID 1.
  • RAID 5 suffers massive performance degradation during partial outage.
  • RAID 5 is less architecturally flexible than RAID 1.
  • Correcting RAID 5 performance problems can be very expensive.

My practical experience with RAID arrays configuration

To make picture clear, I'm putting RAID 10 vs RAID 5 configuration for high-load database, Vmware / Xen servers, mail servers, MS - Exchange mail server etc:

RAID LevelTotal array capacityFault toleranceRead speedWrite speed
RAID-10
500GB x 4 disks
1000 GB1 disk4X2X
RAID-5
500GB x 3 disks
1000 GB1 disk2XSpeed of a RAID 5 depends upon the controller implementation

You can clearly see RAID 10 outperforms RAID 5 at fraction of cost in terms of read and write operations.

A note about backup

Any RAID level will not protect you from multiple disk failures. While one disk is off line for any reason, your disk array is not fully redundant. Therefore, old good tape backups are always recommended.

Please add your thoughts and experience in the comments below.

Further readings:

Seagate Barracuda: 1.5TB Hard Drive Launched

Wow, this is a large size desktop hard disk for storing movies, tv shows, music / mp3s, and photos. You can also load multiple operating systems using vmware or other software for testing purpose. This hard disk comes with 5 year warranty and can transfer at 300MB/s. From the article:

It's been more than 18 months since Hitachi reached the terabyte mark with the Deskstar 7K1000. In that time, all the major players in the hard drive industry have spun up terabytes of their own, and in some cases, offered multiple models targeting different markets. With so many options available and more than enough time for the milestone capacity's initial buzz to fade, it's no wonder that the current crop of 1TB drives is more affordable than we've ever seen from a flagship capacity. The terabyte, it seems, is old news.

Fig.01: Seagate's Barracuda 7200.11 1.5TB hard drive

Fig.01: Seagate's Barracuda 7200.11 1.5TB hard drive

The real question is about reliability. How reliable is the hard disk? So far my Seagate 500GB hard disk working fine. I might get one to dump all my multimedia data / files :)

Sun Solaris on its Deathbed – Claims Jim Zemlin

Jim Zemlin is executive director of the Linux Foundation claims Solaris UNIX is irrelevant and Linux is future. From the article:

Linux is enjoying growth, with a contingent of devotees too large to be called a cult following at this point. Solaris, meanwhile, has thrived as a longstanding, primary Unix platform geared to enterprises.

Sun officials believe the 16-year-old Solaris platform remains a pivotal, innovative platform. But at the Linux Foundation, there is a no-conciliatory stance; the attitude there is to tell Solaris and Sun to move out of the way. "The future is Linux and Microsoft Windows," says foundation Executive Director Jim Zemlin. "It is not Unix or Solaris."

Is Sun Solaris on its deathbed?

Sure Linux has great value but Solaris has its own market share. They make great OS with good features such as DTrace, ZFS and many more. Many government and defense project selects Solaris for Database and many mission critical applications, while Linux used for Web, mail and proxy services.

What do you think?