File system

The sar command collects, report, or save UNIX / Linux system activity information. It will save selected counters in the operating system to the /var/log/sa/sadd file. From the collected data, you get lots of information about your server:

  1. CPU utilization
  2. Memory paging and its utilization
  3. Network I/O, and transfer statistics
  4. Process creation activity
  5. All block devices activity
  6. Interrupts/sec etc.

The sar command output can be used for identifying server bottlenecks. However, analyzing information provided by sar can be difficult, so use kSar tool. kSar takes sar command output and plots a nice easy to understand graph over a period of time.
[continue reading…]

This is an interesting visualization techniques for software analysis. From the article:

Despite being a very important part of any operating system, file systems tend to get little attention. The first part is a detail analysis of one particular Linux Kernel tree and the second is a shorter one done over a large number of file systems from Linux Kernel 2.6.0 to 2.6.29. After that there is a small section that shows some aspects of the BSD family. After conclusions there is an appendix consisting of three things: the first one explains how the file systems for Linux were compiled, the second one shows timelines for the releases of Linux Kernel, FreeBSD, NetBSD and OpenBSD; the last is a detailed map of the external symbols of the kernel modules analyzed in the second section.

A Visual Expedition Inside the Linux File Systems

The tail command is one of the best tool to view log files in a real time using tail -f /path/to/log.file syntax on a Unix-like systems. The program MultiTail lets you view one or multiple files like the original tail program. The difference is that it creates multiple windows on your console (with ncurses). This is one of those dream come true program for UNIX sys admin job. You can browse through several log files at once and do various operations like search for errors and more.
[continue reading…]

Linux and other Unix-like operating systems use the term “swap” to describe both the act of moving memory pages between RAM and disk and the region of a disk the pages are stored on. It is common to use a whole partition of a hard disk for swapping. However, with the 2.6 Linux kernel, swap files are just as fast as swap partitions. Now, many admins (both Windows and Linux/UNIX) follow an old rule of thumb that your swap partition should be twice the size of your main system RAM. Let us say I’ve 32GB RAM, should I set swap space to 64 GB? Is 64 GB of swap space required? How big should your Linux / UNIX swap space be?
[continue reading…]

I’ve already written a small tutorial about finding out if a file exists or not under Linux / UNIX bash shell. However, couple of our regular readers like to know more about a directory checking using if and test shell command.

General syntax to see if a directory exists or not

[ -d directory ]
OR
test directory
See if a directory exists or not with NOT operator:
[ ! -d directory ]
OR
! test directory

Find out if /tmp directory exists or not

Type the following command:
$ [ ! -d /tmp ] && echo 'Directory /tmp not found'
OR
$ [ -d /tmp ] && echo 'Directory found' || echo 'Directory /tmp not found'

Sample Shell Script to gives message if directory exists

Here is a sample shell script:

#!/bin/bash
DIR="$1"
 
if [ $# -ne 1 ]
then
	echo "Usage: $0 {dir-name}"
	exit 1
fi
 
if [ -d "$DIR" ]
then
	echo "$DIR directory  exists!"
else
	echo "$DIR directory not found!"
fi

If your network is heavily loaded you may see some problem with Common Internet File System (CIFS) and NFS under Linux. By default Linux CIFS mount command will try to cache files open by the client. You can use mount option forcedirectio when mounting the CIFS filesystem to disable caching on the CIFS client. This is tested with NETAPP and other storage devices and Novell, CentOS, UNIX and Red Hat Linux systems. This is the only way to avoid data mis-compare and problems.

The default is to attempt to cache ie try to request oplock on files opened by the client (forcedirectio is off). Foredirectio also can indirectly alter the network read and write size, since i/o will now match what was requested by the application, as readahead and writebehind is not being performed by the page cache when forcedirectio is enabled for a mount

mount -t cifs //mystorage/data2 -o username=vivek,password=myPassword,rw,bg,vers=3,proto=tcp,hard,intr,rsize=32768,wsize=32768,forcedirectio,llock /data2

Refer mount.cifs man page, docs stored at Documentation/filesystems/cifs.txt and fs/cifs/README in the linux kernel source tree for additional options and information.

NFS is pretty old file sharing technology for UNIX based system and storage systems. However, it suffers from performance issues. NFSv4.1 address data access issues by adding a new feature called parallel NFS (pNFS) – a method of introducing Data Access Parallelism. The end result is ultra fast file sharing for clusters and high availability configurations.

The Network File System (NFS) is a stalwart component of most modern local area networks (LANs). But NFS is inadequate for the demanding input- and output-intensive applications commonly found in high-performance computing — or, at least it was. The newest revision of the NFS standard includes Parallel NFS (pNFS), a parallelized implementation of file sharing that multiplies transfer rates by orders of magnitude.

In addition to pNFS, NFSv4.1 provides Sessions, Directory Delegation and Notifications, Multi-server Namespace, ACL/SACL/DACL, Retention Attributions, and SECINFO_NO_NAME.

Fig.01: The conceptual organization of pNFS - Image credit IBM

According to wikipedia:

The NFSv4.1 protocol defines a method of separating the meta-data (names and attributes) of a filesystem from the location of the file data; it goes beyond the simple name/data separation of striping the data amongst a set of data servers. This is different from the traditional NFS server which holds the names of files and their data under the single umbrella of the server. There exists products which are multi-node NFS servers, but the participation of the client in separation of meta-data and data is limited. The NFSv4.1 client can be enabled to be a direct participant in the exact location of file data and avoid solitary interaction with the single NFS server when moving data.

The NFSv4.1 pNFS server is a collection of server resources or components; these are assumed to be controlled by the meta-data server.

The pNFS client still accesses a single meta-data server for traversal or interaction with the namespace; when the client moves data to and from the server it may be directly interacting with the set of data servers belonging to the pNFS server collection.

More information about pNFS

  1. Scale your file system with Parallel NFS
  2. Linux NFS Overview, FAQ and HOWTO Documents
  3. NFSv4 delivers seamless network access
  4. Nfsv4 Status Pages
  5. NFS article from the Wikipedia

Linux target framework (tgt) aims to simplify various SCSI target driver (iSCSI, Fibre Channel, SRP, etc) creation and maintenance. The key goals are the clean integration into the scsi-mid layer and implementing a great portion of tgt in user space.

The developer of IET is also helping to develop Linux SCSI target framework (stgt) which looks like it might lead to an iSCSI target implementation with an upstream kernel component. iSCSI Target can be useful:

a] To setup stateless server / client (used in diskless setups).
b] Share disks and tape drives with remote client over LAN, Wan or the Internet.
c] Setup SAN – Storage array.
d] To setup loadbalanced webcluser using cluster aware Linux file system etc.

In this tutorial you will learn how to have a fully functional Linux iSCSI SAN using tgt framework.
[continue reading…]