Unix / Linux Shell: Parse Text CVS File Separator By Field

by on June 26, 2012 · 7 comments· LAST UPDATED March 4, 2014

in , ,

I work for a small ISP (Internet Service Provider) and we are using Linux and Unix-like operating system with bash shell. I want to write a shell script to parse the csv file line by line. Then line must be parse again field by field. The sample input file is as follows:

example.com,username,groupname,homedir,md5password,permission,secondarygroup

I need to extract each of these example.com,username,groupname,homedir,md5password,permission,secondarygroup fields and passed to the different system utilities. How do I write a shell script to automate this task and use the bash shell to parse a text file?

Tutorial details
DifficultyEasy (rss)
Root privilegesNo
RequirementsBash/Ksh
awk
Estimated completion time10m
You can use the bash while loop and read built-in command:

  1. while loop command - The while statement is used to execute a list of commands repeatedly.
  2. read command - Use the read command if you want to receive input while running a script. The read statement accepts input from the keyboard or file.
  3. $IFS - The Internal Field Separator (IFS) that is used for word splitting after expansion and to split lines into words with the read builtin command.

Examples

In this example, read a Comma-separated values (CVS) text file using combination of while and read commands. Here is a sample file.cvs:

cyberciti.biz,foo,bar,user1,password1,homedir1,chroot
nixcraft.com,oof,rab,user2,password2,homedir,nochroot

Create a shell script as follows:

#!/bin/bash
input="/path/to/your/input/file.cvs"
# Set "," as the field separator using $IFS
# and read line by line using while read combo 
while IFS=',' read -r f1 f2 f3 f4 f5 f6 f7
do
  echo "$f1 $f2 $f3 $f4 $f5 $f6 $f7"
done < "$input"
 

Save and close the file. Run it as follows:
$ ./script.sh
Sample outputs:

cyberciti.biz foo bar user1 password1 homedir1 chroot
nixcraft.com oof rab user2 password2 homedir nochroot

You can pass f1, f2 and other fields as per your requirements to other Linux or Unix commands:

#!/bin/bash
input="/path/to/your/input/file.cvs"
while IFS=',' read -r f1 f2 f3 f4 f5 f6 f7
do
  /path/to/useradd -d "$f4" "$f2"
done < "$input"
 

The IFS is a special shell variable. The Internal Field Separator (IFS) that is used for word splitting after expansion and to split lines into words with the read builtin command.

Parse a CVS file using awk

Awk use the field separator called FS and it does not use the name IFS that is used by the POSIX-compliant shells as described above. The syntax is as follows:

 
awk 'BEGIN { FS = "," } ; { do_something_here }' < input.cvs
awk 'BEGIN { FS = "," } ; { do_something_here } END{ clean_up_here }' < input.cvs
awk 'BEGIN { FS = "," } ; { print }' < input.cvs
 

In this example, display domain name using from file.cvs:

 
awk 'BEGIN { FS = "," } ; { print $1 }' < file.svs
 

Sample outputs:

cyberciti.biz
nixcraft.com

In this final example, pass f1, f2 and other fields as per your requirements to other Linux or Unix commands using awk's system(cmd) to executes cmd:

 
awk 'BEGIN { FS = "," } ; { cmd="/path/to/useradd -d" $4 " " $f2; system(cmd) }' < file.svs
 
See also
TwitterFacebookGoogle+PDF versionFound an error/typo on this page? Help us!

{ 7 comments… read them below or add one }

1 Oliver July 3, 2012 at 7:48 am

Why don’t you use awk? Are you restricted to using bash only? If not I would use awk as it is designed to to exactly what you want: process a file line by line splitting it up to fields using IFS.

Reply

2 Nix Craft March 4, 2014 at 12:05 pm

The faq has been updated with the awk example. I appreciate your post.

Reply

3 Jon Tollerton July 18, 2012 at 7:14 pm

Also, this likely fails if you have any internal commas in the data, which most applications treat by double quoting the whole field.

example.com,username,groupname,homedir,md5password,permission,secondarygroup
would work, but

“example, my”,username,groupname,homedir,md5password,permission,secondarygroup
would cause fields to be misaligned from your expectations.

Reply

4 TORNADO August 23, 2012 at 12:52 pm

I agree with Jon regarding coma in the data.

Some time ago I found a tool named csvtool :)
It is in ocaml-csv package in the epel repo.

Reply

5 hepha March 3, 2014 at 3:21 pm

The second example should be
while IFS=’,’ read

Reply

6 Nix Craft March 4, 2014 at 12:05 pm

Thank for the heads up!

Reply

7 foobar March 4, 2014 at 1:43 pm

Or you can use the -F option in awk instead of the BEGIN-Block to define the seperator value.

Reply

Leave a Comment

Tagged as: , , , , , , , , , ,

Previous Faq:

Next Faq: