Unix / Linux Shell: Parse Text CVS File Separator By Field

I work for a small ISP (Internet Service Provider) and we are using Linux and Unix-like operating system with bash shell. I want to write a shell script to parse the csv file line by line. Then line must be parse again field by field. The sample input file is as follows:

ADVERTISEMENTS

example.com,username,groupname,homedir,md5password,permission,secondarygroup

I need to extract each of these example.com,username,groupname,homedir,md5password,permission,secondarygroup fields and passed to the different system utilities. How do I write a shell script to automate this task and use the bash shell to parse a text file?

[donotprint]

Tutorial details
DifficultyEasy (rss)
Root privilegesNo
RequirementsBash/Ksh
awk
Time10m
[/donotprint]You can use the bash while loop and read built-in command:

  1. while loop command – The while statement is used to execute a list of commands repeatedly.
  2. read command – Use the read command if you want to receive input while running a script. The read statement accepts input from the keyboard or file.
  3. $IFS – The Internal Field Separator (IFS) that is used for word splitting after expansion and to split lines into words with the read builtin command.

Examples

In this example, read a Comma-separated values (CVS) text file using combination of while and read commands. Here is a sample file.cvs:

cyberciti.biz,foo,bar,user1,password1,homedir1,chroot
nixcraft.com,oof,rab,user2,password2,homedir,nochroot

Create a shell script as follows:

#!/bin/bash
input="/path/to/your/input/file.cvs"
# Set "," as the field separator using $IFS 
# and read line by line using while read combo 
while IFS=',' read -r f1 f2 f3 f4 f5 f6 f7
do 
  echo "$f1 $f2 $f3 $f4 $f5 $f6 $f7"
done < "$input"

Save and close the file. Run it as follows:
$ ./script.sh
Sample outputs:

cyberciti.biz foo bar user1 password1 homedir1 chroot
nixcraft.com oof rab user2 password2 homedir nochroot

You can pass f1, f2 and other fields as per your requirements to other Linux or Unix commands:

#!/bin/bash
input="/path/to/your/input/file.cvs"
while IFS=',' read -r f1 f2 f3 f4 f5 f6 f7
do 
  /path/to/useradd -d "$f4" "$f2"
done < "$input"

The IFS is a special shell variable. The Internal Field Separator (IFS) that is used for word splitting after expansion and to split lines into words with the read builtin command.

Parse a CVS file using awk

Awk use the field separator called FS and it does not use the name IFS that is used by the POSIX-compliant shells as described above. The syntax is as follows:

awk 'BEGIN { FS = "," } ; { do_something_here }' < input.cvs
awk 'BEGIN { FS = "," } ; { do_something_here } END{ clean_up_here }' < input.cvs
awk 'BEGIN { FS = "," } ; { print }' < input.cvs

In this example, display domain name using from file.cvs:

awk 'BEGIN { FS = "," } ; { print $1 }' < file.svs

Sample outputs:

cyberciti.biz
nixcraft.com

In this final example, pass f1, f2 and other fields as per your requirements to other Linux or Unix commands using awk’s system(cmd) to executes cmd:

awk 'BEGIN { FS = "," } ; { cmd="/path/to/useradd -d" $4 " " $f2; system(cmd) }' < file.svs
See also
🐧 Get the latest tutorials on SysAdmin, Linux/Unix, Open Source/DevOps topics:
CategoryList of Unix and Linux commands
File Managementcat
FirewallCentOS 8 OpenSUSE RHEL 8 Ubuntu 16.04 Ubuntu 18.04 Ubuntu 20.04
Network Utilitiesdig host ip nmap
OpenVPNCentOS 7 CentOS 8 Debian 10 Debian 8/9 Ubuntu 18.04 Ubuntu 20.04
Package Managerapk apt
Processes Managementbg chroot cron disown fg jobs killall kill pidof pstree pwdx time
Searchinggrep whereis which
User Informationgroups id lastcomm last lid/libuser-lid logname members users whoami who w
WireGuard VPNCentOS 8 Debian 10 Firewall Ubuntu 20.04

ADVERTISEMENTS
8 comments… add one
  • Oliver Jul 3, 2012 @ 7:48

    Why don’t you use awk? Are you restricted to using bash only? If not I would use awk as it is designed to to exactly what you want: process a file line by line splitting it up to fields using IFS.

    • 🐧 Nix Craft Mar 4, 2014 @ 12:05

      The faq has been updated with the awk example. I appreciate your post.

  • Jon Tollerton Jul 18, 2012 @ 19:14

    Also, this likely fails if you have any internal commas in the data, which most applications treat by double quoting the whole field.

    example.com,username,groupname,homedir,md5password,permission,secondarygroup
    would work, but

    “example, my”,username,groupname,homedir,md5password,permission,secondarygroup
    would cause fields to be misaligned from your expectations.

  • TORNADO Aug 23, 2012 @ 12:52

    I agree with Jon regarding coma in the data.

    Some time ago I found a tool named csvtool :)
    It is in ocaml-csv package in the epel repo.

  • hepha Mar 3, 2014 @ 15:21

    The second example should be
    while IFS=’,’ read

  • foobar Mar 4, 2014 @ 13:43

    Or you can use the -F option in awk instead of the BEGIN-Block to define the seperator value.

  • be simple Feb 12, 2016 @ 18:43

    cut -d’,’ -f 2,3,4,5

    but you can use awk, python, C++, Windows10 with VisualStudio… some other monstroidal something..

Leave a Reply

Your email address will not be published.

Use HTML <pre>...</pre>, <code>...</code> and <kbd>...</kbd> for code samples.