How To Use awk In Bash Scripting

Posted on in Categories , , , , , , , , , last updated August 14, 2009

How do I use awk pattern scanning and processing language under bash scripts? Can you provide a few examples?

Awk is an excellent tool for building UNIX/Linux shell scripts. AWK is a programming language that is designed for processing text-based data, either in files or data streams, or using shell pipes. In other words you can combine awk with shell scripts or directly use at a shell prompt.

Print a Text File

awk '{ print }' /etc/passwd
awk '{ print $0 }' /etc/passwd

Print Specific Field

Use : as the input field separator and print first field only i.e. usernames (will print the the first field. all other fields are ignored):
awk -F':' '{ print $1 }' /etc/passwd
Send output to sort command using a shell pipe:
awk -F':' '{ print $1 }' /etc/passwd | sort

Pattern Matching

You can only print line of the file if pattern matched. For e.g. display all lines from Apache log file if HTTP error code is 500 (9th field logs status error code for each http request):
awk '$9 == 500 { print $0}' /var/log/httpd/access.log
The part outside the curly braces is called the “pattern”, and the part inside is the “action”. The comparison operators include the ones from C:

== != < > <= >= ?:

If no pattern is given, then the action applies to all lines. If no action is given, then the entire line is printed. If “print” is used all by itself, the entire line is printed. Thus, the following are equivalent:
awk '$9 == 500 ' /var/log/httpd/access.log
awk '$9 == 500 {print} ' /var/log/httpd/access.log
awk '$9 == 500 {print $0} ' /var/log/httpd/access.log

Print Lines Containing tom, jerry AND vivek

Print pattern possibly on separate lines:
awk '/tom|jerry|vivek/' /etc/passwd

Print 1st Line From File

awk "NR==1{print;exit}" /etc/resolv.conf
awk "NR==$line{print;exit}" /etc/resolv.conf

Simply Arithmetic

You get the sum of all the numbers in a column:
awk '{total += $1} END {print total}' earnings.txt
Shell cannot calculate with floating point numbers, but awk can:
awk 'BEGIN {printf "%.3f\n", 2005.50 / 3}'

Call AWK From Shell Script

A shell script to list all IP addresses that accessing your website. This script use awk for processing log file and verification is done using shell script commands.

[ $# -eq 0 ] && { echo "Usage: $0 domain-name"; exit 999; }
if [ -f $HTTPDLOG ];
	awk '{print}' $HTTPDLOG >$OUT
	awk '{ print $1}' $OUT  |  sort -n | uniq -c | sort -n
	echo "$HTTPDLOG not found. Make sure domain exists and setup correctly."
/bin/rm -f $OUT

AWK and Shell Functions

Here is another example. chrootCpSupportFiles() find out the shared libraries required by each program (such as perl / php-cgi) or shared library specified on the command line and copy them to destination. This code calls awk to print selected fields from the ldd output:

chrootCpSupportFiles() {
# Set CHROOT directory name
local BASE="$1"         # JAIL ROOT
local pFILE="$2"        # copy bin file libs
[ ! -d $BASE ] && mkdir -p $BASE || :
FILES="$(ldd $pFILE | awk '{ print $3 }' |egrep -v ^'\(')"
for i in $FILES
  dcc="$(dirname $i)"
  [ ! -d $BASE$dcc ] && mkdir -p $BASE$dcc || :
  /bin/cp $i $BASE$dcc
sldl="$(ldd $pFILE | grep 'ld-linux' | awk '{ print $1}')"
sldlsubdir="$(dirname $sldl)"
if [ ! -f $BASE$sldl ];
        /bin/cp $sldl $BASE$sldlsubdir

This function can be called as follows:
chrootCpSupportFiles /lighttpd-jail /usr/local/bin/php-cgi

AWK and Shell Pipes

List your top 10 favorite commands:
history | awk '{print $2}' | sort | uniq -c | sort -rn | head
Sample Output:

   172 ls
    144 cd
     69 vi
     62 grep
     41 dsu
     36 yum
     29 tail
     28 netstat
     21 mysql
     20 cat

whois | awk ‘/Domain Expiration Date:/ { print $6″-“$5”-“$9 }’

Awk Program File

You can put all awk commands in a file and call the same from a shell script using the following syntax:
awk -f mypgoram.awk input.txt

Awk in Shell Scripts – Passing Shell Variables TO Awk

You can pass shell variables to awk using the -v option:

echo | awk -v x=$n1 -v y=$n2 -f program.awk

Assign the value n1 to the variable x, before execution of the program begins. Such variable values are available to the BEGIN block of an AWK program:

{print ans}

Posted by: Vivek Gite

The author is the creator of nixCraft and a seasoned sysadmin and a trainer for the Linux operating system/Unix shell scripting. He has worked with global clients and in various industries, including IT, education, defense and space research, and the nonprofit sector. Follow him on Twitter, Facebook, Google+.

31 comment

  1. Thanks for this Helpful post, but is it possible to call bash user-defined function within
    function Print_NATE (){
    echo $1
    awk ‘{ Print_name “ALOK’ }’

    I know the awk syntax is wrong, can you provide me small HOW TO on this.


  2. some of my one liner awk tricks:

    — To convert squid log timestamps to readable, sortable format:

    gawk '{print strftime("%m/%d %H:%M:%S ",$1)" "substr($0,12,999)}' access.log > dated

    — To avoid having to cut/paste the above, I have in my .profile file:

    alias tim="gawk '{print strftime(\"%m/%d %H:%M:%S \",\$1)\" \"substr(\$0,12,999)}'"

    — To use AWK to process comma separated data:

    awk -F, '{print $1, "," $6}' excel-save.csv> extract.csv

    — To count complex pattern occurrence

    awk '{if (substr($2,1,4) == "2008" && $4 == "Exception") {print $1}}' test|grep -c ^
  3. An alternate way to call awk programs (or “scripts”), in either Linux or cygwin:

    have awk script start with the line:
    #!/usr/bin/gawk -f

    have bash calling sequences like:
    ./script input-file
    ./script -v param1=”08/15/” -v param2=”MISS/5″ dated > output-file

  4. Hope this illustrates passing arguments from bash to an awk script:

    $ ./
    1 is a valid month number
    4 is a valid month number
    8 is a valid month number
    12 is a valid month number
    18 is not a valid month number
    300 is not a valid month number
    $ cat
    # demonstrating how to pass a parameter from bash to an awk script
    for tester in 1 4 8 12 18 300; do
    ./monthcheck.awk -v awkparam1=$tester monthlist
    $ cat monthcheck.awk
    #!/usr/bin/gawk -f
    BEGIN {
       answer = "is not a valid month number"
       if ( $1 == awkparam1 ) {
          answer = "is a valid month number"
    END {
       print awkparam1  " " answer
    [email protected] /cygdrive/c/proj/jenson
    $ cat monthlist
    $ ./
    1 is a valid month number
    4 is a valid month number
    8 is a valid month number
    12 is a valid month number
    18 is not a valid month number
    300 is not a valid month number
  5. HI nerdoug
    Thanks for the nice post ” Hope this illustrates passing arguments from bash to an awk script:”
    But I am looking for just inverse case
    “passing arguments form awk to bash user-define function.”

    It will be a great help if you can come up with some example.

    Thanks for your time and such a useful note.


  6. # =====================================================
    # method 1A: returning a value from inline awk command via output stream
    $ cat expenses
    name date hotel breakfast lunch dinner beer
    Doug 2009-07-20 159 10 15 32 44
    Doug 2009-07-21 159 0 12 25 0
    Doug 2009-07-22 159 8 15 41 87
    Doug 2009-07-23 159 0 0 10 0
    Doug 2009-07-24 0 11 15 0 0
    $ cat
    # bash script which calls awk to return a value
    #    - here, value is biggest daily food expense in file
    #    -also does arithmetic is awk, easier than in bash
    #    -value is returned as the output of the awk command/script
    #    -this sacrifices production of a regular output stream
    #    -the first if skips over first data file line with column titles
    bash_max=`awk '{if ($1 !="name") {t=$4+$5+$6; if (t>big) {big=t}}} END {print big}' expenses`
    echo " value returned to bash is "$bash_max
    if [ bash_max > 60 ] ; then
       echo " -daily food expense guideline violated"
       echo " -no food expense violation"
    $ ./
     value returned to bash is 64
     -daily food expense guideline violated
    # method 1B: returning a value from awk script via output stream
    $ cat
    # bash script which calls awk script to return a value
    bash_max=`./scan.awk expenses`
    echo " value returned to bash is "$bash_max
    if [ bash_max > 60 ] ; then
       echo " -daily food expense guideline violated"
       echo " -no food expense violation"
    $ cat scan.awk
    #!/usr/bin/awk -f
    # awk script to return largest food expenditure from input file
      if ($1 !="name") {
         if (t>big) {
    END {print big}
    $ ./
     value returned to bash is 64
     -daily food expense guideline violated
    # method 2: returning value via temporary disk file, with normal output stream
    $ cat
    # bash script which calls awk to return a value
    #    -value is returned via a temporary disk file
    #    -still allows production of a regular output stream
    # run the awk scan creating summary file, and max value in disk file
    ./scan-disk.awk expenses > summary
    # retrieve value written to disk
    bash_max=`cat big.txt`
    if [ bash_max > 60 ] ; then
       echo "*** daily food expense guideline violated"
       echo "    see following details"
       cat summary
       echo " -no food expense violation"
    $ cat scan-disk.awk
    #!/usr/bin/awk -f
    # awk script to return largest food expenditure from input file
    # -by writing resulting value to a temporary disk file
    # -also produce normal output stream
      if ($1 !="name") {
         if (t>big) {
         # produce reformatted output stream
         print $2 " food=" t " beer= "$7
    END {
       print big > "big.txt"
    [email protected] /cygdrive/c/proj
    $ ./
    *** daily food expense guideline violated
        see following details
    2009-07-20 food=57 beer= 44
    2009-07-21 food=37 beer= 0
    2009-07-22 food=64 beer= 87
    2009-07-23 food=10 beer= 0
    2009-07-24 food=26 beer= 0
    # Method I can't get to work
    # I was hoping that using this command to embed a shell command in awk would work...
    #       t = system("export bash_max=123")
    # but I can't get it to work, perhaps due to too many levels of child processes?
  7. but I can’t get it to work, perhaps due to too many levels of child processes?

    Yes, you need to keep everything in same shell. Awk calls sh whenever system() is used. To OP, you better use perl or python if you need really complicated stuff.

  8. dear sir,

    I would like to know how could I use awk with conditional statement.

    for ex . –

    if there is two condition (let 14 , 17 ( which are in 3rd column)) & another 2 conditions(let ab,cd (which are in 4th & 6th column respectively)).
    & it will porceed if condition ”14de” or ”17de” are true .

    (all the data’s are in ”xyz.txt” file)
    I have made like :-

    ‘cat xyz.txt | awk -F”,” ‘{ if(((substr($3,1,2)==14) || (substr($3,1,2)==17))) && (substr($4,1,4)==”\”ab\””) && (substr($6,1,4)!=”\”cd\””)) print($4) }’ “| wc -l’


    1. I have a comment on your bash, and a couple on your awk.

      bash: doing cat infile | awk ‘{stuff}’ > outfile causes each line in the input file to be processed twice, once by cat, and once by awk.
      if instead you do awk ‘{stuff}’ infile > outfile there’s only one pass through the file. If you’re processing big files, this can be significant.

      awk: I’m not sure if your boolean logic will work, because I don’t understand whatt you meant by the goal of proceeding if condition ”14de” or ”17de” are true. You can do some of the filtering in a regular expression, and perhaps all of it depending on the formating of your data. for example, I think you can replace the first part of your if logic like this:
      awk -F, /^1[47]/ ‘{if((substr($4,1,4)==”\”ab\””) && (substr($6,1,4)!=”\”cd\””)) {print $4 } }’ xyz.txt | wc -l

      I don’t think the last single quote and the last double quote are needed in your awk.

      hope this helps.

  9. @Vivek.
    Thanks for your topic.

    BTW, reagrding passing shell variables to [awk],
    that can be used in BEGIN block,
    one can also use ENVIRON array:


    export myawkvar=something; awk ‘BEGIN {print “BEGIN:” ENVIRON[“myawkvar”] “:”}’

    Would you add this builtin function to your topic?


  10. hi,
    i wanted to do use awk in a bash funciton like
    funciton myfun(){
    local myval=$(awk -F= ‘/$1/ {print $2}’ myconfig.file)
    echo $myval
    mymain=$(myfun “string”)

    ->but the problem is i have to use single quotes for awk in myfun function, where i cannot reference the variable $1, which is passed on from my main script.
    ->is there a solution to this ??

  11. a directory contains few folders where my files are located( all files in all folders need to be processed).now,how to use awk recursively for all files. in grep we have
    $grep -r string *.asc
    wat abt awk..?

      1. No, Anantha, this won’t do.
        Again, think of Unix philosophy: Each command should do only one thing, and do it well.

        Most of the time, this implies using pipes.

        Precisely in this case, each time a program needs to traverse directories recursively, the right tool is [find]. Then pipe its result into [xargs], that will call [awk].

        Search The Fantastic Web with these commands names (unix find xargs awk) and it’s up to you now =>


  12. Hi to all!
    I have a question on using awk in bash scripts.
    I am completely confused.

    I have a file

    and a script
    awk ‘{print $1}’ $voc

    When I run
    >./ voc.txt
    I get
    as expected.

    But when I run the script
    awk ‘{print ” word” NR ” = (any * ‘” $1 “‘ space @increment” NR “) * ;”}’ $voc
    I get
    word1 = (any * voc1.txt space @increment1) * ;
    word2 = (any * voc1.txt space @increment2) * ;

    If I put a backslash before $, I get

    word1 = (any * ‘$1’ space @increment1) * ;
    word2 = (any * ‘$1’ space @increment2) * ;

    How can I get

    word1 = (any * ‘book’ space @increment1) * ;
    word2 = (any * ‘help’ space @increment2) * ;


  13. Hi Can some one help me put together bash script that runs coulmns
    1 2 3 4 5 6
    and not 1

    with column headers column1 = tj/art

    and then redirect to a file >>

  14. Hi All,

    I have a shell script program which works in Unix platform and able to load data in staging and interface tables, but when it migrated to other instance of linux platform, its not working and not able to load data.

    Is there any change to do with awk and sed commands in Linux ?

    Please help in this regard…

    Please find is shell script program

    #print "Dollar 1 $1"
    org_id=$(print $1 | awk '{print $9}' | sed 's/"//g')
    print $org_id 
    LOGIN=$(print $1 | awk '{print $3}' | sed 's/"//g' | sed 's/FCP_LOGIN=//') 
    print $datafile
    print $GT_TOP
    print $forecast_file
    #  sed 's/
    //' $forecast_file  > GT_TOP/bin/$org_id.dat
    #  mv  $GT_TOP/bin/$org_id.dat $forecast_file
    dos2ux $forecast_file> GT_TOP/bin/test.csv
    rm $forecast_file
    mv $GT_TOP/bin/test.csv $forecast_file
    chmod 777 $forecast_file
    sqlldr $LOGIN control=$GT_TOP/bin/GB_FORECAST_UPLOAD.ctl data=$forecast_file
    sqlplus -s $LOGIN << EOF
    1. $ cat file

      $ awk -F/ ‘{print “/”$2″/”$3}’ file

      $ awk -F/ ‘{print $2,$3}’ file
      a b
      a b
      a b

  15. can anyone help me with this


    Write a bash shell script that prints out only the even numbered UID’s from /etc/passwd.

    1) Use a for loop with awk to extract the third field in /etc/passwd to a variable x.

    2) Set a variable xmod2=$(($x%2))

    3) Use an if condition to test if xmod2 is 0.

    4) If it is zero, then the UID is even, so echo $x.

  16. Hi ,
    Can anyone tell me ? I want to take input from different file into script ,it should take feild wise,my input file has different fields separated by ; ex: madhuri;beautiful;girl;innocent; like this I have input file where name it should take first field,at second input second word,how can I write for this can you explain me pls .

Leave a Comment