Linux / UNIX Display Lines Common in Two Files

Posted on in Categories , , last updated April 12, 2008

Q. I’m trying to use diff command, but it is not working. I’d like to display those lines that are common to file1 and file2? How do I do it?

A. Use comm command; it compare two sorted files line by line. With no options, produce three column output. Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and column three contains lines common to both files.

To Display Those Lines That Are Common to File1 and File2

Type the command as follows:
$ comm /path/to/file1/ /path/to/file2
$ comm -1 /path/to/file1/ /path/to/file2
$ comm -2 /path/to/file1/ /path/to/file2
$ comm -3 /path/to/file1/ /path/to/file2

Where,

  • -1 : suppress lines unique to FILE1
  • -2 : suppress lines unique to FILE2
  • -3 : suppress lines that appear in both files

You can also try out perl code (it was posted by someone at comp.unix.shell news group):

$ perl -ne 'print if ($seen{$_} .= @ARGV) =~ /10$/'  file1 file2

Posted by: Vivek Gite

The author is the creator of nixCraft and a seasoned sysadmin and a trainer for the Linux operating system/Unix shell scripting. He has worked with global clients and in various industries, including IT, education, defense and space research, and the nonprofit sector. Follow him on Twitter, Facebook, Google+.

23 comment

  1. I used the line of perl to find the common lines between two files, but I needed it to ignore case. The solution, provided by mu was:

    perl -ne ‘print if ($seen{lc $_} .= @ARGV) =~ /10$/’ file1 file2

    Thanks mu!

    Aengus

  2. I am not able to find common lines between 2 files with comm -3.. I use diff instead:
    Use regex of contents of the file instead of \d+

    diff -y <(sort file1) out

    And for finding lines that are present in file1 but not in file2:
    diff –suppress-common-lines <(sort file1) ” > output

  3. Why not:

    comm -1 -2 /path/to/file1/ /path/to/file2

    This works in the command line on OSX for showing on the common lines to the 2 files. (though perhaps it is limited to OSX’s flavor of BSD — haven’t tested elsewhere)

    1. It works only for sorted files, so you need:
      /path/to/file1 | sort > /path/to/file1_sorted
      /path/to/file2 | sort > /path/to/file2_sorted
      comm -1 -2 /path/to/file1_sorted/ /path/to/file2_sorted

  4. This script takes two files as arguments $1 and $2 and prints common lines in the
    two files.
    ===============================
    while read line
    do
    echo “Searching for : ”
    echo $line
    while read lines
    do
    if [ “$lines” = “$line” ]
    then
    echo $line
    fi
    done < $2

    done < $1
    ======================================
    save this into a file then run it ./anyname file1 file2
    NOTE : The input files must be intended i.e. no extra space in lines (i.e before or after lines.)

      1. ===============================
        while read line
        do
        echo “Searching for : ”
        echo $line
        while read lines
        do
        if [ “$lines” = “$line” ]
        then
        echo $line
        fi
        done < $2

        done < $1
        =============================================
        It works.
        anyname.sh — contains the above written script
        u have to change the privlges of the file to 777 :
        chmod 777 anyname.sh
        then run it :
        ./anyname.sh

        file1 and file2 can be any text files.

          1. U must run d script like dis only :
            wid arguments .
            ————————————-
            ./anyname.sh (file1) (file2)
            ————————————-

  5. In Unix lets assume, I have 2 files .. file1 and file2
    In file1
    13
    14
    15
    16
    17
    20
    21
    23
    24
    27
    and in file2
    16
    17
    18
    24
    25
    30
    32

    Question : I would like to print the common strings(words) listed in file1 and file2 in the column wise only. i.e,
    16
    17
    24

  6. paste the code below into a file called intersect.sh

    while read line
    do
    grep -n “^${line}$” $2;
    done < $1
    then make sure you have execute permissions on the file

    chmod 777 intersect.sh

    then run this command, substituting the names of your two files

    ./intersect.sh nameOfLongerFile nameOfShorterFile

  7. Is there a quick way to get the matching lines plus (in one file) adjacent lines? The following works, but hits near the end of the files are incredibly inefficent:

    while read p; do grep -A1 “$p” file2 ; done <file1

    this'll match all lines from file1 in file2 (or at least the first match?) and provide the next line from file2 as well.

  8. In python :

    import sys,os
    file1 = open(sys.argv[1],"r")
    file2 = open(sys.argv[2],"r")
    
    lines1 = file1.readlines()
    lines2 = file2.readlines()
    print "File %s has %s lines" % (sys.argv[1],len(lines1))
    print "File %s has %s lines" % (sys.argv[2],len(lines2))
    
    found = []
    notfound = []
    for idx,l in enumerate(lines1) :
    	print "Doing line %s in file %s : %s" % (idx,sys.argv[1],l)
    	for idx2,l2 in enumerate(lines2) :
    		if l == l2 :
    			print "Found %s ( ==  %s ) in file %s at index %s" % (l,l2, sys.argv[2],idx2)
    			found.append(l)
    			continue
    			print "Found : %s , Not Found : %s" % (len(found),len(notfound))
    	notfound.append(l)
    	print "Found : %s , Not Found : %s" % (len(found),len(notfound)) 
    

    python nameofthescript.py file1 file2

    You can comment prints of course

  9. In a file having following data

    adsdfsf 10 sdfd 10
    sdsgfjf 5 hdvfd 10
    gdyfdfd 20 jdfhd 564

    I want to compare column one with column two. Please any body knows reply to my mail id.

Leave a Comment