Sed Find and Display Text Between Two Strings or Words

by on April 12, 2008 · 36 comments· LAST UPDATED April 17, 2008

in , ,

Q. How do I find the text between the strings FOO and BAR inclusive using sed command line option?

A. Sed is a stream editor. A stream editor is used to perform basic text transformations on an input stream - a file or input from a pipeline.

To output all the text from file called test.xt' between 'FOO' and 'BAR', type the following command at a shell prompt. The -n option suppress automatic printing of pattern space:
$ sed -n '/WORD1/,/WORD2/p' /path/to/file
$ sed -n '/FOO/,/BAR/p' test.txt

You can easily find out all virtual host entries from httpd.conf, type
# sed -n '/<VirtualHost*/,/<\/VirtualHost>/p' /etc/httpd/conf/httpd.conf

TwitterFacebookGoogle+PDF versionFound an error/typo on this page? Help us!

{ 36 comments… read them below or add one }

1 kaosmonk April 14, 2008 at 7:00 am

This is just what I’ve needed! I’ve been playing with awk these days, trying to do the same. As these are my first steps in text manipulation field, I’d appreciate if you could help me out with problems I’ve encoutered: e.g. I want to print only the substring between two words/strings and not those two words/strings (the command you gave prints WORD1 and WORD2, and I do not want them to be printed at all); I believe it’s possible to do some search&replace after I run the above sed command, but I wonder if it’s possible to be done through only one iteration; secondly, let’s say I have two files with different strings, where each string is in its own line in the file; what I want to do is to append each string of the second file to the first so that I could have a file formatted like this:
first_file’s_string1, second_file’s_string1
first_file’s_string2, second_file’s_string2

I couldn’t find a way to do this no matter what I’ve tried. Can someone help me with this? Would appreciate a lot!

Reply

2 Binny V A April 14, 2008 at 7:38 pm

, is a range operator in sed – its very useful. I think it can be used to specify 2 line numbers and it will return all the lines between those 2 lines.

Reply

3 nixCraft April 14, 2008 at 7:49 pm

Yes, it is a range operator; if you know line number it is good; but most time you need to select data using dynamic conditions

Reply

4 gameboy April 22, 2008 at 11:05 am

Hi,
thanks it’s a good tips !!
I notice a little mistake in your regular expression :
“/<VirtualHost*/”
means <VirtualHost with 0 t or an infinite t

It will be better :
“/<VirtualHost.*/”

see ya

Reply

5 Gagan Brahmi May 5, 2008 at 2:12 am

A better option will be use to ‘space’ after VirtualHost string in order to ensure that we get the VirtualHost having some value against it.

For example:-

# sed -n ‘/<VirtualHost /,//p’ /etc/httpd/conf/httpd.conf

Reply

6 Robsteranium June 5, 2008 at 3:23 pm

I arrived here struggling with this one but I’ve just figured it out!

/WORD1/,/WORD2/{
/WORD1/d
/WORD2/d
p
}

hth

Reply

7 yoander June 11, 2008 at 4:29 pm

Very useful tip about sed

Reply

8 reza September 10, 2008 at 11:58 pm

hi
i have a file like this:

ther: Part II (1974) …. Vito Corleone
… aka Mario Puzo’s The Godfather: Part II (USA: complete title)
# Mean Streets (1973) …. Johnny Boy
# Bang the Drum Slowly (1973) …. Bruce Pearson
# The Gang That

I need to delete all between ‘)’ and the ‘#’ (keep the # start of each line as is)
output:

ther: Part II (1974)
# Mean Streets (1973)
# Bang the Drum Slowly (1973)
# The Gang That
….

how do i do it?

Reply

9 codegazer May 1, 2009 at 12:48 am

Hi,
Have been reading your blogs for quite some time now and are very informative.
On this particular script, I need some advice. Say I have 2 words, Begin and End that I am looking for in a file. For Ex.

Begin
line1
line2
End

I am trying to modify your script. I want to delete what lies between these 2 patterns viz. Begin/End but I need to keep the words Begin and End. I end up deleting those as well.

Any inputs ?

Thanks

Reply

10 sedbeginner May 14, 2009 at 11:25 pm

I have a similar problem, I have a long line of text and within this line I have my start and end variable and I need the text between start and end.

Example:

t6gd68g d9d8j5%9j30j 0jf087*(&&^*2hd920STARTid8 =e72920 2d9nf9END93nf300j90

needed output between START and END, resulting in id8 =e72920 2d9nf9

Any Idea how to do this?

Cheers.

Reply

11 Vasileios Sotiras September 9, 2012 at 9:06 pm

$ echo ‘t6gd68g d9d8j5%9j30j 0jf087*(&&^*2hd920STARTid8 =e72920 2d9nf9END93nf300j90′ | sed -n ‘s/.*START//;s/END.*//p’
id8 =e72920 2d9nf9

Reply

12 pawlx0r October 17, 2012 at 6:27 am

Life saver! I’ve been trying to figure that out forever.

Since it helped me, I guess I can add that if you have to grab text between two directories in a path like:

/directory1/blablablabla/directory2/ and all you want is blablablabla
Your code is perfect, and all you have to do is change the /’s to something else so that you can use /’s in the text, I used % –such as:

sed -n ‘s%.*directory1/%%;s%/directory2.*%%p’

I hope this helps someone else.

Reply

13 Sergani October 20, 2013 at 10:26 am

Works like a charm, thanks!

Reply

14 Nico Maas June 17, 2009 at 7:21 am

Hi there,
I got some problem.
I got this html Code:

1 neue Nachricht

And need to extract the Text “1 neue Nachricht” – but that thing can change.
Any way in doing this by sed?

Thanks

Reply

15 Nico Maas June 17, 2009 at 7:24 am

I meant this Code:

a href=”../messages/index.php” class=”navilink”>1 neue Nachricht

Reply

16 Robsteranium June 17, 2009 at 2:18 pm

You may find that a web scraper using xpath is more appropriate for extracting data from web pages. Try ScRubyt!

Reply

17 dinu August 21, 2009 at 10:10 am

Shell script to print contents of file from given line number to next given number of lines

Reply

18 Rajiv January 7, 2010 at 5:21 am

I have scenario where i need to copy all the lines in a logfile between 2 specific time intervals. say 5 PM to 6 PM, as the file is quite huge in size we are unable to open the file. please suggest a work around .

Thanks
Rajiv.

Reply

19 AVKLINUX June 24, 2010 at 11:38 pm

what if I only want to match 2 strings for 1st instance . ?

not globally.

THANKS
AVK

Reply

20 Alex July 7, 2010 at 11:27 am

Hello,
What if I have a txt like this:
Begin
yyyy
yyyy
End
…….
Begin
xxxx
xxxx
End

what if I want to display the text between “begin” and “end” but only for the first sequence, or only for the 2nd or 3rd?
Thanks a lot.

Reply

21 gova August 29, 2012 at 11:57 am

grep -n 1~3p’ filename…. this prints the output from first line and prints every 3rd line from thereon….. I know dis may not completely fulfill ur need, but let me know if u have found ur need…

Reply

22 Radheshyam May 9, 2011 at 2:15 am

Hi,

I found a similar command. Here the strings at both ends (FOO and BAR) will be ignored.

sed -e "s/.*FOO//;s/BAR.*//" test.txt
For example:
echo "Test to find string between words FOO-/*+_and_+*/- BAR" | sed -e "s/.*FOO//;s/BAR.*//"

Reply

23 John May 18, 2011 at 8:41 pm

For some reason none of listed example does not working properly when trying to search for ,no matter which command is applied it always show rest of content in file.Could be because second word in search is and that causing problem for some reason.But i managed to make it work by this command:
sed -n ‘s//&/p’ index.html
It shows entire line where is located,but that is not problem,since the key was to find out content between and tag.

Reply

24 Swaroop July 11, 2011 at 6:53 am

I have found a problem. I need to extract the text between two strings and replace the characters with some symbol.

Ex : initially.. ABC ghr fhufhuw XYZ
Output :ABC !!!!!!!!!!!!!!! XYZ.
Can you please , help me out.I need to fix this.

Reply

25 saravana September 29, 2011 at 4:37 am

Hi,

I need to modify /etc/filesystems file in IBM AIX ,where i have to search for a existing FS stanza (eg /opt) then after that stanza i had to put my new FS stanza.

/dev/optlv:
dev = /dev/optlv
Vfs – jfs2
mount = yes

/new/lv #newly insered line
dev = /new/lv #newly inserted line
Vfs = jfs2 #newly inserted line

/tmp
dev = /dev/tmplv
……………………….

Reply

26 bhavya December 5, 2011 at 8:07 pm

Hi ,
There are some problems I am facing with linux right now.

This is what I am doing to generate a cryptographic signature fron bash script….

step1. I encrypt a text : encrypt(bhavyakailkhurabhavyakailkhura) and get a encrypted string sdfghjklsdfghjklzxcvbnmghjkdfghjkzxcvbnmzxc.
step2. I copy that string manually from mouse no commands used and append it with a file..

$ cat textfile.txt…
user=bhavya
signature={sdfghjklsdfghjklzxcvbnmghjkdfghjkzxcvbnmzxc}

step3. Using “sed” on text file I send text between “{” and “}” to another file bhavya.txt
step4. now when I try to decrypt bhavya.txt it say error in reading file. I checked the text is same as encrypted. When I copy original encrypted text in step 1 and decrypt It works.

I think the problem is sed command changes spacing between text its not same as copying using mouse. Do you have any idea how we can solve this problem?

Sorry for clumsy and lengthy explanation.

Thanks

Reply

27 j0hny December 10, 2011 at 9:06 pm

when using sed try to add some tr commands… eg. sed | tr -d’ ‘ | tr -d’\n’ | tr -d’\t’

this will delete all spaces, newlines etc that sed might add to your string

to check the spacing you can try: cat file.txt | od -c (find the delimiter), use sed as regulary (when it fails), cat new_by_sed_created_file.txt | od -c (check the changes)

Reply

28 j0hny December 10, 2011 at 9:10 pm

sorry, should be: sed | tr -d ‘ ‘ | tr -d ‘\n’ | tr -d ‘\t’

Reply

29 Moyente January 12, 2012 at 9:48 pm

in the same concept, how can i extract all the substring starting with <ds:include and ending with .xml from a long textual string?
example input (remember, the input content is in one line):
10120

expected output:
ds:include ds:uri=”$HOME/lookup/lookup_table.xml
ds:include ds:uri=”$HOME/trans/transform.xml

Thanks

Moyente

Reply

30 Genuineapps April 10, 2012 at 9:50 am

thank you

Reply

31 Akin Okon May 19, 2012 at 5:20 am

Thanks I found this very useful!!!

Reply

32 Juliana May 30, 2012 at 7:32 pm

What if If I search for “Element2.blk”, the lines after the “Element2.blk” line slhoud be deleted. Noting but excluding the Element.blk line?

Reply

33 rohan August 2, 2012 at 9:31 pm

I need to find lines between two words if third word match.
eg

JkMountFile /etc/httpd/conf/test1-uriworkermap.properties
ServerName test1.com

JkMountFile /etc/httpd/conf/test2-uriworkermap.properties
ServerName test2.com

want to search test1.com and print lines between –

Reply

34 Sridev October 1, 2012 at 3:50 pm

Hi
Pls let me know how can I fetch the string between the SQL>.
SQL> SQL> SQL> 0 1343761635 TV Alerts-STB TV Alerts 58 6 3 30| 7 1343761635 TV Alerts-STB TV Alerts 58 6 3 30| 7 1343761635 TV Alerts-STB TV Alerts 58 6 3 30| SQL>

The output I shold see as 0 1343761635 TV Alerts-STB TV Alerts 58 6 3 30| 7 1343761635 TV Alerts-STB TV Alerts 58 6 3 30| 7 1343761635 TV Alerts-STB TV Alerts 58 6 3 30|

Reply

35 Mohammed-Egypt November 16, 2012 at 3:41 pm

Hi Every Body Here …
I very Enjoyed With these article and with your comments…
PLease I want Help From Every Programmer Here…
My hard Proplem is ….I have Notpad++ V6.2 With all plugins nearlly…
I use the extension “Copyall links” with Mozilla Browser & i copied many websites links
for espicial operation — So i now have (32) text files—Every File Have avery big data 12.000 lines_with hosts and pathes like this ((www.*.com/anywords/anywords/….))

So My Queation is ..((how can i keep the hosts only or the domain only))
in other words ….((How Can I only copy All Words between {www} And {.com} ))
Please .i want your powerful experiance for doing this..Either or copy these websites to other file OR Invert selection for deleting other non-selected….But finally I want The function Or string that do this….Iam make search in all net but i didn’t find any help …so Answer me PLEASE…WHO ANSWER ME ..PLEASE give me Amessage TO KNOW THAT Y ANSWER ME AT E-mail
http://WWW.COM2100@YAHOO.COM
THANKS FOR YOU ALL

Reply

36 SSengupta January 28, 2014 at 12:58 pm

Thanks guys. Its a great blog to get your doubts clear.

But what if we need the portion from a text based on some keyword.
For eg. My file is like below,

————————————————
Order=[
1
2
3
4
5
Order=[
6
7
8
9
10
Order=[
11
12
13
14
15
Order=[
————————————————
Now i want the middle portion where i found EO427849242. I tried with sed but it does not give me the desired result.

I used the command,
sed -n ‘/Sandy Order/,/Sandy Order/p’ Filename

and it gives me the all the portion in the file from Start=Sandy Order and end=Sandy Order

When I use the below command,
sed -n ‘/Sandy Order/,/Sandy Order/p’ Filename | grep EO427849242

it only gave me the line
Order=[

But what i need is
—–
Order=[
11
12
13
14
15
—–

Can anyone please help me on this. It will be really helpful.

Reply

Leave a Comment

Tagged as: , , , , , ,

Previous Faq:

Next Faq: