pdftotext: Linux / UNIX Convert a PDF File To Text Format

by Vivek Gite · 7 comments

Question: I've downloaded configuration file in a PDF format. I do not have GUI installed on remote Linux / UNIX server. How do I convert a PDF (Portable Document Format) file to a text format using command line so that I can view file over remote ssh session?

Answer: Use pdftotext utility to convert Portable Document Format (PDF) files to plain text. It reads the PDF file, and writes a text file. If text file is not specified, pdftotext converts file.pdf to file.txt. If text-file is -, the text is sent to stdout.

Install pdftotext under RedHat / RHEL / Fedora / CentOS Linux

pdftotext is installed using poppler-utils package under various Linux distributions:
# yum install poppler-utils
OR use the following under Debian / Ubuntu Linux
$ sudo apt-get install poppler-utils

pdftotext syntax

pdftotext {PDF-file} {text-file}

How do I convert a pdf to text?

Convert a pdf file called hp-manual.pdf to hp-manual.txt, enter:
$ pdftotext hp-manual.pdf hp-manual.txt
Specifies the first page 5 and last page 10 (select 5 to 10 pages) to convert, enter:
$ pdftotext -f 5 -l 10 hp-manual.pdf hp-manual.txt
Convert a pdf file protected and encrypted by owner password:
$ pdftotext -opw 'password' hp-manual.pdf hp-manual.txt
Convert a pdf file protected and encrypted by user password:
$ pdftotext -upw 'password' hp-manual.pdf hp-manual.txt
Sets the end-of-line convention to use for text output. You can set it to unix, dos or mac. For UNIX / Linux oses, enter:
$ pdftotext -eol unix hp-manual.pdf hp-manual.txt

Further readings:

  • man page pdftotext

Featured Articles:

Want to read Linux tips and tricks, but don't have time to check our blog everyday? Subscribe to our daily email newsletter to make sure you don't miss a single tip/tricks. Subscribe to our weekly newsletter here!

{ 7 comments… read them below or add one }

1 virus 11.14.08 at 2:02 pm

this is the simplest way, being simple means not perfect. try:

$ less file.pdf

2 vivek 11.14.08 at 2:40 pm

@virus,

You need to take help of lesspipe and they it may work out ;)
eval "$(lesspipe)"
less file.pdf

3 BKB 12.09.08 at 1:05 am

I’m very glad to have found this tip since I’ve been looking for a way to index the contents of PDF files. “pdftotext” works on text in foreign languages and character sets, too, and outputs the text as UTF-8, which is excellent.

4 Ritika Garg 04.09.09 at 8:43 am

How to convert filename.f to filename.pdf in linux?

5 Patricia 08.03.09 at 8:15 am

how to convert filename.txt to filename.pdf in ubuntu?
thank you

6 virus 08.05.09 at 7:26 am

Patricia
print the file as pdf format

7 Patricia 08.08.09 at 6:56 am

i mean, is there any comment to convert text file to pdf file?
thank you

Leave a Comment

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Previous FAQ:

Next FAQ:

nixCraft FAQ PDF Collection Now Available To All