≡ Menu

pdftotext: Linux / UNIX Convert a PDF File To Text Format

Question: I’ve downloaded configuration file in a PDF format. I do not have GUI installed on remote Linux / UNIX server. How do I convert a PDF (Portable Document Format) file to a text format using command line so that I can view file over remote ssh session?

Answer: Use pdftotext utility to convert Portable Document Format (PDF) files to plain text. It reads the PDF file, and writes a text file. If text file is not specified, pdftotext converts file.pdf to file.txt. If text-file is -, the text is sent to stdout.

Install pdftotext under RedHat / RHEL / Fedora / CentOS Linux

pdftotext is installed using poppler-utils package under various Linux distributions:
# yum install poppler-utils
OR use the following under Debian / Ubuntu Linux
$ sudo apt-get install poppler-utils

pdftotext syntax

pdftotext {PDF-file} {text-file}

How do I convert a pdf to text?

Convert a pdf file called hp-manual.pdf to hp-manual.txt, enter:
$ pdftotext hp-manual.pdf hp-manual.txt
Specifies the first page 5 and last page 10 (select 5 to 10 pages) to convert, enter:
$ pdftotext -f 5 -l 10 hp-manual.pdf hp-manual.txt
Convert a pdf file protected and encrypted by owner password:
$ pdftotext -opw 'password' hp-manual.pdf hp-manual.txt
Convert a pdf file protected and encrypted by user password:
$ pdftotext -upw 'password' hp-manual.pdf hp-manual.txt
Sets the end-of-line convention to use for text output. You can set it to unix, dos or mac. For UNIX / Linux oses, enter:
$ pdftotext -eol unix hp-manual.pdf hp-manual.txt

Further readings:

  • man page pdftotext
Share this tutorial on:

Your support makes a big difference:
I have a small favor to ask. More people are reading the nixCraft. Many of you block advertising which is your right, and advertising revenues are not sufficient to cover my operating costs. So you can see why I need to ask for your help. The nixCraft, takes a lot of my time and hard work to produce. If you use nixCraft, who likes it, helps me with donations:
Become a Supporter →    Make a contribution via Paypal/Bitcoin →   

Don't Miss Any Linux and Unix Tips

Get nixCraft in your inbox. It's free:



{ 9 comments… add one }
  • virus November 14, 2008, 2:02 pm

    this is the simplest way, being simple means not perfect. try:

    $ less file.pdf

  • nixCraft November 14, 2008, 2:40 pm

    @virus,

    You need to take help of lesspipe and they it may work out ;)
    eval "$(lesspipe)"
    less file.pdf

  • BKB December 9, 2008, 1:05 am

    I’m very glad to have found this tip since I’ve been looking for a way to index the contents of PDF files. “pdftotext” works on text in foreign languages and character sets, too, and outputs the text as UTF-8, which is excellent.

  • Ritika Garg April 9, 2009, 8:43 am

    How to convert filename.f to filename.pdf in linux?

  • Patricia August 3, 2009, 8:15 am

    how to convert filename.txt to filename.pdf in ubuntu?
    thank you

  • virus August 5, 2009, 7:26 am

    Patricia
    print the file as pdf format

  • Patricia August 8, 2009, 6:56 am

    i mean, is there any comment to convert text file to pdf file?
    thank you

  • panchicore October 12, 2010, 4:00 pm

    with mandriva you can install it with:

    [panchicore@localhost ~]$ su
    [root@localhost ~]$ urpmi poppler

    now you have available: pdftex pdftoabw pdftohtml pdftoppm pdftops pdftosrc pdftotext

  • hamid December 18, 2014, 6:32 am

    perfect!
    this is amazing tanks a lot

Security: Are you a robot or human?

Leave a Comment

You can use these HTML tags and attributes: <strong> <em> <pre> <code> <a href="" title="">


   Tagged with: , , , , , , ,