Question: I’ve downloaded configuration file in a PDF format. I do not have GUI installed on remote Linux / UNIX server. How do I convert a PDF (Portable Document Format) file to a text format using command line so that I can view file over remote ssh session?
Answer: Use pdftotext utility to convert Portable Document Format (PDF) files to plain text. It reads the PDF file, and writes a text file. If text file is not specified, pdftotext converts file.pdf to file.txt. If text-file is -, the text is sent to stdout.
Install pdftotext under RedHat / RHEL / Fedora / CentOS Linux
pdftotext is installed using poppler-utils package under various Linux distributions:
# yum install poppler-utils
OR use the following under Debian / Ubuntu Linux
$ sudo apt-get install poppler-utils
pdftotext syntax
pdftotext {PDF-file} {text-file}
How do I convert a pdf to text?
Convert a pdf file called hp-manual.pdf to hp-manual.txt, enter:
$ pdftotext hp-manual.pdf hp-manual.txt
Specifies the first page 5 and last page 10 (select 5 to 10 pages) to convert, enter:
$ pdftotext -f 5 -l 10 hp-manual.pdf hp-manual.txt
Convert a pdf file protected and encrypted by owner password:
$ pdftotext -opw 'password' hp-manual.pdf hp-manual.txt
Convert a pdf file protected and encrypted by user password:
$ pdftotext -upw 'password' hp-manual.pdf hp-manual.txt
Sets the end-of-line convention to use for text output. You can set it to unix, dos or mac. For UNIX / Linux oses, enter:
$ pdftotext -eol unix hp-manual.pdf hp-manual.txt
Further readings:
- man page pdftotext
🐧 9 comments so far... add one ↓
Category | List of Unix and Linux commands |
---|---|
File Management | cat |
Firewall | Alpine Awall • CentOS 8 • OpenSUSE • RHEL 8 • Ubuntu 16.04 • Ubuntu 18.04 • Ubuntu 20.04 |
Network Utilities | dig • host • ip • nmap |
OpenVPN | CentOS 7 • CentOS 8 • Debian 10 • Debian 8/9 • Ubuntu 18.04 • Ubuntu 20.04 |
Package Manager | apk • apt |
Processes Management | bg • chroot • cron • disown • fg • jobs • killall • kill • pidof • pstree • pwdx • time |
Searching | grep • whereis • which |
User Information | groups • id • lastcomm • last • lid/libuser-lid • logname • members • users • whoami • who • w |
WireGuard VPN | Alpine • CentOS 8 • Debian 10 • Firewall • Ubuntu 20.04 |
this is the simplest way, being simple means not perfect. try:
$ less file.pdf
@virus,
You need to take help of lesspipe and they it may work out ;)
eval "$(lesspipe)"
less file.pdf
I’m very glad to have found this tip since I’ve been looking for a way to index the contents of PDF files. “pdftotext” works on text in foreign languages and character sets, too, and outputs the text as UTF-8, which is excellent.
How to convert filename.f to filename.pdf in linux?
how to convert filename.txt to filename.pdf in ubuntu?
thank you
Patricia
print the file as pdf format
i mean, is there any comment to convert text file to pdf file?
thank you
with mandriva you can install it with:
[panchicore@localhost ~]$ su
[root@localhost ~]$ urpmi poppler
now you have available: pdftex pdftoabw pdftohtml pdftoppm pdftops pdftosrc pdftotext
perfect!
this is amazing tanks a lot