pdfimages: Extract and Save Images From A Portable Document Format ( PDF ) File
Q. How do I extract images from a PDF file under Linux / UNIX shell account?
A. pdfimages works as Portable Document Format (PDF) image extractor under Linux / UNIX operating systems. It saves images from a PDF file as Portable Pixmap (PPM), Portable Bitmap (PBM), or JPEG files. Pdfimages reads the PDF file PDF-file, scans one or more pages, and writes one PPM, PBM, or JPEG file for each image, image-root-nnn.xxx, where nnn is the image number and xxx is the image type (.ppm, .pbm, .jpg).
pdfimages is installed using poppler-utils package under various Linux distributions:
# yum install poppler-utils
OR
# apt-get install poppler-utils
pdfimages syntax
pdfimages /path/to/file.pdf /path/to/output/dir
Extract the PDF file called bar.pdf and save every image as image-00{1,2,3..N}.ppm, enter:
$ pdfimages bar.pdf /tmp/imageSample output:
$ ls /tmp/image*
image-000.ppm image-1025.ppm image-1140.ppm image-1256.ppm image-247.ppm image-374.ppm image-501.ppm image-628.ppm image-755.ppm image-882.ppm image-001.ppm image-1026.ppm image-1141.ppm image-1257.ppm image-248.ppm image-375.ppm image-502.ppm image-629.ppm image-756.ppm image-883.ppm image-002.ppm image-1027.ppm image-1142.ppm image-1258.ppm image-249.ppm image-376.ppm image-503.ppm image-630.ppm image-757.ppm image-884.ppm
Normally, all images are written as PBM (for monochrome images) or PPM (for non-monochrome images) files. With the -j option, images in DCT format are saved as JPEG files. All non-DCT images are saved in PBM/PPM format as usual:
$ pdfimages -j bar.pdf /tmp/image
The -f option Specifies the first page to scan. To scan first 5 pages, enter:
$ pdfimages -j -f 5 bar.pdf /tmp/image
The -l option specifies the last page to scan. To scan last 5 pages, enter:
$ pdfimages -j -l 5 bar.pdf /tmp/image
E-mail
Print
Can't find an answer to your question? Contact us
Related Other Helpful FAQs:
- Howto open .daa files (Direct-Access-Archive) under Linux / UNIX
- Unzip files in particular directory or folder under Linux or UNIX
- How To Extract a Single File / Directory from Tarball Archive
- Apache prevent hot linking or leeching of images using mod_rewrite howto
- Linux / UNIX command to open .gz files
Leave a Reply
We encourage your comments, and suggestions. But please stay on topic, be polite, and avoid spam. Thank you very much for stopping by our site!
Tags: apt-get command, convert pdf, image extractor, jpeg files, linux distributions, monochrome images, output image, pdf, pdf file, pdf files, portable document format, ppm format, unix operating systems, yum command



Recent Comments
Today ~ 17 Comments
Today ~ 5 Comments
Today ~ 11 Comments
Yesterday ~ 24 Comments
Yesterday ~ 6 Comments