How to extract a image from a PDF

Program to extract a picture from a PDF using Java. The external jar file is required to import within the program. Below is that the implementation for an equivalent.

Algorithm:

Extracting images using the APACHE PDF Box module.
Load the prevailing PDF document using file io.
Creating an object of PDFRenderer class.
Rendering a picture from the PDF document using the BufferedImage class.
Writing the extracted image to the new file.
Close the document.

Note: External files are required to download for performing the operation. For more documentation of the module used to refer to this.

Implementation:

// Extracting Images from a PDF using java
import java.io.*;
import java.awt.image.BufferedImage;
import javax.imageio.ImageIO;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.rendering.PDFRenderer;

class GFG {
	public static void main(String[] args) throws Exception
	{
		// Existing PDF Document
		// to be Loaded using file io
		File newFile
			= new File("C:/Documents/GeeksforGeeks.pdf");
		PDDocument pdfDocument = PDDocument.load(newFile);

		// PDFRenderer class to be Instantiated
		// i.e. creating it's object
		PDFRenderer pdfRenderer
			= new PDFRenderer(pdfDocument);

		// Rendering an image
		// from the PDF document
		// using BufferedImage class
		BufferedImage img = pdfRenderer.renderImage(0);
		// Writing the extracted
		// image to a new file
		ImageIO.write(
			img, "JPEG",
			new File("C:/Documents/GeeksforGeeks.png"));
		System.out.println(
			"Image has been extracted successfully");

		// Closing the PDF document
		pdfDocument.close();
	}
}

Submit Your Programming Assignment Details

Attach Assignment File

How to extract a image from a PDF

Submit Your Programming Assignment Details

Programming Shark

Quick Links

Links

Contact Us

Programming Help by Countries