How to extract a image from a PDF

 
Algorithm:
 
  1. Extracting images using the APACHE PDF Box module.
  2. Load the prevailing PDF document using file io.
  3. Creating an object of PDFRenderer class.
  4. Rendering a picture from the PDF document using the BufferedImage class.
  5. Writing the extracted image to the new file.
  6. Close the document.

Note: External files are required to download for performing the operation. For more documentation of the module used to refer to this. 

Implementation:

 

 
 
// Extracting Images from a PDF using java
import java.io.*;
import java.awt.image.BufferedImage;
import javax.imageio.ImageIO;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.rendering.PDFRenderer;

class GFG {
	public static void main(String[] args) throws Exception
	{
		// Existing PDF Document
		// to be Loaded using file io
		File newFile
			= new File("C:/Documents/GeeksforGeeks.pdf");
		PDDocument pdfDocument = PDDocument.load(newFile);

		// PDFRenderer class to be Instantiated
		// i.e. creating it's object
		PDFRenderer pdfRenderer
			= new PDFRenderer(pdfDocument);

		// Rendering an image
		// from the PDF document
		// using BufferedImage class
		BufferedImage img = pdfRenderer.renderImage(0);
		// Writing the extracted
		// image to a new file
		ImageIO.write(
			img, "JPEG",
			new File("C:/Documents/GeeksforGeeks.png"));
		System.out.println(
			"Image has been extracted successfully");

		// Closing the PDF document
		pdfDocument.close();
	}
}

 

Submit Your Programming Assignment Details