In this article, we are getting to extract the contents of the Java class file using the Apache Tika library. Apache Tika is employed for document type detection and content extraction from various file formats. It uses various document parsers and document type detection techniques to detect and extract data. It provides one generic API for parsing different file formats. of these parser libraries are encapsulated during a single interface called the Parser interface.
The following table shows a description of the important methods used in the solution :
Method |
Description |
---|---|
BodyContentHandler() | It creates a content handler that writes XHTML body character events to an internal string buffer. |
Metadata() | It constructs new, empty metadata. |
ParseContext() | It creates a parse context object that is used to pass context information to Tika parsers. |
parse() | Instantiate the parser object, and invoke the parse method. |
Example: Java code to extract the contents of Java class file format
// Java program to extract the // contents of Java class file // format import java.io.File; import java.io.FileInputStream; import java.io.IOException; // importing Apache Tika libraries import org.apache.tika.exception.TikaException; import org.apache.tika.metadata.Metadata; import org.apache.tika.parser.AutoDetectParser; import org.apache.tika.parser.ParseContext; import org.apache.tika.parser.Parser; import org.apache.tika.sax.BodyContentHandler; import org.xml.sax.SAXException; public class ParserExtraction { public static void main(final String[] args) throws IOException, SAXException, TikaException { // create a File object File f = new File("AddTwoNumbers.java"); // parse method parameters Parser parser = new AutoDetectParser(); // instantiate BodyContentHandle BodyContentHandler handler = new BodyContentHandler(); // Creates the Metadata object Metadata metadata = new Metadata(); FileInputStream inputstream = new FileInputStream(f); // creates a parse context object ParseContext context = new ParseContext(); // parsing the file parser.parse(inputstream, handler, metadata, context); // display the file content System.out.println("File content : " + Handler.toString()); } }
// Java program to extract the // contents of Java class file // format import java.io.File; import java.io.FileInputStream; import java.io.IOException; // importing Apache Tika libraries import org.apache.tika.exception.TikaException; import org.apache.tika.metadata.Metadata; import org.apache.tika.parser.AutoDetectParser; import org.apache.tika.parser.ParseContext; import org.apache.tika.parser.Parser; import org.apache.tika.sax.BodyContentHandler; import org.xml.sax.SAXException; public class ParserExtraction { public static void main(final String[] args) throws IOException, SAXException, TikaException { // create a File object File f = new File("AddTwoNumbers.java"); // parse method parameters Parser parser = new AutoDetectParser(); // instantiate BodyContentHandle BodyContentHandler handler = new BodyContentHandler(); // Creates the Metadata object Metadata metadata = new Metadata(); FileInputStream inputstream = new FileInputStream(f); // creates a parse context object ParseContext context = new ParseContext(); // parsing the file parser.parse(inputstream, handler, metadata, context); // display the file content System.out.println("File content : " + Handler.toString()); } }
Australia
UK
UAE
Singapore
Canada
New
Zealand
Malaysia
USA
India
South
Africa
Ireland
Saudi
Arab
Qatar
Kuwait
Hongkong
Copyright 2016-2023 www.programmingshark.com - All Rights Reserved.
Disclaimer : Any type of help and guidance service given by us is just for reference purpose. We never ask any of our clients to submit our solution guide as it is, anywhere.