Quantcast
Channel: Baeldung
Viewing all articles
Browse latest Browse all 4536

Convert XML to HTML in Java

$
0
0

1. Introduction

In this tutorial, we'll describe how to convert XML to HTML using common Java libraries and template engines – JAXP, StAX, Freemarker, and Mustache.

2. An XML to Unmarshal

Let's start off with a simple XML document that we'll unmarshal into a suitable Java representation before we convert it into HTML. We'll bear in mind a few key goals:

  1. Keep the same XML for all of our samples
  2. Create a syntactically and semantically valid HTML5 document at the end
  3. Convert all XML elements into text

Let's use a simple Jenkins notification as our sample XML:

<?xml version="1.0" encoding="UTF-8"?>
<notification>
    <from>builds@baeldung.com</from>
    <heading>Build #7 passed</heading>
    <content>Success: The Jenkins CI build passed</content>
</notification>

And it's pretty straightforward. It includes a root element and some nested elements.

We'll aim to remove all of the unique XML tags and print out key-value pairs when we create our HTML file.

3. JAXP

Java Architecture for XML Processing (JAXP) is a library that was intended to expand the functionality of the popular SAX Parser with additional DOM support. JAXP provides the ability to marshal and unmarshal XML-defined objects into and from POJOs using SAX Parser. We'll also make use of the built-in DOM helpers.

Let's add the Maven dependency for JAXP to our project:

<dependency>
    <groupId>javax.xml</groupId>
    <artifactId>jaxp-api</artifactId>
    <version>1.4.2</version>
</dependency>

3.1. Unmarshalling Using DOM Builder

Let's begin by first unmarshalling our XML file into a Java Element object:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);

Document input = factory
  .newDocumentBuilder()
  .parse(resourcePath);
Element xml = input.getDocumentElement();

3.2. Extracting the XML file contents in a Map

Now, let's build a Map with the relevant contents of our XML file:

Map<String, String> map = new HashMap<>();
map.put("heading", 
  xml.getElementsByTagName("heading")
    .item(0)
    .getTextContent());
map.put("from", String.format("from: %s",
  xml.getElementsByTagName("from")
    .item(0)
    .getTextContent()));
map.put("content", 
  xml.getElementsByTagName("content")
    .item(0)
    .getTextContent());

3.3. Marshalling Using DOM Builder

Marshalling our XML into an HTML file is a little more involved.

Let's prepare a transfer Document that we'll use to write out the HTML:

Document doc = factory
  .newDocumentBuilder()
  .newDocument();

Next, we'll fill the Document with the Elements in our map:

Element html = doc.createElement("html");

Element head = doc.createElement("head");
html.setAttribute("lang", "en");

Element title = doc.createElement("title");
title.setTextContent(map.get("heading"));

head.appendChild(title);
html.appendChild(head);

Element body = doc.createElement("body");

Element from = doc.createElement("p");
from.setTextContent(map.get("from"));

Element success = doc.createElement("p");
success.setTextContent(map.get("content"));

body.appendChild(from);
body.appendChild(success);

html.appendChild(body);
doc.appendChild(html);

Finally, let's marshal our Document object using a TransformerFactory:

TransformerFactory transformerFactory = TransformerFactory.newInstance();
transformerFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
transformerFactory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
transformerFactory.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, "");

try (Writer output = new StringWriter()) {
    Transformer transformer = transformerFactory.newTransformer();
    transformer.transform(new DOMSource(doc), new StreamResult(output));
}

If we call output.toString(), we'll get the HTML representation.

Note that some of the extra features and attributes we set on our factory were taken from the recommendations of the OWASP project to avoid XXE injection.

4. StAX

Another library we can use is the Streaming API for XML (StAX). Like JAXP, StAX has been around for a long time — since 2004.

The other two libraries simplify parsing XML files. That's great for simple tasks or projects but less so when we need to iterate or have explicit and fine-grained control over element parsing itself. That's where StAX comes in handy.

Let's add the Maven dependency for the StAX API to our project:

<dependency>
    <groupId>javax.xml.stream</groupId>
    <artifactId>stax-api</artifactId>
    <version>1.0-2</version>
</dependency>

4.1. Unmarshalling Using StAX

We'll use a simple iteration control flow to store XML values into our Map:

XMLInputFactory factory = XMLInputFactory.newInstance();
factory.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES, Boolean.FALSE);
factory.setProperty(XMLInputFactory.SUPPORT_DTD, Boolean.FALSE);
XMLStreamReader input = null;
try (FileInputStream file = new FileInputStream(resourcePath)) {
    input = factory.createXMLStreamReader(file);

    Map<String, String> map = new HashMap<>();
    while (input.hasNext()) {
        input.next();
        if (input.isStartElement()) {
            if (input.getLocalName().equals("heading")) {
                map.put("heading", input.getElementText());
            }
            if (input.getLocalName().equals("from")) {
                map.put("from", String.format("from: %s", input.getElementText()));
            }
            if (input.getLocalName().equals("content")) {
                map.put("content", input.getElementText());
            }
        }
    }
} finally {
    if (input != null) {
        input.close();
    }
}

4.2. Marshalling Using StAX

Now, let's use our map and write out the HTML:

try (Writer output = new StringWriter()) {
    XMLStreamWriter writer = XMLOutputFactory
      .newInstance()
      .createXMLStreamWriter(output);

    writer.writeDTD("<!DOCTYPE html>");
    writer.writeStartElement("html");
    writer.writeAttribute("lang", "en");
    writer.writeStartElement("head");
    writer.writeDTD("<META http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\">");
    writer.writeStartElement("title");
    writer.writeCharacters(map.get("heading"));
    writer.writeEndElement();
    writer.writeEndElement();

    writer.writeStartElement("body");

    writer.writeStartElement("p");
    writer.writeCharacters(map.get("from"));
    writer.writeEndElement();

    writer.writeStartElement("p");
    writer.writeCharacters(map.get("content"));
    writer.writeEndElement();

    writer.writeEndElement();
    writer.writeEndDocument();
    writer.flush();
}

Like in the JAXP example, we can call output.toString() to get the HTML representation.

5. Using Template Engines

As an alternative to writing the HTML representation, we can use template engines. There multiple options in the Java ecosystem. Let's explore some of them.

5.1. Using Apache Freemarker

Apache FreeMarker is a Java-based template engine for generating text output (HTML web pages, e-mails, configuration files, source code, etc.) based on templates and changing data.

In order to use it, we'll need to add the freemarker dependency to our Maven project:

<dependency>
    <groupId>org.freemarker</groupId>
    <artifactId>freemarker</artifactId>
    <version>2.3.29</version>
</dependency>

First, let's create a template using the FreeMarker syntax:

<!DOCTYPE html>
<html lang="en">
<head>
    <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>${heading}</title>
</head>
<body>
<p>${from}</p>
<p>${content}</p>
</body>
</html>

Now, let's reuse our map and fill the gaps in the template:

Configuration cfg = new Configuration(Configuration.VERSION_2_3_29);
cfg.setDirectoryForTemplateLoading(new File(templateDirectory));
cfg.setDefaultEncoding(StandardCharsets.UTF_8.toString());
cfg.setTemplateExceptionHandler(TemplateExceptionHandler.RETHROW_HANDLER);
cfg.setLogTemplateExceptions(false);
cfg.setWrapUncheckedExceptions(true);
cfg.setFallbackOnNullLoopVariable(false);
Template temp = cfg.getTemplate(templateFile);
try (Writer output = new StringWriter()) {
    temp.process(staxTransformer.getMap(), output);
}

5.2. Using Mustache

Mustache is a logic-less template engine. Mustache can be used for HTML, config files, source code — pretty much anything. It works by expanding tags in a template using values provided in a hash or object.

To use it, we'll need to add the mustache dependency to our Maven project:

<dependency>
    <groupId>com.github.spullara.mustache.java</groupId>
    <artifactId>compiler</artifactId>
    <version>0.9.6</version>
</dependency>

Let's start creating a template using the Mustache syntax:

<!DOCTYPE html>
<html lang="en">
<head>
    <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>{{heading}}</title>
</head>
<body>
<p>{{from}}</p>
<p>{{content}}</p>
</body>
</html>

Now, let's fill the template with our map:

MustacheFactory mf = new DefaultMustacheFactory();
Mustache mustache = mf.compile(templateFile);
try (Writer output = new StringWriter()) {
    mustache.execute(output, staxTransformer.getMap());
    output.flush();
}

6. The Resulting HTML

In the end, with all our code samples, we'll get the same HTML output:

<!DOCTYPE html>
<html lang="en">
<head>
    <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>Build #7 passed</title>
</head>
<body>
<p>from: builds@baeldung.com</p>
<p>Success: The Jenkins CI build passed</p>
</body>
</html>

7. Conclusion

In this tutorial, we've learned the basics of using JAXP, StAX, Freemarker, and Mustache to convert XML into HTML.

For more information about XML in Java, check out these other great resources right here on Baeldung:

As always, the complete code samples seen here are available over on GitHub.


Viewing all articles
Browse latest Browse all 4536

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>