Quantcast
Channel: Baeldung
Viewing all articles
Browse latest Browse all 4535

Parsing an XML File Using StAX

$
0
0

1. Introduction

In this tutorial, we'll illustrate how to parse an XML file using StAX. We'll implement a simple XML parser and see how it works with an example.

2. Parsing with StAX

StAX is one of the several XML libraries in Java. It's a memory-efficient library included in the JDK since Java 6. StAX doesn't load the entire XML into memory. Instead, it pulls data from a stream in a forward-only fashion. The stream is read by an XMLEventReader object.

3. XMLEventReader Class

In StAX, any start tag or end tag is an event. XMLEventReader reads an XML file as a stream of events. It also provides the methods necessary to parse the XML. The most important methods are:

  • isStartElement(): checks if the current event is a StartElement (start tag)
  • isEndElement(): checks if the current event is an EndElement (end tag)
  • asCharacters(): returns the current event as characters
  • getName(): gets the name of the current event
  • getAttributes(): returns an Iterator of the current event's attributes

4. Implementing a Simple XML Parser

Needless to say, the first step to parse an XML is to read it. We need an XMLInputFactory to create an XMLEventReader for reading our file:

XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
XMLEventReader reader = xmlInputFactory.createXMLEventReader(new FileInputStream(path));

Now that the XMLEventReader is ready, we move forward through the stream with nextEvent():

while (reader.hasNext()) {
    XMLEvent nextEvent = reader.nextEvent();
}

Next, we need to find our desired start tag first:

if (nextEvent.isStartElement()) {
    StartElement startElement = nextEvent.asStartElement();
    if (startElement.getName().getLocalPart().equals("desired")) {
        //...
    }
}

Consequently, we can read the attributes and data:

String url = startElement.getAttributeByName(new QName("url")).getValue();
String name = nextEvent.asCharacters().getData();

We can also check if we've reached an end tag:

if (nextEvent.isEndElement()) {
    EndElement endElement = nextEvent.asEndElement();
}

5. Parsing Example

To get a better understanding, let's run our parser on a sample XML file:

<?xml version="1.0" encoding="UTF-8"?>
<websites>
    <website url="https://baeldung.com">
        <name>Baeldung</name>
        <category>Online Courses</category>
        <status>Online</status>
    </website>
    <website url="http://example.com">
        <name>Example</name>
        <category>Examples</category>
        <status>Offline</status>
    </website>
    <website url="http://localhost:8080">
        <name>Localhost</name>
        <category>Tests</category>
        <status>Offline</status>
    </website>
</websites>

Let's parse the XML and store all data into a list of entity objects called websites:

while (reader.hasNext()) {
    XMLEvent nextEvent = reader.nextEvent();
    if (nextEvent.isStartElement()) {
        StartElement startElement = nextEvent.asStartElement();
        switch (startElement.getName().getLocalPart()) {
            case "website":
                website = new WebSite();
                Attribute url = startElement.getAttributeByName(new QName("url"));
                if (url != null) {
                    website.setUrl(url.getValue());
                }
                break;
            case "name":
                nextEvent = reader.nextEvent();
                website.setName(nextEvent.asCharacters().getData());
                break;
            case "category":
                nextEvent = reader.nextEvent();
                website.setCategory(nextEvent.asCharacters().getData());
                break;
            case "status":
                nextEvent = reader.nextEvent();
                website.setStatus(nextEvent.asCharacters().getData());
                break;
        }
    }
    if (nextEvent.isEndElement()) {
        EndElement endElement = nextEvent.asEndElement();
        if (endElement.getName().getLocalPart().equals("website")) {
            websites.add(website);
        }
    }
}

To get all the properties of each website, we check startElement.getName().getLocalPart() for each event. We then set the corresponding property accordingly. When we reach the website's end element, we know that our entity is complete, so we add the entity to our websites list.

6. Conclusion

In this tutorial, we learned how to parse an XML file using StAX library. The example XML file and the full parser code are available over on Github.


Viewing all articles
Browse latest Browse all 4535

Trending Articles