1. Introduction
Application logs are important resources for troubleshooting, measuring performance, or simply checking the behavior of a software application.
In this tutorial, we’ll learn how to implement structured logging in Java and the advantages of this technique over unstructured logging.
2. Structured vs. Unstructured Logs
Before jumping into code, let’s understand the key difference between unstructured and structured logs.
An unstructured log is a piece of information printed with consistent formatting but without structure. It’s simply a block of text with some variables concatenated and formatted within it.
Let’s look at one example of an unstructured log taken from a demo Spring application:
22:25:48.111 [restartedMain] INFO o.s.d.r.c.RepositoryConfigurationDelegate - Finished Spring Data repository scanning in 42 ms. Found 1 JPA repository interfaces.
The above log shows the timestamp, log level, fully-qualified class name, and a description of what Spring is doing at that time. It’s a helpful piece of information when we’re observing application behavior.
However, it’s harder to extract information from an unstructured log. For instance, it’s not trivial to identify and extract the class name that generated that log, as we might need to use String manipulation logic to find it.
In counterpart, structured logs show each piece of information individually in a dictionary-like way. We can think of them as informational objects instead of Strings. Let’s look at a possible structured log solution applied to the unstructured log example:
{
"timestamp": "22:25:48.111",
"logger": "restartedMain",
"log_level": "INFO",
"class": "o.s.d.r.c.RepositoryConfigurationDelegate",
"message": "Finished Spring Data repository scanning in 42 ms. Found 1 JPA repository interfaces."
}
In a structured log, it’s easier to extract a specific field value since we can access it using its name. Hence, we don’t need to process text and find specific patterns therein to extract information. For example, in our code, we can simply use the class field to access the class name that generated the log.
3. Configuring Structured Logs
In this section, we’ll dive into the details of implementing structured logging in Java applications using logback and slf4j libraries.
3.1. Dependencies
To make things work properly, we need to set a few dependencies into our pom.xml file:
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>2.0.9</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.4.14</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-core</artifactId>
<version>1.4.14</version>
</dependency>
The slf4j-api dependency is a facade to the logback-classic and logback-core dependencies. They work together to implement the logging mechanism with ease in Java applications. Note that if we’re using Spring Boot, then we don’t need to add these three dependencies because they are a child of spring-boot-starter-logging.
Let’s add another dependency, logstash-logback-encoder, that helps to implement structured log formats and layouts:
<dependency>
<groupId>net.logstash.logback</groupId>
<artifactId>logstash-logback-encoder</artifactId>
<version>7.4</version>
</dependency>
Remember to always use the latest possible version of the dependencies mentioned.
3.2. Configuring the Basics of logback for Structured Logs
To log information in a structured way, we need to configure logback. To do so, let’s create a logback.xml file with some initial content:
<configuration>
<appender name="jsonConsoleAppender" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="net.logstash.logback.encoder.LogstashEncoder">
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="jsonConsoleAppender"/>
</root>
</configuration>
In the above file, we configured an appender named jsonConsoleAppender that uses the existing ConsoleAppender class from logback-core as its appender.
We’ve also set an encoder pointing to the LogstashEncoder class from the logback-encoder library. That encoder is responsible for transforming a log event into JSON format and outputting the information.
With all that set, let’s see a sample log entry:
{"@timestamp":"2023-12-20T22:16:25.2831944-03:00","@version":"1","message":"Example log message","logger_name":"info_logger","thread_name":"main","level":"INFO","level_value":20000,"custom_message":"my_message","password":"123456"}
The above log line is structured in a JSON format with metadata information and customized fields like message and password.
3.3. Improving Structured Logs
To make our logger more readable and secure, let’s modify our logback.xml:
<configuration>
<appender name="jsonConsoleAppender" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="net.logstash.logback.encoder.LogstashEncoder">
<includeCallerData>true</includeCallerData>
<jsonGeneratorDecorator class="net.logstash.logback.decorate.CompositeJsonGeneratorDecorator">
<decorator class="net.logstash.logback.decorate.PrettyPrintingJsonGeneratorDecorator"/>
<decorator class="net.logstash.logback.mask.MaskingJsonGeneratorDecorator">
<defaultMask>XXXX</defaultMask>
<path>password</path>
</decorator>
</jsonGeneratorDecorator>
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="jsonConsoleAppender"/>
</root>
</configuration>
Here, we’ve added a few tags to improve the readability of the output, added more metadata, and obfuscated some fields. Let’s look at each one individually:
- configuration: The root tag containing the logging configuration
- appender name: The appender name that we’ve defined to reuse in other tags
- appender class: The fully-qualified name of the class that implements the logging appender. We’ve used the ConsoleAppender class from logback-core.
- encoder class: The logging encoder implementation, which in our case is the LogstashEncoder from logstash-logback-encoder
- includeCallerData: Adds more information about the caller code that originated that log line
- jsonGeneratorDecorator: To print JSON in a pretty format, we’ve added that tag with a nested decorator tag that references the CompositeJsonGeneratorDecorator class.
- decorator class: We’ve used the PrettyPrintingJsonGeneratorDecorator class to print the JSON output in a prettier way, showing each field in a different line.
- decorator class: Here, the MaskingJsonGeneratorDecorator class obfuscates any field data.
- defaultMask: The mask that substitutes the fields defined in the path tag. This is useful to mask sensitive data and make our applications PII complainant when using structured logs.
- path: The field name to apply the mask defined in the defaultMask tag
With the new configuration, the same logs of section 3.2. should look similar to:
{
"@timestamp" : "2023-12-20T22:44:58.0961616-03:00",
"@version" : "1",
"message" : "Example log message",
"logger_name" : "info_logger",
"thread_name" : "main",
"level" : "INFO",
"level_value" : 20000,
"custom_message" : "my_message",
"password" : "XXXX",
"caller_class_name" : "StructuredLog4jExampleUnitTest",
"caller_method_name" : "givenStructuredLog_whenUseLog4j_thenExtractCorrectInformation",
"caller_file_name" : "StructuredLog4jExampleUnitTest.java",
"caller_line_number" : 16
}
4. Implementing Structured Logs
To illustrate structured logging, we’ll use a demo application with a User class.
4.1. Creating the Demo User class
Let’s first create a Java POJO named User:
public class User {
private String id;
private String name;
private String password;
// getters, setters, and all-args constructor
}
4.2. Exercising Use-Cases of Structured Loggers
Let’s create a test class to illustrate the usage of structured logging:
public class StructuredLog4jExampleUnitTest {
Logger logger = LoggerFactory.getLogger("logger_name_example");
//...
}
Here, we created a variable to store an instance of the Logger interface. We’ve used the LoggerFactory.getLogger() method with an arbitrary name as a parameter to get a valid implementation of Logger.
Now, let’s define a test case to print a message at info level:
@Test
void whenInfoLoggingData_thenFormatItCorrectly() {
User user = new User("1", "John Doe", "123456");
logger.atInfo().addKeyValue("user_info", user)
.log();
}
In the above code, we’ve defined a User with some data. Then, we used the addKeyValue() method of the LoggingEventBuilder to append the user_info information to the logger variable created before.
Let’s see how the logger outputs the log with the newly added information user_info:
{
"@timestamp" : "2023-12-21T23:58:03.0581889-03:00",
"@version" : "1",
"message" : "Processed user succesfully",
"logger_name" : "logger_name_example",
"thread_name" : "main",
"level" : "INFO",
"level_value" : 20000,
"user_info" : {
"id" : "1",
"name" : "John Doe",
"password" : "XXXX"
},
"caller_class_name" : "StructuredLog4jExampleUnitTest",
"caller_method_name" : "whenInfoLoggingData_thenFormatItCorrectly",
"caller_file_name" : "StructuredLog4jExampleUnitTest.java",
"caller_line_number" : 21
}
Logs are also helpful in identifying errors in our code. Thus, we can also use LoggingEventBuilder to illustrate error logging in a catch block:
@Test
void givenStructuredLog_whenUseLog4j_thenExtractCorrectInformation() {
User user = new User("1", "John Doe", "123456");
try {
throwExceptionMethod();
} catch (RuntimeException ex) {
logger.atError().addKeyValue("user_info", user)
.setMessage("Error processing given user")
.addKeyValue("exception_class", ex.getClass().getSimpleName())
.addKeyValue("error_message", ex.getMessage())
.log();
}
}
In the test above, we’ve added more key-value pairs for the exception message and class name. Let’s see the log output:
{
"@timestamp" : "2023-12-22T00:04:52.8414988-03:00",
"@version" : "1",
"message" : "Error processing given user",
"logger_name" : "logger_name_example",
"thread_name" : "main",
"level" : "ERROR",
"level_value" : 40000,
"user_info" : {
"id" : "1",
"name" : "John Doe",
"password" : "XXXX"
},
"exception_class" : "RuntimeException",
"error_message" : "Error saving user data",
"caller_class_name" : "StructuredLog4jExampleUnitTest",
"caller_method_name" : "givenStructuredLog_whenUseLog4j_thenExtractCorrectInformation",
"caller_file_name" : "StructuredLog4jExampleUnitTest.java",
"caller_line_number" : 35
}
5. Advantages of Structured Logging
Structured logging has some advantages over unstructured logging, like readability and efficiency.
5.1. Readability
Logs are typically one of the best tools to troubleshoot software, measure performance, and check if the applications behave as expected. Thus, creating a system where we can read log lines more easily is important.
Structured logs show data as a dictionary, which makes it easier for the human brain to search for a specific field across the log line. It’s the same concept as searching for a specific chapter in a book using an index versus reading the content page by page.
5.2. Efficiency
In general, data visualization tools like Kibana, New Relic, and Splunk use a query language to search for a specific value across all log lines in a specific time window. Log search queries are easier to write when using structured logging since the data is in a key-value format.
Additionally, using structured logging, it’s easier to create business metrics about the data provided. In that case, searching for business data in a consistent, structured format is easier and more efficient than searching for specific words in the whole log text.
Finally, queries to search structured data use less complex algorithms, which might decrease cloud computing costs depending on the tool used.
6. Conclusion
In this article, we saw one way to implement structured logging in Java using slf4j and logback.
Using formatted, structured logs allows machines and humans to read them faster, making our application easier to troubleshoot and reducing the complexity of consuming log events.
As always, the source code is available over on GitHub.