1. Overview
In this quick article, we'll take a look at the differences between YAML and JSON through quick and practical examples.
2. Format
To have a better image, let's start by looking at the JSON and YAML representations of a simple POJO:
class Person {
String name;
Integer age;
List<String> hobbies;
Person manager;
}
First, let's look at its JSON representation:
{
"name":"John Smith",
"age":26,
"hobbies":[
"sports",
"cooking"
],
"manager":{
"name":"Jon Doe",
"age":45,
"hobbies":[
"fishing"
],
"manager":null
}
}
JSON syntax is somewhat cumbersome as it uses special syntax like curly braces {} and square brackets [] to represent objects and arrays.
Next, let's see how the same structure would look in YAML:
name: John Smith
age: 26
hobbies:
- sports
- cooking
manager:
name: Jon Doe
age: 45
hobbies:
- fishing
manager:
YAML's syntax looks a bit friendlier as it uses blank spaces to denote relations between objects and ‘–‘ to represent array elements.
We can see that although both are easily readable, YAML tends to be more human-readable.
Another bonus point for YAML is the number of lines it takes to represent the same information — YAML takes only 11 lines, while JSON takes 16.
3. Size
We've seen in the previous section that YAML is represented in fewer lines than JSON, but does that mean that it takes less space?
Let's imagine a deeply nested structure with a parent and five children represented as JSON:
{
"child":{
"child":{
"child":{
"child":{
"child":{
"child":{
"child":null
}
}
}
}
}
}
}
The same structure would look similar in YAML:
child:
child:
child:
child:
child:
child:
child:
On first sight, it might look like JSON takes more space, but, in reality, JSON specification doesn't care about whitespace or newlines, and it can be shortened as follows:
{"child":{"child":{"child":{"child":{"child":{"child":{"child":null}}}}}}}
We can see that the second form is much shorter, and it occupies only 74 bytes, while the YAML format takes 97 bytes.
4. YAML Features
Besides the basic features that JSON provides, YAML comes with additional functionality as we'll see next.
4.1. Comments
YAML allows comments by using #, a feature that is often desired when working with JSON files:
# This is a simple comment
name: John
4.2. Multi-Line Strings
Another feature missing in JSON but present in YAML is multi-line strings:
website: |
line1
line2
line3
4.3. Aliases and Anchors
We can easily assign an alias to a specific item using & and anchor (reference) it using *:
httpPort: 80
httpsPort: &httpsPort 443
defaultPort: *httpsPort
5. Performance
Due to the simple nature of JSON specification, its performance in parsing/serializing data is much better than YAML.
We're going to implement a simple benchmark to compare the parsing speed of YAML and JSON using JMH.
For the YAML benchmark, we're going to use the well-known snake-yaml library, and for our JSON benchmark, we'll use org-json:
@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.SECONDS)
@Measurement(batchSize = 10_000, iterations = 5)
@Warmup(batchSize = 10_000, iterations = 5)
@State(Scope.Thread)
class Bench {
static void main(String[] args) {
org.openjdk.jmh.Main.main(args);
}
@State(Scope.Thread)
static class YamlState {
public Yaml yaml = new Yaml();
}
@Benchmark
Object benchmarkYaml(YamlState yamlState) {
return yamlState.yaml.load("foo: bar");
}
@Benchmark
Object benchmarkJson(Blackhole blackhole) {
return new JSONObject("{\"foo\": \"bar\"}");
}
}
As we might've expected, JSON is the winner, being approximately 30 times faster:
Benchmark Mode Cnt Score Error Units
Main2.benchmarkJson thrpt 50 644.085 ± 9.962 ops/s
Main2.benchmarkYaml thrpt 50 20.351 ± 0.312 ops/s
6. Library Availability
JavaScript is the standard for the web, meaning that it's almost impossible to find a language that doesn't fully support JSON.
On the other hand, YAML is widely supported, but it's not a standard. This means that libraries exist for most popular programming languages, but due to its complexity, they might not fully implement the specification.
7. What Should I Choose?
This might be a difficult question to answer and a subjective one in many cases.
If we need to expose a set of REST APIs to other front-end or back-end applications, we should probably go with JSON as it's the de facto industry standard.
If we need to create a configuration file that will often be read/updated by humans, YAML might be a good option.
Of course, there might also be use cases where both YAML and JSON would be a good fit, and it will be just a matter of taste.
8. Conclusion
In this quick article, we've learned the main differences between YAML and JSON and what aspects to consider to make an informed decision as to which one we should choose.