1. Overview
JSON is a widely used structured data format typically used in most modern APIs and data services. It's particularly popular in web applications due to its lightweight nature and compatibility with Javascript.
Unfortunately, shells such as Bash can't interpret and work with JSON directly. This means that working with JSON via the command line can be cumbersome involving text manipulation using a combination of tools such as sed and grep.
In this tutorial, we'll take a look at how we can alleviate this awkwardness using jq – an eloquent command-line processor for JSON.
2. Installation
Let's begin by installing jq which is available in most operating system packaging repositories. It's also possible to download the binary directly or build it from the source.
Once we've installed the package, let's verify the installation by simply running jq:
$ jq jq - commandline JSON processor [version 1.6] Usage: jq [options] <jq filter> [file...] jq [options] --args <jq filter> [strings...] jq [options] --jsonargs <jq filter> [JSON_TEXTS...] ...
If the installation was successful, we'll see the version, some usage examples and other information displayed in the console.
3. Working with Simple Filters
jq is built around the concept of filters that work over a stream of JSON. Each filter takes an input and emits JSON to standard out. As we're going to see, there are many predefined filters that we can use. And, we can effortlessly combine these filters using pipes to quickly construct and apply complex operations and transformations to our JSON data.
3.1. Prettify JSON
Let's start by taking a look at the simplest filter of all which incidentally is one of the most useful and frequently used features of jq:
echo '{"fruit":{"name":"apple","color":"green","price":1.20}}' | jq '.'
In this example, we echo a simple JSON string and pipe it directly into our jq command. Then, we use the identity filter ‘.' which takes the input and produces it unchanged as output with the caveat that by default jq pretty-prints all output.
This gives us the output:
{ "fruit": { "name": "apple", "color": "green", "price": 1.2 } }
We can also apply this filter directly to a JSON file:
jq '.' fruit.json
Being able to prettify JSON is particularly useful when we want to retrieve data from an API and see the response in a clear, readable format.
Let's hit a simple API using curl to see this in practice:
curl http://api.open-notify.org/iss-now.json | jq '.'
This gives us a JSON response for the current position of the International Space Station:
{ "message": "success", "timestamp": 1572386230, "iss_position": { "longitude": "-35.4232", "latitude": "-51.3109" } }
3.2. Accessing Properties
We can access property values by using another simple filter: The .field operator. To find a property value, we simply combine this filter followed by the property name.
Let's see this by building on our simple fruit example:
jq '.fruit' fruit.json
Here we are accessing the fruit property which gives us all the children of this key:
{ "name": "apple", "color": "green", "price": 1.2 }
We can also chain property values together which allows us to access nested objects:
jq '.fruit.color' fruit.json
As expected, this simply returns the color of our fruit:
"green"
If we need to retrieve multiple keys we can separate them using a comma:
jq '.fruit.color,.fruit.price' fruit.json
This results in an output containing both property values:
"green" 1.2
An important point to note is that if one of the properties has spaces or special characters, then we need to wrap the property name in quotes when accessing it from the jq command:
echo '{ "with space": "hello" }' | jq '."with space"'
4. JSON Arrays
Let's now take a look at how we can work with arrays in JSON data. We typically use arrays to represent a list of items. And, as in many programming languages, we use square brackets to denote the start and end of an array.
4.1. Iteration
We'll start with a really basic example to demonstrate how to iterate over an array:
echo '["x","y","z"]' | jq '.[]'
Here we see the object value iterator operator .[] in use which will print out each item in the array on a separate line:
"x" "y" "z"
Now let's imagine we now want to represent a list of fruit in a JSON document:
[ { "name": "apple", "color": "green", "price": 1.2 }, { "name": "banana", "color": "yellow", "price": 0.5 }, { "name": "kiwi", "color": "green", "price": 1.25 } ]
In this example, each item in the array is an object which represents a fruit.
Let's take a look at how we can extract the name of each fruit from each object in the array:
jq '.[] | .name' fruits.json
First, we need to iterate over the array using .[]. Then we can pass each object in the array to the next filter in the command using a pipe |. The last step is to output the name field from each object using .name:
"apple" "banana" "kiwi"
We can also use a slightly more concise version and access the property directly on each object in the array:
jq '.[].name' fruits.json
4.2. Accessing By Index
Of course, as with all arrays we can access one of the items in the array directly by passing the index:
jq '.[1].price' fruits.json
4.3. Slicing
Finally, jq also supports slicing of arrays, another powerful feature. This is particularly useful when we need to return a subarray of an array.
Again, let's see this using a simple array of numbers:
echo '[1,2,3,4,5,6,7,8,9,10]' | jq '.[6:9]'
In this example the result will be a new array with a length of 3, containing the elements from index 6 (inclusive) to index 9 (exclusive):
[ 7, 8, 9 ]
It's also possible to omit one of the indexes when using the slicing functionality:
echo '[1,2,3,4,5,6,7,8,9,10]' | jq '.[:6]' | jq '.[-2:]'
In this example, since we specified only the second argument in .[:6], the slice will start from the beginning of the array and run up until index 6. It's the same as doing .[0:6].
The second slicing operation has a negative argument, which denotes in this case that it counts backward from the end of the array.
Note the subtle difference in the second slice, we pass the index as the first argument. This means we will start 2 indexes from the end (-2) and as the second argument is empty it will run until the end of the array.
This gives us the output:
[ 5, 6 ]
5. Using Functions
jq has many powerful built-in functions that we can use to perform a variety of useful operations. In this section, we're going to take a look at some of them.
5.1. Getting Keys
Sometimes we may want to get the keys of an object as an array as opposed to the values. We can do this using the keys function:
jq '.fruit | keys' fruit.json
This gives us the keys sorted alphabetically:
[ "color", "name", "price" ]
5.2. Returning the Length
Another handy function for arrays and objects is the length function. We can use this function to return the array’s length or the number of properties on an object:
jq '.fruit | length' fruit.json
We can even use the length function on string values as well:
jq '.fruit.name | length' fruit.json
In the first example, we get “3” as the fruit object has three properties. In the second example, we see “5” as the resulting output as the fruit name property has five characters – “apple“.
5.3. Mapping Values
The map function is a powerful function we can use to apply a filter or function to an array:
jq 'map(has("name"))' fruits.json
In this example, we're applying the has function to each item in the array and looking to see if there is a name property. In our simple fruits JSON, we get true in each result item.
We can also use the map function to apply operations to the elements in an array. Let's imagine we want to increase the price of each fruit:
jq 'map(.price+2)' fruits.json
This gives us a new array with each price incremented:
[ 3.2, 2.5, 3.25 ]
5.4. Min and Max
If we need to find the minimum or maximum element of an input array, we can utilize the min and max functions:
jq '[.[].price] | min' fruits.json
Likewise, we can also find the most expensive fruit in our JSON document:
jq '[.[].price] | max' fruits.json
Note that in these two examples, we've constructed a new array, using [] around the array iteration. This contains only the prices before we pass this new list to the min or max function.
5.5. Selecting Values
The select function is another impressive utility that we can use for querying JSON. We can think of it as a bit like a simple version of XPath for JSON:
jq '.[] | select(.price>0.5)' fruits.json
This selects all the fruit with a price greater than 0.5. Likewise, we can also make selections based on the value of a property:
jq '.[] | select(.color=="yellow")' fruits.json
We can even combine conditions to buildup complex selections:
jq '.[] | select(.color=="yellow" and .price>=0.5)' fruits.json
This will give us all yellow fruit matching a given price condition:
{ "name": "banana", "color": "yellow", "price": 0.5 }
5.6. Support For Regular Expressions
Next up, we're going to look at the test function which enables us to test if an input matches against a given regular expression:
jq '.[] | select(.name|test("^a.")) | .price' fruits.json
Simply put, here we want to output the price of all the fruit whose name starts with the letter “a“.
5.7. Finding Unique Values
One common use case is to be able to see unique occurrences of a particular value within an array or remove duplicates.
Let's see how can see how many unique colors we have in our fruits JSON document:
jq 'map(.color) | unique' fruits.json
In this example, we use the map function to create a new array containing only colors. Then we pass each color in the new array to the unique function using a pipe |.
This gives us an array with two distinct fruit colors:
[ "green", "yellow" ]
5.8. Deleting Keys From JSON
Sometimes we might also want to remove a key and corresponding value from JSON objects. For this, jq provides the del function:
jq 'del(.fruit.name)' fruit.json
This outputs the fruit object without the deleted key:
{ "fruit": { "color": "green", "price": 1.2 } }
6. Transforming JSON
Frequently when working with data structures such as JSON, we might want to transform one data structure into another. This can be useful when working with large JSON structures when we are only interested in several properties or values.
In this example, we'll use some JSON from Wikipedia which describes a list of page entries:
{ "query": { "pages": [ { "21721040": { "pageid": 21721040, "ns": 0, "title": "Stack Overflow", "extract": "Some interesting text about Stack Overflow" } }, { "21721041": { "pageid": 21721041, "ns": 0, "title": "Baeldung", "extract": "A great place to learn about Java" } } ] } }
For our purposes, we're only really interested in the title and extract of each page entry. So let's take a look at how we can transform this document:
jq '.query.pages | [.[] | map(.) | .[] | {page_title: .title, page_description: .extract}]' wikipedia.json
Let's take a look at the command in more detail to understand it properly:
- First, we begin by accessing the pages array and passing that array into the next filter in the command via a pipe
- Then we iterate over this array and pass each object inside the pages array to the map function, where we simply create a new array with the contents of each object
- Next, we iterate over this array and for each item create an object containing two keys page_title and page_description
- The .title and .extract references are used to populate the two new keys
This gives us a nice new lean JSON structure:
[ { "page_title": "Stack Overflow", "page_description": "Some interesting text about Stack Overflow" }, { "page_title": "Baeldung", "page_description": "A great place to learn about Java" } ]
7. Conclusion
In this in-depth tutorial, we’ve covered some of the basic capabilities that jq provides for processing and manipulating JSON via the command line.
First, we looked at some of the essential filters jq offers and saw how they can be used as the building blocks for more complex operations.
Later, we saw how to use a number of built-in functions that come bundled with jq. Then, we concluded with a complex example showing how we could transform one JSON document into another.
Of course, be sure to check out the excellent cookbook for more interesting examples and always, the full source code of the article is available over on GitHub.