https://www.maketecheasier.com/use-jq-process-json-linux
How to Use jq Command to Process JSON in Linux
Jq is a powerful and highly flexible parser program that can stream and filter JSON data out of files and UNIX pipes. This article will teach you the basics of jq, present code examples, as well as some alternative implementations that you can install today.
What Is jq Used For?
The most common use for jq is for processing and manipulating JSON responses from Software-as-a-Service (SaaS) APIs. For instance, you can use jq along with cURL to tap into Digitalocean’s API endpoints to get your account details.
Aside from that, jq is also a powerful utility for managing large JSON files. Some of the most popular database programs today such as MongoDB, PostgreSQL, and MySQL support JSON as a way to store data. As such, learning jq gives you an edge in understanding how those database systems work.
Good to know: learn some of the best tools to edit JSON files inside Chrome.
Installing and Using jq
To start with jq, install its binary package to your system:
sudo apt install jq
Find an open API endpoint that you can test jq on. In my case, I’m going to use the ipinfo.io’s IP checker API.
The most basic filter for jq is the dot (.) filter. This will pretty print the JSON response as jq received it from its standard input:
curl https://ipinfo.io/ | jq '.'
Another basic filter is the pipe (|) symbol. This is a special filter that passes the output of one filter as the input of another:
curl https://ipinfo.io/ | jq '. | .ip'
The value after the pipe operator is the “Object Identifier-Index.” This searches your JSON input for any variable that matches its text and prints its value on the terminal. In this case, I’m looking for the value of the “ip:” key.
With the basics done and dusted, the following sections will show you some of the tricks that you can do using jq.
Good to know: learn how you can manipulate text streams using sed in Linux.
1. Creating a Basic Feed Reader with jq
Most modern websites today offer open API endpoints for reading data inside their platforms. For example, every Github repository has its own API URL for you to retrieve the latest commits and issues for that project.
You can use an API endpoint like this with jq to create your own simple “RSS-like” feed. To start, use cURL to test if the endpoint is working properly:
curl https://api.github.com/repos/bitcoin/bitcoin/issues
Run the following to print the first entry in your feed:
curl https://api.github.com/repos/bitcoin/bitcoin/issues | jq '.[0]'
This will show the different fields that the Github API sends to jq.
You can use these to create your own custom JSON object by piping the
input to the curly braces ({}
) filter:
curl https://api.github.com/repos/bitcoin/bitcoin/issues | jq '.[0] | { title: .title }'
Adding the comma (,) filter inside the curly braces allows you to add multiple fields to your custom object:
curl https://api.github.com/repos/bitcoin/bitcoin/issues | jq '.[0] | {title: .title, url: .html_url, author: .user.login}'
Removing the “0” inside the square brackets will apply your jq filter to the entire feed:
curl https://api.github.com/repos/bitcoin/bitcoin/issues | jq '.[] | {title: .title, url: .html_url, author: .user.login}'
You can also create a small Bash script to display the latest issues from your favorite Github project. Paste the following block of code inside an empty shell script file:
#!/bin/bash<br><br># usage: ./script.sh [0 ... 29]<br><br>REPO="https://api.github.com/repos/bitcoin/bitcoin/issues"<br><br>curl $REPO | jq ".[$1] | {title: .title, url: .html_url, author: .user.login}"
Save your file, then run the following command to make it executable:
chmod u+x ./script.sh
Test your new feed reader by listing the latest issue in your favorite Github repo:
./script.sh 0
2. Reading and Searching through a JSON Database
Aside from reading data off of APIs, you can also use jq to manage JSON files in your local machine. Start by creating a simple JSON database file using your favorite text editor:
nano ./database.json
Paste the following block of data inside your file, then save it:
[<br> {"id": 1, "name": "Ramces", "balance": 20},<br> {"id": 2, "name": "Alice", "balance": 30},<br> {"id": 3, "name": "Bob", "balance": 10},<br> {"id": 4, "name": "Charlie", "balance": 20},<br> {"id": 5, "name": "Maria", "balance": 50}<br>]
Test whether jq reads your JSON file properly by printing the first object in your database array:
jq '.[0]' database.json
Make a query on your JSON database using the “Object Identifier-Index” filter. In my case, I am searching for the value of the “.name” key on every entry in my database:
jq '.[] | .name' database.json
You can also use some of jq’s built-in functions to filter your queries based on certain qualities. For example, you can search for and print all the JSON objects that have a “.name” value with more than six characters:
jq '.[] | select((.name|length)>6)' database.json
Operating on JSON Databases with jq
In addition to that, jq can operate on JSON databases similar to a basic spreadsheet. For instance, the following command prints the sum total of the “.balance” key for every object in the database:
jq '[.[] | .balance] | add' database.json
You can even extend this by adding a conditional statement to your query. The following will only add the “.balance” if the “.name” value of the second object is “Alice”:
jq 'if .[1].name == "Alice" then [ .[] | .balance ] | add else "Second name is not Alice" end' database.json
It’s possible to temporarily remove variables from your JSON database. This can be useful if you’re testing your filter and you want to make sure that it can still process your dataset:
jq 'del(.[1].name) | .[]' database.json
You can also insert new variables to your database using the “+” operator. For example, the following line adds the variable “active: true” to the first object in the database:
jq '.[0] + {active: true}' database.json
Note: You can make your changes permanent by piping the output of your jq command to your original database file: jq '.[0] + {active: true}' database.json > database.json
.
3. Transforming Non-JSON Data in jq
Another brilliant feature of jq is that it can accept and work with non-JSON data. To achieve that, the program uses an alternative “slurp mode” where it converts any space and newline delimited data into a JSON array.
You can enable this feature by piping data into jq with an -s
flag:
echo '1 2' | jq -s .
One advantage of converting your raw data into an array is that you can address them using array index numbers. The following command adds two values by referring to their converted array location:
echo '1 2' | jq -s '.[0] + .[1]'
You can take this array location further and construct new JSON code around it. For instance, this code converts the text from the echo command to a JSON object through the curly braces filter:
echo '6 "Mallory" 10' | jq -s '{"id": .[0], "name": .[1], "balance": .[2]}'
Apart from taking in raw data, jq can also return non-JSON data as its output. This is useful if you’re using jq as part of a larger shell script and you only need the result from its filters.
To do that, run jq followed by the -r
flag. For example, the following command reads all the names from my database file and returns it as plain text data:
jq -r '.[] | .name' database.json
Alternative JSON Parsers to jq
Since the code for jq is open source, various developers have created their own versions of the JSON parser. Each of these has its own unique selling point that either improves on or changes a core part of jq.
1. jaq
Jaq is a powerful JSON parser that provides a near identical feature set to jq.
Written in Rust, one of the biggest selling points of Jaq is that it can run the jq language up to 30 times faster than the original parser while still retaining backward compatibility. This alone makes it valuable when you’re running large jq filters and you want to maximize the performance out of your machine.
That said, one downside of jaq is that it’s not currently available on Debian, Ubuntu, and Fedora repositories. The only way to obtain it is to either download Homebrew or compile it from source.
2. gojq
Gojq is an alternative JSON parser that’s written entirely in Go. It provides an accessible and easy-to-use version of jq that you can install on almost any platform.
The original jq program can be incredibly terse in its error messages. As a result, debugging jq scripts is especially hard for a new jq user. Gojq solves this issue by showing you where the mistake is in your script as well as providing detailed messages on the kind of error that happened.
Another selling point of gojq is that it can read and process both JSON and YAML files. This can be especially helpful if you’re a Docker and Docker Compose user and you want to automate your deployment workflow.
Gojq’s biggest issue is that it removed some of the features that
come by default on the original jq parser. For instance, options such as
--ascii-output
, --seq
, and --sort-keys
doesn’t exist on gojq.
FYI: learn how to improve your JSON code by using some of the best JSON beautifiers today.
3. fq
Unlike jaq and gojq, fq is a comprehensive software toolkit that can parse both text and binary data. It can work with a variety of popular formats such as JSON, YAML, HTML, and even FLAC.
The biggest feature of fq is that it contains a built-in hex reader for files. This makes it trivial to look at a file’s internal structure to determine how it’s made and whether there’s anything wrong with it. Aside from that, fq also uses the same syntax for jq when dealing with text which makes it easy to learn for anyone already familiar with jq.
One downside of this ambitious goal is that fq is still under heavy development. As such, some of the program’s features and behaviors are still subject to sweeping changes.
Exploring jq, how it works, and what makes it special is just the
first step in learning how to make programs on your computer. Take a
deep dive into the wonderful world of coding by reading the basics of shell programming.
No comments:
Post a Comment