# jq & Data Processing ## Overview jq is "sed for JSON" — a lightweight command-line processor for JSON data. Combined with curl, it's the standard way to interact with REST APIs from the terminal. yq extends the same idea to YAML. ## Installation | Platform | jq | yq | |----------|----|----| | Windows | `choco install jq` / `winget install jqlang.jq` | `choco install yq` | | macOS | `brew install jq` | `brew install yq` | | Linux | `apt install jq` | `snap install yq` | ## jq Basics ### Identity (pretty-print) ```bash echo '{"name":"Alice","age":30}' | jq '.' ``` ### Field access ```bash echo '{"name":"Alice","address":{"city":"NYC"}}' | jq '.name' # "Alice" echo '{"name":"Alice","address":{"city":"NYC"}}' | jq '.address.city' # "NYC" ``` ### Array indexing ```bash echo '[10,20,30,40,50]' | jq '.[0]' # 10 echo '[10,20,30,40,50]' | jq '.[-1]' # 50 echo '[10,20,30,40,50]' | jq '.[2:5]' # [30,40,50] ``` ### Array iteration ```bash echo '[{"name":"Alice"},{"name":"Bob"}]' | jq '.[]' ``` ### Pipe ```bash curl -s https://api.example.com/data | jq '.users[] | .name' ``` ### Select / filter ```bash echo '[{"name":"Alice","age":35},{"name":"Bob","age":25}]' | jq '.[] | select(.age > 30)' ``` ### Map ```bash echo '[{"name":"Alice"},{"name":"Bob"}]' | jq '[.[] | .name]' # ["Alice", "Bob"] ``` ### Keys, length, type ```bash echo '{"a":1,"b":2}' | jq 'keys' # ["a","b"] echo '[1,2,3]' | jq 'length' # 3 echo '"hello"' | jq 'type' # "string" ``` ### Construct objects ```bash echo '{"first":"Alice","contact":{"email":"a@b.com"}}' | jq '{name: .first, email: .contact.email}' ``` ### String interpolation ```bash echo '{"name":"Alice","email":"a@b.com"}' | jq '"\(.name) - \(.email)"' # "Alice - a@b.com" ``` ## Common jq Recipes ### Extract specific fields ```bash jq '.[] | {name, email}' users.json ``` ### Filter by value ```bash jq '[.[] | select(.status == "active")]' users.json ``` ### Count items ```bash jq '.users | length' data.json ``` ### Sort by field ```bash jq '.users | sort_by(.age)' data.json ``` ### Group by field ```bash jq '.users | group_by(.department)' data.json ``` ### Flatten arrays ```bash jq '[.[][]]' nested.json ``` ### JSON to CSV ```bash jq -r '.[] | [.name, .email, .age] | @csv' users.json ``` ### Unique values ```bash jq '[.[].category] | unique' items.json ``` ### Merge objects ```bash jq -s '.[0] * .[1]' defaults.json overrides.json ``` ### Conditional ```bash echo '{"age":21}' | jq 'if .age > 18 then "adult" else "minor" end' ``` ## yq Basics ### Read a field ```bash yq '.metadata.name' deployment.yaml ``` ### Update a field ```bash yq '.spec.replicas = 3' -i deployment.yaml ``` ### Convert YAML to JSON ```bash yq -o json deployment.yaml ``` ### Convert JSON to YAML ```bash yq -P data.json ``` ### Merge YAML files ```bash yq eval-all '. as $item ireduce({}; . * $item)' base.yaml override.yaml ``` ## Combining with curl ```bash curl -s https://api.github.com/repos/torvalds/linux | jq '{name: .name, stars: .stargazers_count, language: .language}' ``` Output: ```json { "name": "linux", "stars": 178000, "language": "C" } ``` ## Best Practices - Use jq for any JSON manipulation in shell scripts — it is purpose-built and reliable - Pipe `curl -s` to jq for API work — this is the standard pattern for CLI-based API interaction - Use yq for editing Kubernetes manifests, Helm values, and CI config files in place - Prefer jq over grep/awk for JSON — regex on JSON is fragile and error-prone - Use `-r` for raw string output in scripts to avoid quoted strings in downstream processing - Learn `select()` and `map()` — they handle the vast majority of filtering and transformation tasks