This article covers advanced JOLT transformations including cardinality, modify operations, wildcards, and chained specifications for complex JSON restructuring.
This continues from Part 3: JOLT Transformations Part 1.
Most of the JSON I deal with comes from APIs that weren’t designed with my database schema in mind. Nested objects three levels deep, arrays that sometimes contain one item and sometimes fifty, field names in camelCase when I need snake_case. The basic JOLT operations from Part 1 handle simple cases, but real-world API responses usually need multiple transformation passes. That’s what the advanced operations below are for.
Cardinality Operation
Cardinality controls whether values are arrays or single values. Use it to normalize inconsistent APIs that sometimes return arrays and sometimes single values.
ONE - Force Single Value
Input:
{
"tags": ["urgent"],
"category": ["electronics", "gadgets"],
"priority": "high"
}
Spec:
[
{
"operation": "cardinality",
"spec": {
"tags": "ONE",
"category": "ONE",
"priority": "ONE"
}
}
]
Output:
{
"tags": "urgent",
"category": "electronics",
"priority": "high"
}
MANY - Force Array
Input:
{
"singleItem": "value1",
"alreadyArray": ["a", "b"],
"nested": {
"item": "single"
}
}
Spec:
[
{
"operation": "cardinality",
"spec": {
"singleItem": "MANY",
"alreadyArray": "MANY",
"nested": {
"item": "MANY"
}
}
}
]
Output:
{
"singleItem": ["value1"],
"alreadyArray": ["a", "b"],
"nested": {
"item": ["single"]
}
}
Modify Operations
The modify operation transforms values using built-in functions. There are two variants:
modify-default-beta- Only modify if field doesn’t existmodify-overwrite-beta- Always modify
String Functions
Input:
{
"name": " john doe ",
"email": "[email protected]",
"code": "abc123"
}
Spec:
[
{
"operation": "modify-overwrite-beta",
"spec": {
"name": "=trim(@(1,name))",
"email": "=toLower(@(1,email))",
"code": "=toUpper(@(1,code))",
"displayName": "=concat(@(1,name),' <',@(1,email),'>')"
}
}
]
Output:
{
"name": "john doe",
"email": "[email protected]",
"code": "ABC123",
"displayName": "john doe <[email protected]>"
}
Available String Functions
toLower(string)- Convert to lowercasetoUpper(string)- Convert to uppercasetrim(string)- Remove leading/trailing whitespaceconcat(str1, str2, ...)- Concatenate stringssubstring(string, start, end)- Extract substringsplit(string, delimiter)- Split into arrayjoin(delimiter, array)- Join array to string
Math Functions
Input:
{
"price": 19.99,
"quantity": 3,
"discount": 0.1,
"values": [10, 20, 30, 40]
}
Spec:
[
{
"operation": "modify-overwrite-beta",
"spec": {
"subtotal": "=doubleSum(@(1,price),@(1,price),@(1,price))",
"total": "=divide(@(1,subtotal),=intSum(1,-@(1,discount)))",
"average": "=avg(@(1,values))",
"min": "=min(@(1,values))",
"max": "=max(@(1,values))",
"rounded": "=toInteger(@(1,price))"
}
}
]
Available Math Functions
intSum(n1, n2, ...)- Sum integersdoubleSum(n1, n2, ...)- Sum doublesdivide(dividend, divisor)- Divisionmultiply(n1, n2)- Multiplication (via divide with 1/n)min(array)- Minimum valuemax(array)- Maximum valueavg(array)- Averageabs(number)- Absolute valuetoInteger(value)- Convert to integertoDouble(value)- Convert to doubletoLong(value)- Convert to long
Type Conversion
Input:
{
"stringNum": "42",
"stringBool": "true",
"intValue": 100
}
Spec:
[
{
"operation": "modify-overwrite-beta",
"spec": {
"asInteger": "=toInteger(@(1,stringNum))",
"asBoolean": "=toBoolean(@(1,stringBool))",
"asString": "=toString(@(1,intValue))"
}
}
]
Output:
{
"stringNum": "42",
"stringBool": "true",
"intValue": 100,
"asInteger": 42,
"asBoolean": true,
"asString": "100"
}
Reference Syntax
The @ symbol references values:
@(0,fieldName)- Current level field@(1,fieldName)- Parent level field@(2,fieldName)- Grandparent level field@- Current value being processed
Advanced Wildcards
Matching Multiple Patterns
Use | to match multiple keys:
Input:
{
"firstName": "John",
"lastName": "Doe",
"middleName": "William",
"age": 30
}
Spec:
[
{
"operation": "shift",
"spec": {
"firstName|lastName|middleName": "names.&",
"age": "demographics.age"
}
}
]
Output:
{
"names": {
"firstName": "John",
"lastName": "Doe",
"middleName": "William"
},
"demographics": {
"age": 30
}
}
Matching Array Elements
Input:
{
"items": [
{"id": 1, "name": "Apple", "type": "fruit"},
{"id": 2, "name": "Carrot", "type": "vegetable"},
{"id": 3, "name": "Banana", "type": "fruit"}
]
}
Spec:
[
{
"operation": "shift",
"spec": {
"items": {
"*": {
"id": "products[&1].productId",
"name": "products[&1].productName",
"type": "products[&1].category"
}
}
}
}
]
Output:
{
"products": [
{"productId": 1, "productName": "Apple", "category": "fruit"},
{"productId": 2, "productName": "Carrot", "category": "vegetable"},
{"productId": 3, "productName": "Banana", "category": "fruit"}
]
}
Conditional Matching by Value
Match based on field values using the @ reference on the left side:
Input:
{
"events": [
{"type": "click", "data": "button1"},
{"type": "view", "data": "page1"},
{"type": "click", "data": "link2"}
]
}
Spec:
[
{
"operation": "shift",
"spec": {
"events": {
"*": {
"type": {
"click": {
"@(2,data)": "clicks[]"
},
"view": {
"@(2,data)": "views[]"
}
}
}
}
}
}
]
Output:
{
"clicks": ["button1", "link2"],
"views": ["page1"]
}
Hash (#) for Array Collection
The # symbol creates arrays by collecting matched values:
Input:
{
"users": {
"u1": {"name": "Alice", "role": "admin"},
"u2": {"name": "Bob", "role": "user"},
"u3": {"name": "Charlie", "role": "admin"}
}
}
Spec:
[
{
"operation": "shift",
"spec": {
"users": {
"*": {
"name": "allNames[]",
"role": {
"admin": {
"#admin": "roles.admins[]",
"@(2,name)": "adminNames[]"
},
"user": {
"#user": "roles.users[]",
"@(2,name)": "userNames[]"
}
}
}
}
}
}
]
Output:
{
"allNames": ["Alice", "Bob", "Charlie"],
"roles": {
"admins": ["admin", "admin"],
"users": ["user"]
},
"adminNames": ["Alice", "Charlie"],
"userNames": ["Bob"]
}
Chained Specifications
Complex transformations often require multiple passes:
Input:
{
"order": {
"id": "ORD-001",
"items": [
{"sku": "A1", "qty": 2, "price": 10.00},
{"sku": "B2", "qty": 1, "price": 25.00}
],
"customer": {
"name": " JOHN DOE ",
"email": "[email protected]"
}
}
}
Chained Spec:
[
{
"operation": "shift",
"spec": {
"order": {
"id": "orderId",
"items": {
"*": {
"sku": "lineItems[&1].sku",
"qty": "lineItems[&1].quantity",
"price": "lineItems[&1].unitPrice"
}
},
"customer": {
"name": "customer.rawName",
"email": "customer.rawEmail"
}
}
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"customer": {
"name": "=trim(@(1,rawName))",
"email": "=toLower(@(1,rawEmail))"
},
"lineItems": {
"*": {
"lineTotal": "=doubleSum(=multiply(@(1,quantity),@(1,unitPrice)))"
}
}
}
},
{
"operation": "remove",
"spec": {
"customer": {
"rawName": "",
"rawEmail": ""
}
}
},
{
"operation": "default",
"spec": {
"status": "pending",
"currency": "USD",
"metadata": {
"version": "1.0"
}
}
}
]
Output:
{
"orderId": "ORD-001",
"lineItems": [
{"sku": "A1", "quantity": 2, "unitPrice": 10.00},
{"sku": "B2", "quantity": 1, "unitPrice": 25.00}
],
"customer": {
"name": "JOHN DOE",
"email": "[email protected]"
},
"status": "pending",
"currency": "USD",
"metadata": {
"version": "1.0"
}
}
Sort Operation
Alphabetically sort object keys (useful for consistent output):
Spec:
[
{
"operation": "sort"
}
]
This recursively sorts all keys in the document.
Summary
This article covered advanced JOLT features:
- Cardinality for ONE/MANY value normalization
- Modify operations with string and math functions
- Advanced wildcards for pattern matching
- Conditional routing by value
- Chained specifications for complex transformations
- Sort operation for consistent output
The next article explores dynamic HTTP listeners with portpxy for webhook handling.
Resources
Next: Dynamic HTTP Listeners with portpxy
Check out the next article in this series, Apache NiFi: Dynamic HTTP Listeners with portpxy.
This blog post, titled: "Apache NiFi: JOLT Transformations Part 2: Apache NiFi Part 4" by Craig Johnston, is licensed under a Creative Commons Attribution 4.0 International License.
