This article introduces JOLT (JSON to JSON transformation Language) in Apache NiFi, covering the fundamental operations: shift, default, and remove.
If you’ve ever pulled data from a third-party API and tried to insert it into a database, you know the JSON structure rarely matches your schema. Field names are different, nested objects need flattening, some fields need defaults, others need to go away entirely. JOLT lets you define those transformations declaratively instead of writing code. I use it for reshaping webhook payloads, flattening API responses, and cleaning up data before it hits Postgres.
What is JOLT?
JOLT is a JSON transformation library that lets you restructure JSON documents declaratively. In NiFi, the JoltTransformJSON processor applies JOLT specifications to FlowFile content.
JOLT operations execute in order:
- shift - Move and rename data
- default - Add default values for missing fields
- remove - Delete fields
- sort - Sort map keys alphabetically
- cardinality - Control array/single value handling
- modify - Transform values with functions
Shift Operation
The shift operation moves data from the input to the output. The left side matches input paths, and the right side defines output paths.
Basic Field Mapping
Input:
{
"firstName": "John",
"lastName": "Doe",
"age": 30
}
Spec:
[
{
"operation": "shift",
"spec": {
"firstName": "first_name",
"lastName": "last_name",
"age": "user_age"
}
}
]
Output:
{
"first_name": "John",
"last_name": "Doe",
"user_age": 30
}
Nested Object Mapping
Input:
{
"user": {
"name": "John",
"contact": {
"email": "[email protected]",
"phone": "555-1234"
}
}
}
Spec:
[
{
"operation": "shift",
"spec": {
"user": {
"name": "profile.displayName",
"contact": {
"email": "profile.email",
"phone": "profile.telephone"
}
}
}
}
]
Output:
{
"profile": {
"displayName": "John",
"email": "[email protected]",
"telephone": "555-1234"
}
}
Flattening Nested Structures
Input:
{
"order": {
"id": "ORD-123",
"customer": {
"name": "Jane",
"address": {
"city": "Seattle",
"state": "WA"
}
}
}
}
Spec:
[
{
"operation": "shift",
"spec": {
"order": {
"id": "orderId",
"customer": {
"name": "customerName",
"address": {
"city": "city",
"state": "state"
}
}
}
}
}
]
Output:
{
"orderId": "ORD-123",
"customerName": "Jane",
"city": "Seattle",
"state": "WA"
}
Creating Nested from Flat
Input:
{
"orderId": "ORD-123",
"customerName": "Jane",
"customerEmail": "[email protected]",
"itemName": "Widget",
"itemPrice": 29.99
}
Spec:
[
{
"operation": "shift",
"spec": {
"orderId": "order.id",
"customerName": "order.customer.name",
"customerEmail": "order.customer.email",
"itemName": "order.items[0].name",
"itemPrice": "order.items[0].price"
}
}
]
Output:
{
"order": {
"id": "ORD-123",
"customer": {
"name": "Jane",
"email": "[email protected]"
},
"items": [
{
"name": "Widget",
"price": 29.99
}
]
}
}
Using Wildcards
The * wildcard matches any key at the current level.
Input:
{
"data": {
"user1": {"name": "Alice", "score": 85},
"user2": {"name": "Bob", "score": 92},
"user3": {"name": "Charlie", "score": 78}
}
}
Spec:
[
{
"operation": "shift",
"spec": {
"data": {
"*": {
"name": "users[&1].username",
"score": "users[&1].points"
}
}
}
}
]
Output:
{
"users": {
"user1": {"username": "Alice", "points": 85},
"user2": {"username": "Bob", "points": 92},
"user3": {"username": "Charlie", "points": 78}
}
}
Array Index Reference (&)
The & symbol references matched keys by depth level.
&or&0- Current matched key&1- Parent matched key&2- Grandparent matched key
Input:
{
"departments": {
"engineering": ["Alice", "Bob"],
"marketing": ["Charlie", "Diana"]
}
}
Spec:
[
{
"operation": "shift",
"spec": {
"departments": {
"*": {
"*": "employees[&1].members[]"
}
}
}
}
]
Output:
{
"employees": {
"engineering": {"members": ["Alice", "Bob"]},
"marketing": {"members": ["Charlie", "Diana"]}
}
}
Default Operation
The default operation adds values for missing fields without overwriting existing ones.
Input:
{
"name": "John",
"status": "active"
}
Spec:
[
{
"operation": "default",
"spec": {
"name": "Unknown",
"email": "[email protected]",
"status": "pending",
"role": "user",
"metadata": {
"version": "1.0",
"source": "api"
}
}
}
]
Output:
{
"name": "John",
"status": "active",
"email": "[email protected]",
"role": "user",
"metadata": {
"version": "1.0",
"source": "api"
}
}
Note that name and status retain their original values.
Nested Defaults
Input:
{
"user": {
"id": 123
}
}
Spec:
[
{
"operation": "default",
"spec": {
"user": {
"name": "Anonymous",
"preferences": {
"theme": "light",
"notifications": true
}
}
}
}
]
Output:
{
"user": {
"id": 123,
"name": "Anonymous",
"preferences": {
"theme": "light",
"notifications": true
}
}
}
Remove Operation
The remove operation deletes fields from the output.
Input:
{
"id": 123,
"name": "John",
"password": "secret123",
"ssn": "123-45-6789",
"email": "[email protected]"
}
Spec:
[
{
"operation": "remove",
"spec": {
"password": "",
"ssn": ""
}
}
]
Output:
{
"id": 123,
"name": "John",
"email": "[email protected]"
}
Removing Nested Fields
Input:
{
"user": {
"id": 123,
"profile": {
"name": "John",
"internalId": "INT-456",
"email": "[email protected]"
},
"audit": {
"createdBy": "admin",
"modifiedBy": "system"
}
}
}
Spec:
[
{
"operation": "remove",
"spec": {
"user": {
"profile": {
"internalId": ""
},
"audit": ""
}
}
}
]
Output:
{
"user": {
"id": 123,
"profile": {
"name": "John",
"email": "[email protected]"
}
}
}
Combining Operations
Operations execute in array order. Combine them for complex transformations.
Input:
{
"raw_data": {
"user_id": "U123",
"user_name": "johndoe",
"temp_token": "abc123",
"email_address": "[email protected]"
}
}
Spec:
[
{
"operation": "shift",
"spec": {
"raw_data": {
"user_id": "user.id",
"user_name": "user.username",
"temp_token": "user.token",
"email_address": "user.email"
}
}
},
{
"operation": "default",
"spec": {
"user": {
"role": "member",
"verified": false
}
}
},
{
"operation": "remove",
"spec": {
"user": {
"token": ""
}
}
}
]
Output:
{
"user": {
"id": "U123",
"username": "johndoe",
"email": "[email protected]",
"role": "member",
"verified": false
}
}
NiFi Processor Configuration
In NiFi, use the JoltTransformJSON processor:
- Add
JoltTransformJSONto your flow - Set Jolt Specification to your spec (or reference a file)
- Set Jolt Transform to “Chain” for multiple operations
Using Expression Language
Reference FlowFile attributes in specs:
[
{
"operation": "default",
"spec": {
"processedAt": "${now():format('yyyy-MM-dd HH:mm:ss')}",
"source": "${filename}"
}
}
]
Enable Transform Cache Size for repeated transformations with the same spec.
Testing JOLT Specs
Use these approaches to test your specs:
NiFi’s Built-in Tester
The processor includes an “Advanced” tab with a JOLT spec tester. Paste input JSON and verify output before running the flow.
Online Tools
- JOLT Transform Demo
- Paste input and spec to see output immediately
Unit Testing in Java
import com.bazaarvoice.jolt.Chainr;
import com.bazaarvoice.jolt.JsonUtils;
@Test
public void testTransformation() {
List<Object> spec = JsonUtils.classpathToList("/specs/user-transform.json");
Chainr chainr = Chainr.fromSpec(spec);
Object input = JsonUtils.classpathToObject("/testdata/input.json");
Object expected = JsonUtils.classpathToObject("/testdata/expected.json");
Object actual = chainr.transform(input);
assertEquals(expected, actual);
}
Summary
This article covered JOLT basics:
- Shift to restructure and rename fields
- Default to add missing values
- Remove to delete unwanted fields
- Combining operations for complex transformations
The next article explores advanced JOLT operations including cardinality, modify, and wildcard patterns.
Resources
Next: JOLT Transformations Part 2
Check out the next article in this series, Apache NiFi: JOLT Transformations Part 2.
This blog post, titled: "Apache NiFi: JOLT Transformations Part 1: Apache NiFi Part 3" by Craig Johnston, is licensed under a Creative Commons Attribution 4.0 International License.
