Transforms
Transform your data before insert.
Transforms
A transform changes the shape of data while RawTree inserts it.
Many telemetry and cloud services send one large JSON object that contains the real events inside an array. That wrapper object is useful for delivery, but it is not always the shape you want to query. A transform tells RawTree which array contains the real rows and which wrapper fields should be copied onto each row.
Without a transform, RawTree stores the JSON object you send. With a transform, RawTree reads the JSON object, extracts the nested records, and stores each extracted record as its own row.
Example
CloudWatch Logs sends payloads shaped like this:
{
"owner": "123456789012",
"logGroup": "/aws/lambda/api",
"logStream": "2024/01/15/[$LATEST]abcdef",
"logEvents": [
{ "id": "1", "timestamp": 1705312800000, "message": "started" },
{ "id": "2", "timestamp": 1705312800100, "message": "finished" }
]
}If you insert that JSON without a transform, RawTree inserts one row: the wrapper object with a nested logEvents array.
If you insert it with --transform cloudwatch-logs, RawTree inserts two rows, one for each log event:
{
"id": "1",
"timestamp": 1705312800000,
"message": "started",
"owner": "123456789012",
"logGroup": "/aws/lambda/api",
"logStream": "2024/01/15/[$LATEST]abcdef"
}{
"id": "2",
"timestamp": 1705312800100,
"message": "finished",
"owner": "123456789012",
"logGroup": "/aws/lambda/api",
"logStream": "2024/01/15/[$LATEST]abcdef"
}Now you can query the log events directly:
SELECT timestamp, message, logGroup
FROM cloudwatch_logs
ORDER BY timestamp
LIMIT 10;When to use transforms
Use a transform when all of these are true:
- Your source format is one of the supported formats below.
- One JSON object contains many records inside a nested array.
- You want to query those records as table rows.
Do not use a transform when your data is already one event per JSON object or one event per JSONL line. Insert that data normally.
RawTree supports built-in transforms only. Custom transform code is not currently supported; transform the data before insert if you need a custom shape.
How to insert with a transform
Use --transform with inline JSON, JSON files, or JSONL files.
rtree insert --table traces \
--data '{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"api"}}]},"scopeSpans":[{"scope":{"name":"http"},"spans":[{"name":"GET /health","spanId":"abc"}]}]}' \
--transform otlp-tracesThe API accepts the transform name as a query parameter on a JSON body insert.
curl -X POST "https://api.rawtree.com/v1/tables/traces?transform=otlp-traces" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"api"}}]},"scopeSpans":[{"scope":{"name":"http"},"spans":[{"name":"GET /health","spanId":"abc"}]}]}'After insert, query the destination table as usual.
rtree query "SELECT * FROM traces LIMIT 10"Transforms are not supported with URL inserts. If you use ?url=, transform the data before hosting it.
Supported transforms
| Transform | Input shape | Output rows |
|---|---|---|
otlp-traces | OTLP export with resourceSpans, or one resource span object with resource and scopeSpans | One row per span |
otlp-logs | OTLP export with resourceLogs, or one resource log object with resource and scopeLogs | One row per log record |
otlp-metrics | OTLP export with resourceMetrics, or one resource metric object with resource and scopeMetrics | One row per data point |
cloudwatch-logs | CloudWatch Logs payload with logEvents | One row per log event |
cloudtrail | CloudTrail payload with Records | One row per CloudTrail record |
Unknown transform names return a 400 response with the list of supported names.
OpenTelemetry output
The OpenTelemetry transforms merge resource attributes into each emitted row. Attribute values such as stringValue, intValue, boolValue, and doubleValue are unwrapped into their inner values. Complex attribute values such as arrayValue and kvlistValue keep their OTLP wrapper object.
For traces and logs, RawTree also adds scope.name when the source scope includes a name.
For metrics, RawTree emits one row per data point and adds metric metadata:
| Field | Description |
|---|---|
metric.name | The OTLP metric name |
metric.type | The OTLP metric type, such as gauge, sum, histogram, summary, or exponentialHistogram |
metric.unit | The metric unit when present |
metric.description | The metric description when present |
scope.name | The instrumentation scope name when present |
Result shapes
The exact output columns depend on the source fields. Each transform starts with the nested record it unwraps, then merges selected wrapper fields into the row.
otlp-traces
Input can be a full OTLP export with resourceSpans, or a single resource span object with resource and scopeSpans.
For each item in scopeSpans[].spans[], RawTree inserts one row. The row starts with the span object itself, then RawTree adds resource attributes such as service.name, and adds scope.name when the source scope has a name.
Example output row from a span named GET /health:
{
"spanId": "abc",
"name": "GET /health",
"service.name": "api",
"scope.name": "http"
}otlp-logs
Input can be a full OTLP export with resourceLogs, or a single resource log object with resource and scopeLogs.
For each item in scopeLogs[].logRecords[], RawTree inserts one row. The row starts with the log record object itself, then RawTree adds resource attributes such as service.name, and adds scope.name when the source scope has a name.
Example output row from an INFO log record:
{
"timeUnixNano": "1700000000000000000",
"severityText": "INFO",
"body": { "stringValue": "request started" },
"service.name": "api",
"scope.name": "logger"
}otlp-metrics
Input can be a full OTLP export with resourceMetrics, or a single resource metric object with resource and scopeMetrics.
For every metric data point under gauge, sum, histogram, summary, or exponentialHistogram, RawTree inserts one row. The row starts with the data point object itself, then RawTree adds resource attributes, optional scope.name, and metric fields such as metric.name, metric.type, metric.unit, and metric.description.
Example output row from a cpu.usage gauge data point:
{
"timeUnixNano": "1700000000000000000",
"asDouble": 42.5,
"host.name": "node-1",
"scope.name": "runtime",
"metric.name": "cpu.usage",
"metric.type": "gauge",
"metric.unit": "%"
}cloudwatch-logs
Input must contain a top-level logEvents array.
For each item in logEvents[], RawTree inserts one row. The row starts with the log event object itself, then RawTree adds top-level CloudWatch fields such as owner, logGroup, logStream, and messageType.
Example output row from a CloudWatch log event:
{
"id": "37199704885633600550210639780",
"timestamp": 1705312800000,
"message": "START RequestId: abc-123",
"owner": "123456789012",
"logGroup": "/aws/lambda/my-function",
"logStream": "2024/01/15/[$LATEST]abcdef",
"messageType": "DATA_MESSAGE"
}cloudtrail
Input must contain a top-level Records array.
For each item in Records[], RawTree inserts the CloudTrail record object itself. RawTree does not add wrapper fields for this transform.
Example output row from a CloudTrail GetObject record:
{
"eventVersion": "1.08",
"eventTime": "2024-01-15T10:30:00Z",
"eventSource": "s3.amazonaws.com",
"eventName": "GetObject",
"awsRegion": "us-east-1",
"sourceIPAddress": "203.0.113.42",
"userIdentity": { "type": "IAMUser", "userName": "alice" }
}If the unwrapped record already contains a field with the same name as a merged wrapper field, the merged value is the value inserted for that column. For example, OpenTelemetry scope.name comes from the scope wrapper, and CloudWatch logGroup comes from the top-level wrapper.
AWS output
The cloudwatch-logs transform unwraps logEvents. Each emitted row keeps the log event fields and also includes these wrapper fields when present:
| Field | Description |
|---|---|
owner | AWS account owner from the wrapper payload |
logGroup | CloudWatch log group |
logStream | CloudWatch log stream |
messageType | CloudWatch message type |
The cloudtrail transform unwraps the top-level Records array. CloudTrail records are inserted as the emitted rows without additional wrapper fields.
Empty results
A transform can emit zero rows if the input JSON does not contain the expected wrapper arrays. For example, otlp-traces requires spans under resourceSpans or scopeSpans, and cloudtrail requires a Records array.
If an insert succeeds but inserts fewer rows than expected, check the source shape and query the table with SELECT * FROM <table> LIMIT 10.