Transformations & Field Mapping
Overview
The transform engine processes data at the NDJSON (newline-delimited JSON) intermediate representation layer. This means transforms work consistently across all input formats (CSV, XML, JSON, NDJSON) and are applied in a streaming manner without buffering entire datasets.
Key Features
- 🧮 30+ Compute Functions — String manipulation, math, logic, type conversions
- 🎯 Type Coercion — Convert strings to numbers, booleans, timestamps
- 🛡️ Error Policies — Control behavior for missing fields and coercion failures
Basic Usage
import { ConvertBuddy } from "convert-buddy-js";
const buddy = new ConvertBuddy();
const result = await buddy.convert(csvData, {
outputFormat: "json",
transform: {
mode: "replace",
fields: [
{ targetFieldName: "id", required: true },
{ targetFieldName: "full_name", compute: "concat(first, ' ', last)" },
{ targetFieldName: "age", coerce: { type: "i64" }, defaultValue: 0 },
{ targetFieldName: "email", compute: "lower(trim(email))" },
],
onMissingField: "null",
onCoerceError: "null",
},
});Transform Modes
Replace Mode (default)
mode: "replace" outputs only the fields you explicitly map. All original fields are discarded unless mapped.
// Input: { "first": "Alice", "last": "Smith", "age": 30, "city": "NYC" }
// Output: { "full_name": "Alice Smith", "age": 30 }
transform: {
mode: "replace",
fields: [
{ targetFieldName: "full_name", compute: "concat(first, ' ', last)" },
{ targetFieldName: "age" },
],
}Augment Mode
mode: "augment" preserves all original fields and adds or overwrites the fields you map. Perfect for adding computed columns to existing data.
// Input: { "name": "Alice", "age": 30 }
// Output: { "name": "Alice", "age": 30, "name_upper": "ALICE", "is_adult": true }
transform: {
mode: "augment",
fields: [
{ targetFieldName: "name_upper", compute: "upper(name)" },
{ targetFieldName: "is_adult", compute: "gte(age, 18)" },
],
}Field Mapping
Each field mapping supports multiple options:
type FieldMap = {
targetFieldName: string; // Output field name (required)
originFieldName?: string; // Input field name (defaults to targetFieldName)
required?: boolean; // Fail if missing
defaultValue?: any; // Fallback value when missing or null
coerce?: Coerce; // Type conversion specification
compute?: string; // Expression to compute value
};Examples
Basic pass-through:
{ targetFieldName: "name" } // Maps input "name" to output "name"Rename field:
{ targetFieldName: "fullName", originFieldName: "full_name" }Required field:
{ targetFieldName: "id", required: true }Default value:
{ targetFieldName: "status", defaultValue: "active" }Computed field:
{ targetFieldName: "full_name", compute: "concat(first, ' ', last)" }Type coercion:
{ targetFieldName: "age", coerce: { type: "i64" } }Compute Functions
Transform expressions support 30+ built-in functions. All functions are evaluated in Rust/WASM for maximum performance.
String Functions
| Function | Description | Example |
|---|---|---|
concat(...) | Join strings and values | |
lower(s) | Convert to lowercase | "hello" |
upper(s) | Convert to uppercase | "HELLO" |
trim(s) | Remove leading/trailing whitespace | "text" |
substring(s, start, end) | Extract substring | "hel" |
replace(s, old, new) | Replace text | |
len(s) | String length | 5 |
starts_with(s, prefix) | Check if starts with prefix | |
ends_with(s, suffix) | Check if ends with suffix | |
contains(s, substr) | Check if contains substring | |
pad_start(s, len, char) | Left-pad to length | "00042" |
pad_end(s, len, char) | Right-pad to length | "x__" |
trim_start(s) | Remove leading whitespace | |
trim_end(s) | Remove trailing whitespace | |
repeat(s, n) | Repeat string n times | "ababab" |
reverse(s) | Reverse string | "olleh" |
Math Functions
| Function | Description | Example |
|---|---|---|
+, -, *, / | Arithmetic operators | |
round(n) | Round to nearest integer | 4 |
floor(n) | Round down | 3 |
ceil(n) | Round up | 4 |
abs(n) | Absolute value | 5 |
min(...) | Minimum value | |
max(...) | Maximum value | |
Logic & Comparison
| Function | Description | Example |
|---|---|---|
if(cond, true_val, false_val) | Conditional expression | |
not(bool) | Boolean NOT | |
eq(a, b) | Equals | |
ne(a, b) | Not equals | |
gt(a, b) | Greater than | |
gte(a, b) | Greater than or equal | |
lt(a, b) | Less than | |
lte(a, b) | Less than or equal | |
Type Checking & Utilities
| Function | Description | Example |
|---|---|---|
is_null(val) | Check if null | |
is_number(val) | Check if number | |
is_string(val) | Check if string | |
is_bool(val) | Check if boolean | |
to_string(val) | Convert to string | "42" |
parse_int(s) | Parse string to integer | 123 |
parse_float(s) | Parse string to float | 3.14 |
coalesce(...) | First non-null value | |
default(val, fallback) | Fallback for null | |
Complex Expressions
Functions can be nested and combined:
// Nested functions
compute: "upper(trim(concat(first, ' ', last)))"
// Arithmetic with functions
compute: "round((price - discount) * 1.1)"
// Conditional logic with comparisons
compute: "if(gte(age, 18), 'adult', if(gte(age, 13), 'teen', 'child'))"
// Multiple operations
compute: "pad_start(to_string(round(price)), 10, '0')"Type Coercion
Convert field values to specific types:
type Coerce =
| { type: "string" }
| { type: "i64" } // 64-bit integer
| { type: "f64" } // 64-bit float
| { type: "bool" }
| { type: "timestamp_ms"; format?: "iso8601" | "unix_ms" | "unix_s" };String Coercion
// Number → String: 42 → "42"
// Boolean → String: true → "true"
// Null → String: null → ""
{ targetFieldName: "age_str", coerce: { type: "string" } }Integer Coercion (i64)
// String → Integer: "42" → 42
// Float → Integer: 3.7 → 3 (truncates)
// Boolean → Integer: true → 1, false → 0
{ targetFieldName: "count", coerce: { type: "i64" } }Float Coercion (f64)
// String → Float: "3.14" → 3.14
// Integer → Float: 42 → 42.0
{ targetFieldName: "price", coerce: { type: "f64" } }Boolean Coercion
// String → Boolean: "true"/"false" → true/false
// Number → Boolean: 0 → false, non-zero → true
{ targetFieldName: "active", coerce: { type: "bool" } }Timestamp Coercion
// ISO8601 → Unix milliseconds
{
targetFieldName: "created_ts",
coerce: { type: "timestamp_ms", format: "iso8601" }
}
// "2024-01-15T10:30:00Z" → 1705315800000
// Unix seconds → milliseconds
{
targetFieldName: "updated_ts",
coerce: { type: "timestamp_ms", format: "unix_s" }
}
// 1705315800 → 1705315800000Error Handling
onMissingField
Controls behavior when a mapped field is missing from input:
"error"— Fail the entire conversion"null"— Insertnullfor missing fields"drop"— Omit the field from output
transform: {
fields: [{ targetFieldName: "optional" }],
onMissingField: "null", // Missing → null
}onMissingRequired
Controls behavior when a required field is missing:
"error"— Fail the conversion (default)"abort"— Stop processing this record
transform: {
fields: [{ targetFieldName: "id", required: true }],
onMissingRequired: "error",
}onCoerceError
Controls behavior when type coercion fails:
"error"— Fail the entire conversion"null"— Set field tonullon failure"dropRecord"— Silently skip records that fail coercion
// Skip records where age cannot be parsed
transform: {
fields: [
{ targetFieldName: "age", coerce: { type: "i64" } }
],
onCoerceError: "dropRecord", // Skip invalid records
}Performance Considerations
- ✅ Zero JS overhead — All transforms execute in Rust/WASM
- ✅ Streaming — Processes line-by-line without buffering
- ✅ Memory efficient — Uses WASM linear memory, no JS objects
- ✅ Type safe — Expression parser validates syntax at compile time
- ⚡ Low overhead — Typically adds <10% to conversion time
For extremely complex transformations requiring external API calls, database lookups, or custom business logic, use the onRecords callback for post-processing in JavaScript.
Complete Examples
E-commerce Product Enrichment
const controller = buddy.stream("products.csv", {
outputFormat: "json",
transform: {
mode: "augment",
fields: [
// Price calculations
{
targetFieldName: "price_with_tax",
compute: "round(price * 1.1 * 100) / 100"
},
{
targetFieldName: "discount_amount",
compute: "round((original_price - price) * 100) / 100"
},
// Categorization
{
targetFieldName: "price_tier",
compute: "if(gt(price, 100), 'premium', if(gt(price, 50), 'mid', 'budget'))"
},
// SKU normalization
{
targetFieldName: "sku_normalized",
compute: "upper(trim(replace(sku, '-', '')))"
},
],
},
onRecords: async (ctrl, records) => {
await db.products.upsertMany(records);
},
});User Data Cleaning Pipeline
const controller = buddy.stream("users-export.csv", {
outputFormat: "ndjson",
transform: {
mode: "replace",
fields: [
{ targetFieldName: "id", required: true },
{
targetFieldName: "email",
compute: "lower(trim(email))",
required: true
},
{
targetFieldName: "full_name",
compute: "trim(concat(first_name, ' ', last_name))"
},
{
targetFieldName: "age",
coerce: { type: "i64" },
defaultValue: null
},
{
targetFieldName: "is_verified",
originFieldName: "verified",
coerce: { type: "bool" },
defaultValue: false
},
{
targetFieldName: "created_ts",
originFieldName: "created_at",
coerce: { type: "timestamp_ms", format: "iso8601" }
},
],
onMissingRequired: "error",
onCoerceError: "dropRecord", // Skip malformed records
},
onRecords: async (ctrl, records, stats) => {
console.log(`Processed ${stats.recordsOut} valid records`);
await importToDatabase(records);
},
});