Matcher
The matcher takes two datasets and finds records that correspond to each other. It outputs three arrays: matched pairs, unmatched records from the left side, and unmatched records from the right side.
Basic Usage
{
"type": "matcher",
"properties": {
"left": "@input.invoices",
"right": "@input.payments",
"matchOn": ["invoice_id"],
"outputMatched": "matched",
"outputUnmatchedLeft": "unmatched_invoices",
"outputUnmatchedRight": "unmatched_payments"
}
}
This matches invoices to payments by exact invoice_id. Records with matching IDs land in @matched. Invoices with no payment go to @unmatched_invoices. Payments with no invoice go to @unmatched_payments.
Properties Reference
| Property | Type | Required | Description |
|---|---|---|---|
left |
array / @path / doc: | Yes | First dataset — inline array, @path reference, or doc: uploaded document |
right |
array / @path / doc: | Yes | Second dataset — inline array, @path reference, or doc: uploaded document |
matchOn |
string[] | Yes | Fields that must match exactly |
tolerance |
number | No | Absolute numeric tolerance applied to the amount field. A tolerance of 0.02 means amounts differing by more than $0.02 won't match |
dateWindowDays |
number | No | Date tolerance in days (±N). Applied to the date field |
fuzzyThreshold |
number | No | Text similarity threshold 0–100. Applied to the field specified by descriptionKey |
descriptionKey |
string | No | Field name for fuzzy text matching |
rules |
array | No | Custom matching rules |
outputMatched |
string | No | Context key for matched pairs (default: "matched") |
outputUnmatchedLeft |
string | No | Context key for unmatched left records (default: "unmatchedLeft") |
outputUnmatchedRight |
string | No | Context key for unmatched right records (default: "unmatchedRight") |
Matching Criteria
Exact Key Matching (matchOn)
Fields listed in matchOn must match exactly. This is the primary matching criteria — records are only compared if their matchOn fields align.
{
"matchOn": ["invoice_id"]
}
Multiple keys create a composite match — all must match:
{
"matchOn": ["vendor_id", "invoice_number"]
}
Numeric Tolerance (tolerance)
Allow the amount field to differ by an absolute value. A tolerance of 50 means amounts within $50 of each other are still considered a match.
{
"matchOn": ["invoice_id"],
"tolerance": 50
}
With this configuration, an invoice for $1,000.00 would match a payment of $950.00–$1,050.00. Smaller tolerances such as 0.02 are still valid when you truly mean two cents.
Date Window (dateWindowDays)
Allow date fields to differ by up to N days:
{
"matchOn": ["invoice_id"],
"dateWindowDays": 3
}
An invoice dated January 10 would match a payment dated January 7–13.
Fuzzy Text Matching (fuzzyThreshold + descriptionKey)
Compare text fields using fuzzy string similarity. The threshold is 0–100 where 100 is an exact match:
{
"matchOn": ["vendor_id"],
"fuzzyThreshold": 85,
"descriptionKey": "description"
}
This matches records where vendor_id is identical and the description fields are at least 85% similar. Useful for matching line-item descriptions that may be worded differently across systems.
Custom Rules (rules)
Define additional matching rules evaluated by the condition engine:
{
"matchOn": ["invoice_id"],
"rules": [
{
"condition": {
"lessOrEqual": [
{ "abs": { "subtract": ["@left.amount", "@right.amount"] } },
50
]
}
}
]
}
Custom rules use the same condition operators as workflow conditions, with @left and @right referencing the current pair being compared.
Output Format
Matched Pairs
Each matched record contains both the left and right record:
[
{
"a": { "invoice_id": "INV-001", "amount": 1000, "vendor": "Acme" },
"b": { "invoice_id": "INV-001", "amount": 1000, "vendor": "Acme Corp" },
"match_score": 0.95,
"amount_difference": 0
}
]
The a field is the left record, b is the right record. match_score reflects overall match quality. amount_difference shows numeric deviation when tolerance matching is used.
Unmatched Records
Unmatched arrays contain the original records with no modifications:
[
{ "invoice_id": "INV-099", "amount": 5000, "vendor": "NewVendor" }
]
Worked Example
Input:
{
"invoices": [
{ "invoice_id": "INV-001", "amount": 1000.00, "date": "2025-01-10", "description": "Monthly service fee" },
{ "invoice_id": "INV-002", "amount": 2500.00, "date": "2025-01-15", "description": "Equipment rental" },
{ "invoice_id": "INV-003", "amount": 750.00, "date": "2025-01-20", "description": "Consulting hours" }
],
"payments": [
{ "invoice_id": "INV-001", "amount": 1000.00, "date": "2025-01-12", "description": "Monthly service" },
{ "invoice_id": "INV-002", "amount": 2475.00, "date": "2025-01-15", "description": "Equip rental Jan" }
]
}
Matcher configuration:
{
"type": "matcher",
"properties": {
"left": "@input.invoices",
"right": "@input.payments",
"matchOn": ["invoice_id"],
"tolerance": 50,
"dateWindowDays": 3,
"fuzzyThreshold": 80,
"descriptionKey": "description",
"outputMatched": "reconciled",
"outputUnmatchedLeft": "exceptions"
}
}
Results:
@reconciled: INV-001 (exact match), INV-002 (amount difference $25 within tolerance, descriptions 80%+ similar)@exceptions: INV-003 (no matching payment found)
Using Uploaded Documents
Instead of embedding datasets in the execution payload, upload files to Document Storage and reference them with the doc: prefix:
{
"type": "matcher",
"properties": {
"left": "doc:doc_a1b2c3d4e5f6",
"right": "doc:doc_x7y8z9w0v1u2",
"matchOn": ["invoice_id"],
"tolerance": 50,
"outputMatched": "matched",
"outputUnmatchedLeft": "exceptions"
}
}
CSV files resolve to Array<Object> with header rows as keys. JSON files resolve as-is. Pin a specific version with doc:doc_xxx@2 for audit reproducibility.
Large Dataset Optimization
For large datasets (10,000+ records per side), the matcher automatically switches to an indexed matching strategy when the infrastructure supports it. This provides significant performance improvements by pre-indexing records by their matchOn keys rather than performing pairwise comparison.
No configuration change is needed — the matcher detects the optimal strategy based on dataset size automatically.
Matcher as a common front-end step. Many operational workflows use a matcher near the front of the pipeline. Matched records can flow into deterministic processing, while exceptions route to AI agents or human review. This is one common graduated exception handling pattern: deterministic rules for clear cases, AI for ambiguous cases, humans for edge cases.