CSV Schemas
CSV is a simple, flat, text-based format. It does not define types, nesting, arrays, or structure.
MAPS supports CSV schemas by providing a minimal configuration that allows CSV rows to be mapped into Typed Events.
This page describes how CSV schemas are defined, how the server interprets them, and how they integrate into the MAPS processing pipeline.
1. Overview
CSV offers:
- flat rows
- string-based values
- optional numeric parsing
- no native schema language
MAPS wraps CSV with a lightweight schema definition so CSV data can be:
- validated (header count)
- converted to Typed Events
- used in filtering, transformation, and statistics
- converted to other formats (JSON, Avro, Protobuf, etc.)
2. Schema Format (SchemaConfig)
A CSV schema in MAPS is represented as:
{
"format": "csv",
"schema": {
"headerValues": "col1, col2, col3",
"interpretNumericStrings": true
}
}
It maps directly to:
@Getter
@Setter
public static final class CsvConfig {
private String headerValues;
private boolean interpretNumericStrings;
}
2.1 headerValues
A comma‑separated list of field names.
- Used to name each column in the CSV row.
- Parsed using uniVocity CsvParser.
- Whitespace around commas is ignored unless quoted.
Example:
"name, id, email"
Produces fields:
nameidemail
2.2 interpretNumericStrings
Controls whether MAPS attempts to convert CSV strings into numbers:
When true
"42"→int 42"3.14"→double 3.14"00123"→123"1e3"→1000.0
When false
- All fields remain strings.
This impacts:
- typed filtering
- schema-to-schema conversions
- statistics accuracy
3. Typed Event Mapping
Each CSV row becomes a Typed Event.
Example CSV:
alice, 1001, [email protected]
With schema:
{
"headerValues": "name, id, email",
"interpretNumericStrings": true
}
Typed Event:
{
"name": "alice",
"id": 1001,
"email": "[email protected]"
}
CSV supports no nested structures.
Every field is top-level.
4. Example CSV SchemaConfig
{
"versionId": "1",
"name": "Temperature CSV",
"description": "BME688 sensor CSV data",
"labels": {
"uniqueId": "3a0d7bc0-9d6c-4c2d-a67f-e37d70f0cafe",
"resource": "sensor",
"interface": "sensor.bme688.csv"
},
"format": "csv",
"schema": {
"headerValues": "timestamp, temperature, humidity",
"interpretNumericStrings": true
}
}
5. Limitations of CSV Schemas
CSV is intentionally simple, but this means:
- ❌ No nested objects
- ❌ No arrays
- ❌ No enums
- ❌ No type constraints (min/max, regex, etc.)
- ❌ No binary fields
- ❌ No timestamps unless interpreted as strings
CSV schemas cannot express the richness of other formats.
They are best used for logging, integration with legacy systems, or simple time-series feeds.
6. Integration with MAPS
After parsing and typing, CSV-derived Typed Events work seamlessly with:
- filtering
- statistics
- transformations
- schema-to-schema conversions
- multi-protocol publishing
7. Example Usage Flow
7.1 Ingest CSV via MQTT
Topic:
sensors/bme688/csv
Payload:
2025-01-01T10:00:00Z,20.1,45
7.2 MAPS applies schema and produces Typed Event
{
"timestamp": "2025-01-01T10:00:00Z",
"temperature": 20.1,
"humidity": 45
}
7.3 Transform to JSON
MAPS can automatically re‑encode into JSON, Avro, Protobuf, etc.
8. Best Practices
- Prefer
interpretNumericStrings = trueunless you require strict textual fields. - Avoid leading/trailing spaces in header values.
- Ensure the number of CSV columns matches the header count.
- Use CSV for simple or legacy data, not complex structures.