Avro Schemas

Apache Avro is a compact, row-oriented serialization format designed for high-throughput data systems. MAPS treats Avro as a first-class schema type, with tight integration into the Typed Event pipeline.

1. Format Overview

Avro defines data using a JSON schema and encodes records in a compact binary format.

Key characteristics

Schema stored as JSON, data encoded as binary
Strong typing with support for:
- records, arrays, maps
- enums, unions, fixed, logical types
Well-suited for:
- telemetry streams
- log/event pipelines
- long-lived topic-based data with evolution over time

Why use Avro in MAPS?

Efficient binary encoding
Built-in schema evolution features (defaults, aliases, unions)
Good fit for high-volume IoT and analytics streams
Plays well with downstream big-data / lake / warehouse tooling

2. SchemaConfig for Avro

All Avro schemas in MAPS are stored as a SchemaConfig:

format must be "avro".
schema holds the Avro JSON schema.
schemaBase64 is typically null for Avro.
labels carry routing and discovery metadata (including CoAP interface/resource when exposed over CoAP).

2.1 Required fields for Avro

At the SchemaConfig level:

format → "avro"
name → logical schema name
versionId → logical schema version
schema → valid Avro JSON schema
labels.matchExpression → regex mapping topics to this schema
labels.uniqueId → stable schema identifier
labels.interface → optional: CoAP if value if exposed via CoAP
labels.resource → optional: CoAP rt value if exposed via CoAP

3. Example Avro SchemaConfig (BME688)

Below is an example Avro-based SchemaConfig for the BME688 sensor payload.

{
  "versionId": "1",
  "name": "BME688-Avro",
  "description": "BME688 VOC, pressure, temperature and humidity telemetry (Avro-encoded)",
  "labels": {
    "comments": "I2C device BME688 VOC, Pressure, Temperature and Humidity Sensor",
    "uniqueId": "b1dc43de-4c9b-5d86-9425-cf958eeb598d",
    "resource": "sensor",
    "interface": "sensor.bme688"
  },
  "format": "avro",
  "schema": {
    "type": "record",
    "name": "BME688Reading",
    "namespace": "io.mapsmessaging.sensors",
    "fields": [
      {
        "name": "temperature",
        "type": "double",
        "doc": "Unit: °C, range -40.0 to 85.0"
      },
      {
        "name": "humidity",
        "type": "double",
        "doc": "Unit: %RH, range 10.0 to 90.0"
      },
      {
        "name": "pressure",
        "type": "double",
        "doc": "Unit: hPa, range 300.0 to 1100.0"
      },
      {
        "name": "gas",
        "type": "double",
        "doc": "Unit: Ω, range 0.0 to 65535.0"
      },
      {
        "name": "heaterStatus",
        "type": "string"
      },
      {
        "name": "gasMode",
        "type": "string"
      },
      {
        "name": "dewPoint",
        "type": "double",
        "doc": "Unit: °C, range -50.0 to 100.0"
      },
      {
        "name": "condensationRisk",
        "type": "double",
        "doc": "Risk score in [0.0, 1.0]"
      },
      {
        "name": "timestamp",
        "type": {
          "type": "long",
          "logicalType": "timestamp-millis"
        },
        "doc": "Event time, epoch millis"
      }
    ]
  }
}

Notes:

The Avro schema sits directly in schema as standard Avro JSON.
timestamp uses Avro's logicalType: "timestamp-millis" to align with MAPS' normalised time handling.
Ranges and units are carried in the Avro doc field.

4. How MAPS Uses Avro Schemas

At runtime, MAPS:

Resolves the SchemaConfig by topic via matchExpression / bindings.
Loads the Avro JSON schema from schema.
Uses the Avro schema to decode binary Avro payloads into a Typed Event:
- field names and types come from the Avro schema
- logical types (like timestamps) are normalised internally
The Typed Event flows through:
- filtering
- transformations
- statistics
- format conversion (e.g. Avro → JSON / Protobuf / CBC)

Schema evolution rules defined at the Avro level (e.g. added fields with defaults) are respected when decoding.

5. Warnings & Best Practices

Keep namespace stable; it forms part of the Avro type identity.
Prefer double for sensor telemetry to avoid unnecessary rounding artefacts.
Use Avro logical types where appropriate:
- timestamp-millis / timestamp-micros for event time
- date for date-only values
When changing schemas:
- add fields with sensible defaults
- avoid incompatible type changes
- use aliases when renaming fields
Only use schemaBase64 for Avro if you truly need to store a compiled/binary representation; otherwise keep the canonical form as Avro JSON in schema.

6. Example

This java example will load an Avro schema from file and construct a AvroSchemaConfig to then use

public static AvroSchemaConfig getAvroSchema(String name, String title, String description, String matcher, String type) throws IOException {
    String schemaFile = "";
    File file = new File("./src/main/avro/"+name+".avsc");
    try (InputStream is = new FileInputStream(file.getAbsolutePath())) {
      schemaFile = new String(is.readAllBytes(), StandardCharsets.UTF_8);
    }
    UUID schemaId;
    try {
      schemaId = UuidGenerator.getInstance().generate(NamedVersions.SHA1,uuid, file.getAbsolutePath() );
    } catch (NoSuchAlgorithmException e) {
      e.printStackTrace();
      schemaId = UuidGenerator.getInstance().generate();
    }
    JsonElement element = JsonParser.parseString(schemaFile);
    AvroSchemaConfig config = new AvroSchemaConfig();
    config.setSchema(element.getAsJsonObject());
    config.setComments(description);
    config.setTitle(title);
    config.setVersion(1);
    config.setMatchExpression(matcher);
    config.setUniqueId(schemaId);
    config.setResourceType(type);
    return config;
  }

Example of a file called ballast.avsc

{
  "type": "record",
  "name": "BallastTelemetry",
  "namespace": "io.mapsmessaging.ship",
  "fields": [
    { "name": "fore_tank_level", "type": "float" },
    { "name": "aft_tank_level", "type": "float" },
    { "name": "stbd_tank_level", "type": "float" },
    { "name": "port_tank_level", "type": "float" }
  ]
}

Example of a file called cargo.avsc

{
  "type": "record",
  "name": "CargoMonitorTelemetry",
  "namespace": "io.mapsmessaging.ship",
  "fields": [
    { "name": "container_temp", "type": "float" },
    { "name": "humidity", "type": "float" },
    { "name": "shock_detected", "type": "boolean" }
  ]
}

Example of a file called engine-room.avsc

{
  "type": "record",
  "name": "EngineRoomTelemetry",
  "namespace": "io.mapsmessaging.ship",
  "fields": [
    { "name": "rpm", "type": "int" },
    { "name": "oil_pressure", "type": "float" },
    { "name": "temperature", "type": "float" }
  ]
}

1. Format Overview​

Key characteristics​

Why use Avro in MAPS?​

2. SchemaConfig for Avro​

2.1 Required fields for Avro​

3. Example Avro SchemaConfig (BME688)​

4. How MAPS Uses Avro Schemas​

5. Warnings & Best Practices​

6. Example​