Skip to main content

Filtering Overview

MAPS Filtering lets you select, transform, and route events using a concise expression language.
It follows JMS selector semantics with extensions for JSON helpers and ML model scoring.


Standards & Compatibility

MAPS filtering is compatible with JMS Message Selectors and follows the semantics defined in the JMS specification §3.8.1 “Message Selectors.” Message selectors are a subset of the SQL‑92 conditional expression syntax (X/Open CAE Specification — SQL, Version 2, ISBN 1‑85912‑151‑9, March 1996).

What this means in practice:

  • Operators & precedence follow JMS selector rules (case‑insensitive keywords; string, numeric, boolean literals).
  • Supported predicates include =, <>, <, <=, >, >=, BETWEEN, LIKE [ESCAPE], IN (...), and IS [NOT] NULL.
  • No FROM/SELECT — selectors are expressions, not full SQL statements.
  • MAPS extensions add capabilities beyond JMS: EXTENSION(...) helpers (e.g., JSON), {time} macro, and ML functions.

If you keep to pure JMS selector constructs, your filters will behave per the JMS standard. Adding MAPS extensions is optional and backward‑compatible (they’re parsed only when used).


Where filtering is used

  • On ingress/egress routes (per topic/queue/bridge)
  • At satellite boundaries (pre‑uplink batching, post‑downlink re‑publish)
  • Inter server connections (between servers, both MAPS and other messaging servers)
  • Inside pipelines (drop, reroute, annotate, alert)

Identifier Resolution

When MAPS evaluates a filter expression, identifiers are resolved against the event in the following order:

  1. Event dictionary lookup
    Many protocols — including MQTT 5, STOMP, and AMQP — carry native event dictionaries (key–value maps).
    If the identifier matches a key in this dictionary, its value is used in the filter.

  2. Schema-based extraction
    If the identifier is not present in the event dictionary, MAPS checks whether the topic or queue has a schema configured.
    If so, MAPS parses the payload using that schema and extracts the corresponding field value.

This layered resolution lets filters operate consistently across protocols and payload formats, without requiring data to be pre-normalized.


Core concepts

  • Event fields (identifiers): dotted paths and array indices are supported, e.g. engine.temp, level1.level2.array[0].name.
  • Case‑insensitive keywords: AND, OR, NOT, TRUE, FALSE, NULL, etc.
  • Strings: single quotes; escape with doubled quotes: 'it''s'.
  • Numbers: integers and floating‑point (93f, 123.4e9).
  • Timestamps: {time} injects the current epoch millis.
  • Null tests: field IS NULL / field IS NOT NULL.

Syntax (at a glance)

Literals

'string'   42   3.14   TRUE   FALSE   NULL   {time}

Identifiers

sensorId   engine.temp   metrics[2]   title

Operators (precedence high → low)

LevelOperators / formsNotes
1+x, -xunary plus/minus
2x * y, x / ymultiply / divide
3x + y, x - yadd / subtract
4x < y, x > y, x <= y, x >= ycomparisons
5x = y, x <> y, x BETWEEN a AND b, x LIKE 'pat' [ESCAPE '\\'], x IN ('a','b',...), x IS [NOT] NULLequality class
6NOT exprlogical NOT
7expr AND exprlogical AND
8expr OR exprlogical OR

Parentheses (...) override precedence.

LIKE & ESCAPE

  • % matches any sequence, _ matches one char: title LIKE 'Thrill%', word LIKE 'l_se'
  • Escape example: underscored LIKE '\_%' ESCAPE '\'

IN (strings)

Country IN (' UK', 'US', 'France')
Country NOT IN (' UK', 'US', 'France')

BETWEEN

releaseYear BETWEEN 1980 AND 1989
17 BETWEEN 16 AND 18

Extensions

JSON / custom parsers

Call the extension loader to use helper parsers, such as JSON:

EXTENSION('json', 'temperature') > 40
EXTENSION(format, 'temperature') BETWEEN 20 AND 40
  • First argument: the parser name ('json' or an identifier like format).
  • Following arguments: string parameters to the extension.
  • Returns a value that can be compared or composed in expressions.

Time macro

  • {time} injects the current epoch milliseconds, useful for age tests.

Machine Learning filters

You can invoke ML models directly in filters. The general form is:

ML_FUNCTION(arg1, arg2, feature1, feature2, ...)
<comparison> <threshold>

Note: Arguments are identifiers (not quoted strings). Model names like model_rf.arff are valid identifiers.

Supported functions (tokens)

TensorFlow, k-means, g-means, x-means, k-means_lloyd,
linear_regression, ols, ridge, lasso, decision_tree, naive_bayes,
hierarchical, pca, pca_fit, pca_cor, random_forest, Logistic_regression,
mlp, qda, lda, isolation_forest, knn, model_exists

Common patterns

  • Model presence
    model_exists(example_model.arff) = TRUE
  • Clustering (distance)
    k-means(distance, model_kmeans.arff) > 1.0
    k-means(distance, model_kmeans.arff, temp, humidity) > 1.0
  • Regression (predict)
    ols(predict, model_ols.arff, temp, humidity) < 30.0
    ridge(predict, model_ridge.arff, temp, humidity) < 30.0
    lasso(predict, model_lasso.arff, temp, humidity) < 30.0
  • Classification
    decision_tree(classify, model_dt.arff, temp, humidity) = 1
    naive_bayes(classify, model_nb.arff, temp, humidity) = 1
    random_forest(classify, model_rf.arff, temp, humidity) = 1
    logistic_regression(classify, model_logreg.arff, temp, humidity) = 1
    logistic_regression(classifyprob, model_logreg.arff, temp, humidity) > 0.9
    mlp(predict, model_mlp.arff, temp, humidity) = 1
    mlp(predictprob, model_mlp.arff, temp, humidity) > 0.9
    qda(predict, model_qda.arff, temp, humidity) = 1
    qda(predictprob, model_qda.arff, temp, humidity) > 0.9
    lda(predict, model_lda.arff, temp, humidity) = 1
    lda(predictprob, model_lda.arff, temp, humidity) > 0.9
    knn(classify, model_knn.arff, temp, humidity) = 1
  • Anomaly detection
    isolation_forest(score, model_iso.arff) > 1.2
    isolation_forest(score, model_iso.arff, temp, humidity) > 1.2
    isolation_forest(is_anomaly, model_iso.arff) = 1
    isolation_forest(is_anomaly, model_iso.arff, temp, humidity) = 1
  • PCA (explained variance)
    pca_fit(explainedvariance[1], model_pca_fit.arff) > 0.7
    pca_fit(explainedvariance[2], model_pca_fit.arff, temp, humidity) > 0.7
    pca_cor(explainedvariance[3], model_pca_cor.arff) > 0.7
    pca_cor(explainedvariance[4], model_pca_cor.arff, temp, humidity) > 0.7
  • TensorFlow
    TensorFlow(sensor_safety_model, temp, humidity, co2) < 10

Tip: Combine ML with logical operators and thresholds:

( isolation_forest(is_anomaly, model_iso.arff, temp, vib) = 1 )
AND ( engine.temp > 85 )

Examples

  • Booleans & arithmetic
    TRUE
    (3+3)*2 = 12
    20 BETWEEN 10 / 2 AND 30 - 8
  • Strings & LIKE
    title = 'Sam''s'
    title LIKE 'Thrill%'
    word LIKE 'l_se'
  • IN / NOT IN
    'WORKS' IN ('no working','working','WORKS')
    Country NOT IN (' UK', 'US', 'France')
  • Paths & arrays
    level1.level2.array[0].name = 'test'
    pca_fit(explainedvariance[1], model.arff) > 0.7
  • Nulls
    releaseYear IS NOT NULL

Actions (typical usage)

  • drop (do not forward)
  • forward / route to another topic or protocol
  • annotate (add fields like anomaly=true, score=...)
  • alert (raise events to monitoring/ops)

The exact action syntax depends on where the filter is attached (route/listener/bridge).


Performance & diagnostics

  • Keep expressions simple; avoid deep nesting on hot paths.
  • Prefer batching and deduplication before satellite uplink.
  • Use logs/metrics to troubleshoot: parse errors, evaluation time, match counts.