Docs
/
Models
/

Model Introduction

AI Models

An AI model in Koios runs machine learning inference on live data from your devices. Models read input values from tags, feed them through a pre-trained neural network (ONNX or TensorFlow Lite), and write predictions back to output tags — all in real time, at a configurable scan rate.

How Models Work

On every scan, a model follows this cycle:

  1. Collect — read current and recent historical values from all input tags
  2. Calibrate — apply per-binding gain and bias for sensor drift or unit corrections (no-op at defaults)
  3. Normalize — scale each input to the range the model was trained on
  4. Infer — run the model file to produce predictions
  5. Denormalize — scale predictions back to real-world units
  6. Write — apply inverse calibration and push predictions to output tags

This repeats at the model's scan rate (e.g. every 1s, 5s, or 30s). Models can also run in on-demand mode for synchronized device reads/writes, or be grouped in a scan group for batched execution.

Key Components

1. Model File

The trained model file (ONNX or TFLite) contains the neural network weights and structure. You can upload multiple versions and switch between them without reconfiguring bindings.

FormatExtensionDescription
ONNX.onnxOpen Neural Network Exchange — exported from PyTorch, scikit-learn, etc.
TFLite.tfliteTensorFlow Lite — optimized for edge deployment

2. Bindings

Bindings connect the model's inputs and outputs to tags:

  • Input bindings read values from tags and pass them to the model
  • Output bindings receive predictions from the model and optionally write them to tags

Every input binding must be assigned to a tag. Output bindings can be left unassigned.

3. Normalization

Each binding has a normalization setting that controls how values are scaled. The system uses two independent settings:

Normalization type — the mathematical formula:

TypeFormulaOutput Range
NonePassthroughRaw values
Min-Max(v - min) / (max - min)[0, 1]
Symmetric2*(v - min)/(max - min) - 1[-1, 1]
Z-Score(v - mean) / stdUnbounded

Normalization source — where the parameters come from:

SourceDescription
Tag RangeUses the tag's configured range_min / range_max
CustomUses custom values set directly on the binding

Z-Score always forces Custom source (tags don't have meaningful mean/std values).

4. Calibration

Each binding can apply a linear gain-and-bias transform on top of the raw value — useful for sensor drift, engineering unit conversion, or fine-tuning a model's response without retraining. Defaults are identity (gain 1.0, bias 0.0), so existing bindings see no change. See Calibration (Gain & Bias) for the full pipeline.

5. Configuration

SettingDescriptionDefault
Output ApplicationAbsolute writes the prediction directly; Relative adds the predicted delta to the current tag valueAbsolute
Output ModeContinuous maps each output neuron 1:1 to an output binding; Discrete selects from an action map via argmaxContinuous
Scan RateHow often inference runs (seconds)1s
Sample RateInterval between historical samples in the input tensor (seconds)1s
On-DemandRequest fresh device reads before inference, writes afterOff
On-Demand TimeoutMax wait for fresh reads (seconds)3s
Memory OnlyStore history in process memory instead of the time-series databaseOff
Scan GroupAssign to a group for synchronized executionNone

Memory Only Mode

When enabled, the model stores its input history in an in-memory buffer instead of querying the time-series database. This eliminates the database read/write round-trip on every cycle, enabling ultra-low-latency inference for fast control loops.

How it works: The predict engine maintains a rolling buffer per input tag, appending one sample per scan cycle from the live data cache. The model reads from this buffer instead of the database.

Trade-offs:

AspectStandardMemory Only
Input history sourceTime-series databaseIn-memory buffer
Execution metricsFull (charts, missed scans)Avg cycle duration only
Data on restartPersistedLost — model warms up from zero
LatencyDatabase query per cycleNear-zero (cache + memory)

Requirements:

  • Requires On-Demand to be enabled (the predict engine must actively pull fresh reads)
  • Cannot be in a Scan Group

Warmup: After a restart, the buffer is empty. The model shows a "Memory buffer warming up" message until it has collected enough samples (determined by input_depth * sample_rate). During warmup, inference does not run.

Scan Groups

A scan group runs multiple models together on a shared schedule. When on-demand is enabled, all member models' reads and writes are combined into a single network request per device — reducing I/O on slow networks.

See Scan Groups for details.

Input Depth and Historical Data

Models typically need a window of historical data, defined by:

  • Input depth — number of historical samples (read from the model file)
  • Sample rate — time interval between each sample

For example, input_depth=10 at sample_rate=0.5s needs the last 5 seconds of data.

Model Status

StatusMeaning
RunningActively making predictions at its scan rate
StoppedDisabled or not started
FailedError during inference — check error code and message

Each binding also has its own status. A binding can fail if the bound tag is disabled, there isn't enough historical data, or the value is outside the normalization range.

Model Lifecycle

  1. Create — name, output application, output mode, scan rate
  2. Upload a model file — ONNX or TFLite on the Files tab
  3. Assign bindings — map inputs and outputs to tags on the Bindings tab
  4. Initialize history — backfill data if input tags are new
  5. Enable — activate real-time inference

What You See on a Model

Model List

Table with status, name, input/output counts, and timestamps. Supports filtering, search, and bulk actions (enable, disable, delete, export).

Model Detail

TabContent
OverviewLive status, last prediction, scan progress, active file info, tensor chart, recent events
FilesUpload, manage, and switch model file versions
BindingsConfigure input/output bindings, normalization, failure detection
ConfigurationName, description, output settings, scan rate, sample rate, advanced settings
ExecutionCycle timing charts and performance metrics
ParametersRead-only table of all model fields
LogsReal-time predict engine log viewer
Cross ReferencesTags, components, and other entities referencing this model

What's Next