AI Models
An AI model in Koios runs machine learning inference on live data from your devices. Models read input values from tags, feed them through a pre-trained neural network (ONNX or TensorFlow Lite), and write predictions back to output tags — all in real time, at a configurable scan rate.
How Models Work
On every scan, a model follows this cycle:
- Collect — read current and recent historical values from all input tags
- Calibrate — apply per-binding gain and bias for sensor drift or unit corrections (no-op at defaults)
- Normalize — scale each input to the range the model was trained on
- Infer — run the model file to produce predictions
- Denormalize — scale predictions back to real-world units
- Write — apply inverse calibration and push predictions to output tags
This repeats at the model's scan rate (e.g. every 1s, 5s, or 30s). Models can also run in on-demand mode for synchronized device reads/writes, or be grouped in a scan group for batched execution.
Key Components
1. Model File
The trained model file (ONNX or TFLite) contains the neural network weights and structure. You can upload multiple versions and switch between them without reconfiguring bindings.
2. Bindings
Bindings connect the model's inputs and outputs to tags:
- Input bindings read values from tags and pass them to the model
- Output bindings receive predictions from the model and optionally write them to tags
Every input binding must be assigned to a tag. Output bindings can be left unassigned.
3. Normalization
Each binding has a normalization setting that controls how values are scaled. The system uses two independent settings:
Normalization type — the mathematical formula:
Normalization source — where the parameters come from:
Z-Score always forces Custom source (tags don't have meaningful mean/std values).
4. Calibration
Each binding can apply a linear gain-and-bias transform on top of the raw value — useful for sensor drift, engineering unit conversion, or fine-tuning a model's response without retraining. Defaults are identity (gain 1.0, bias 0.0), so existing bindings see no change. See Calibration (Gain & Bias) for the full pipeline.
5. Configuration
Memory Only Mode
When enabled, the model stores its input history in an in-memory buffer instead of querying the time-series database. This eliminates the database read/write round-trip on every cycle, enabling ultra-low-latency inference for fast control loops.
How it works: The predict engine maintains a rolling buffer per input tag, appending one sample per scan cycle from the live data cache. The model reads from this buffer instead of the database.
Trade-offs:
Requirements:
- Requires On-Demand to be enabled (the predict engine must actively pull fresh reads)
- Cannot be in a Scan Group
Warmup: After a restart, the buffer is empty. The model shows a "Memory buffer warming up" message until it has collected enough samples (determined by input_depth * sample_rate). During warmup, inference does not run.
Scan Groups
A scan group runs multiple models together on a shared schedule. When on-demand is enabled, all member models' reads and writes are combined into a single network request per device — reducing I/O on slow networks.
See Scan Groups for details.
Input Depth and Historical Data
Models typically need a window of historical data, defined by:
- Input depth — number of historical samples (read from the model file)
- Sample rate — time interval between each sample
For example, input_depth=10 at sample_rate=0.5s needs the last 5 seconds of data.
Model Status
Each binding also has its own status. A binding can fail if the bound tag is disabled, there isn't enough historical data, or the value is outside the normalization range.
Model Lifecycle
- Create — name, output application, output mode, scan rate
- Upload a model file — ONNX or TFLite on the Files tab
- Assign bindings — map inputs and outputs to tags on the Bindings tab
- Initialize history — backfill data if input tags are new
- Enable — activate real-time inference
What You See on a Model
Model List
Table with status, name, input/output counts, and timestamps. Supports filtering, search, and bulk actions (enable, disable, delete, export).
Model Detail
What's Next
- Training a Model — what to prepare before deploying a model
- Model Inference Requirements — tensor shapes and data preparation
- Creating a Model — step-by-step guide
- Managing Model Files — uploading and versioning
- Assigning Bindings — mapping inputs and outputs to tags
- Configuring a Model — all configuration settings explained
- On-Demand Inference — synchronize inference with fresh device data
- Scan Groups — grouped execution with shared on-demand
- Monitoring a Model — live values, diagnostics, and execution performance
