Troubleshooting a Model
When a model encounters a problem during its inference cycle, Koios records three pieces of diagnostic information at both the model level and the binding level:
- Error Code — a category that identifies what kind of failure occurred
- Error Message — a short description of the problem (e.g. "Input tag is disabled")
- Error Detail — additional context explaining the root cause and what to do about it
Together, these three fields tell you what went wrong, why, and where to start investigating.
Where Errors Appear
Model Detail Page
On a model's Overview tab, the status hero banner at the top shows the current state. When a model is in a Failed state, a red error strip appears below the status showing the error message and error detail.
Model List
In the model list table, each model shows a colored status icon. When a model has failed, hovering over the red icon displays the error message as a tooltip.
Bindings Page
The Bindings tab shows each input and output binding as a card with a colored left border indicating its status. When a binding has failed, its card turns red and expands to show the error message and error detail inline. This is the most important diagnostic view — it tells you which specific binding caused the model to fail.
How the Inference Pipeline Works
Understanding the pipeline helps you read error codes. Each inference cycle runs these steps in order — if any step fails, the pipeline stops and the model reports the error:
- Validate bindings — check that each input tag is assigned, enabled, and running
- Query history — fetch historical data from the time-series database
- Preprocess data — interpolate, check ranges, normalize, assemble the input tensor
- Run inference — execute the ONNX or TFLite model
- Write results — denormalize predictions, check output ranges, write to tags
A binding can fail at any stage. The error code tells you which stage failed.
Model Error Codes
These errors apply to the model as a whole. When the model fails, check the bindings for more specific errors.
File Errors
Binding Errors
Inference Errors
System Errors
Binding Error Codes
These errors apply to individual input or output bindings. They appear on the Bindings tab and are the most useful diagnostic information when troubleshooting.
Tag State Errors
These errors mean the binding's tag is in a bad state — the fix is at the tag or device level, not the model.
On-Demand Errors
These errors are specific to models or scan groups configured for on-demand inference.
History & Data Errors
These errors mean the binding's tag is working, but the historical data needed for inference is insufficient or too old.
Range & Normalization Errors
These errors relate to the normalization or failure range configuration on a binding.
Rate of Change Errors
Data Processing Errors
Output Errors
These errors apply to output bindings after inference has completed.
Other
Common Scenarios
Model failed — all inputs show "Tag Disabled"
All input bindings have error code 11 (Binding Tag Disabled). The model depends on tags that have been stopped — either individually or because their parent device was disabled. Re-enable the tags or device.
Model failed — all inputs show "Not Enough Historical Depth"
This is normal when a model is first enabled or when the device was recently restarted. The model needs time to accumulate enough historical data. Wait for the device to complete enough scan cycles to fill the model's sample window (depth × scan rate).
Model failed — one input shows "Upstream Model Failure"
This binding reads from a tag that is written by another model's output. That upstream model is currently failing, so this model can't get valid input. Fix the upstream model first — this one will recover automatically.
Model is running but output shows "Value Out of Range"
The model produced a prediction, but the output value exceeded the failure bounds. This could mean:
- The input data has drifted outside the range the model was trained on
- The failure bounds are set too tightly for the model's expected output
- The normalization range doesn't match the training data's range
Review the output binding's failure range settings on the Configuration tab.
Model failed — "On-demand read failed"
The model (or its scan group) is configured for on-demand inference, but the device didn't respond to the read request in time. Check that:
- The device is powered on and reachable on the network
- The device connection parameters are correct (see Troubleshooting a Device)
- The On-Demand Timeout is long enough for the device to respond — some devices take several seconds
Model failed — "Input tensor shape mismatch"
The number of input bindings doesn't match what the model file expects. This usually happens after uploading a new model file with a different architecture. Go to the Bindings tab and reconcile the bindings to match the new model's input/output structure.
Scan group failed — all models show "On-demand read failed"
When a scan group's shared on-demand read fails, all models in the group fail together. The group-level error appears on the scan group's detail page. Fix the device connection issue, then all models in the group will recover on the next cycle.
How Errors Are Resolved
Model and binding errors clear automatically when the next inference cycle completes successfully. You don't need to manually acknowledge or dismiss them — once the underlying issue is resolved (e.g. the tag is re-enabled, the device recovers, or enough history accumulates), the next successful cycle resets all error codes and clears the error messages.
If a model is stuck in a failed state:
- Check the Bindings tab for specific per-binding errors — these are more actionable than the model-level error
- Review the model's logs (on the Logs tab) — set the log level to Debug for maximum detail about each pipeline stage
- Verify all input tags are enabled and running — a single disabled or failed tag can block the entire model
- Check the tag devices — if a device is down, all tags on it fail, which cascades to every model using those tags
- Try toggling the model off and on with the Enabled switch
