Docs
/
Models
/

Troubleshooting a Model

Troubleshooting a Model

When a model encounters a problem during its inference cycle, Koios records three pieces of diagnostic information at both the model level and the binding level:

  • Error Code — a category that identifies what kind of failure occurred
  • Error Message — a short description of the problem (e.g. "Input tag is disabled")
  • Error Detail — additional context explaining the root cause and what to do about it

Together, these three fields tell you what went wrong, why, and where to start investigating.

Where Errors Appear

Model Detail Page

On a model's Overview tab, the status hero banner at the top shows the current state. When a model is in a Failed state, a red error strip appears below the status showing the error message and error detail.

Model List

In the model list table, each model shows a colored status icon. When a model has failed, hovering over the red icon displays the error message as a tooltip.

Bindings Page

The Bindings tab shows each input and output binding as a card with a colored left border indicating its status. When a binding has failed, its card turns red and expands to show the error message and error detail inline. This is the most important diagnostic view — it tells you which specific binding caused the model to fail.

How the Inference Pipeline Works

Understanding the pipeline helps you read error codes. Each inference cycle runs these steps in order — if any step fails, the pipeline stops and the model reports the error:

  1. Validate bindings — check that each input tag is assigned, enabled, and running
  2. Query history — fetch historical data from the time-series database
  3. Preprocess data — interpolate, check ranges, normalize, assemble the input tensor
  4. Run inference — execute the ONNX or TFLite model
  5. Write results — denormalize predictions, check output ranges, write to tags

A binding can fail at any stage. The error code tells you which stage failed.

Model Error Codes

These errors apply to the model as a whole. When the model fails, check the bindings for more specific errors.

File Errors

CodeNameDescription
1No File GivenNo model file has been uploaded. Go to the Files tab and upload an ONNX or TFLite file.
2No File FoundThe model file was uploaded previously but is now missing from storage. Re-upload the file on the Files tab.
3Failed to Parse FileThe uploaded file is corrupted, not a valid ONNX/TFLite model, or uses an ONNX IR version above 13 (the highest the runtime loads). Verify the file opens in your training environment; if it was exported at a newer IR version, re-export at IR 13 or lower — or annotate it with the model-utils library, which clamps the IR version automatically — and re-upload.

Binding Errors

CodeNameDescription
4Failed to Get Model DepthThe model file's metadata is missing the input depth (number of historical samples). Re-upload the model file to set the input depth correctly.
5Failed to Get BindingsBinding configuration could not be loaded from the database. This is rare and usually indicates a database connectivity issue.
6Bindings Invalid StateBindings exist but their configuration is inconsistent — for example, an input binding with no tag assigned or an output index that doesn't match the model file. Review bindings on the Bindings tab.

Inference Errors

CodeNameDescription
7Failed to Structure HistoryHistorical data could not be organized into the format the model expects. This usually means there's a data type issue or an unexpected gap in the history.
8Failed to Get PredictionsThe model executed but produced no output. This can happen if the model file is incompatible with the input tensor shape. Check that the number of bindings matches the model's expected inputs.
9Binding Prediction Index MissingAn output binding's Output Index doesn't match any of the model's output positions. Review output bindings and ensure their indices align with the model architecture.
10Failed to Scale PredictionsPrediction denormalization failed for one or more output bindings. Check that output bindings have valid normalization ranges configured.
11Failed to Write PredictionsResults could not be written to the live data cache. This is rare and usually indicates a cache connectivity issue.

System Errors

CodeNameDescription
12Thread ErrorThe inference engine's execution thread encountered an internal error. This typically resolves on the next scan cycle. If it persists, check the predict engine service logs.
13Generic ExceptionAn unclassified error occurred. The error detail contains the specific exception — check the model's logs for the full stack trace.
999UnlicensedThe model cannot run because the Koios license does not cover it. Disable unused models or upgrade your license.

Binding Error Codes

These errors apply to individual input or output bindings. They appear on the Bindings tab and are the most useful diagnostic information when troubleshooting.

Tag State Errors

These errors mean the binding's tag is in a bad state — the fix is at the tag or device level, not the model.

CodeNameDescription
11Binding Tag DisabledThe binding's tag is disabled (stopped). Enable the tag or its parent device.
12Binding Tag Bad QualityThe tag's value was read but the data source reported poor quality (e.g. OPC-UA quality status is not "Good"). The value may be stale or unreliable.
13Binding Tag FailedThe binding's tag is in a failed state. Check the tag's own error — see Troubleshooting a Tag.
14Binding Tag Not AssignedThe input binding has no tag selected. Go to the Bindings tab and assign a tag.
17Upstream Model FailureThis input tag is written by another model's output, and that upstream model is currently failing. Fix the upstream model first — this binding will recover automatically.

On-Demand Errors

These errors are specific to models or scan groups configured for on-demand inference.

CodeNameDescription
20On-Demand Read FailedThe device did not respond to the on-demand read request within the timeout. Check that the device is online and reachable, or increase the On-Demand Timeout on the model or scan group's Configuration tab.

History & Data Errors

These errors mean the binding's tag is working, but the historical data needed for inference is insufficient or too old.

CodeNameDescription
1Not Enough Historical DepthThere isn't enough historical data to fill the model's input window. This is common when a model is first enabled — wait for enough scan cycles to accumulate the required depth. The error detail shows how much more data is needed.
2Stale History DataThe most recent data point is too old. This means the tag's device has stopped collecting or the data is arriving with significant delay. The error detail shows how far outside the allowed window the data is.

Range & Normalization Errors

These errors relate to the normalization or failure range configuration on a binding.

CodeNameDescription
3No Range GivenThe binding requires a normalization range but none is configured. Set Range Min and Range Max on the tag, or enable Custom Range on the binding and set custom values. For Z-Score normalization, set Custom Mean and Custom Std Dev.
4Invalid Range GivenThe normalization range is invalid — typically because min is greater than or equal to max, or Z-Score std dev is zero. Correct the range values on the tag or binding.
5Value Out of RangeAn input or output value exceeded the configured failure bounds. The error detail shows the specific value and which bound was exceeded. Review the failure range settings or widen the bounds if the value is expected.
6Failed to NormalizeThe normalization calculation produced an error (e.g. division by zero from an invalid range). Check the normalization type and range values.

Rate of Change Errors

CodeNameDescription
19Rate of Change ExceededThe input or output value is changing faster than the configured threshold. The error detail shows the actual rate, the threshold, and the direction. This is a safety check — if the rate is expected, increase the ROC threshold on the binding's Configuration tab.

Data Processing Errors

CodeNameDescription
7Failed to Structure HistoryThe binding's historical data could not be interpolated or organized. This can happen if the data contains unexpected values or if the timestamps are inconsistent.
16Failed to Structure Input DataThe preprocessed data could not be assembled into a valid input tensor. This usually indicates a mismatch between the data shape and what the model expects.

Output Errors

These errors apply to output bindings after inference has completed.

CodeNameDescription
8Prediction Index MissingThe model's output doesn't have a value at this binding's configured Output Index. Verify the output index matches the model architecture.
9Failed to Scale PredictionThe predicted value could not be denormalized back to the tag's original scale. Check the output binding's normalization range.
10Failed to Write PredictionThe output value could not be written to the live data cache. This is rare and usually indicates a cache connectivity issue.
18General Model FailureThe model failed to produce a prediction, so this output binding has no value. The root cause is in the input bindings or the model-level error — check those first.

Other

CodeNameDescription
15Binding Not in DictionaryInternal error — the binding was expected in a lookup table but wasn't found. This should not occur in normal operation.
999UnlicensedThe binding cannot be processed because the Koios license does not cover it.

Common Scenarios

Model failed — all inputs show "Tag Disabled"

All input bindings have error code 11 (Binding Tag Disabled). The model depends on tags that have been stopped — either individually or because their parent device was disabled. Re-enable the tags or device.

Model failed — all inputs show "Not Enough Historical Depth"

This is normal when a model is first enabled or when the device was recently restarted. The model needs time to accumulate enough historical data. Wait for the device to complete enough scan cycles to fill the model's sample window (depth × scan rate).

Model failed — one input shows "Upstream Model Failure"

This binding reads from a tag that is written by another model's output. That upstream model is currently failing, so this model can't get valid input. Fix the upstream model first — this one will recover automatically.

Model is running but output shows "Value Out of Range"

The model produced a prediction, but the output value exceeded the failure bounds. This could mean:

  • The input data has drifted outside the range the model was trained on
  • The failure bounds are set too tightly for the model's expected output
  • The normalization range doesn't match the training data's range

Review the output binding's failure range settings on the Configuration tab.

Model failed — "On-demand read failed"

The model (or its scan group) is configured for on-demand inference, but the device didn't respond to the read request in time. Check that:

  • The device is powered on and reachable on the network
  • The device connection parameters are correct (see Troubleshooting a Device)
  • The On-Demand Timeout is long enough for the device to respond — some devices take several seconds

Model failed — "Input tensor shape mismatch"

The number of input bindings doesn't match what the model file expects. This usually happens after uploading a new model file with a different architecture. Go to the Bindings tab and reconcile the bindings to match the new model's input/output structure.

Scan group failed — all models show "On-demand read failed"

When a scan group's shared on-demand read fails, all models in the group fail together. The group-level error appears on the scan group's detail page. Fix the device connection issue, then all models in the group will recover on the next cycle.

How Errors Are Resolved

Model and binding errors clear automatically when the next inference cycle completes successfully. You don't need to manually acknowledge or dismiss them — once the underlying issue is resolved (e.g. the tag is re-enabled, the device recovers, or enough history accumulates), the next successful cycle resets all error codes and clears the error messages.

If a model is stuck in a failed state:

  1. Check the Bindings tab for specific per-binding errors — these are more actionable than the model-level error
  2. Review the model's logs (on the Logs tab) — set the log level to Debug for maximum detail about each pipeline stage
  3. Verify all input tags are enabled and running — a single disabled or failed tag can block the entire model
  4. Check the tag devices — if a device is down, all tags on it fail, which cascades to every model using those tags
  5. Try toggling the model off and on with the Enabled switch