---
title: "Troubleshooting a Model"
description: "How to diagnose and resolve model and binding errors in Koios"
source_url: https://ai-ops.com/docs/models/troubleshooting
---

# Troubleshooting a Model

When a model encounters a problem during its inference cycle, Koios records three pieces of diagnostic information at both the **model level** and the **binding level**:

- **Error Code** — a category that identifies _what kind_ of failure occurred
- **Error Message** — a short description of the problem (e.g. "Input tag is disabled")
- **Error Detail** — additional context explaining the root cause and what to do about it

Together, these three fields tell you what went wrong, why, and where to start investigating.

## Where Errors Appear

### Model Detail Page

On a model's **Overview** tab, the status hero banner at the top shows the current state. When a model is in a **Failed** state, a red error strip appears below the status showing the error message and error detail.

### Model List

In the model list table, each model shows a colored status icon. When a model has failed, hovering over the red icon displays the error message as a tooltip.

### Bindings Page

The **Bindings** tab shows each input and output binding as a card with a colored left border indicating its status. When a binding has failed, its card turns red and expands to show the error message and error detail inline. This is the most important diagnostic view — it tells you _which specific binding_ caused the model to fail.

> [!TIP] Start with the bindings
> When a model fails, the model-level error is often generic (e.g. "All input bindings failed validation"). The binding-level errors tell you the actual cause — a disabled tag, a missing range, stale data, etc. Always check the Bindings tab first.

## How the Inference Pipeline Works

Understanding the pipeline helps you read error codes. Each inference cycle runs these steps in order — if any step fails, the pipeline stops and the model reports the error:

1. **Validate bindings** — check that each input tag is assigned, enabled, and running
2. **Query history** — fetch historical data from the time-series database
3. **Preprocess data** — interpolate, check ranges, normalize, assemble the input tensor
4. **Run inference** — execute the ONNX or TFLite model
5. **Write results** — denormalize predictions, check output ranges, write to tags

A binding can fail at any stage. The error code tells you which stage failed.

## Model Error Codes

These errors apply to the model as a whole. When the model fails, check the bindings for more specific errors.

### File Errors

| Code | Name | Description |
|------|------|-------------|
| 1 | **No File Given** | No model file has been uploaded. Go to the **Files** tab and upload an ONNX or TFLite file. |
| 2 | **No File Found** | The model file was uploaded previously but is now missing from storage. Re-upload the file on the Files tab. |
| 3 | **Failed to Parse File** | The uploaded file is corrupted, not a valid ONNX/TFLite model, or uses an ONNX IR version above 13 (the highest the runtime loads). Verify the file opens in your training environment; if it was exported at a newer IR version, re-export at IR 13 or lower — or annotate it with the model-utils library, which clamps the IR version automatically — and re-upload. |

### Binding Errors

| Code | Name | Description |
|------|------|-------------|
| 4 | **Failed to Get Model Depth** | The model file's metadata is missing the input depth (number of historical samples). Re-upload the model file to set the input depth correctly. |
| 5 | **Failed to Get Bindings** | Binding configuration could not be loaded from the database. This is rare and usually indicates a database connectivity issue. |
| 6 | **Bindings Invalid State** | Bindings exist but their configuration is inconsistent — for example, an input binding with no tag assigned or an output index that doesn't match the model file. Review bindings on the Bindings tab. |

### Inference Errors

| Code | Name | Description |
|------|------|-------------|
| 7 | **Failed to Structure History** | Historical data could not be organized into the format the model expects. This usually means there's a data type issue or an unexpected gap in the history. |
| 8 | **Failed to Get Predictions** | The model executed but produced no output. This can happen if the model file is incompatible with the input tensor shape. Check that the number of bindings matches the model's expected inputs. |
| 9 | **Binding Prediction Index Missing** | An output binding's **Output Index** doesn't match any of the model's output positions. Review output bindings and ensure their indices align with the model architecture. |
| 10 | **Failed to Scale Predictions** | Prediction denormalization failed for one or more output bindings. Check that output bindings have valid normalization ranges configured. |
| 11 | **Failed to Write Predictions** | Results could not be written to the live data cache. This is rare and usually indicates a cache connectivity issue. |

### System Errors

| Code | Name | Description |
|------|------|-------------|
| 12 | **Thread Error** | The inference engine's execution thread encountered an internal error. This typically resolves on the next scan cycle. If it persists, check the predict engine service logs. |
| 13 | **Generic Exception** | An unclassified error occurred. The error detail contains the specific exception — check the model's logs for the full stack trace. |
| 999 | **Unlicensed** | The model cannot run because the Koios license does not cover it. Disable unused models or upgrade your license. |

> [!NOTE] Error code 0 means no error
> An error code of **0** (None) means the model has no active error. You'll see this on healthy models or after an error has been resolved.

## Binding Error Codes

These errors apply to individual input or output bindings. They appear on the **Bindings** tab and are the most useful diagnostic information when troubleshooting.

### Tag State Errors

These errors mean the binding's tag is in a bad state — the fix is at the tag or device level, not the model.

| Code | Name | Description |
|------|------|-------------|
| 11 | **Binding Tag Disabled** | The binding's tag is disabled (stopped). Enable the tag or its parent device. |
| 12 | **Binding Tag Bad Quality** | The tag's value was read but the data source reported poor quality (e.g. OPC-UA quality status is not "Good"). The value may be stale or unreliable. |
| 13 | **Binding Tag Failed** | The binding's tag is in a failed state. Check the tag's own error — see [Troubleshooting a Tag](https://ai-ops.com/docs/tags/troubleshooting.md). |
| 14 | **Binding Tag Not Assigned** | The input binding has no tag selected. Go to the Bindings tab and assign a tag. |
| 17 | **Upstream Model Failure** | This input tag is written by another model's output, and that upstream model is currently failing. Fix the upstream model first — this binding will recover automatically. |

### On-Demand Errors

These errors are specific to models or scan groups configured for [on-demand inference](https://ai-ops.com/docs/models/on-demand-inference.md).

| Code | Name | Description |
|------|------|-------------|
| 20 | **On-Demand Read Failed** | The device did not respond to the on-demand read request within the timeout. Check that the device is online and reachable, or increase the **On-Demand Timeout** on the model or scan group's Configuration tab. |

### History & Data Errors

These errors mean the binding's tag is working, but the historical data needed for inference is insufficient or too old.

| Code | Name | Description |
|------|------|-------------|
| 1 | **Not Enough Historical Depth** | There isn't enough historical data to fill the model's input window. This is common when a model is first enabled — wait for enough scan cycles to accumulate the required depth. The error detail shows how much more data is needed. |
| 2 | **Stale History Data** | The most recent data point is too old. This means the tag's device has stopped collecting or the data is arriving with significant delay. The error detail shows how far outside the allowed window the data is. |

> [!NOTE] New models need time to warm up
> When you first enable a model, input bindings will show "Not Enough Historical Depth" until the device has collected enough scan cycles to fill the model's sample window. This is normal — the model will start running automatically once enough data accumulates.

### Range & Normalization Errors

These errors relate to the normalization or failure range configuration on a binding.

| Code | Name | Description |
|------|------|-------------|
| 3 | **No Range Given** | The binding requires a normalization range but none is configured. Set **Range Min** and **Range Max** on the tag, or enable **Custom Range** on the binding and set custom values. For Z-Score normalization, set **Custom Mean** and **Custom Std Dev**. |
| 4 | **Invalid Range Given** | The normalization range is invalid — typically because min is greater than or equal to max, or Z-Score std dev is zero. Correct the range values on the tag or binding. |
| 5 | **Value Out of Range** | An input or output value exceeded the configured failure bounds. The error detail shows the specific value and which bound was exceeded. Review the failure range settings or widen the bounds if the value is expected. |
| 6 | **Failed to Normalize** | The normalization calculation produced an error (e.g. division by zero from an invalid range). Check the normalization type and range values. |

### Rate of Change Errors

| Code | Name | Description |
|------|------|-------------|
| 19 | **Rate of Change Exceeded** | The input or output value is changing faster than the configured threshold. The error detail shows the actual rate, the threshold, and the direction. This is a safety check — if the rate is expected, increase the ROC threshold on the binding's Configuration tab. |

### Data Processing Errors

| Code | Name | Description |
|------|------|-------------|
| 7 | **Failed to Structure History** | The binding's historical data could not be interpolated or organized. This can happen if the data contains unexpected values or if the timestamps are inconsistent. |
| 16 | **Failed to Structure Input Data** | The preprocessed data could not be assembled into a valid input tensor. This usually indicates a mismatch between the data shape and what the model expects. |

### Output Errors

These errors apply to output bindings after inference has completed.

| Code | Name | Description |
|------|------|-------------|
| 8 | **Prediction Index Missing** | The model's output doesn't have a value at this binding's configured **Output Index**. Verify the output index matches the model architecture. |
| 9 | **Failed to Scale Prediction** | The predicted value could not be denormalized back to the tag's original scale. Check the output binding's normalization range. |
| 10 | **Failed to Write Prediction** | The output value could not be written to the live data cache. This is rare and usually indicates a cache connectivity issue. |
| 18 | **General Model Failure** | The model failed to produce a prediction, so this output binding has no value. The root cause is in the input bindings or the model-level error — check those first. |

### Other

| Code | Name | Description |
|------|------|-------------|
| 15 | **Binding Not in Dictionary** | Internal error — the binding was expected in a lookup table but wasn't found. This should not occur in normal operation. |
| 999 | **Unlicensed** | The binding cannot be processed because the Koios license does not cover it. |

## Common Scenarios

### Model failed — all inputs show "Tag Disabled"

All input bindings have error code 11 (Binding Tag Disabled). The model depends on tags that have been stopped — either individually or because their parent device was disabled. Re-enable the tags or device.

### Model failed — all inputs show "Not Enough Historical Depth"

This is normal when a model is first enabled or when the device was recently restarted. The model needs time to accumulate enough historical data. Wait for the device to complete enough scan cycles to fill the model's sample window (depth × scan rate).

### Model failed — one input shows "Upstream Model Failure"

This binding reads from a tag that is written by another model's output. That upstream model is currently failing, so this model can't get valid input. Fix the upstream model first — this one will recover automatically.

### Model is running but output shows "Value Out of Range"

The model produced a prediction, but the output value exceeded the failure bounds. This could mean:

- The input data has drifted outside the range the model was trained on
- The failure bounds are set too tightly for the model's expected output
- The normalization range doesn't match the training data's range

Review the output binding's failure range settings on the Configuration tab.

### Model failed — "On-demand read failed"

The model (or its scan group) is configured for on-demand inference, but the device didn't respond to the read request in time. Check that:

- The device is powered on and reachable on the network
- The device connection parameters are correct (see [Troubleshooting a Device](https://ai-ops.com/docs/devices/troubleshooting.md))
- The **On-Demand Timeout** is long enough for the device to respond — some devices take several seconds

### Model failed — "Input tensor shape mismatch"

The number of input bindings doesn't match what the model file expects. This usually happens after uploading a new model file with a different architecture. Go to the **Bindings** tab and reconcile the bindings to match the new model's input/output structure.

### Scan group failed — all models show "On-demand read failed"

When a scan group's shared on-demand read fails, all models in the group fail together. The group-level error appears on the scan group's detail page. Fix the device connection issue, then all models in the group will recover on the next cycle.

## How Errors Are Resolved

Model and binding errors clear automatically when the next inference cycle completes successfully. You don't need to manually acknowledge or dismiss them — once the underlying issue is resolved (e.g. the tag is re-enabled, the device recovers, or enough history accumulates), the next successful cycle resets all error codes and clears the error messages.

If a model is stuck in a failed state:

1. Check the **Bindings** tab for specific per-binding errors — these are more actionable than the model-level error
2. Review the model's **logs** (on the Logs tab) — set the log level to **Debug** for maximum detail about each pipeline stage
3. Verify all input tags are **enabled and running** — a single disabled or failed tag can block the entire model
4. Check the tag devices — if a device is down, all tags on it fail, which cascades to every model using those tags
5. Try toggling the model off and on with the **Enabled** switch
