On-Demand Scanning

On-demand scanning allows AI models to trigger an immediate device read or write outside the normal scan cycle. The device side controls when cached data is fresh enough to reuse and how long to wait before executing (to batch concurrent requests).

Two sides of on-demand

This page covers device-side settings. For the model-side settings and full on-demand cycle, see On-Demand Inference.

Settings Reference

Found on the device's Configuration tab under Advanced Configuration.

Setting	Description	Default	Range
On-Demand Freshness	Max age (seconds) of cached data before a fresh read is required	1s	0–3,600s
On-Demand Batch Window	Time (seconds) to wait before executing, batching concurrent requests	0.1s	0–10s

On-Demand Freshness

When an on-demand read is requested, Koios checks how old the cached data is. If newer than the freshness threshold, the cached data is returned immediately — no device read needed.

Freshness = 5s

    Cache age: 3s  →  Return cached (fresh enough)
    Cache age: 8s  →  Read device (stale)

Setting to 0 means every request triggers a fresh read. This guarantees the freshest data but increases device I/O.

Freshness vs scan rate

If your device has a fast scan rate (e.g. 1s) and freshness is set to 3s, most on-demand requests will be served from cache without extra reads.

On-Demand Batch Window

When multiple models request reads from the same device at nearly the same time, the batch window groups them into a single device read.

Batch window = 0.5s

  t=0.0s  Model A requests read  ─┐
  t=0.1s  Model B requests read   ├─→ Single read at t=0.5s
  t=0.3s  Model C requests read  ─┘

Increasing the batch window improves efficiency but adds latency to every on-demand cycle. The model must wait for the window to close before the device is polled.

Batch window adds to model timeout

A 2s batch window means the model waits up to 2s before the device is even read. Make sure the model's on-demand timeout accommodates: batch window + device read time. If a binding's on-demand read is timing out, see A Model or Binding Isn't Running to diagnose it.

Setting to 0 means every request is executed immediately with no batching.

How It Fits Together

Model requests read → Freshness check
                         ├─ Fresh → Return cached data
                         └─ Stale → Batch window → Read device → Return fresh data
                                                                       ↓
                                                                 Model infers
                                                                       ↓
                                                                 Model writes

Total on-demand latency: (batch window if not cached) + device read time. This must be less than the model's on-demand timeout.

Configuration Guidelines

Scenario	Freshness	Batch Window
Single model, fast control loop	0s	0s
Single model, moderate polling	3–5s	0s
Multiple models, same scan rate	3–5s	0.5s
Multiple models, high device I/O cost	5–10s	1–2s

Start simple

The defaults, 1s freshness and 0.1s batch window, suit most cases. Drop both to 0 for always-fresh, un-batched reads, or raise them later to reduce device I/O.

Importing & Exporting Devices Device Parameters