---
title: "On-Demand Scanning"
description: "Configure device-side on-demand read and write settings for AI model synchronization"
source_url: https://ai-ops.com/docs/devices/on-demand-scanning
---

# On-Demand Scanning

On-demand scanning allows AI models to trigger an immediate device read or write outside the normal scan cycle. The device side controls **when cached data is fresh enough to reuse** and **how long to wait before executing** (to batch concurrent requests).

> [!NOTE] Two sides of on-demand
> This page covers **device-side** settings. For the model-side settings and full on-demand cycle, see [On-Demand Inference](https://ai-ops.com/docs/models/on-demand-inference.md).

## Settings Reference

Found on the device's **Configuration** tab under **Advanced Configuration**.

| Setting | Description | Default | Range |
|---------|-------------|---------|-------|
| **On-Demand Freshness** | Max age (seconds) of cached data before a fresh read is required | 1s | 0–3,600s |
| **On-Demand Batch Window** | Time (seconds) to wait before executing, batching concurrent requests | 0.1s | 0–10s |

---

## On-Demand Freshness

When an on-demand read is requested, Koios checks how old the cached data is. If newer than the freshness threshold, the cached data is returned immediately — no device read needed.

```text
Freshness = 5s

    Cache age: 3s  →  Return cached (fresh enough)
    Cache age: 8s  →  Read device (stale)
```

**Setting to 0** means every request triggers a fresh read. This guarantees the freshest data but increases device I/O.

> [!TIP] Freshness vs scan rate
> If your device has a fast scan rate (e.g. 1s) and freshness is set to 3s, most on-demand requests will be served from cache without extra reads.

---

## On-Demand Batch Window

When multiple models request reads from the same device at nearly the same time, the batch window groups them into a **single device read**.

```text
Batch window = 0.5s

  t=0.0s  Model A requests read  ─┐
  t=0.1s  Model B requests read   ├─→ Single read at t=0.5s
  t=0.3s  Model C requests read  ─┘
```

Increasing the batch window improves efficiency but **adds latency** to every on-demand cycle. The model must wait for the window to close before the device is polled.

> [!WARNING] Batch window adds to model timeout
> A 2s batch window means the model waits up to 2s before the device is even read. Make sure the model's on-demand timeout accommodates: batch window + device read time.

**Setting to 0** means every request is executed immediately with no batching.

---

## How It Fits Together

```text
Model requests read → Freshness check
                         ├─ Fresh → Return cached data
                         └─ Stale → Batch window → Read device → Return fresh data
                                                                       ↓
                                                                 Model infers
                                                                       ↓
                                                                 Model writes
```

Total on-demand latency: **(batch window if not cached) + device read time**. This must be less than the model's on-demand timeout.

---

## Configuration Guidelines

| Scenario | Freshness | Batch Window |
|----------|-----------|--------------|
| Single model, fast control loop | 0s | 0s |
| Single model, moderate polling | 3–5s | 0s |
| Multiple models, same scan rate | 3–5s | 0.5s |
| Multiple models, high device I/O cost | 5–10s | 1–2s |

> [!TIP] Start simple
> Leave freshness and batch window at 0 for the simplest behavior. Add them later if you need to reduce device I/O.
