You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|**1️⃣ Re‑use the last host**| If the model was previously run on a specific host and that host is currently free, select it. | The host identifier is stored in a Redis key `:last_host`. The strategy checks that the host is not occupied by another model and attempts an atomic acquisition of a provider on that host. |
72
+
|**2️⃣ Re‑use any known host**| Prefer any host that already has the model loaded. | A Redis set `:hosts` tracks all hosts where the model is currently loaded. The strategy scans the provider list, picks a free provider on one of those hosts, and locks it atomically. |
73
+
|**3️⃣ Pick an unused host**| Spread the load to a fresh host when no suitable “known” host is available. | It looks for a provider whose host is **not** present in the `:hosts` set and is not occupied, then acquires it. |
74
+
|**4️⃣ Fallback to plain first‑available**| Guarantees a result even if the optimisation steps fail. | If none of the previous steps succeed, the strategy delegates to the base `FirstAvailableStrategy`, which simply selects the first free provider. |
75
+
|**5️⃣ Book‑keeping**| Keep the optimisation data up‑to‑date for future requests. | After a provider is successfully acquired, the host is recorded as the *last host* (`:last_host`), added to the model‑specific host set (`:hosts`), and marked as occupied in a Redis hash `:occupancy`. |
76
+
77
+
**When to use it**
78
+
79
+
***Low‑latency / high‑throughput workloads** – Re‑using the same host avoids the overhead of re‑loading a large model,
80
+
resulting in faster responses.
81
+
***Environments with a limited number of hosts** – The strategy maximises the utilization of already‑occupied hosts
82
+
while still allowing the load to be spread when needed.
83
+
***Multi‑worker deployments** – Because all state is stored in Redis, many processes (or even different machines) can
84
+
safely share the optimisation logic without race conditions.
85
+
86
+
**Summary**
87
+
`first_available_optim` combines the simplicity of the *first‑available* approach with smart host reuse, reducing
88
+
model‑loading latency and improving overall throughput while still providing a reliable fallback mechanism. All
89
+
coordination is performed via Redis, ensuring safe concurrent operation across multiple workers.
90
+
91
+
---
92
+
93
+
## Environments and Redis installation
52
94
53
95
The connection details for Redis can be configured using environment variables:
0 commit comments