You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: site-src/guides/epp-configuration/config-text.md
+20-18Lines changed: 20 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -34,19 +34,19 @@ featureGates:
34
34
35
35
The first two lines of the configuration are constant and must appear as is.
36
36
37
-
The plugins section defines the set of plugins that will be instantiated and their parameters. This section is described in more detail in the section [Configuring Plugins via text](#configuring-plugins-via-text)
37
+
The `featureGates` section allows the enablement of experimental features of the IGW. This section is
38
+
described in more detail in the section [Feature Gates](#feature-gates)
39
+
40
+
The `plugins` section defines the set of plugins that will be instantiated and their parameters. This section is described in more detail in the section [Configuring Plugins via text](#configuring-plugins-via-text)
38
41
39
-
The schedulingProfiles section defines the set of scheduling profiles that can be used in scheduling
42
+
The `schedulingProfiles` section defines the set of scheduling profiles that can be used in scheduling
40
43
requests to pods. This section is described in more detail in the section [Configuring Plugins via YAML](#configuring-plugins-via-yaml)
41
44
42
-
The saturationDetector section configures the saturation detector, which is used to determine if special
45
+
The `saturationDetector` section configures the saturation detector, which is used to determine if special
43
46
action needs to eb taken due to the system being overloaded or saturated. This section is described in more detail in the section [Saturation Detector configuration](#saturation-detector-configuration)
44
47
45
-
The data section configures the data layer, which is used to gather metrics and other data used in making scheduling decisions.
46
-
This section is described in more detail in the section [Data Layer configuration](#data-layer-configuration)
47
-
48
-
The featureGates sections allows the enablement of experimental features of the IGW. This section is
49
-
described in more detail in the section [Feature Gates](#feature-gates)
48
+
The `data` section configures the data layer, which is used to gather information (such as metrics) used in making scheduling
49
+
decisions. This section is described in more detail in the section [Data Layer configuration](#data-layer-configuration)
50
50
51
51
## Configuring Plugins via YAML
52
52
@@ -69,7 +69,7 @@ In addition, the set of instantiated plugins can also include a picker, which ch
69
69
the request is scheduled after filtering and scoring. If one is not referenced in a SchedulingProfile, an
70
70
instance of `MaxScorePicker` will be added to the SchedulingProfile in question.
71
71
72
-
The plugins section defines the set of plugins that will be instantiated and their parameters.
72
+
The `plugins` section defines the set of plugins that will be instantiated and their parameters.
73
73
Each entry in this section has the following form:
74
74
75
75
```yaml
@@ -88,7 +88,7 @@ field is omitted, the plugin's type will be used as its name.
88
88
- *parameters* which is optional, defines the set of parameters used to configure the plugin in question.
89
89
The actual set of parameters varies from plugin to plugin.
90
90
91
-
The schedulingProfiles section defines the set of scheduling profiles that can be used in scheduling
91
+
The `schedulingProfiles` section defines the set of scheduling profiles that can be used in scheduling
92
92
requests to pods. The number of scheduling profiles one defines, depends on the use case. For simple
93
93
serving of requests, one is enough. For disaggregated prefill, two profiles are required. Each entry
94
94
in this section has the following form:
@@ -313,7 +313,7 @@ The Saturation Detector determines that the cluster is saturated by looking at t
313
313
- KV cache utilization
314
314
- Metrics staleness
315
315
316
-
The Saturation Detector is configured via the saturationDetector section of the overall configuration.
316
+
The Saturation Detector is configured via the `saturationDetector` section of the overall configuration.
317
317
It has the following form:
318
318
319
319
```yaml
@@ -323,7 +323,7 @@ saturationDetector:
323
323
metricsStalenessThreshold: 150ms
324
324
```
325
325
326
-
The various sub-fields of the saturationDetector section are:
326
+
The various sub-fields of the `saturationDetector` section are:
327
327
328
328
- The `queueDepthThreshold` field which defines the backend waiting queue size above which a
329
329
pod is considered to have insufficient capacity for new requests. This field is optional, if
@@ -338,18 +338,18 @@ as having no capacity for safety. This field is optional, if omitted a value of
338
338
## Data Layer configuration
339
339
340
340
The Data Layer collects metrics and other data used in scheduling decisions made by the various configured
341
-
filters and plugins. The exact data collected varies by the DataSource and Extractors configured. The basic ones
342
-
collect Prometheus metrics from the Model Servers in the InferencePool.
341
+
plugins. The exact data collected varies by the DataSource and Extractors configured. The baseline
342
+
provided in GAIE collect Prometheus metrics from the Model Servers in the InferencePool.
343
343
344
-
The Data Layer is configured via the data section of the overall configuration. It has the following form:
344
+
The Data Layer is configured via the `data` section of the overall configuration. It has the following form:
345
345
346
346
```yaml
347
347
data:
348
348
sources:
349
349
- pluginRef: source1
350
350
extractors:
351
-
- extarctor1
352
-
- extractor2
351
+
- pluginRef: extractor1
352
+
- pluginRef: extractor2
353
353
```
354
354
355
355
The data section has one field *sources* which configures the set of DataSources to be used to gather the metrics
@@ -358,7 +358,9 @@ and other data used for scheduling.
358
358
Each entry in the sources list has the following fields:
359
359
360
360
- *pluginRef* is a reference to the name of the plugin instance to be used.
361
-
- *extractors* specifies the list of the extractor plugin instances, by name, to be used with this DataSource.
361
+
- *extractors* specifies the list of the extractors to be used with this DataSource. Each entry in the extractors
362
+
list has the following field:
363
+
- *pluginRef* is a reference to the name of the plugin instances to be used.
362
364
363
365
**Note**: The names of the plugin instances mentioned above, refer to plugin instances defined in the plugins section
0 commit comments