Skip to content

Commit fea388f

Browse files
removes unneeded metrics
Signed-off-by: Elena Kolevska <elena@kolevska.com>
1 parent 8da8ccd commit fea388f

File tree

1 file changed

+10
-52
lines changed

1 file changed

+10
-52
lines changed

REDIS_TESTING_METRICS_SPECIFICATION.md

Lines changed: 10 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,6 @@ Redis Test Apps → OpenTelemetry Collector → Prometheus → Grafana
2424
| `redis_operations_total` | Counter | `1` | Total number of Redis operations executed | `operation`, `status`, `app_name`, `instance_id`, `version`, `error_type` | Records every Redis command |
2525
| `redis_operation_duration` | Histogram | `ms` | Duration of Redis operations in milliseconds | `operation`, `status`, `app_name`, `instance_id`, `version` | Buckets: `[0.1, 0.5, 1, 2, 5, 10, 25, 50, 100, 250, 500, 1000, 2500, 5000, 10000]` |
2626
| `redis_connections_total` | Counter | `1` | Total number of Redis connection attempts | `status`, `app_name`, `instance_id`, `version` | Tracks connection success/failure |
27-
| `redis_active_connections` | Gauge | `1` | Current number of active Redis connections | `app_name`, `instance_id`, `version` | Current state, not cumulative |
2827
| `redis_reconnection_duration_ms` | Histogram | `ms` | Duration of Redis reconnection attempts | `app_name`, `instance_id`, `version` | Buckets: `[100, 500, 1000, 2000, 5000, 10000, 30000, 60000]` |
2928

3029
### Label Definitions
@@ -45,46 +44,11 @@ Redis Test Apps → OpenTelemetry Collector → Prometheus → Grafana
4544
redis_operations_total{operation="SET", status="success", app_name="python-basic-rw", instance_id="abc123", version="1.0.0", error_type="none"} 1500
4645
redis_operations_total{operation="GET", status="error", app_name="python-basic-rw", instance_id="abc123", version="1.0.0", error_type="timeout"} 5
4746
48-
# Active connections gauge
49-
redis_active_connections{app_name="python-basic-rw", instance_id="abc123", version="1.0.0"} 5
50-
5147
# Connection attempts counter
5248
redis_connections_total{status="success", app_name="python-basic-rw", instance_id="abc123", version="1.0.0"} 10
5349
redis_connections_total{status="error", app_name="python-basic-rw", instance_id="abc123", version="1.0.0"} 2
5450
```
5551

56-
## Standard Redis Operations
57-
58-
All applications should support these standard Redis operations for consistent testing:
59-
60-
| Category | Operation | Description | Use Case |
61-
|----------|-----------|-------------|----------|
62-
| **Core** | `SET` | Set string value | Basic key-value operations |
63-
| | `GET` | Get string value | Basic key-value operations |
64-
| | `DEL` | Delete key | Cleanup and key management |
65-
| | `INCR` | Increment integer value | Counters and atomic operations |
66-
| | `PING` | Connection test | Health checks |
67-
| **Lists** | `LPUSH` | Push to left of list | Queue operations |
68-
| | `RPUSH` | Push to right of list | Stack operations |
69-
| | `LPOP` | Pop from left of list | Queue processing |
70-
| | `RPOP` | Pop from right of list | Stack processing |
71-
| | `LRANGE` | Get range from list | List inspection |
72-
| | `LLEN` | Get list length | List size monitoring |
73-
| **Sets** | `SADD` | Add to set | Unique collections |
74-
| | `SREM` | Remove from set | Set management |
75-
| | `SCARD` | Get set cardinality | Set size monitoring |
76-
| | `SMEMBERS` | Get all set members | Set inspection |
77-
| **Hashes** | `HSET` | Set hash field | Object storage |
78-
| | `HGET` | Get hash field | Object field access |
79-
| | `HDEL` | Delete hash field | Object field management |
80-
| | `HGETALL` | Get all hash fields | Object inspection |
81-
| **Sorted Sets** | `ZADD` | Add to sorted set | Ranked collections |
82-
| | `ZREM` | Remove from sorted set | Ranked management |
83-
| | `ZCARD` | Get sorted set cardinality | Ranked size monitoring |
84-
| | `ZRANGE` | Get range from sorted set | Ranked queries |
85-
| **Pub/Sub** | `PUBLISH` | Publish message | Message broadcasting |
86-
| | `SUBSCRIBE` | Subscribe to channel | Message consumption |
87-
8852
## Label Standards
8953

9054
### App Name Convention
@@ -195,20 +159,20 @@ All metric queries should support these filters for consistent dashboard behavio
195159

196160
## Example Grafana Queries
197161

198-
**Note on Time Ranges**: The `[5m]` in these queries represents a 5-minute time window for rate calculations. This provides:
199-
- **Smoothed metrics**: Reduces noise from short-term spikes
200-
- **Stable rates**: More reliable rate calculations over time
201-
- **Better visualization**: Smoother graphs in Grafana
162+
**Note on Time Ranges**: The `[10s]` in these queries represents a 10-second time window for rate calculations. This provides:
163+
- **Near real-time response**: Changes visible within 10 seconds
164+
- **Good balance**: Responsive enough for monitoring while reducing noise
165+
- **Fast failure detection**: Quickly shows when Redis goes down
166+
202167

203-
For real-time monitoring, you can use shorter windows like `[30s]` or `[1m]`, but expect more volatile graphs.
204168

205169
### Operations Rate by Status
206170
```promql
207-
# 5-minute rate (recommended for stable visualization)
208-
sum(rate(redis_operations_total{app_name=~"$app_name", instance_id=~"$instance_id", operation=~"$operation", version=~"$version"}[5m])) by (operation, status)
171+
# 10-second rate (recommended for near real-time monitoring)
172+
sum(rate(redis_operations_total{app_name=~"$app_name", instance_id=~"$instance_id", operation=~"$operation", version=~"$version"}[10s])) by (operation, status)
209173
210-
# 1-minute rate (more real-time, more volatile)
211-
sum(rate(redis_operations_total{app_name=~"$app_name", instance_id=~"$instance_id", operation=~"$operation", version=~"$version"}[1m])) by (operation, status)
174+
# 5-minute rate (smoother, less responsive)
175+
sum(rate(redis_operations_total{app_name=~"$app_name", instance_id=~"$instance_id", operation=~"$operation", version=~"$version"}[5m])) by (operation, status)
212176
```
213177

214178
### Average Latency by Operation
@@ -229,11 +193,6 @@ histogram_quantile(0.95, sum(rate(redis_operation_duration_bucket{app_name=~"$ap
229193
histogram_quantile(0.95, sum(rate(redis_operation_duration_bucket{app_name=~"$app_name", instance_id=~"$instance_id", operation=~"$operation", version=~"$version"}[1m])) by (operation, le))
230194
```
231195

232-
### Active Connections by App
233-
```promql
234-
redis_active_connections{app_name=~"$app_name", instance_id=~"$instance_id", version=~"$version"}
235-
```
236-
237196
### Connection Success Rate
238197
```promql
239198
# 5-minute success rate (recommended)
@@ -410,7 +369,6 @@ spec:
410369
411370
3. **Incorrect Metric Types**
412371
- Use Counter for cumulative values (operations_total, connections_total)
413-
- Use Gauge for current state (active_connections)
414372
- Use Histogram for distributions (duration metrics)
415373
416374
4. **Performance Issues**
@@ -420,7 +378,7 @@ spec:
420378
421379
### Validation Checklist
422380
423-
- [ ] All 5 required metrics are implemented
381+
- [ ] All 4 required metrics are implemented
424382
- [ ] All required labels are present and correctly formatted
425383
- [ ] App name follows naming convention
426384
- [ ] Instance ID is unique per application instance

0 commit comments

Comments
 (0)