Skip to content

Commit fc1b743

Browse files
Adds app name support for metrics
Signed-off-by: Elena Kolevska <elena@kolevska.com>
1 parent e2988eb commit fc1b743

File tree

9 files changed

+356
-28
lines changed

9 files changed

+356
-28
lines changed

.env.example

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,10 @@ OTEL_EXPORTER_OTLP_TIMEOUT=10
5959
OTEL_EXPORTER_JAEGER_ENDPOINT=
6060
OTEL_RESOURCE_ATTRIBUTES=service.name=redis-py-test-app,service.version=1.0.0
6161

62+
# Multi-App Identification
63+
APP_NAME=python
64+
INSTANCE_ID=python-redis-test-1
65+
6266
# Output Configuration
6367
OUTPUT_FILE=
6468
QUIET=false

METRICS.md

Lines changed: 206 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,206 @@
1+
# Redis Test App Metrics Reference
2+
3+
Complete reference for all metrics collected by the Redis testing applications with multi-app support.
4+
5+
## 🏷️ Multi-Application Support
6+
7+
All metrics include labels for filtering by application name:
8+
9+
- **`app_name`**: Application name (python, go, java, etc.)
10+
- **`service_name`**: Service identifier (redis-py-test-app, redis-go-test-app, etc.)
11+
- **`instance_id`**: Unique instance identifier
12+
13+
## 📊 Core Metrics
14+
15+
### **1. Total Operations (Success/Error)**
16+
```
17+
redis_operations_total{operation="SET", status="success", app_name="python", service_name="redis-py-test-app", instance_id="python-redis-test-1"}
18+
```
19+
- **Type**: Counter
20+
- **Description**: Total number of Redis operations
21+
- **Labels**: operation, status, app_name, service_name, instance_id
22+
- **Values**: Increments on each operation
23+
24+
### **2. Operation Latency (Percentiles)**
25+
```
26+
redis_operation_duration_seconds{operation="GET", app_name="python", service_name="redis-py-test-app", instance_id="python-redis-test-1"}
27+
```
28+
- **Type**: Histogram
29+
- **Description**: Duration of Redis operations for percentile calculation
30+
- **Labels**: operation, app_name, service_name, instance_id
31+
- **Buckets**: 0.0001s to 10s (optimized for high-performance Redis)
32+
33+
### **3. Throughput (Operations per Second)**
34+
```
35+
redis_operations_per_second{app_name="python", service_name="redis-py-test-app", instance_id="python-redis-test-1"}
36+
```
37+
- **Type**: Gauge
38+
- **Description**: Current operations per second
39+
- **Labels**: app_name, service_name, instance_id
40+
- **Updates**: Real-time calculation
41+
42+
### **4. Error Rate Percentage**
43+
```
44+
redis_error_rate_percent{sdk="python", service_name="redis-py-test-app", instance_id="python-redis-test-1"}
45+
```
46+
- **Type**: Gauge
47+
- **Description**: Current error rate as percentage
48+
- **Labels**: sdk, service_name, instance_id
49+
- **Range**: 0-100%
50+
51+
### **5. Reconnection Duration**
52+
```
53+
redis_reconnection_duration_seconds{sdk="python", service_name="redis-py-test-app", instance_id="python-redis-test-1"}
54+
```
55+
- **Type**: Histogram
56+
- **Description**: Time taken for reconnection attempts
57+
- **Labels**: sdk, service_name, instance_id
58+
- **Buckets**: 0.1s to 60s
59+
60+
### **6. Active Connections**
61+
```
62+
redis_active_connections{sdk="python", service_name="redis-py-test-app", instance_id="python-redis-test-1"}
63+
```
64+
- **Type**: Gauge
65+
- **Description**: Current number of active Redis connections
66+
- **Labels**: sdk, service_name, instance_id
67+
68+
### **7. Connection Attempts**
69+
```
70+
redis_connections_total{status="success", sdk="python", service_name="redis-py-test-app", instance_id="python-redis-test-1"}
71+
```
72+
- **Type**: Counter
73+
- **Description**: Total connection attempts
74+
- **Labels**: status, sdk, service_name, instance_id
75+
76+
### **8. Average Latency**
77+
```
78+
redis_average_latency_seconds{operation="SET", sdk="python", service_name="redis-py-test-app", instance_id="python-redis-test-1"}
79+
```
80+
- **Type**: Gauge
81+
- **Description**: Average operation latency (for quick overview)
82+
- **Labels**: operation, sdk, service_name, instance_id
83+
84+
## 🎯 Grafana Query Examples
85+
86+
### **Filter by Application Type**
87+
```promql
88+
# Python app operations only
89+
redis_operations_total{app_name="python"}
90+
91+
# Go app operations only
92+
redis_operations_total{app_name="go"}
93+
94+
# Java app operations only
95+
redis_operations_total{app_name="java"}
96+
```
97+
98+
### **Compare Applications**
99+
```promql
100+
# Throughput comparison across app types
101+
sum(rate(redis_operations_total[5m])) by (app_name)
102+
103+
# Latency comparison (95th percentile)
104+
histogram_quantile(0.95, rate(redis_operation_duration_seconds_bucket[5m])) by (app_name)
105+
106+
# Error rate comparison
107+
avg(redis_error_rate_percent) by (app_name)
108+
```
109+
110+
### **Multi-Instance Monitoring**
111+
```promql
112+
# All Python instances
113+
redis_operations_per_second{app_name="python"}
114+
115+
# Specific instance
116+
redis_operations_per_second{instance_id="python-redis-test-1"}
117+
118+
# Instance comparison
119+
sum(rate(redis_operations_total[5m])) by (instance_id)
120+
```
121+
122+
### **Operation-Specific Analysis**
123+
```promql
124+
# SET operation latency across all apps
125+
histogram_quantile(0.95, rate(redis_operation_duration_seconds_bucket{operation="SET"}[5m])) by (app_name)
126+
127+
# GET operation throughput by app type
128+
sum(rate(redis_operations_total{operation="GET"}[5m])) by (app_name)
129+
130+
# Error rate for specific operations
131+
rate(redis_operations_total{operation="LPUSH", status="error"}[5m]) / rate(redis_operations_total{operation="LPUSH"}[5m]) * 100
132+
```
133+
134+
## 🔧 Configuration for Different Apps
135+
136+
### **Python App (.env.docker)**
137+
```bash
138+
APP_NAME=redis-py-6.2.0
139+
INSTANCE_ID=python-redis-test-1
140+
OTEL_SERVICE_NAME=redis-py-test-app
141+
```
142+
143+
### **Go App (example)**
144+
```bash
145+
APP_NAME=go-9.3.2
146+
INSTANCE_ID=go-redis-test-1
147+
OTEL_SERVICE_NAME=redis-go-test-app
148+
```
149+
150+
### **Java App (example)**
151+
```bash
152+
APP_NAME=jedis-6.0.0
153+
INSTANCE_ID=java-redis-test-1
154+
OTEL_SERVICE_NAME=redis-java-test-app
155+
```
156+
157+
## 📈 Dashboard Panels
158+
159+
### **Overview Panel**
160+
```promql
161+
# Total operations across all apps
162+
sum(rate(redis_operations_total[5m]))
163+
164+
# Success rate across all apps
165+
sum(rate(redis_operations_total{status="success"}[5m])) / sum(rate(redis_operations_total[5m])) * 100
166+
```
167+
168+
### **Performance Comparison Panel**
169+
```promql
170+
# Throughput by app type
171+
sum(rate(redis_operations_total[5m])) by (app_name)
172+
173+
# Average latency by app type
174+
avg(redis_average_latency_seconds) by (app_name)
175+
```
176+
177+
### **Error Analysis Panel**
178+
```promql
179+
# Error rate by app type
180+
avg(redis_error_rate_percent) by (app_name)
181+
182+
# Errors by operation and app type
183+
sum(rate(redis_operations_total{status="error"}[5m])) by (operation, app_name)
184+
```
185+
186+
## 🚀 Benefits
187+
188+
### **Multi-App Monitoring**
189+
-**Compare performance** across different language implementations
190+
-**Identify bottlenecks** in specific app types
191+
-**Track improvements** across Redis client libraries
192+
-**Isolate issues** to specific applications or instances
193+
194+
### **Comprehensive Coverage**
195+
-**All required metrics** from your specification
196+
-**Operation-specific tracking** for detailed analysis
197+
-**Real-time calculations** for throughput and error rates
198+
-**Percentile support** for latency analysis
199+
200+
### **Production Ready**
201+
-**Scalable labeling** for large deployments
202+
-**Efficient storage** with appropriate metric types
203+
-**Standard formats** compatible with all monitoring tools
204+
-**Easy filtering** and aggregation in Grafana
205+
206+
This metrics setup gives you complete visibility into Redis performance across all your testing applications! 🎯

cli.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,8 @@ def cli():
8686
@click.option('--otel-endpoint', default=lambda: get_env_or_default('OTEL_EXPORTER_OTLP_ENDPOINT', None), help='OpenTelemetry OTLP endpoint')
8787
@click.option('--otel-service-name', default=lambda: get_env_or_default('OTEL_SERVICE_NAME', 'redis-load-test'), help='OpenTelemetry service name')
8888
@click.option('--otel-export-interval', type=int, default=lambda: get_env_or_default('OTEL_EXPORT_INTERVAL', 5000, int), help='OpenTelemetry export interval in milliseconds')
89+
@click.option('--app-name', default=lambda: get_env_or_default('APP_NAME', 'python'), help='Application name for multi-app filtering (python, go, java, etc.)')
90+
@click.option('--instance-id', default=lambda: get_env_or_default('INSTANCE_ID', None), help='Unique instance identifier')
8991
@click.option('--output-file', default=lambda: get_env_or_default('OUTPUT_FILE', None), help='Output file for metrics export (JSON)')
9092
@click.option('--quiet', is_flag=True, default=lambda: get_env_or_default('QUIET', False, bool), help='Suppress periodic stats output')
9193
@click.option('--config-file', default=lambda: get_env_or_default('CONFIG_FILE', None), help='Load configuration from YAML/JSON file')
@@ -309,7 +311,9 @@ def _build_config_from_args(kwargs) -> RunnerConfig:
309311
otel_enabled=kwargs['otel_enabled'],
310312
otel_endpoint=kwargs['otel_endpoint'],
311313
otel_service_name=kwargs['otel_service_name'],
312-
otel_export_interval_ms=kwargs['otel_export_interval']
314+
otel_export_interval_ms=kwargs['otel_export_interval'],
315+
app_name=kwargs['app_name'],
316+
instance_id=kwargs['instance_id']
313317
)
314318

315319
return config

config.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,10 @@ class RunnerConfig:
138138
otel_export_interval_ms: int = 5000
139139
otel_resource_attributes: Dict[str, str] = field(default_factory=dict)
140140

141+
# Multi-app identification
142+
app_name: str = "python"
143+
instance_id: Optional[str] = None
144+
141145

142146
class WorkloadProfiles:
143147
"""Pre-defined workload profiles with intuitive, descriptive names."""

docker-compose.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ services:
4949
ports:
5050
- "4317:4317" # OTLP gRPC receiver
5151
- "4318:4318" # OTLP HTTP receiver
52-
- "8888:8888" # Prometheus metrics
52+
- "8889:8889" # Prometheus metrics
5353
- "14250:14250" # Jaeger gRPC
5454
depends_on:
5555
- jaeger

0 commit comments

Comments
 (0)