|
| 1 | +# Redis Test App Metrics Reference |
| 2 | + |
| 3 | +Complete reference for all metrics collected by the Redis testing applications with multi-app support. |
| 4 | + |
| 5 | +## 🏷️ Multi-Application Support |
| 6 | + |
| 7 | +All metrics include labels for filtering by application name: |
| 8 | + |
| 9 | +- **`app_name`**: Application name (python, go, java, etc.) |
| 10 | +- **`service_name`**: Service identifier (redis-py-test-app, redis-go-test-app, etc.) |
| 11 | +- **`instance_id`**: Unique instance identifier |
| 12 | + |
| 13 | +## 📊 Core Metrics |
| 14 | + |
| 15 | +### **1. Total Operations (Success/Error)** |
| 16 | +``` |
| 17 | +redis_operations_total{operation="SET", status="success", app_name="python", service_name="redis-py-test-app", instance_id="python-redis-test-1"} |
| 18 | +``` |
| 19 | +- **Type**: Counter |
| 20 | +- **Description**: Total number of Redis operations |
| 21 | +- **Labels**: operation, status, app_name, service_name, instance_id |
| 22 | +- **Values**: Increments on each operation |
| 23 | + |
| 24 | +### **2. Operation Latency (Percentiles)** |
| 25 | +``` |
| 26 | +redis_operation_duration_seconds{operation="GET", app_name="python", service_name="redis-py-test-app", instance_id="python-redis-test-1"} |
| 27 | +``` |
| 28 | +- **Type**: Histogram |
| 29 | +- **Description**: Duration of Redis operations for percentile calculation |
| 30 | +- **Labels**: operation, app_name, service_name, instance_id |
| 31 | +- **Buckets**: 0.0001s to 10s (optimized for high-performance Redis) |
| 32 | + |
| 33 | +### **3. Throughput (Operations per Second)** |
| 34 | +``` |
| 35 | +redis_operations_per_second{app_name="python", service_name="redis-py-test-app", instance_id="python-redis-test-1"} |
| 36 | +``` |
| 37 | +- **Type**: Gauge |
| 38 | +- **Description**: Current operations per second |
| 39 | +- **Labels**: app_name, service_name, instance_id |
| 40 | +- **Updates**: Real-time calculation |
| 41 | + |
| 42 | +### **4. Error Rate Percentage** |
| 43 | +``` |
| 44 | +redis_error_rate_percent{sdk="python", service_name="redis-py-test-app", instance_id="python-redis-test-1"} |
| 45 | +``` |
| 46 | +- **Type**: Gauge |
| 47 | +- **Description**: Current error rate as percentage |
| 48 | +- **Labels**: sdk, service_name, instance_id |
| 49 | +- **Range**: 0-100% |
| 50 | + |
| 51 | +### **5. Reconnection Duration** |
| 52 | +``` |
| 53 | +redis_reconnection_duration_seconds{sdk="python", service_name="redis-py-test-app", instance_id="python-redis-test-1"} |
| 54 | +``` |
| 55 | +- **Type**: Histogram |
| 56 | +- **Description**: Time taken for reconnection attempts |
| 57 | +- **Labels**: sdk, service_name, instance_id |
| 58 | +- **Buckets**: 0.1s to 60s |
| 59 | + |
| 60 | +### **6. Active Connections** |
| 61 | +``` |
| 62 | +redis_active_connections{sdk="python", service_name="redis-py-test-app", instance_id="python-redis-test-1"} |
| 63 | +``` |
| 64 | +- **Type**: Gauge |
| 65 | +- **Description**: Current number of active Redis connections |
| 66 | +- **Labels**: sdk, service_name, instance_id |
| 67 | + |
| 68 | +### **7. Connection Attempts** |
| 69 | +``` |
| 70 | +redis_connections_total{status="success", sdk="python", service_name="redis-py-test-app", instance_id="python-redis-test-1"} |
| 71 | +``` |
| 72 | +- **Type**: Counter |
| 73 | +- **Description**: Total connection attempts |
| 74 | +- **Labels**: status, sdk, service_name, instance_id |
| 75 | + |
| 76 | +### **8. Average Latency** |
| 77 | +``` |
| 78 | +redis_average_latency_seconds{operation="SET", sdk="python", service_name="redis-py-test-app", instance_id="python-redis-test-1"} |
| 79 | +``` |
| 80 | +- **Type**: Gauge |
| 81 | +- **Description**: Average operation latency (for quick overview) |
| 82 | +- **Labels**: operation, sdk, service_name, instance_id |
| 83 | + |
| 84 | +## 🎯 Grafana Query Examples |
| 85 | + |
| 86 | +### **Filter by Application Type** |
| 87 | +```promql |
| 88 | +# Python app operations only |
| 89 | +redis_operations_total{app_name="python"} |
| 90 | +
|
| 91 | +# Go app operations only |
| 92 | +redis_operations_total{app_name="go"} |
| 93 | +
|
| 94 | +# Java app operations only |
| 95 | +redis_operations_total{app_name="java"} |
| 96 | +``` |
| 97 | + |
| 98 | +### **Compare Applications** |
| 99 | +```promql |
| 100 | +# Throughput comparison across app types |
| 101 | +sum(rate(redis_operations_total[5m])) by (app_name) |
| 102 | +
|
| 103 | +# Latency comparison (95th percentile) |
| 104 | +histogram_quantile(0.95, rate(redis_operation_duration_seconds_bucket[5m])) by (app_name) |
| 105 | +
|
| 106 | +# Error rate comparison |
| 107 | +avg(redis_error_rate_percent) by (app_name) |
| 108 | +``` |
| 109 | + |
| 110 | +### **Multi-Instance Monitoring** |
| 111 | +```promql |
| 112 | +# All Python instances |
| 113 | +redis_operations_per_second{app_name="python"} |
| 114 | +
|
| 115 | +# Specific instance |
| 116 | +redis_operations_per_second{instance_id="python-redis-test-1"} |
| 117 | +
|
| 118 | +# Instance comparison |
| 119 | +sum(rate(redis_operations_total[5m])) by (instance_id) |
| 120 | +``` |
| 121 | + |
| 122 | +### **Operation-Specific Analysis** |
| 123 | +```promql |
| 124 | +# SET operation latency across all apps |
| 125 | +histogram_quantile(0.95, rate(redis_operation_duration_seconds_bucket{operation="SET"}[5m])) by (app_name) |
| 126 | +
|
| 127 | +# GET operation throughput by app type |
| 128 | +sum(rate(redis_operations_total{operation="GET"}[5m])) by (app_name) |
| 129 | +
|
| 130 | +# Error rate for specific operations |
| 131 | +rate(redis_operations_total{operation="LPUSH", status="error"}[5m]) / rate(redis_operations_total{operation="LPUSH"}[5m]) * 100 |
| 132 | +``` |
| 133 | + |
| 134 | +## 🔧 Configuration for Different Apps |
| 135 | + |
| 136 | +### **Python App (.env.docker)** |
| 137 | +```bash |
| 138 | +APP_NAME=redis-py-6.2.0 |
| 139 | +INSTANCE_ID=python-redis-test-1 |
| 140 | +OTEL_SERVICE_NAME=redis-py-test-app |
| 141 | +``` |
| 142 | + |
| 143 | +### **Go App (example)** |
| 144 | +```bash |
| 145 | +APP_NAME=go-9.3.2 |
| 146 | +INSTANCE_ID=go-redis-test-1 |
| 147 | +OTEL_SERVICE_NAME=redis-go-test-app |
| 148 | +``` |
| 149 | + |
| 150 | +### **Java App (example)** |
| 151 | +```bash |
| 152 | +APP_NAME=jedis-6.0.0 |
| 153 | +INSTANCE_ID=java-redis-test-1 |
| 154 | +OTEL_SERVICE_NAME=redis-java-test-app |
| 155 | +``` |
| 156 | + |
| 157 | +## 📈 Dashboard Panels |
| 158 | + |
| 159 | +### **Overview Panel** |
| 160 | +```promql |
| 161 | +# Total operations across all apps |
| 162 | +sum(rate(redis_operations_total[5m])) |
| 163 | +
|
| 164 | +# Success rate across all apps |
| 165 | +sum(rate(redis_operations_total{status="success"}[5m])) / sum(rate(redis_operations_total[5m])) * 100 |
| 166 | +``` |
| 167 | + |
| 168 | +### **Performance Comparison Panel** |
| 169 | +```promql |
| 170 | +# Throughput by app type |
| 171 | +sum(rate(redis_operations_total[5m])) by (app_name) |
| 172 | +
|
| 173 | +# Average latency by app type |
| 174 | +avg(redis_average_latency_seconds) by (app_name) |
| 175 | +``` |
| 176 | + |
| 177 | +### **Error Analysis Panel** |
| 178 | +```promql |
| 179 | +# Error rate by app type |
| 180 | +avg(redis_error_rate_percent) by (app_name) |
| 181 | +
|
| 182 | +# Errors by operation and app type |
| 183 | +sum(rate(redis_operations_total{status="error"}[5m])) by (operation, app_name) |
| 184 | +``` |
| 185 | + |
| 186 | +## 🚀 Benefits |
| 187 | + |
| 188 | +### **Multi-App Monitoring** |
| 189 | +- ✅ **Compare performance** across different language implementations |
| 190 | +- ✅ **Identify bottlenecks** in specific app types |
| 191 | +- ✅ **Track improvements** across Redis client libraries |
| 192 | +- ✅ **Isolate issues** to specific applications or instances |
| 193 | + |
| 194 | +### **Comprehensive Coverage** |
| 195 | +- ✅ **All required metrics** from your specification |
| 196 | +- ✅ **Operation-specific tracking** for detailed analysis |
| 197 | +- ✅ **Real-time calculations** for throughput and error rates |
| 198 | +- ✅ **Percentile support** for latency analysis |
| 199 | + |
| 200 | +### **Production Ready** |
| 201 | +- ✅ **Scalable labeling** for large deployments |
| 202 | +- ✅ **Efficient storage** with appropriate metric types |
| 203 | +- ✅ **Standard formats** compatible with all monitoring tools |
| 204 | +- ✅ **Easy filtering** and aggregation in Grafana |
| 205 | + |
| 206 | +This metrics setup gives you complete visibility into Redis performance across all your testing applications! 🎯 |
0 commit comments