Sentry Metrics¶
Tux uses Sentry metrics to track performance, usage patterns, and system health. Metrics help you identify bottlenecks, monitor usage trends, and correlate performance data with errors.
Overview¶
Sentry metrics are automatically enabled when Sentry is initialized. The metrics system provides three types of metrics:
- Counters - Track event occurrences (command usage, errors, cache hits/misses)
- Distributions - Track measurements with percentiles (execution times, latencies)
- Gauges - Track current values (cache sizes, connection pool usage)
Available Metrics Functions¶
All metrics functions are available from tux.services.sentry.metrics:
from tux.services.sentry.metrics import (
record_command_metric,
record_database_metric,
record_api_metric,
record_cog_metric,
record_cache_metric,
record_task_metric,
)
Automatic Metrics¶
Command Execution¶
Command metrics are automatically recorded for all commands:
bot.command.execution_time(distribution) - Command execution time in millisecondsbot.command.usage(counter) - Command usage countbot.command.failures(counter) - Command failure count
Attributes:
command- Command namecommand_type- Type of command (prefix, slash, hybrid)success- Whether command succeedederror_type- Error type if command failed
Integration: Automatically recorded in track_command_end() in src/tux/services/sentry/context.py
Database Operations¶
Database metrics are automatically recorded for all database operations with retry logic:
bot.database.operation.duration(distribution) - Operation duration in millisecondsbot.database.operation.count(counter) - Operation countbot.database.retries(counter) - Retry countbot.database.failures(counter) - Failure count
Attributes:
operation- Operation type (query, insert, update, delete)table- Table name if applicableretry_count- Number of retriessuccess- Whether operation succeedederror_type- Error type if operation failed
Integration: Automatically recorded in _execute_with_retry() in src/tux/database/service.py
Manual Metrics Recording¶
API Calls¶
Track external API call performance:
from tux.services.sentry.metrics import record_api_metric
import time
start_time = time.perf_counter()
try:
response = await httpx.get("https://api.example.com/data")
duration_ms = (time.perf_counter() - start_time) * 1000
record_api_metric(
service="example_api",
endpoint="/data",
duration_ms=duration_ms,
status_code=response.status_code,
method="GET",
success=True,
)
except httpx.HTTPStatusError as e:
duration_ms = (time.perf_counter() - start_time) * 1000
record_api_metric(
service="example_api",
endpoint="/data",
duration_ms=duration_ms,
status_code=e.response.status_code,
method="GET",
success=False,
)
Metrics Emitted:
bot.api.call.duration(distribution) - API call latencybot.api.call.count(counter) - API call countbot.api.rate_limits(counter) - Rate limit hitsbot.api.failures(counter) - API failure count
Cog Operations¶
Track cog loading, unloading, and reloading:
from tux.services.sentry.metrics import record_cog_metric
import time
start_time = time.perf_counter()
try:
await bot.load_extension("tux.modules.tools.tldr")
duration_ms = (time.perf_counter() - start_time) * 1000
record_cog_metric(
cog_name="tldr",
operation="load",
duration_ms=duration_ms,
success=True,
)
except Exception as e:
duration_ms = (time.perf_counter() - start_time) * 1000
record_cog_metric(
cog_name="tldr",
operation="load",
duration_ms=duration_ms,
success=False,
error_type=type(e).__name__,
)
Metrics Emitted:
bot.cog.operation.count(counter) - Cog operation countbot.cog.operation.duration(distribution) - Operation durationbot.cog.failures(counter) - Cog operation failures
Cache Operations¶
Track cache performance:
from tux.services.sentry.metrics import record_cache_metric
import time
start_time = time.perf_counter()
cached_value = cache.get("key")
duration_ms = (time.perf_counter() - start_time) * 1000
if cached_value:
record_cache_metric(
cache_name="tldr",
operation="get",
hit=True,
duration_ms=duration_ms,
)
else:
record_cache_metric(
cache_name="tldr",
operation="get",
miss=True,
duration_ms=duration_ms,
)
# Fetch and cache value
value = await fetch_value()
cache.set("key", value)
# Record cache size
record_cache_metric(
cache_name="tldr",
operation="set",
size=len(cache),
)
Metrics Emitted:
bot.cache.hits(counter) - Cache hitsbot.cache.misses(counter) - Cache missesbot.cache.operation.duration(distribution) - Cache operation durationbot.cache.size(gauge) - Current cache size
Background Tasks¶
Track background task execution:
from tux.services.sentry.metrics import record_task_metric
import time
async def background_task():
start_time = time.perf_counter()
try:
# Task logic
await do_work()
duration_ms = (time.perf_counter() - start_time) * 1000
record_task_metric(
task_name="daily_cleanup",
duration_ms=duration_ms,
success=True,
task_type="scheduled",
)
except Exception as e:
duration_ms = (time.perf_counter() - start_time) * 1000
record_task_metric(
task_name="daily_cleanup",
duration_ms=duration_ms,
success=False,
error_type=type(e).__name__,
task_type="scheduled",
)
Metrics Emitted:
bot.task.execution_time(distribution) - Task execution timebot.task.executions(counter) - Task execution countbot.task.failures(counter) - Task failure count
Integration Points¶
Recommended Integration Locations¶
-
TLDR Cache Updates (
src/tux/modules/tools/tldr.py) - Userecord_cache_metric()for cache operations - Track cache update durations -
Discord API Wrappers (HTTP clients) - Use
record_api_metric()for Discord API calls - Track rate limits and latencies -
Cog Setup (
src/tux/core/setup/cog_setup.py) - Userecord_cog_metric()for cog loading operations - Track load times and failures -
Background Tasks (
src/tux/core/task_monitor.py) - Userecord_task_metric()for periodic tasks - Track execution times and failures -
Cache Systems (prefix manager, emoji manager) - Use
record_cache_metric()for cache operations - Track hit/miss rates and sizes
Viewing Metrics in Sentry¶
- Navigate to Discover in Sentry
- Select Metrics from the dropdown
- Search for metrics by name (e.g.,
bot.command.execution_time) - Filter by attributes (e.g.,
command:ping,success:true) - View aggregations (p50, p90, p95, p99, min, max, avg)
Best Practices¶
-
Use Appropriate Metric Types - Counters for events (usage, errors, hits/misses) - Distributions for measurements (times, latencies, sizes) - Gauges for current values (cache sizes, connection counts)
-
Include Relevant Attributes - Add attributes that help filter and group metrics - Use consistent attribute names across related metrics
-
Record Metrics at Key Points - Record success and failure metrics - Include timing for performance-critical operations - Track retries and error types
-
Avoid Over-Metrics - Don't record metrics for every operation - Focus on important operations and failure points - Use sampling for high-frequency operations if needed
-
Correlate with Errors - Metrics automatically include trace context - Correlate metric spikes with error occurrences - Use attributes to filter metrics by error type
Example: Complete Integration¶
from tux.services.sentry.metrics import record_api_metric, record_cache_metric
import time
async def fetch_tldr_page(platform: str, command: str, lang: str) -> str:
"""Fetch TLDR page with metrics tracking."""
cache_key = f"tldr:{platform}:{command}:{lang}"
# Check cache
start = time.perf_counter()
cached = cache.get(cache_key)
cache_duration = (time.perf_counter() - start) * 1000
if cached:
record_cache_metric(
cache_name="tldr",
operation="get",
hit=True,
duration_ms=cache_duration,
)
return cached
record_cache_metric(
cache_name="tldr",
operation="get",
miss=True,
duration_ms=cache_duration,
)
# Fetch from API
api_start = time.perf_counter()
try:
response = await httpx.get(
f"https://tldr.sh/api/v1/pages/{platform}/{command}",
params={"lang": lang},
)
api_duration = (time.perf_counter() - api_start) * 1000
record_api_metric(
service="tldr",
endpoint=f"/pages/{platform}/{command}",
duration_ms=api_duration,
status_code=response.status_code,
method="GET",
success=True,
)
content = response.text
cache.set(cache_key, content)
record_cache_metric(
cache_name="tldr",
operation="set",
size=len(cache),
)
return content
except httpx.HTTPStatusError as e:
api_duration = (time.perf_counter() - api_start) * 1000
record_api_metric(
service="tldr",
endpoint=f"/pages/{platform}/{command}",
duration_ms=api_duration,
status_code=e.response.status_code,
method="GET",
success=False,
)
raise
Related Documentation¶
- Sentry Integration - General Sentry setup and configuration
- Choosing Instrumentation - When to use transactions/spans vs metrics
- Transactions and Spans - How to use transactions and spans
- Context and Data - Tags, context, scopes, users
- Error Handling - Error handling patterns