Structured Logging in Python
In modern DevOps workflows, observability plays an important role. Operating distributed systems often depends on telemetry data, including metrics, traces, and logs. Logs are frequently a detailed source of information during troubleshooting. Traditional unstructured log messages, written as free-form text, can make it difficult to extract useful information automatically.
Structured logging organizes log data into a consistent, machine-readable format. For a primer on telemetry systems and observability fundamentals, see the blog post Introduction to Telemetry Systems.
What Is Structured Logging?
Structured logging captures log data as key-value pairs or JSON objects, assigning specific fields such as timestamp, severity, context, and message, rather than composing free-form text.
For example, consider this unstructured log message:
|
|
The same information expressed as a structured log entry:
|
|
Why Structured Logging Is Useful
Extracting data from unstructured logs often requires complex regular expressions or custom parsers. Changes in log message formats can break those downstream processes. Parsing unstructured logs is also resource-intensive, especially at scale.
Structured logs simplify automated ingestion, filtering, correlation, and querying within telemetry systems. They can integrate with observability platforms such as Elasticsearch, OpenSearch, Loki, or cloud-native logging solutions.
Introduction to Python Logging
Python includes built-in logging capabilities via the logging
module. Loggers are organized hierarchically: the root logger sits at the top, with child loggers inheriting behavior from their parents. For instance, a.b
is a child of a
.
Every logger has several components. The log level defines the minimum severity required to process messages. Filters offer additional control over which records are processed. Formatters define the output format. Handlers direct log records to destinations such as console, files, or remote services.
Python defines standard log levels: DEBUG
, INFO
, WARNING
, ERROR
, and CRITICAL
. If a logger’s level is set to NOTSET
, it inherits its level from the nearest ancestor.
When a log record is created, it is evaluated against the logger’s level. Filters are applied, then the record is passed to each handler. After processing by local handlers, the record propagates to the parent logger. Handlers are executed in sequence, one at a time.
Implementing Structured Logging in Python
The logging
module does not include native structured logging but can be extended with custom formatters. Here’s a simple JSON formatter example:
|
|
Logging configuration can be controlled through configuration files. This example shows how to apply the custom formatter:
|
|
Best Practices
When developing libraries, create loggers with logging.getLogger(__name__)
and avoid configuring handlers, formatters, filters, or log levels. Leave configuration to the consuming application.
For applications, use centralized configuration. Attach handlers and filters only to the root logger. Child loggers should not define handlers, filters, or levels. Thanks to message propagation, all log records flow to the root logger where consistent processing occurs.
Even though the root logger handles message processing, avoid using it directly. Always create child loggers for both libraries and applications using logging.getLogger(__name__)
.
This preserves clear origin information.
Small applications may only need one child logger. Larger applications often assign a dedicated child logger to each subsystem or module.
Python’s logging system is synchronous and blocking.
In high-volume scenarios, blocking can affect performance.
To mitigate this, the QueueHandler
can offload log processing to a separate thread.
Maintaining a consistent field schema across all log records simplifies ingestion, indexing, and querying in telemetry backends.
Conclusion
In this post, we examined the challenges of unstructured logs and the benefits of structured logging, especially for telemetry systems. We reviewed how Python’s built-in logging
module works, including its logger hierarchy, levels, filters, handlers, and formatters. We implemented structured logging by extending the default formatter with a custom JSON formatter. Finally, we covered best practices for configuring logging in both libraries and applications, emphasizing centralized control and consistent log schema design.
Structured logging in Python helps create reliable, machine-readable logs that integrate more easily into observability platforms, making system monitoring and troubleshooting more effective.
Happy engineering!