Python Debugger (pdb)

debuggingPythoncommand-linestep-executionbreakpointsinteractive

Debugging Tool

Overview

Python Debugger (pdb) is Python's standard debugger. It provides command-line control over Python program execution, allowing you to set breakpoints, examine variables, and perform step execution.

Details

Python Debugger (pdb) is an interactive debugger included in Python's standard library, evolving alongside the Python language since the 1990s. Built into the Python interpreter, it requires no additional installation and provides powerful debugging capabilities through a simple command-line interface. Designed as a Python port of GDB, pdb maintains a similar command structure while being optimized for Python's dynamic nature.

The defining feature of pdb is complete interactive control over running Python programs. You can pause program execution at any point to inspect variable values, call functions, evaluate expressions, and examine the call stack. Its post-mortem debugging capability allows detailed analysis of program state after exceptions occur. Additionally, remote debugging features enable debugging of Python programs running in separate processes or across networks.

Despite the proliferation of sophisticated IDE debuggers, pdb maintains strong support due to its lightweight nature and simplicity. It particularly excels in situations where GUI environments are unavailable, such as investigating issues in production environments, debugging in CI pipelines, or troubleshooting within Docker containers. Since Python 3.7, the addition of the breakpoint() built-in function has further simplified pdb usage.

Third-party tools extending pdb, such as pdb++ (pdbpp) and IPDB, exist to provide syntax highlighting, autocompletion, and improved user interfaces. In modern Python development, pdb continues to maintain its status as a fundamental debugging tool.

Pros and Cons

Pros

Built-in: Available from Python installation, no additional setup required
Lightweight: Minimal resource consumption during debugging
Simple Operation: Intuitive command-line interface
Complete Control: Detailed control and inspection of program execution
Post-mortem Analysis: Detailed state analysis after exceptions
Remote Debugging: Debugging possible over network
IDE-independent: Consistent debugging experience in any environment
Extensibility: Custom commands and scripting capabilities

Cons

Learning Curve: Command-line operation requires learning
No GUI: Lack of visual interface
Limited Features: Limited functionality compared to IDE debuggers
Efficiency: Reduced operational efficiency in large programs
Display Limitations: Difficult visualization of complex data structures
Poor Integration: Weak integration with modern development tools
Beginner Barrier: Operations can be difficult for Python beginners

Key Links

Usage Examples

Basic Debugging Setup

# basic_debugging.py
import pdb

def calculate_average(numbers):
    # Insert breakpoint here
    pdb.set_trace()  # Traditional method
    # Or use Python 3.7+ built-in:
    # breakpoint()
    
    total = sum(numbers)
    count = len(numbers)
    
    # This will cause a ZeroDivisionError if numbers is empty
    average = total / count
    return average

def main():
    # Test with various inputs
    test_cases = [
        [1, 2, 3, 4, 5],
        [10, 20, 30],
        [],  # This will cause an error
    ]
    
    for numbers in test_cases:
        try:
            avg = calculate_average(numbers)
            print(f"Average of {numbers}: {avg}")
        except ZeroDivisionError:
            print(f"Cannot calculate average of empty list")

if __name__ == "__main__":
    main()

# Basic pdb commands:
# h(elp) - Show command help
# l(ist) - Show current code
# n(ext) - Execute next line (step over)
# s(tep) - Step into function
# c(ontinue) - Continue execution
# p variable - Print variable value
# pp variable - Pretty-print variable
# w(here) - Show stack trace
# u(p) - Move up in stack
# d(own) - Move down in stack
# q(uit) - Quit debugger

Advanced Breakpoint Usage

# advanced_breakpoints.py
import pdb
import sys

class DataProcessor:
    def __init__(self):
        self.data = []
        self.processed_count = 0
    
    def add_data(self, item):
        self.data.append(item)
    
    def process_all(self):
        for index, item in enumerate(self.data):
            # Conditional breakpoint
            # Only break when processing specific items
            if index > 2:
                pdb.set_trace()
            
            self.process_item(item)
            self.processed_count += 1
    
    def process_item(self, item):
        # Simulate processing
        if isinstance(item, str):
            return item.upper()
        elif isinstance(item, (int, float)):
            return item * 2
        else:
            raise TypeError(f"Unsupported type: {type(item)}")

# Using pdb from command line
# python -m pdb advanced_breakpoints.py

# Setting breakpoints programmatically
def debug_specific_condition(value):
    # Only enter debugger under specific conditions
    if value > 100:
        import pdb; pdb.set_trace()
    
    result = value ** 2
    return result

# Using breakpoint() with environment variable
# PYTHONBREAKPOINT=0 python script.py  # Disable all breakpoints
# PYTHONBREAKPOINT=ipdb.set_trace python script.py  # Use ipdb instead

if __name__ == "__main__":
    processor = DataProcessor()
    processor.add_data("hello")
    processor.add_data(42)
    processor.add_data("world")
    processor.add_data(3.14)
    processor.add_data([1, 2, 3])  # This will cause an error
    
    try:
        processor.process_all()
    except Exception as e:
        print(f"Error during processing: {e}")
        # Enter post-mortem debugging
        pdb.post_mortem()

Post-mortem Debugging

# post_mortem_debugging.py
import pdb
import traceback

def risky_operation(x, y):
    """Perform a risky mathematical operation"""
    result = x / y  # Potential ZeroDivisionError
    return result ** 2

def complex_calculation(data):
    """Process data with multiple potential failure points"""
    results = []
    
    for i, item in enumerate(data):
        try:
            # Multiple operations that could fail
            value = float(item)
            normalized = value / max(data)
            result = risky_operation(normalized, i)
            results.append(result)
        except Exception as e:
            print(f"Error processing item {i}: {e}")
            raise
    
    return results

# Method 1: Using pdb.pm() in interactive session
def demo_post_mortem_interactive():
    try:
        data = [5, 10, 0, 20, "invalid", 30]
        result = complex_calculation(data)
    except:
        # Enter post-mortem debugging on the last exception
        import pdb; pdb.pm()

# Method 2: Using sys.excepthook for automatic post-mortem
def enable_post_mortem_hook():
    """Enable automatic post-mortem debugging on unhandled exceptions"""
    def debug_hook(type, value, tb):
        traceback.print_exception(type, value, tb)
        pdb.post_mortem(tb)
    
    sys.excepthook = debug_hook

# Method 3: Context manager for post-mortem debugging
class PostMortemDebugger:
    def __enter__(self):
        return self
    
    def __exit__(self, exc_type, exc_val, exc_tb):
        if exc_type is not None:
            print(f"\nException occurred: {exc_type.__name__}: {exc_val}")
            pdb.post_mortem(exc_tb)
        return False  # Don't suppress the exception

# Usage example
def main():
    # Enable automatic post-mortem debugging
    # enable_post_mortem_hook()
    
    # Using context manager
    with PostMortemDebugger():
        data = [5, 10, 0, 20, 15]
        result = complex_calculation(data)
        print(f"Results: {result}")

if __name__ == "__main__":
    # Run with: python -m pdb post_mortem_debugging.py
    # The debugger will automatically enter post-mortem mode on exceptions
    main()

Remote and Script Debugging

# remote_debugging.py
import pdb
import socket
import sys

# Remote debugging using pdb
class RemotePdb(pdb.Pdb):
    """PDB subclass that accepts remote connections"""
    
    def __init__(self, host='127.0.0.1', port=4444):
        self.host = host
        self.port = port
        
        # Create socket for remote connection
        self.server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        self.server_socket.bind((self.host, self.port))
        self.server_socket.listen(1)
        
        print(f"Remote debugger listening on {self.host}:{self.port}")
        print(f"Connect with: telnet {self.host} {self.port}")
        
        self.connection, address = self.server_socket.accept()
        
        # Redirect I/O to socket
        handle = self.connection.makefile('rw')
        super().__init__(stdin=handle, stdout=handle)
        
        print(f"Remote debugger connected from {address}")

def set_remote_trace(host='127.0.0.1', port=4444):
    """Convenience function to set remote trace"""
    debugger = RemotePdb(host, port)
    debugger.set_trace(sys._getframe().f_back)

# Example usage in application
def application_logic():
    data = load_data()
    
    # Enable remote debugging at this point
    # set_remote_trace()  # Uncomment to enable
    
    processed = process_data(data)
    return processed

def load_data():
    """Simulate data loading"""
    return [1, 2, 3, 4, 5]

def process_data(data):
    """Process data with potential debugging points"""
    results = []
    
    for item in data:
        # Can also use conditional remote debugging
        if item > 3:
            # set_remote_trace()  # Debug only specific conditions
            pass
        
        result = item ** 2
        results.append(result)
    
    return results

# Script debugging with pdb
def debug_script(script_path):
    """Debug a Python script using pdb"""
    import runpy
    
    # Set up debugging environment
    sys.argv = [script_path]  # Set command line arguments
    
    # Run script under debugger
    pdb.run(f"runpy.run_path('{script_path}')")

if __name__ == "__main__":
    # Example: Debug another script
    # debug_script('another_script.py')
    
    # Run normal application
    result = application_logic()
    print(f"Results: {result}")

Interactive Debugging Techniques

# interactive_debugging.py
import pdb
import inspect
import dis

class DebugHelper:
    """Helper class for advanced debugging techniques"""
    
    @staticmethod
    def explore_object(obj):
        """Explore object attributes and methods interactively"""
        pdb.set_trace()
        
        # In pdb, you can:
        # - dir(obj) to see all attributes
        # - vars(obj) to see instance variables
        # - type(obj) to check object type
        # - help(obj) for documentation
        # - inspect.getmembers(obj) for detailed info
        
        return obj
    
    @staticmethod
    def trace_function_calls(func):
        """Decorator to trace function calls"""
        def wrapper(*args, **kwargs):
            # Get function details
            func_name = func.__name__
            func_args = inspect.signature(func).bind(*args, **kwargs)
            func_args.apply_defaults()
            
            print(f"Calling {func_name}{func_args}")
            pdb.set_trace()  # Break before function execution
            
            result = func(*args, **kwargs)
            
            print(f"{func_name} returned: {result}")
            return result
        
        return wrapper

# Example: Debugging list comprehensions
def debug_comprehension():
    """Debug complex list comprehensions"""
    data = range(10)
    
    # Hard to debug list comprehension
    # result = [x**2 for x in data if x % 2 == 0]
    
    # Debuggable version
    result = []
    for x in data:
        if x % 2 == 0:
            pdb.set_trace()  # Debug each iteration
            squared = x ** 2
            result.append(squared)
    
    return result

# Debugging with bytecode inspection
def analyze_bytecode(func):
    """Analyze function bytecode for debugging"""
    print(f"Bytecode for {func.__name__}:")
    dis.dis(func)
    pdb.set_trace()
    
    # In pdb, you can examine:
    # - func.__code__.co_varnames - local variable names
    # - func.__code__.co_consts - constants used
    # - func.__code__.co_names - global names referenced

# Custom pdb commands
class CustomPdb(pdb.Pdb):
    """Extended pdb with custom commands"""
    
    def do_inspect(self, arg):
        """Inspect object details - usage: inspect <object>"""
        try:
            obj = eval(arg, self.curframe.f_globals, self.curframe.f_locals)
            print(f"Type: {type(obj)}")
            print(f"ID: {id(obj)}")
            print(f"Size: {sys.getsizeof(obj)} bytes")
            print(f"Attributes: {dir(obj)}")
            if hasattr(obj, '__dict__'):
                print(f"Instance dict: {obj.__dict__}")
        except Exception as e:
            print(f"Error inspecting object: {e}")
    
    def do_locals(self, arg):
        """Show all local variables"""
        frame = self.curframe
        for key, value in frame.f_locals.items():
            print(f"{key} = {repr(value)}")

# Example usage
@DebugHelper.trace_function_calls
def calculate(x, y, operation='+'):
    """Perform calculation with tracing"""
    operations = {
        '+': lambda a, b: a + b,
        '-': lambda a, b: a - b,
        '*': lambda a, b: a * b,
        '/': lambda a, b: a / b,
    }
    
    return operations[operation](x, y)

if __name__ == "__main__":
    # Test custom debugging
    helper = DebugHelper()
    
    # Debug object exploration
    test_obj = {"key": "value", "number": 42}
    helper.explore_object(test_obj)
    
    # Debug function calls
    result = calculate(10, 5, operation='*')
    
    # Debug comprehension
    squares = debug_comprehension()
    
    # Use custom pdb
    # CustomPdb().set_trace()

Debugging Best Practices

# debugging_best_practices.py
import pdb
import logging
import functools
import contextlib

# 1. Conditional debugging based on environment
import os

DEBUG = os.getenv('DEBUG', 'False').lower() == 'true'

def conditional_breakpoint():
    """Only break if DEBUG environment variable is set"""
    if DEBUG:
        pdb.set_trace()

# 2. Debugging decorator
def debug_on_error(func):
    """Decorator that enters pdb on exception"""
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        try:
            return func(*args, **kwargs)
        except Exception:
            print(f"Error in {func.__name__}")
            pdb.post_mortem()
            raise
    return wrapper

# 3. Context manager for debugging blocks
@contextlib.contextmanager
def debug_context(name="Debug Block"):
    """Context manager for debugging specific code blocks"""
    print(f"Entering {name}")
    pdb.set_trace()
    try:
        yield
    finally:
        print(f"Exiting {name}")

# 4. Debugging with logging integration
class DebugLogger:
    """Integrate logging with debugging"""
    
    def __init__(self, name):
        self.logger = logging.getLogger(name)
        self.logger.setLevel(logging.DEBUG)
        
        # Console handler with debug info
        handler = logging.StreamHandler()
        formatter = logging.Formatter(
            '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
        )
        handler.setFormatter(formatter)
        self.logger.addHandler(handler)
    
    def debug_point(self, message, locals_dict=None):
        """Log and optionally enter debugger"""
        self.logger.debug(message)
        if locals_dict:
            self.logger.debug(f"Locals: {locals_dict}")
        
        if DEBUG:
            pdb.set_trace()

# 5. Pretty printing in pdb
def pdb_pp(obj):
    """Pretty print object in pdb session"""
    import pprint
    pprint.pprint(obj)

# 6. Save and restore debugging session
class DebugSession:
    """Save and restore debugging context"""
    
    def __init__(self):
        self.breakpoints = []
        self.watch_expressions = []
    
    def save_session(self, filename):
        """Save current debugging session"""
        import pickle
        
        session_data = {
            'breakpoints': self.breakpoints,
            'watch_expressions': self.watch_expressions,
        }
        
        with open(filename, 'wb') as f:
            pickle.dump(session_data, f)
    
    def load_session(self, filename):
        """Load debugging session"""
        import pickle
        
        with open(filename, 'rb') as f:
            session_data = pickle.load(f)
        
        self.breakpoints = session_data['breakpoints']
        self.watch_expressions = session_data['watch_expressions']

# Example usage
@debug_on_error
def risky_function(data):
    """Function that might fail"""
    result = []
    
    for item in data:
        # Using debug context
        with debug_context(f"Processing {item}"):
            if isinstance(item, str):
                result.append(item.upper())
            else:
                result.append(item * 2)
    
    return result

def main():
    # Set up debug logger
    debug_logger = DebugLogger(__name__)
    
    # Example data processing
    data = ["hello", 42, "world", None]  # None will cause error
    
    debug_logger.debug_point("Starting data processing", locals())
    
    try:
        results = risky_function(data)
        print(f"Results: {results}")
    except Exception as e:
        print(f"Processing failed: {e}")

if __name__ == "__main__":
    # Run with different debugging modes:
    # python debugging_best_practices.py
    # DEBUG=True python debugging_best_practices.py
    # python -m pdb debugging_best_practices.py
    
    main()

"""
PDB Tips and Tricks:

1. Use aliases for common commands:
   alias pi for k in %1.__dict__.keys(): print("%1.%s = %r" % (k, %1.__dict__[k]))
   
2. .pdbrc file for startup commands:
   # ~/.pdbrc
   alias ll list -20
   alias pd pp dir(%1)
   
3. Use python -m pdb -c continue script.py to run until exception

4. In pdb prompt:
   - !<statement> to execute Python statement
   - global <var> to make variable global
   - run [args] to restart program with args
   
5. Debugging in production:
   - Use conditional breakpoints sparingly
   - Consider using logging instead of breakpoints
   - Use post-mortem debugging for crashed processes
"""