Debugging Techniques in Python
Python Programming for Beginners
Error Handling and Debugging
Debugging Techniques
Debugging is a critical skill for every programmer. Even the most experienced developers write code with bugs, but what sets them apart is their ability to efficiently identify and fix these issues. In this section, we’ll explore various debugging techniques and tools available in Python.
Understanding Common Error Types
Before diving into debugging techniques, it’s helpful to understand the common types of errors you might encounter:
- Syntax Errors: Occur when the Python parser is unable to understand your code due to violations of the language’s syntax rules.
- Runtime Errors (Exceptions): Occur during the execution of a program, even if the syntax is correct.
- Logical Errors: The code runs without raising exceptions, but produces incorrect results due to flaws in the logic.
Print Debugging
The simplest debugging technique is to add print()
statements to your code to inspect variable values and program flow:
def calculate_average(numbers):
print(f"Input: {numbers}") # Debug print
total = 0
for num in numbers:
total += num
print(f"Added {num}, total is now {total}") # Debug print
average = total / len(numbers)
print(f"Calculated average: {average}") # Debug print
return average
# Test the function
result = calculate_average([10, 20, 30, 40, 50])
print(f"Final result: {result}")
Advantages of Print Debugging:
- Simple and requires no special tools
- Works in any environment
- Easy to understand
Disadvantages of Print Debugging:
- Can clutter the code and output
- Must be manually added and removed
- Not interactive
- Inefficient for complex issues
Best Practices for Print Debugging:
- Use descriptive messages:
print(f"User data after processing: {user_data}")
- Format output for readability:
print(f"Status: {'Success' if status else 'Failure'}")
- Use temporary variables for complex expressions:
result = complex_calculation(x, y, z) print(f"Result of calculation: {result}")
Using the logging
Module
A more sophisticated alternative to print()
statements is the built-in logging
module:
import logging
# Configure logging
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(levelname)s - %(message)s',
filename='debug.log' # Optional: log to a file instead of console
)
def calculate_average(numbers):
logging.debug(f"Input: {numbers}")
if not numbers:
logging.error("Empty list provided")
raise ValueError("Cannot calculate average of empty list")
total = 0
for num in numbers:
total += num
logging.debug(f"Added {num}, total is now {total}")
average = total / len(numbers)
logging.info(f"Calculated average: {average}")
return average
# Test the function
try:
result = calculate_average([10, 20, 30, 40, 50])
print(f"Final result: {result}")
except Exception as e:
logging.exception("An error occurred")
print(f"Error: {e}")
Advantages of Logging:
- Different log levels (DEBUG, INFO, WARNING, ERROR, CRITICAL)
- Can be enabled/disabled without code changes
- Can log to files, network, email, etc.
- Includes timestamps and other metadata
- Can stay in production code
Note: The logging levels help categorize the importance of log messages:
- DEBUG: Detailed information, typically for diagnosing problems
- INFO: Confirmation that things are working as expected
- WARNING: An indication that something unexpected happened, but the program is still working
- ERROR: Due to a more serious problem, the software has not been able to perform some function
- CRITICAL: A serious error, indicating that the program itself may be unable to continue running
Using a Debugger
Python comes with a built-in debugger called pdb
(Python Debugger) that allows you to interactively debug your code:
import pdb
def complex_function(x, y):
result = x * 2
pdb.set_trace() # Start the debugger at this point
result = result + y
return result * 2
# Test the function
output = complex_function(5, 10)
print(output)
When the debugger is activated, you’ll be presented with a command prompt. Here are some common commands:
Command | Description |
---|---|
n (next) | Execute the current line and move to the next one |
s (step) | Step into a function call |
c (continue) | Continue execution until the next breakpoint |
q (quit) | Quit the debugger |
l (list) | Show the current location in the code |
p expression | Print the value of an expression |
pp expression | Pretty-print the value of an expression |
w (where) | Show the current call stack |
b line_number | Set a breakpoint at a specific line |
h (help) | Show help on debugger commands |
Important:
In Python 3.7 and later, you can use the built-in breakpoint()
function instead of importing pdb
and calling pdb.set_trace()
:
def complex_function(x, y):
result = x * 2
breakpoint() # Start the debugger at this point
result = result + y
return result * 2
Debugging in IDE: Visual Studio Code
Most modern IDEs include powerful debugging tools. Let’s look at how debugging works in Visual Studio Code:
Set a Breakpoint: Click in the gutter (the space to the left of the line numbers) to set a breakpoint.
Start Debugging: Press F5 or click the “Run and Debug” button in the sidebar.
Use the Debug Controls:
- Continue (F5): Resume execution until the next breakpoint
- Step Over (F10): Execute the current line and move to the next one
- Step Into (F11): Step into a function call
- Step Out (Shift+F11): Execute the rest of the current function and return to the caller
- Restart (Ctrl+Shift+F5): Restart the debugging session
- Stop (Shift+F5): End the debugging session
Inspect Variables: View and modify variable values in the “Variables” panel.
Watch Expressions: Add expressions to the “Watch” panel to monitor their values.
View the Call Stack: See the sequence of function calls that led to the current point.
Note: Similar debugging features are available in other IDEs like PyCharm, Jupyter Notebooks, and Spyder.
Debugging with assert
Statements
Assert statements provide a simple way to check that conditions are as expected:
def calculate_rectangle_area(length, width):
assert length > 0, "Length must be positive"
assert width > 0, "Width must be positive"
area = length * width
assert area > 0, "Area calculation error"
return area
# Test the function
try:
result = calculate_rectangle_area(5, -2)
print(f"Area: {result}")
except AssertionError as e:
print(f"Error: {e}") # Output: Error: Width must be positive
assert
statements are particularly useful for:
- Validating function inputs and outputs
- Checking invariants (conditions that should always be true)
- Verifying that your code’s assumptions are correct
Important:
Assert statements can be completely removed when Python is run with the -O
(optimize) flag, so don’t use them for input validation in production code. Use proper exception handling instead.
Exception Handling for Debugging
While exceptions are primarily for handling runtime errors, they can also be useful for debugging:
def process_data(data):
try:
result = data['value'] * 2
return result
except KeyError:
# Print debugging information
print(f"KeyError in process_data. Available keys: {data.keys()}")
# Re-raise the exception with more information
raise KeyError(f"Missing 'value' key in data: {data}")
except TypeError as e:
print(f"TypeError in process_data: {e}")
print(f"Data type: {type(data)}, Value type: {type(data.get('value', None))}")
raise # Re-raise the original exception
# Test the function
try:
result = process_data({'other_key': 10})
print(result)
except Exception as e:
print(f"Error caught: {e}")
This approach helps diagnose issues by:
- Providing context about what was happening when the error occurred
- Logging relevant variable values
- Adding more specific error messages
Debugging with Tracebacks
When an exception occurs, Python provides a traceback that shows the sequence of function calls leading to the error:
Traceback (most recent call last):
File "example.py", line 20, in <module>
result = process_data({'other_key': 10})
File "example.py", line 4, in process_data
result = data['value'] * 2
KeyError: 'value'
To get the traceback as a string for logging or analysis:
import traceback
try:
# Code that might raise an exception
result = 10 / 0
except Exception:
# Get the traceback as a string
trace = traceback.format_exc()
print(f"An error occurred:\n{trace}")
# Log the traceback for future reference
with open("error_log.txt", "a") as f:
f.write(f"\n--- New Error ---\n{trace}\n")
Debugging Complex Data Structures
For complex data structures, Python’s pretty-printing module (pprint
) can be invaluable:
from pprint import pprint
complex_data = {
'users': [
{'name': 'Alice', 'age': 30, 'roles': ['admin', 'user'], 'metadata': {'joined': '2020-01-15', 'last_login': '2023-05-20'}},
{'name': 'Bob', 'age': 25, 'roles': ['user'], 'metadata': {'joined': '2021-03-10', 'last_login': '2023-05-18'}}
],
'settings': {
'theme': 'dark',
'notifications': {'email': True, 'sms': False, 'push': True},
'permissions': ['read', 'write', 'delete']
}
}
# Standard print
print("Using print():")
print(complex_data)
# Pretty print
print("\nUsing pprint():")
pprint(complex_data, width=40, depth=2) # Control the formatting
# You can also use pprint with a depth limit for nested structures
print("\nWith depth limit:")
pprint(complex_data, depth=1) # Only show the first level
Rubber Duck Debugging
A surprisingly effective technique is “rubber duck debugging”:
- Get a rubber duck (or any object/imaginary person)
- Explain your code to the duck, line by line
- Often, the act of explaining the problem helps you find the solution
This technique works because it forces you to:
- Articulate what each part of your code is supposed to do
- Compare your expectations with what’s actually happening
- Consider aspects of the problem you might have overlooked
Time-Travel Debugging Techniques
Sometimes, bugs are intermittent or hard to reproduce. In these cases, “time-travel” debugging techniques can help:
Logging with Timestamps:
import time def process_transaction(transaction): start_time = time.time() log_entry = f"[{time.strftime('%H:%M:%S')}] Processing transaction {transaction['id']}\n" # Various processing steps... time.sleep(0.5) # Simulate processing time log_entry += f"[{time.strftime('%H:%M:%S')}] Transaction processed in {time.time() - start_time:.2f} seconds\n" with open("transaction_log.txt", "a") as f: f.write(log_entry)
Record and Replay: Save inputs and intermediate states to reproduce issues later.
Debugging a Real Example
Let’s walk through debugging a flawed function that calculates the median of a list of numbers:
def calculate_median(numbers):
# Sort the numbers
sorted_numbers = sorted(numbers)
# Find the middle position
n = len(sorted_numbers)
middle = n / 2
# Return the median
if n % 2 == 1:
# Odd number of elements
return sorted_numbers[middle]
else:
# Even number of elements
return (sorted_numbers[middle - 1] + sorted_numbers[middle]) / 2
# Test with odd number of elements
print(calculate_median([7, 2, 5, 1, 9])) # Should be 5
# Test with even number of elements
print(calculate_median([7, 2, 5, 1, 9, 8])) # Should be 6
When we run this code, we get:
Traceback (most recent call last):
File "median.py", line 16, in <module>
print(calculate_median([7, 2, 5, 1, 9]))
File "median.py", line 10, in calculate_median
return sorted_numbers[middle]
TypeError: list indices must be integers or slices, not float
Let’s debug this step by step:
Identify the Error: The error message tells us that we’re trying to use a float as a list index, which isn’t allowed.
Locate the Problem: The error is on line 10, where we’re using
middle
as an index.Add Debug Prints:
def calculate_median(numbers): print(f"Input: {numbers}") # Sort the numbers sorted_numbers = sorted(numbers) print(f"Sorted: {sorted_numbers}") # Find the middle position n = len(sorted_numbers) middle = n / 2 print(f"n: {n}, middle: {middle}, type(middle): {type(middle)}") # ... rest of the function
Fix the Issue: Now we can see that
middle
is a float (5/2 = 2.5), but list indices must be integers. We need to convertmiddle
to an integer:# For odd number of elements if n % 2 == 1: return sorted_numbers[int(middle)]
Test and Verify: After making this change, we should test both cases again to ensure they work correctly.
Fully corrected function:
def calculate_median(numbers):
# Sort the numbers
sorted_numbers = sorted(numbers)
# Find the middle position
n = len(sorted_numbers)
middle = n // 2 # Use integer division
# Return the median
if n % 2 == 1:
# Odd number of elements
return sorted_numbers[middle]
else:
# Even number of elements
return (sorted_numbers[middle - 1] + sorted_numbers[middle]) / 2
# Test with odd number of elements
print(calculate_median([7, 2, 5, 1, 9])) # 5
# Test with even number of elements
print(calculate_median([7, 2, 5, 1, 9, 8])) # 6.0
Unit Testing as a Debugging Tool
Unit tests can be a powerful debugging tool by verifying that specific parts of your code work as expected:
import unittest
def calculate_rectangle_area(length, width):
return length * width
class TestCalculateArea(unittest.TestCase):
def test_positive_values(self):
self.assertEqual(calculate_rectangle_area(5, 4), 20)
def test_zero_values(self):
self.assertEqual(calculate_rectangle_area(0, 5), 0)
self.assertEqual(calculate_rectangle_area(5, 0), 0)
def test_negative_values(self):
# Should area be negative or should we raise an error?
with self.assertRaises(ValueError):
calculate_rectangle_area(-5, 4)
# Run the tests
if __name__ == "__main__":
unittest.main()
This helps us:
- Verify that fixes actually solve the problem
- Prevent regression (reintroducing bugs)
- Document expected behavior
Debugging Checklist
When faced with a bug, follow this systematic approach:
Reproduce the Bug:
- Create a minimal test case that consistently triggers the bug.
Understand the Error Message:
- Read the error message carefully, noting the exception type and line number.
- Examine the traceback to understand the sequence of function calls.
Inspect Variable Values:
- Add print statements or use a debugger to check variable values.
- Verify that inputs and outputs match your expectations.
Check Assumptions:
- Are you assuming something about the data that’s not true?
- Are function arguments in the correct order?
- Are you using the right data type?
Isolate the Problem:
- Comment out sections of code to pinpoint where the issue lies.
- Simplify complex expressions into multiple steps.
Fix the Bug:
- Make one change at a time.
- Test after each change to verify that it fixes the issue.
Add Tests:
- Create a test that would have caught this bug.
- Run the test to confirm your fix works.
Document the Issue and Solution:
- Add comments explaining the bug and how you fixed it.
- Update documentation if necessary.
Exercises
Exercise 1: The following function is supposed to count the frequency of each character in a string. However, it has a bug. Use print debugging to find and fix the issue:
def count_characters(text):
counter = {}
for char in text:
if char in counter:
counter[char] += 1
else:
counter[char] = 1
return counter
result = count_characters("hello world")
print(result) # Expected: {'h': 1, 'e': 1, 'l': 3, 'o': 2, ' ': 1, 'w': 1, 'r': 1, 'd': 1}
Exercise 2: The following function attempts to find the largest number in a list that is divisible by a given divisor. However, it has multiple bugs. Use the pdb debugger to find and fix them:
def largest_divisible(numbers, divisor):
largest = 0
for num in numbers:
if num % divisor == 0:
if num > largest:
largest = num
return largest
numbers = [12, 30, 21, 8, 25, 15, 22]
result = largest_divisible(numbers, 5)
print(result) # Expected: 30
Exercise 3: Create a function called analyze_student_scores
that takes a list of student score dictionaries and returns statistics about the scores (min, max, average). Include error handling and debugging features like assertions and logging. Test your function with valid and invalid inputs.
# Example input:
scores = [
{'name': 'Alice', 'score': 85},
{'name': 'Bob', 'score': 92},
{'name': 'Charlie', 'score': 78},
# ...possibly more students
]
# Your function should handle cases like:
# - Empty list
# - Missing 'score' key
# - Non-numeric scores
Hint for Exercise 1: Add print statements to display the value of char
and counter
during each iteration of the loop.
In the next section, we’ll explore using the Python debugger in more detail, learning how to set breakpoints, step through code, and inspect variables to find bugs efficiently.