Defensive Programming in Python

1

Behavior From Examples

Why this matters

Defensive programming starts before the first if: you decide which inputs belong to the function’s contract and which ones are caller mistakes. Good examples and non-examples keep validation focused instead of turning it into guesswork.

🎯 You will learn to

Analyze valid and invalid examples as partitions of the input space.
Convert examples into a small contract before writing code.
Distinguish wrong type, malformed value, out-of-range value, and valid edge case.

Predict the contract

Study this intended function:

def parse_percentage(text: str) -> float:
    ...

It should accept these strings:

"0%" returns 0.0
"25%" returns 25.0
"100%" returns 100.0
" 7% " returns 7.0

It should reject these values:

"", because there is no percentage text
"25", because the percent sign is missing
"-1%", because the value is below the valid range
"101%", because the value is above the valid range
"abc%", because the numeric part is not a number
25, because the value has the wrong type

Before you continue, write one sentence that describes the contract. A strong version says something like: “The caller must pass a string that, after surrounding whitespace is ignored, is a numeric percentage with a percent sign and a value from 0 through 100.”

Why no code yet?

This first step is a classification checkpoint. If you cannot sort an input into the right failure category, the implementation usually becomes either too permissive ("25" becomes accepted) or too vague (ValueError for everything, including wrong types).

Starter files

percentages.py

def parse_percentage(text: str) -> float:
    """Return the numeric value in a percentage string.

    Contract draft:
      - caller passes a string
      - surrounding whitespace is ignored
      - the cleaned value ends with "%"
      - the number before "%" is in the inclusive range 0..100
    """
    raise NotImplementedError("You will implement validation in later steps.")

Step 1 — Knowledge Check

Min. score: 80%

1. parse_percentage(25) is invalid for which reason?

Wrong type: the function contract asks for a string.
Malformed value: the percent sign is missing.
Malformed-value checks happen after the caller has supplied the right kind of value. Here the caller supplied an int, not a str.
Out of range: 25 is too large.
The value 25 would be in range if it came from a valid percentage string. The first problem is the type.
Valid edge case: 25 means 25 percent.
Accepting a bare number would silently expand the contract. This function is specifically parsing percentage strings.

The caller violated the type part of the contract. Defensive code should report that as a type problem, not try to reinterpret the integer.

2. parse_percentage("25") is invalid for which reason?

Wrong type: the input is not a string.
"25" is a string, so the type part of the contract is satisfied. The shape of the string is the problem.
Malformed value: the required percent sign is missing.
Out of range: 25 is outside 0 through 100.
The numeric value 25 is inside the allowed range. Range checks happen after the string has the required form.
Valid edge case: 25 is exactly on a boundary.
The boundary values are 0 and 100. This input also lacks the percent sign the parser promises to require.

The value has the right type but the wrong shape. A parser that accepts both "25" and "25%" is no longer enforcing the stated contract.

3. parse_percentage("101%") is invalid for which reason?

Wrong type: strings cannot represent percentages.
Strings are exactly what this parser accepts. The type is not the issue.
Malformed value: 101 is not numeric.
101 can be parsed as a number. The problem appears after parsing, when the value is compared with the allowed range.
Out of range: the numeric value is above 100.
Valid edge case: the percent sign is present.
Having a percent sign is necessary, but not sufficient. The numeric value still has to be between 0 and 100 inclusive.

This is a range violation: the input has the right type and shape, but the parsed numeric value is outside the contract.

4. Which input is a valid edge case for this contract?

"0%"
"-1%"
A value just below the lower boundary is useful to test, but it belongs in the rejected partition.
"abc%"
The percent sign is present, but the numeric part cannot be parsed as a number.
""
The empty string is malformed because it has no number and no percent sign.

"0%" sits exactly on the lower boundary and is included by the contract. Boundary values are worth testing precisely because one wrong comparison can reject them.

5. Four programmers each propose a different implementation strategy for parse_percentage. Which strategy is the most defensive — and why?

Reject non-strings with TypeError, reject malformed/out-of-range strings with ValueError, return the value only after every check passes.
Coerce anything that looks numeric: float(str(text).rstrip('%')) / 100.0. Always return a number.
Coercion silently widens the contract. parse_percentage(None) would become parse_percentage("None") and then fail somewhere deeper, far from the actual mistake.
Wrap the body in try: ... except Exception: return 0.0 so the caller never sees an exception.
Broad exception swallowing hides the kind of failure (empty string vs malformed number vs out-of-range) behind one indistinguishable return value. The caller has no way to repair the input.
Return None for any input that doesn’t parse cleanly; let the caller check the result.
Sentinel returns force every caller to add a check, and they confuse ‘unparseable’ with ‘legitimately 0.0%’. Explicit exceptions are easier for callers to handle correctly.

Defensive programming is about making the failure tell the caller what to fix. Distinct exception classes do that — they make the kind of mistake (wrong type vs wrong shape vs wrong value) visible at the point of failure. The other three strategies all silently broaden, swallow, or flatten the failure signal.

2

Precise Exceptions

Why this matters

Defensive code is more useful when the failure tells the caller what kind of mistake they made. Python already has a useful split: TypeError for the wrong kind of value, ValueError for the right kind of value with an unacceptable value.

🎯 You will learn to

Apply TypeError and ValueError to different validation failures.
Use pytest.raises to verify the exact exception type.
Reject bool where an integer count is required.

Your task

Open repeat_text.py. Implement:

def repeat(text: str, times: int) -> str:
    ...

Contract:

text must be a str; otherwise raise TypeError.
times must be an int, but not bool; otherwise raise TypeError.
times must be non-negative; otherwise raise ValueError.
Valid inputs return text * times.

The tests check exact exception classes. A subclass or a broad Exception is not enough because the exception choice is the learning target here.

Starter files

repeat_text.py

def repeat(text: str, times: int) -> str:
    """Return text repeated times times."""
    return text * times

test_repeat_text.py

import pytest
from repeat_text import repeat


def test_repeats_text_for_valid_count():
    assert repeat("ha", 3) == "hahaha"
    assert repeat("ha", 0) == ""


def test_wrong_text_type_is_type_error():
    with pytest.raises(TypeError) as excinfo:
        repeat(123, 2)
    assert excinfo.type is TypeError


def test_wrong_times_type_is_type_error():
    with pytest.raises(TypeError) as excinfo:
        repeat("ha", "3")
    assert excinfo.type is TypeError


def test_negative_times_is_value_error():
    with pytest.raises(ValueError) as excinfo:
        repeat("ha", -1)
    assert excinfo.type is ValueError

Solution

repeat_text.py

def repeat(text: str, times: int) -> str:
    """Return text repeated times times."""
    if not isinstance(text, str):
        raise TypeError("text must be a string")
    if isinstance(times, bool) or not isinstance(times, int):
        raise TypeError("times must be an integer")
    if times < 0:
        raise ValueError("times must be non-negative")
    return text * times

The implementation checks the type part of the contract first, then the value constraint. That keeps the exception type aligned with the caller’s mistake.

Step 2 — Knowledge Check

Min. score: 80%

1. repeat("ha", "3") raises TypeError rather than ValueError. Why is TypeError the right choice here?

Because "3" is the wrong kind of object — the contract wanted an integer, not a string.
Because the string "3" can’t be converted to an integer.
"3" can be converted with int("3") — but the contract doesn’t ask the function to do that coercion. The kind mismatch is the violation.
Because TypeError is always preferred over ValueError for safety.
There’s no general rule that TypeError is safer. The split has semantic meaning: kind vs value. Both belong in defensive code at the right places.
Because Python forces TypeError whenever string × string is attempted.
Python would raise TypeError on "ha" * "3" eventually, but with a confusing message about strings. Catching the kind mismatch first gives the caller a clear diagnostic.

TypeError signals ‘wrong kind of object’; ValueError signals ‘right kind, unacceptable value’. The two carry distinct information for the caller — and choosing the right one is part of the defensive contract.

2. repeat("ha", True) raises TypeError even though True is an int in Python. What is this defending against?

The bool-as-int trap: isinstance(True, int) returns True, so a check that accepts integers would silently accept booleans — almost never what the caller meant.
A bug in CPython where booleans corrupt string multiplication.
There’s no such bug. "ha" * True is well-defined Python — it returns "ha". The defense is against caller confusion, not language brokenness.
The risk that True evaluates to 0 in arithmetic contexts.
True evaluates to 1 in arithmetic, not 0. And that’s exactly the trap: repeat("ha", True) would silently return "ha" instead of raising — likely hiding a caller bug.
Nothing — the test is overly strict and should be removed.
The test is enforcing a contract distinction the language doesn’t enforce for you. That’s the whole point of defensive programming.

Python’s bool is a subclass of int, so isinstance(True, int) is True. The defensive check isinstance(x, bool) or not isinstance(x, int) catches True/False before accepting them as counts — because passing a boolean to a function that wants a count is almost certainly a caller mistake.

3. A reviewer suggests collapsing all three checks in repeat into one line: if not (isinstance(text, str) and isinstance(times, int) and times >= 0): raise ValueError("invalid input"). What does this change cost you?

It collapses three distinct failure messages into one undifferentiated ‘invalid input’, and it raises ValueError even when the failure was a type problem.
Nothing — it’s cleaner and behaves identically.
Behavior is not identical — the failure message and the exception class both lose information. The caller can no longer tell whether they passed a string instead of an int, or a negative count.
It makes the function slower because of short-circuit evaluation.
Short-circuit evaluation makes it marginally faster, not slower. Performance isn’t the issue here.
It breaks pytest.raises because the test framework can’t parse compound conditions.
pytest.raises works fine. The cost is pedagogical/diagnostic, not technical.

Defensive programming is about information at the failure site. Collapsing checks erases the distinction between ‘wrong kind’ and ‘wrong value’ — the very signal the caller needs to fix their code.

3

Rejecting Quiet Repairs

Why this matters

Defensive programming is not the same as making every bad call work somehow. Quietly repairing a caller bug can move the failure far away from the cause, which makes the next bug harder to debug.

🎯 You will learn to

Evaluate when an automatic repair hides a caller mistake.
Reject invalid relationships between arguments.
Preserve the contract of a small numeric helper.

Your task

clamp(value, lower, upper) should return value limited to the inclusive range [lower, upper].

Valid examples:

clamp(5, 0, 10)    # 5
clamp(-2, 0, 10)   # 0
clamp(12, 0, 10)   # 10

The starter code tries to be helpful by swapping the bounds when lower > upper. Delete that behavior. The contract should reject reversed bounds with ValueError; callers need to know they passed the arguments in the wrong order.

Starter files

ranges.py

def clamp(value: float, lower: float, upper: float) -> float:
    """Return value limited to the inclusive interval [lower, upper]."""
    if lower > upper:
        # This hides the caller's bug. Replace it with a clear failure.
        lower, upper = upper, lower
    if value < lower:
        return lower
    if value > upper:
        return upper
    return value

test_ranges.py

import pytest
from ranges import clamp


def test_value_inside_range_is_unchanged():
    assert clamp(5, 0, 10) == 5


def test_value_below_range_returns_lower_bound():
    assert clamp(-2, 0, 10) == 0


def test_value_above_range_returns_upper_bound():
    assert clamp(12, 0, 10) == 10


def test_reversed_bounds_are_rejected():
    with pytest.raises(ValueError) as excinfo:
        clamp(5, 10, 0)
    assert excinfo.type is ValueError

Solution

ranges.py

def clamp(value: float, lower: float, upper: float) -> float:
    """Return value limited to the inclusive interval [lower, upper]."""
    if lower > upper:
        raise ValueError("lower must be less than or equal to upper")
    if value < lower:
        return lower
    if value > upper:
        return upper
    return value

The function still helps valid callers by clamping values, but it refuses to reinterpret a reversed range. That keeps the failure close to the caller’s mistake.

Step 3 — Knowledge Check

Min. score: 80%

1. Why is silently swapping reversed bounds in clamp(7, 9, 3) worse than raising ValueError?

It hides a caller bug. The next time the caller passes (value, max, min) instead of (value, min, max), the function appears to work — but the caller’s mental model is broken, and the next bug will be harder to find.
It violates Python’s style guide.
PEP 8 doesn’t mention this. The harm is semantic, not stylistic — and far more serious than a lint warning.
It costs measurable performance because of the tuple unpacking.
Performance is irrelevant at this scale. The cost is correctness: a broken caller passes tests and ships.
It changes the return type from float to tuple[float, float].
The return type doesn’t change. The harm is in the caller’s invisible-to-tests confusion about argument order.

Defensive programming is partly about preserving the visibility of bugs. Silent repair shifts a bug from ‘detected immediately’ to ‘detected weeks later in production’, which is the worst possible move.

2. Postel’s Law (“be liberal in what you accept, conservative in what you send”) is a famous internet design principle. A web framework’s URL router silently strips trailing slashes from URLs before matching. Is this the same anti-pattern as clamp’s silent swap?

No — the trailing-slash strip is at a trust boundary (untrusted user input from HTTP) where normalization is intentional and is usually logged. clamp is an internal contract between trusted code where silent repair hides programmer mistakes.
Yes — any silent transformation of input is always bad.
This is the canonical Postel’s Law misconception. There’s a meaningful distinction between boundary tolerance (normalize HTTP input, web forms, file encodings) and interior strictness (function arguments between trusted modules).
No — Postel’s Law was deprecated by RFC 8546, so neither pattern is acceptable.
RFC 8546 (2019) does push back on Postel’s Law, but the practical rule that emerged isn’t ‘never normalize’ — it’s ‘normalize at the boundary, be strict in the interior.’ Modern HTTP frameworks still strip trailing slashes.
Yes — but only because URLs and numeric ranges are fundamentally different domains.
The domain isn’t the point. The point is who is on the other side of the call: an untrusted HTTP client (normalize) or another part of your own program (fail loud).

Defensive programming has a location dimension: at the system boundary (HTTP input, file parsing, command-line args), liberal acceptance with documented normalization is often correct. Inside trusted code, silent repair hides bugs and should be rejected — exactly what clamp enforces.

3. A teammate proposes that clamp(value, lower, upper) also coerce non-numeric inputs: clamp(\"5\", 0, 10) would try float(\"5\") and continue. Is this a good defensive addition?

No — it silently widens the contract. The next caller who passes a string by mistake will see clamp ‘work’, and the actual bug (wrong variable being passed) goes undetected.
Yes — being more accepting makes the function more robust.
This is the exact same anti-pattern as the silent bound-swap, just dressed up as ‘helpfulness’. Robustness ≠ accepting any input — it means responding to bad input in a way that helps the caller fix their code.
Yes — float("5") is a cheap conversion, so the cost is negligible.
Conversion cost isn’t the issue. The issue is the contract: clamp is documented to accept floats, and silently expanding to strings means the documentation lies.
No — but only because float("5e1") would also succeed, which is unexpected.
float("5e1") succeeding is a curiosity, not the main problem. The main problem is that the contract no longer matches the docstring.

The pattern in this step generalizes: any time a function silently expands the set of inputs it accepts, it’s hiding caller bugs. The right move is to reject early with a precise exception, not to coerce.

4

State Integrity

Why this matters

Defensive programming is not only about function inputs. Objects also need valid state, and the easiest time to protect that state is at construction.

🎯 You will learn to

Enforce a class invariant in __post_init__.
Validate construction-time state before methods rely on it.
Reject method inputs that do not belong to the public contract.

Your task

Open intervals.py. The class represents a closed interval, meaning both endpoints are included. Its invariant is:

start <= end

Implement two pieces:

__post_init__ rejects invalid construction with ValueError.
contains(value) returns whether an integer value is inside the interval, and raises TypeError for non-integers or bool.

The class is frozen=True, so valid instances cannot be edited after construction. That makes the construction check especially important.

Starter files

intervals.py

from dataclasses import dataclass


@dataclass(frozen=True, slots=True)
class ClosedInterval:
    start: int
    end: int

    def __post_init__(self) -> None:
        # Enforce the invariant here.
        pass

    def contains(self, value: int) -> bool:
        # Return True when value is between start and end inclusive.
        return False

test_intervals.py

import pytest
from intervals import ClosedInterval


def test_valid_interval_contains_endpoints_and_middle():
    interval = ClosedInterval(3, 7)
    assert interval.contains(3) is True
    assert interval.contains(5) is True
    assert interval.contains(7) is True


def test_valid_interval_rejects_outside_values():
    interval = ClosedInterval(3, 7)
    assert interval.contains(2) is False
    assert interval.contains(8) is False


def test_reversed_interval_cannot_be_constructed():
    with pytest.raises(ValueError) as excinfo:
        ClosedInterval(7, 3)
    assert excinfo.type is ValueError


def test_contains_rejects_non_integer_values():
    interval = ClosedInterval(3, 7)
    with pytest.raises(TypeError) as excinfo:
        interval.contains("5")
    assert excinfo.type is TypeError

Solution

intervals.py

from dataclasses import dataclass


@dataclass(frozen=True, slots=True)
class ClosedInterval:
    start: int
    end: int

    def __post_init__(self) -> None:
        if not isinstance(self.start, int) or isinstance(self.start, bool):
            raise TypeError("start must be an integer")
        if not isinstance(self.end, int) or isinstance(self.end, bool):
            raise TypeError("end must be an integer")
        if self.start > self.end:
            raise ValueError("start must be less than or equal to end")

    def contains(self, value: int) -> bool:
        if not isinstance(value, int) or isinstance(value, bool):
            raise TypeError("value must be an integer")
        return self.start <= value <= self.end

The object rejects invalid state once, at construction, and each method can rely on the invariant after that. Method-specific validation is still needed for new caller input.

Step 4 — Knowledge Check

Min. score: 80%

1. Why does ClosedInterval validate start <= end in __post_init__ rather than inside contains?

Because validating once at construction means every method can trust the invariant afterward; re-checking in every method would be redundant and would scale badly with more methods.
Because Python doesn’t allow validation inside dataclass methods.
Dataclasses can validate anywhere; nothing in Python forces the check into __post_init__. The reason is pedagogical: the invariant only needs to be established once.
Because __post_init__ runs slightly faster than method calls.
Performance isn’t relevant at this scale. The reason is structural: avoiding redundant checks across many methods.
Because dataclasses can’t have method bodies that raise.
Dataclasses can absolutely raise inside methods. The reason for the constructor check is about where the invariant is established.

Construction is the one stable time when invalid state is possible. After __post_init__, the invariant is established — and because the class is frozen=True, it can’t be broken later. Every method can rely on start <= end without re-checking.

2. The contains(value) method still validates that value is a non-bool integer, even though the interval itself is already a valid object. Why?

Because value is a new caller input — it’s not part of the interval’s invariant. Method arguments need their own validation; the object’s invariant only protects the object’s own state.
Because Python’s type hints aren’t checked at runtime, so validation must run twice.
Type hints not being runtime-checked is true, but unrelated. The distinction is between object invariants (state of self) and method preconditions (incoming arguments).
Because contains could be called before __post_init__ finishes.
__post_init__ runs as part of __init__, before the constructor returns. contains can’t be called before that finishes.
Because frozen=True doesn’t protect against attribute changes via reflection.
frozen=True is bypassable, yes — but that’s not why contains validates value. value is a fresh argument from the caller and has no relationship to the existing invariant.

Invariants protect object state. Preconditions protect method arguments. They’re two distinct concerns: the invariant is about what self looks like; the precondition is about what the caller just handed you.

3. ClosedInterval(True, 5) raises TypeError even though True is technically an int. This is the same trap as Step 2’s repeat(\"ha\", True). What is being defended against in both cases?

The bool-as-int subclass trap: passing a boolean where the contract documents an integer is almost certainly a caller mistake — likely a confused conditional or a return value from == used in the wrong place.
A CPython implementation bug in isinstance.
There’s no such bug. isinstance(True, int) returns True by design — that’s exactly the trap.
An ambiguity in True * 5 vs 5 * True.
Both operations are well-defined and return 5. The harm is semantic, not arithmetic.
The risk that booleans take up more memory than integers.
bool and int use the same memory representation. Memory isn’t the concern.

Both repeat and ClosedInterval defend the same way: explicitly check isinstance(x, bool) before accepting isinstance(x, int). The pattern is worth memorizing — almost every Python function that wants ‘an integer (and not a bool)’ uses it.

5

Visible Error Causes

Why this matters

except Exception: return {} looks friendly until production starts using the empty default after a missing file or broken JSON document. Defensive code should translate low-level failures into domain language while preserving the original cause for debugging.

🎯 You will learn to

Replace broad exception swallowing with targeted exception handling.
Raise a domain-specific exception with raise ... from exc.
Preserve useful failure information without leaking low-level details into callers.

Your task

Open config_loader.py. Replace the broad fallback with a clear contract:

load_json_config(path) returns a JSON object as a dict.
Missing or unreadable files raise ConfigError(...) from exc.
Invalid JSON raises ConfigError(...) from exc.
Valid JSON that is not an object raises ConfigError without pretending it is a usable config.

Do not catch Exception or BaseException. Catch the failures you can translate.

🔍 How the hidden check works

A pytest assertion that simply looks at behavior could miss a sneaky except Exception: that happens to translate to the right exception in the test cases. To guard against that, the hidden check parses your code’s Abstract Syntax Tree (AST) and explicitly rejects any except handler whose type is Exception or BaseException. This is the same technique used in real-world code review tools (Bandit, Ruff’s BLE001) — and it’s worth knowing as a defensive-programming technique in its own right: when you can’t trust behavior alone, parse the structure.

Starter files

config_loader.py

import json
from pathlib import Path
from typing import Any


class ConfigError(RuntimeError):
    """Raised when an application config file cannot be loaded."""


def load_json_config(path: Path) -> dict[str, Any]:
    """Load a JSON config object from path."""
    try:
        with Path(path).open(encoding="utf-8") as config_file:
            return json.load(config_file)
    except Exception:
        return {}

test_config_loader.py

import json
import pytest
from pathlib import Path
from config_loader import ConfigError, load_json_config


def test_valid_config_file_returns_dict():
    path = Path("/tutorial/app-config.json")
    path.write_text(json.dumps({"debug": True, "port": 8080}), encoding="utf-8")
    assert load_json_config(path) == {"debug": True, "port": 8080}


def test_missing_file_raises_config_error_with_cause():
    path = Path("/tutorial/does-not-exist.json")
    with pytest.raises(ConfigError) as excinfo:
        load_json_config(path)
    assert isinstance(excinfo.value.__cause__, OSError)


def test_invalid_json_raises_config_error_with_cause():
    path = Path("/tutorial/broken-config.json")
    path.write_text("{not valid json", encoding="utf-8")
    with pytest.raises(ConfigError) as excinfo:
        load_json_config(path)
    assert isinstance(excinfo.value.__cause__, json.JSONDecodeError)

Solution

config_loader.py

import json
from pathlib import Path
from typing import Any


class ConfigError(RuntimeError):
    """Raised when an application config file cannot be loaded."""


def load_json_config(path: Path) -> dict[str, Any]:
    """Load a JSON config object from path."""
    config_path = Path(path)
    try:
        text = config_path.read_text(encoding="utf-8")
    except OSError as exc:
        raise ConfigError(f"could not read config file: {config_path}") from exc

    try:
        data = json.loads(text)
    except json.JSONDecodeError as exc:
        raise ConfigError(f"invalid JSON config file: {config_path}") from exc

    if not isinstance(data, dict):
        raise ConfigError("config file must contain a JSON object")
    return data

The loader now catches only the failures it can translate. raise ... from exc keeps the low-level exception in __cause__, so debugging tools still show the real source of the failure.

Step 5 — Knowledge Check

Min. score: 80%

1. The original buggy load_json_config did except Exception: return {}. What is the worst long-term harm of this pattern in production?

Every caller now thinks ‘empty config’ and ‘unreadable file’ look identical, so failure modes that should be fatal (missing config) get silently treated as benign (no overrides). The bug surfaces as wrong behavior, far from the cause.
It catches KeyboardInterrupt, which can crash the interpreter.
Exception does not catch KeyboardInterrupt (that’s BaseException and not a subclass of Exception). The real harm is information loss, not interpreter safety.
It returns the wrong type — dict instead of list.
The return type is correct (dict). The harm is semantic — the dict is empty because of a failure, indistinguishable from an empty dict that legitimately had no entries.
It causes a memory leak because the exception traceback is never freed.
Python’s garbage collector handles exception tracebacks fine. The harm is informational: the cause is gone.

Broad-except + sentinel return is the canonical ‘silent corruption’ anti-pattern. Failures that should stop the program become failures that change its behavior in subtle ways — far harder to debug than a clean crash.

2. What does raise ConfigError(...) from exc give callers that raise ConfigError(...) alone does not?

It preserves the original OSError or JSONDecodeError in ConfigError.__cause__, so the full chain shows up in tracebacks and tools like pytest and Sentry can drill into the real root cause.
It makes the exception faster to raise.
Both forms have the same raise cost. The difference is informational — what shows up in the traceback.
It changes the exception class to a subclass of the original.
The class is still ConfigError. from exc only sets the __cause__ attribute; it doesn’t change the class hierarchy.
It allows the exception to be caught by except OSError: blocks.
ConfigError is its own class. from exc doesn’t make it polymorphic with OSError. The chain is purely diagnostic.

raise X from Y sets X.__cause__ = Y, which Python prints as ‘The above exception was the direct cause of the following exception…’ in tracebacks. The caller catches ConfigError (the right abstraction), but a debugger can still see the underlying file I/O or JSON parse failure that actually caused it.

3. The solution checks if not isinstance(data, dict): raise ConfigError(...). Why is this validation step necessary even after json.load succeeded?

Because valid JSON can be a list, number, string, boolean, or null at the top level — not just an object. The function’s contract specifically promises a dict config, and json.load returning a JSON array is a contract violation, not a parse error.
Because json.load can return None if the file is empty.
An empty JSON file would be a JSONDecodeError, not None. The issue isn’t empty files — it’s valid JSON of the wrong shape.
Because isinstance(data, dict) defends against the bool-subclass-of-int trap.
The bool trap is about integer arguments, not JSON parsing. This check is about top-level JSON shape.
Because Python’s json module is unreliable on Windows.
The json module is cross-platform stable. The issue is the JSON spec itself: any JSON value (array, number, string, boolean, null, object) is a valid top-level value.

JSON has six top-level value types; only one is a dict (a JSON ‘object’). [1, 2, 3] is valid JSON. So is "hello". json.load succeeds for all of them, then hands you something that’s not the dict your function promised. That gap between ‘parses cleanly’ and ‘matches the contract’ is exactly where shape validation lives.

6

Port Range Parser

Why this matters

Real defensive programming combines the pieces: classify input, reject invalid forms, choose precise exceptions, and guarantee that valid results obey a postcondition. A small parser is enough to practice the full loop without adding domain clutter.

🎯 You will learn to

Create a parser that rejects wrong types, malformed strings, out-of-range values, and reversed ranges.
Apply postcondition reasoning to returned tuples.
Test boundary values around a finite numeric domain.

Your task

Implement:

def parse_port_range(spec: str) -> tuple[int, int]:
    ...

Accepted forms:

"8080" returns (8080, 8080).
"8000-8080" returns (8000, 8080).
Surrounding whitespace around the whole spec is okay.

Rejected forms:

wrong type: raise TypeError
malformed string: raise ValueError
port outside 0..65535: raise ValueError
reversed range such as "9000-8000": raise ValueError

Postcondition for every successful call: the result is a pair of integers where 0 <= start <= end <= 65535.

🎓 After this capstone — should you always program defensively?

Across this tutorial you’ve built a complete defensive vocabulary: classify inputs into partitions, raise the right exception class, reject silent repair, protect object invariants, translate boundary failures into domain errors, and combine all of it into a parser whose contract is verifiable from the outside.

That’s the right toolkit at a system boundary — anywhere your code receives data from a user, a network call, a file, a database, or another team’s API. The further you go from that boundary, between trusted modules you wrote yourself, the more the same checks start to feel like noise that compounds across a large codebase.

Bertrand Meyer (who coined the term in the 1980s) argued against applying defensive programming uniformly. His alternative — Design by Contract — asks a sharper question: instead of “how do I reject bad input?”, whose job was this? That single shift changes what code looks like in the interior of a system.

Next: the Design by Contract in Python tutorial picks up where this one ends. It uses the same Python and the same techniques you just practiced, but reframes them around responsibility allocation — and gives you a criterion for deciding where defensive programming earns its keep and where it costs more than it gives.

Starter files

ports.py

def parse_port_range(spec: str) -> tuple[int, int]:
    """Parse a single port or inclusive port range."""
    raise NotImplementedError("parse the port range")

test_ports.py

import pytest
from ports import parse_port_range


def assert_valid_port_range(result):
    assert isinstance(result, tuple)
    assert len(result) == 2
    start, end = result
    assert isinstance(start, int)
    assert isinstance(end, int)
    assert 0 <= start <= end <= 65535


def test_single_port_becomes_one_point_range():
    result = parse_port_range("8080")
    assert result == (8080, 8080)
    assert_valid_port_range(result)


def test_range_preserves_start_and_end():
    result = parse_port_range("8000-8080")
    assert result == (8000, 8080)
    assert_valid_port_range(result)


def test_boundary_ports_are_valid():
    assert parse_port_range("0") == (0, 0)
    assert parse_port_range("65535") == (65535, 65535)
    assert parse_port_range("0-65535") == (0, 65535)


def test_wrong_type_is_type_error():
    with pytest.raises(TypeError) as excinfo:
        parse_port_range(8080)
    assert excinfo.type is TypeError


def test_malformed_specs_are_value_error():
    for spec in ["", "abc", "80/http", "80-", "-90", "80-90-100"]:
        with pytest.raises(ValueError):
            parse_port_range(spec)


def test_out_of_range_ports_are_value_error():
    for spec in ["-1", "65536", "80-70000"]:
        with pytest.raises(ValueError):
            parse_port_range(spec)


def test_reversed_range_is_value_error():
    with pytest.raises(ValueError):
        parse_port_range("9000-8000")

Solution

ports.py

def parse_port_range(spec: str) -> tuple[int, int]:
    """Parse a single port or inclusive port range."""
    if not isinstance(spec, str):
        raise TypeError("spec must be a string")

    cleaned = spec.strip()
    if not cleaned:
        raise ValueError("port range cannot be empty")

    if "-" in cleaned:
        parts = cleaned.split("-")
        if len(parts) != 2 or not parts[0] or not parts[1]:
            raise ValueError("range must be start-end")
        if any(part.strip() != part for part in parts):
            raise ValueError("range cannot contain embedded whitespace")
        start_text, end_text = parts
    else:
        start_text = end_text = cleaned

    if not start_text.isdecimal() or not end_text.isdecimal():
        raise ValueError("ports must be decimal integers")

    start = int(start_text)
    end = int(end_text)
    if not (0 <= start <= 65535 and 0 <= end <= 65535):
        raise ValueError("ports must be in the range 0..65535")
    if start > end:
        raise ValueError("start port must be less than or equal to end port")
    return (start, end)

The parser checks type first, then string shape, then numeric range, then the relationship between endpoints. The return happens only after the postcondition is true.

Step 6 — Knowledge Check

Min. score: 80%

1. A student implementation turns "9000-8000" into (8000, 9000) so the returned tuple is ordered. What is the main problem?

It hides a caller bug by silently repairing a reversed range.
It should return a list instead of a tuple.
The contract promises a tuple, and the tuple shape is not the failure. The issue is accepting an invalid relationship between the two endpoints.
It should raise TypeError because the values are strings.
The function receives a string specification by design. Wrong type would be something like None or an integer argument.
It fails only because 9000 is outside the valid port range.
Both 8000 and 9000 are valid port numbers. The invalid part is their order in a range.

Reversed ranges are caller mistakes. Silently swapping them repeats the quiet-repair problem from clamp.

2. Which assertion best captures the postcondition of a successful parse_port_range call?

result is not None and both tuple slots can be indexed
A non-None result could still be malformed, reversed, or out of range. That oracle is too weak.
result is a 2-item tuple and 0 <= start <= end <= 65535
str(result).startswith('(') and the string contains a comma
String formatting is not the contract. A list or invalid tuple could still produce a parenthesized-looking string.
result[0] < result[1] so endpoints are strictly ordered
A single port such as "8080" should return (8080, 8080), where equality is valid.

The postcondition includes tuple shape, integer endpoints, range limits, and ordering with equality allowed.

3. Which failure should be a TypeError rather than a ValueError?

parse_port_range(None)
parse_port_range("abc")
"abc" has the right type but cannot be parsed as the required string form, so it is a value problem.
parse_port_range("65536")
"65536" has the right type and shape but is outside the valid port range, so it is a value problem.
parse_port_range("9000-8000")
"9000-8000" has the right type and numeric parts, but the endpoint relationship is invalid, so it is a value problem.

TypeError is for the wrong kind of argument. The malformed, out-of-range, and reversed examples are all strings, so they are ValueError cases.

4. Retrieval from Step 3: a teammate proposes that parse_port_range(\"9000-8000\") should silently swap to (8000, 9000) because "the user obviously meant the lower port first." What principle from clamp (Step 3) applies here?

Silent repair hides caller bugs. The next time someone writes parse_port_range(f"{max_port}-{min_port}") because they confused the variable order, the function appears to work — and the actual bug surfaces somewhere far away, much later.
Silent repair is fine because the result is mathematically equivalent.
Mathematical equivalence isn’t the question. The question is whether the caller knows they passed the arguments wrong. Silent repair means they don’t.
Silent repair is fine here because port ranges are commutative.
Port ranges are not commutative — "9000-8000" is malformed by the contract, regardless of whether the resulting set of ports would be the same.
Silent repair is acceptable as long as the function logs a warning.
Logging is a partial mitigation, but at the internal contract boundary, raising is still the right move. (Logging-and-continuing is sometimes acceptable at the system boundary — see Step 3’s Postel’s Law question.)

This is the same pattern as clamp’s reversed-bounds swap. The harm is preserving the visibility of the bug — silent repair shifts caller errors from ‘detected immediately’ to ‘detected weeks later in production.’

5. Retrieval from Step 5: imagine parse_port_range was extended to read its input from a config file — port_range = json.load(...)\\[\"ports\"\\]. Which defensive pattern from load_json_config (Step 5) should re-enter the picture?

Catch only the specific failures you can translate (OSError, json.JSONDecodeError), use raise ConfigError(...) from exc to preserve the cause, then validate the loaded shape before handing it to the parser.
Wrap the whole flow in except Exception: return None so the program never crashes.
Broad except + sentinel return is the exact anti-pattern Step 5 trained you to avoid. It would convert ‘missing config file’ and ‘malformed JSON’ and ‘wrong shape’ into one indistinguishable None.
Skip JSON validation — parse_port_range will validate anyway.
Defense in depth: validating shape at the JSON boundary catches problems earlier and with better diagnostics than letting them surface inside parse_port_range. A JSON list arriving where a string was expected gives a much more confusing error from the parser than from an explicit type check at the boundary.
Disable the AST check from Step 5, since this is a different file.
The AST check is a tool for the tutorial autograder, not a runtime constraint. The defensive pattern (no broad except) applies to every place you read external data, not just load_json_config.

Defensive programming has layers: boundary validation (Step 5’s shape check), targeted exception translation (Step 5’s from exc), and interior contract enforcement (Step 6’s port parser). Each layer covers what the others can’t. Skipping any one of them lets a class of bug through.