1

Private Is Not a Secret

Why this matters

Most CS students learn private before they learn what it’s for. So when someone asks “did you hide the information?”, they answer “yes — the field is private.” That answer is wrong often enough that a billion-dollar industry of code reviews exists. The next five minutes are an inoculation: you will hold a class whose fields are “private” and use its own public API to plant a fake entry — proving the secret leaked anyway.

🎯 You will learn to

  • Distinguish information hiding (a design decision about who is allowed to know what) from private (a syntax feature that helps enforce it, when used carefully)
  • Analyze a public method signature for representation leaks — even when every field uses the _underscore convention

✏️ Predict before you run

Open watch_history.py. Every field starts with _ — Python’s convention for “private.” Will an outside caller still be able to make history.recent() return [..., {"title": "Pirated Movie", "year": 1999}] without calling history.add(...)?

  • (a) No — the underscore convention prevents outside code from touching _history.
  • (b) No — the only public modifier is add(), so _history can only change via add().
  • (c) Yes — because recent() returns the actual internal list, the caller can mutate it from outside.
  • (d) Compile error — Python won’t let you index into a _ -prefixed attribute.

Commit to a letter, then try the task.

Reveal (after you've tried it) Answer **(c)**. The "private" field has not been hidden — only renamed. `recent()` hands the caller a *reference* to `self._history`, which is a `list`, and `list.append(...)` mutates in place. Visibility modifiers stop nothing on their own. The **representation decision** ("storage is a list of dicts you may mutate by reference") has leaked through the public API. Three other ways this same class still leaks even if `recent()` returned `tuple(self._history)`: 1. Callers see the element type (`dict[str, ...]`) and start writing code that depends on the keys. 2. Switching `_history` to a `dict[str, Episode]` keyed by title would change what `recent()` returns and break every caller. 3. `for entry in history.recent():` quietly depends on iteration order. Information hiding isn't a marker on a field — it's a decision about *which design decisions clients may depend on*. The next five steps train that judgment.

Your task — make the leak happen

Open watch_history.py. Below the # TODO marker, write one line that — using only the public API history.recent() — plants a {"title": "Pirated Movie", "year": 1999} entry. Do not call history.add(...). When the test runs, history.recent() must return a list containing that planted entry.

The point isn’t that you should write code like this. The point is that the design lets you, even though the author thought they had hidden the field.

Starter files
watch_history.py
class WatchHistory:
    """Stores what you have watched.

    The author thought ``_history`` was 'hidden' because of the
    underscore convention.
    """

    def __init__(self) -> None:
        self._history: list[dict] = []

    def add(self, title: str, year: int) -> None:
        self._history.append({"title": title, "year": year})

    def recent(self) -> list[dict]:
        # The author meant this as a read-only view.
        # It is not. Find out why in the task below.
        return self._history


if __name__ == "__main__":
    history = WatchHistory()
    history.add("Stranger Things", 2016)
    history.add("Severance", 2022)

    # TODO: Without calling ``history.add(...)``, plant the fake entry
    # ``{"title": "Pirated Movie", "year": 1999}`` so it shows up in
    # ``history.recent()``. One line is enough. Use only the public
    # API — meaning the method ``history.recent()``.
    # planted = ...

    print(history.recent())
2

The Playlist's Secret

Why this matters

Step 1 showed you a leak the size of one line. Real codebases hide leaks the size of an org chart — and the same leak forces three teams to coordinate every time the data shape changes. The fastest way to feel the difference is to do a small refactor on a Playlist class and then run the same client tests against a different internal storage. If the client breaks, the secret was never hidden.

Connection to the chapter’s KWIC example. Parnas (1972) made the same point at the system level using a Key Word In Context indexer: decomposing by processing steps (input → shift → sort → print) spread the line-storage decision across every module, so a change to line storage broke every module. Decomposing by hidden decisions (line storage, shift generation, sort ordering) kept each change local. This step plays out Parnas’s argument at the class level on a Playlist.

🎯 You will learn to

  • Apply the five-step routine for hiding a secret: name the change, name the secret, list the minimum client assumptions, remove the leak, verify with a swap test
  • Analyze which lines of client code depend on the internal representation
  • Create a new domain operation on the hidden module without changing any client

The scene

You are building a music app. Playlist exposes a raw tracks: list[dict]. Three client features reach in: a textual summary, a top-picks list, and a “party-ready?” check. Run app.py and see the current behavior. Then the product manager files a ticket: “Add remove(title) — needs to be O(1).” That requires switching internal storage from list to dict keyed by title. Run the same client code afterward and watch it break.

✏️ Predict before you run

Look at playlist.py and app.py. If we change Playlist.tracks from a list[dict] to a dict[str, Track] (keyed by title, for O(1) remove), how many files will need to change to keep the demo running?

  • (a) Only playlist.py — the field is private-ish; clients should not care.
  • (b) playlist.py and app.py — every loop and indexing operation in app.py will break.
  • (c) Only app.pyPlaylist itself is fine.
  • (d) Zero files — Python’s duck typing handles it.

Commit to a letter. The first test enforces this.

The five-step routine

You will use this exact routine in every refactor from here on. Memorize the headings, not the lines:

1. Name the change.        What is about to change, and why is it likely?
2. Name the secret.        Which design decision should one module own?
3. Minimum client assumptions.   What does the client *actually* need to know?
4. Remove the leak.        Replace exposed representation with domain operations.
5. Verify with a swap.     Same client, different implementation, same output.

Filled in for Playlist:

1. Change:    Internal storage may move from list to dict for O(1) remove.
2. Secret:    How tracks are stored and ordered for retrieval.
3. Client needs:  top N by popularity, total minutes, average popularity, count.
4. Remove:    Stop exposing ``tracks``; expose ``top_tracks(n)``,
              ``total_duration_minutes()``, ``average_popularity()``, ``__len__``.
5. Verify:    Hidden test runs the same client against a dict-backed Playlist.

Your task

Refactor playlist.py so the public API is domain operations, not a raw list:

add(title, artist, duration_sec, popularity)
top_tracks(n) -> list[Track]          # n highest by popularity
total_duration_minutes() -> float
average_popularity() -> float         # 0 if empty
__len__() -> int

Define a Track dataclass with the four attributes (title, artist, duration_sec, popularity) so top_tracks returns domain objects, not dicts.

Refactor app.py so describe_playlist and is_party_appropriate use only the four methods above. They must not touch playlist.tracks, _tracks, or any dict keys.

One method is up to you: in app.py, write is_party_appropriate(playlist) — returns True if total duration > 90 minutes AND average popularity > 60.

🪞 Before clicking Next

Once all four tests pass, take 15 seconds and answer in your own words — out loud, to a rubber duck, or in your head:

In ≤25 words, what did this refactor actually buy you? Constraint: don’t use the words “private”, “list”, or “encapsulate” — those are mechanisms, and the answer is about a decision.

Forcing yourself to say it without those words is the point. If your sentence keeps reaching for them, you are still describing the implementation, not the design choice. The first quiz question rewards you for finding the decision-level wording.

The “generation effect” (Chi et al., 1994) says producing the sentence yourself strengthens learning more than re-reading does. The forbidden-word constraint comes from Variation Theory (Marton): forcing different language for the same concept is what makes the schema transferable.

Starter files
playlist.py
"""Music playlist.

STARTING STATE: leaks the internal list[dict] through ``tracks``.
Refactor so the only public surface is *domain operations*.
"""


class Playlist:
    def __init__(self) -> None:
        self.tracks: list[dict] = []

    def add(self, title: str, artist: str, duration_sec: int, popularity: int) -> None:
        self.tracks.append({
            "title": title,
            "artist": artist,
            "duration_sec": duration_sec,
            "popularity": popularity,
        })

# TODO (Step 4 of the routine — "Remove the leak"):
#   1. Define a ``Track`` dataclass with title, artist, duration_sec, popularity.
#   2. Replace ``self.tracks`` with a hidden ``self._tracks``.
#   3. Add the four domain methods listed in the instructions.
app.py
"""Client of Playlist. Currently reaches into the raw list.

Refactor ``describe_playlist`` and ``is_party_appropriate`` to use
ONLY the new domain methods on Playlist (no ``playlist.tracks``,
no dict indexing).
"""

from playlist import Playlist


def describe_playlist(playlist: Playlist) -> str:
    total_minutes = sum(t["duration_sec"] for t in playlist.tracks) / 60
    avg_popularity = (
        sum(t["popularity"] for t in playlist.tracks) / len(playlist.tracks)
        if playlist.tracks else 0
    )
    top_three = sorted(playlist.tracks, key=lambda t: t["popularity"], reverse=True)[:3]

    lines = [
        f"{len(playlist.tracks)} tracks, "
        f"{total_minutes:.1f} min, "
        f"avg popularity {avg_popularity:.0f}"
    ]
    lines.append("Top picks:")
    for t in top_three:
        lines.append(f"  - {t['title']} by {t['artist']}")
    return "\n".join(lines)


def is_party_appropriate(playlist: Playlist) -> bool:
    # TODO: rewrite using only Playlist's new domain methods.
    # Spec: True iff total duration > 90 min AND avg popularity > 60.
    raise NotImplementedError


if __name__ == "__main__":
    p = Playlist()
    p.add("Bad Guy", "Billie Eilish", 194, 95)
    p.add("Levitating", "Dua Lipa", 203, 88)
    p.add("Blinding Lights", "The Weeknd", 200, 92)
    p.add("Heat Waves", "Glass Animals", 238, 80)
    p.add("As It Was", "Harry Styles", 167, 90)
    print(describe_playlist(p))
    print("party-ready?", is_party_appropriate(p))
3

A Protocol on Familiar Code

Why this matters

Step 2’s swap test worked because Python’s duck typing checks methods at call time. That’s powerful but invisible — nothing in your code says “Playlist and DictBackedPlaylist are interchangeable.” This step introduces the one Python construct that makes that contract visible: typing.Protocol. No new design principle here — just a new way to declare what your Step 2 refactor already accomplished.

Why now, before the next refactor? Steps 4–7 all use Protocol. Pre-loading the syntax on familiar code (your Playlist) means each later step adds only one new design idea at a time, not two or three. That keeps cognitive load on the lesson, not the language.

🎯 You will learn to

  • Apply typing.Protocol to name a contract that multiple classes satisfy structurally — no explicit inheritance needed
  • Distinguish Python’s existing duck typing (runtime, invisible) from a typed Protocol (declared, type-checkable)
  • Recognize that the same construct hides an algorithm in Step 4, a storage backend in Step 5, and an exhaustive set of alternatives in Step 6

Five-minute primer

A Protocol is a class that declares method signatures as a contract. Any class with matching methods satisfies it automatically — no class Foo(Bar): required.

from typing import Protocol

class Counter(Protocol):
    def increment(self) -> None: ...
    def value(self) -> int: ...

class TallyCounter:           # No explicit base class!
    def __init__(self) -> None:
        self._n = 0
    def increment(self) -> None:
        self._n += 1
    def value(self) -> int:
        return self._n

def report(c: Counter) -> str:   # Accepts any Counter-shaped class
    return f"count is {c.value()}"

report(TallyCounter())   # OK — TallyCounter is structurally a Counter

The ... after each method’s signature is literally Python’s ellipsis literal — it tells readers (and mypy) “this method is declared, not implemented here.” The Protocol class itself is never instantiated; concrete classes are.

✏️ Predict before you run

protocol_demo.py has your Step 2 Playlist and a small DictBackedPlaylist. If we add class PlaylistLike(Protocol) with the five methods, will a type checker accept both classes as PlaylistLike?

  • (a) Only PlaylistDictBackedPlaylist doesn’t inherit from PlaylistLike.
  • (b) Both — structural matching cares about method shape, not inheritance.
  • (c) Neither — Playlist doesn’t declare : PlaylistLike either.
  • (d) Only DictBackedPlaylist — it was added later, so it knows about the Protocol.

Commit, then continue.

Reveal (after you've committed) Answer **(b)**. Protocols use *structural* subtyping (PEP 544). Any class with the matching methods satisfies the Protocol — no explicit base class, no order-of-definition concerns, no decorator needed. This is what makes `Protocol` the right Python tool for "swap-this-for-that" designs. The Step 2 swap test was the *runtime* proof; `PlaylistLike` is the *declared* contract.

Your task

Open protocol_demo.py. The Playlist class from Step 2 is there, plus a tiny DictBackedPlaylist (the swap class from your Step 2 test, made permanent so it has a name).

  1. Add from typing import Protocol and define class PlaylistLike(Protocol) at the top, with these five methods, each ending in ...:
    • add(self, title: str, artist: str, duration_sec: int, popularity: int) -> None
    • top_tracks(self, n: int) -> list[Track]
    • total_duration_minutes(self) -> float
    • average_popularity(self) -> float
    • __len__(self) -> int
  2. Change summary(playlist: object) to summary(playlist: PlaylistLike). Do not touch the body — only the annotation.
  3. Do not add (PlaylistLike) to either Playlist or DictBackedPlaylist. The whole point is that they satisfy it without saying so.

The test will call summary(Playlist()) and summary(DictBackedPlaylist()). Both should produce identical-shape strings — proving the same client function accepts two completely different backings, via the declared Protocol rather than runtime luck.

Look back at Step 2’s swap test. It built DictBackedPlaylist inside the test and passed it to your refactored client. That worked because of invisible duck typing — Python found the methods at call time. PlaylistLike is the same fact, now declared. Nothing about Step 2’s runtime behavior changes; what changes is that a future reader can see the contract without running the code.

🪞 Before clicking Next

In ten seconds, finish this aloud:

Without using the word “duck”: what does PlaylistLike make visible about my code that wasn’t visible in Step 2?

The word “duck” is forbidden because “duck typing” is the Python jargon for what’s happening — but the design point is that the contract is now named. The new affordance is reader-visible substitutability. Variation Theory says forcing different language is what makes the concept transferable to the next refactor.

Starter files
protocol_demo.py
"""Step 3 — pre-load Protocol on familiar code.

Playlist and DictBackedPlaylist are both here. Define
``PlaylistLike(Protocol)`` so a single ``summary(p: PlaylistLike)``
function accepts both — by structural matching, no inheritance.
"""

from dataclasses import dataclass
# TODO: from typing import Protocol


@dataclass(frozen=True)
class Track:
    title: str
    artist: str
    duration_sec: int
    popularity: int


# TODO 1: Define ``class PlaylistLike(Protocol)`` here with the five
# methods from the instructions. End each declaration with ``...``.


class Playlist:
    def __init__(self) -> None:
        self._tracks: list[Track] = []
    def add(self, title, artist, duration_sec, popularity):
        self._tracks.append(Track(title, artist, duration_sec, popularity))
    def top_tracks(self, n):
        return sorted(self._tracks, key=lambda t: t.popularity, reverse=True)[:n]
    def total_duration_minutes(self):
        return sum(t.duration_sec for t in self._tracks) / 60
    def average_popularity(self):
        return (sum(t.popularity for t in self._tracks) / len(self._tracks)) if self._tracks else 0
    def __len__(self):
        return len(self._tracks)


class DictBackedPlaylist:
    """Same operations, dict-backed storage. Structural twin of Playlist."""
    def __init__(self) -> None:
        self._by_title: dict = {}
    def add(self, title, artist, duration_sec, popularity):
        self._by_title[title] = Track(title, artist, duration_sec, popularity)
    def top_tracks(self, n):
        return sorted(self._by_title.values(), key=lambda t: t.popularity, reverse=True)[:n]
    def total_duration_minutes(self):
        return sum(t.duration_sec for t in self._by_title.values()) / 60
    def average_popularity(self):
        vs = list(self._by_title.values())
        return (sum(t.popularity for t in vs) / len(vs)) if vs else 0
    def __len__(self):
        return len(self._by_title)


# TODO 2: change the parameter annotation here from ``object`` to ``PlaylistLike``.
def summary(playlist: object) -> str:
    lines = [f"{len(playlist)} tracks, {playlist.total_duration_minutes():.1f} min"]
    for t in playlist.top_tracks(3):
        lines.append(f"  - {t.title}")
    return "\n".join(lines)


if __name__ == "__main__":
    for cls in (Playlist, DictBackedPlaylist):
        p = cls()
        p.add("Bad Guy", "Billie Eilish", 194, 95)
        p.add("Levitating", "Dua Lipa", 203, 88)
        p.add("Blinding Lights", "The Weeknd", 200, 92)
        print(cls.__name__)
        print(summary(p))
        print()
4

An Interface That Tells You Too Much

Why this matters

Step 2 fixed a mutation leak. Step 3 gave you the declared contractProtocol — for Step 2’s swap test. This step fixes a subtler problem: a contract that looks clean but over-specifies how it computes its answer. Parnas warned about this in 1972 with his KWIC example: an interface that says more than the client needs to know restricts future implementations. A music recommender that returns raw BM25 scores is the modern version. Switch the algorithm from BM25 to embeddings and every numeric threshold in the client breaks.

One new piece of syntax this step: typing.Literal. Protocol you already own from Step 3 — reuse it freely. The new content of this step is design judgment, not mechanics.

🎯 You will learn to

  • Analyze a read-only API for over-specification — which numeric scales, internal IDs, or raw rows are visible that clients did not need
  • Create a typing.Protocol plus a small dataclass so two different ranking strategies satisfy the same contract
  • Apply the Parnas/Clements/Weiss module-guide mini-doc format: secret, likely changes, stable contract, what is not promised

One-minute primer on typing.Literal

typing.Literal lets a type be one of a fixed set of values:

from typing import Literal
Confidence = Literal["low", "medium", "high"]

Now confidence: Confidence means “must be the string low, medium, or high, and your type checker will yell if you try anything else.” It’s the right tool for a small enum of domain-meaningful labels.

The scene

recommender.py ranks songs for a query. The current contract returns list[tuple[int, float, dict]](bucket_id, similarity_score, raw_row). sidebar.py thresholds at score >= 12.0 to call a hit “strong.” Today’s scorer is BM25-style; scores live in 0..30. Next quarter the team plans to swap in vector embeddings; scores will live in 0..1. Every threshold in every client will silently produce wrong answers.

✏️ Predict before you run

The bad design returns (bucket_id, score, row). If the recommender switches from BM25 to cosine-similarity embeddings, what is the most likely failure mode in the existing sidebar.py?

  • (a) A crash — the new return type won’t match.
  • (b) An empty sidebar — every score will be below the threshold 12.0, so no hits are “strong” anymore.
  • (c) The sidebar shows literally every song — every score will be above 12.0.
  • (d) The sidebar is unchanged — the contract types are the same.

Commit before reading on.

Reveal Answer **(b)**. Cosine-similarity scores live in `0..1`. The old threshold `12.0` is now larger than the highest possible score, so the strong-hit list is always empty. **The sidebar just goes blank in production** with no exception — the worst kind of bug. The deep mistake is not in `sidebar.py`. It is in `recommender.py`'s contract, which exposed the numeric score and tied callers to its *scale*. Parnas's term for this in his 1972 paper: the interface "reveals more than is necessary," restricting which future implementations can satisfy it.

Scaffold: trace the leak before you code

Do this step in four small passes. The goal is to lower the typing load so your attention stays on the design decision.

Pass What to decide What to edit
1. Name the leak sidebar.py knows the score scale, bucket IDs, and raw row shape. Those belong to the recommender implementation. Do not touch code yet; point at the three leaking facts in the starter files.
2. Replace the contract Clients need “is this a strong hit?”, not “what was the raw score?” Add Confidence, SongHit, and Recommender in recommender.py.
3. Move the algorithm decision Popularity buckets are one implementation’s secret. Implement PopularityRecommender.recommend(...) behind the Protocol.
4. Clean the client The sidebar should only ask for hits and read domain fields. Refactor support_sidebar(query, recommender) to filter on hit.confidence.

Your task

Refactor recommender.py so the contract exposes only what the client genuinely needs:

  1. Define Confidence = Literal["low", "medium", "high"].
  2. Define @dataclass(frozen=True) class SongHit with track_id: str, title: str, artist: str, confidence: Confidence.
  3. Define class Recommender(Protocol) with def recommend(self, query: str, *, limit: int = 5) -> list[SongHit]: ....
  4. Provide class PopularityRecommender: whose recommend method satisfies the Protocol. Use the helper _strong_track_table() already in the file to populate a few demo hits — assign confidence based on internal popularity buckets (you choose how).
  5. Refactor sidebar.py so support_sidebar(query, recommender) takes a Recommender and returns titles of hits where hit.confidence == "high". No numeric thresholds anywhere in sidebar.py.

Also: write a module guide comment at the top of recommender.py in this exact format (you can fill in the values):

"""
Module guide:
  Primary secret:   <one sentence — name the volatile decision>
  Likely changes:   <bullets — BM25 -> embeddings, score scale shifts, ...>
  Stable contract:  <one sentence — what callers can rely on>
  Not promised:     <bullets — raw scores, bucket IDs, ranking algorithm, ...>
"""

Test 4 will look for those four words (Primary secret, Likely changes, Stable contract, Not promised) — Parnas, Clements, and Weiss called this artifact the module guide in their 1985 paper. It is the lightest-weight design-doc you can write that still records why the boundary exists.

🪞 Before clicking Next

Once all four tests pass, take 20 seconds and answer in your head:

Without using the words “score” or “BM25”: if a future engineer reads sidebar.py, can they tell which ranking algorithm runs underneath? Why or why not?

The right answer (“no — sidebar only sees hit.confidence, which is a domain label, not an algorithm artifact”) is what you just bought with this refactor. The forbidden words force you to talk about the concept, not just point at the leak.

Starter files
recommender.py
"""STARTING STATE.

Today's design returns ``list[tuple[bucket_id, score, raw_row]]``.
Score scale is 0..30 (BM25-like). Refactor as the instructions ask
so a future swap to embeddings (scale 0..1) does NOT break callers.
"""

# The recommender currently exposes raw scores, bucket IDs, and dict rows.
# That is an over-specified contract. Replace it.

_DEMO_CATALOG = [
    # (track_id, title, artist, internal_popularity_0_to_100)
    ("t1", "Bad Guy",          "Billie Eilish",          95),
    ("t2", "Bury a Friend",    "Billie Eilish",          78),
    ("t3", "Lovely",           "Billie Eilish, Khalid",  62),
    ("t4", "Ocean Eyes",       "Billie Eilish",          55),
    ("t5", "Happier Than Ever","Billie Eilish",          88),
    ("t6", "All The Good Girls","Billie Eilish",         40),
]


def _strong_track_table() -> list[tuple[str, str, str, int]]:
    """Return the demo catalog. Use the int popularity to choose confidence."""
    return list(_DEMO_CATALOG)


def recommend(query: str) -> list[tuple[int, float, dict]]:
    """LEAKY contract — returns (bucket_id, score, raw_row)."""
    raw = _strong_track_table()
    # Pretend BM25 scores in 0..30 derived from the popularity field.
    return [
        (
            i // 3,                       # bucket_id leaks an internal partition
            round(pop * 30 / 100, 2),     # score scale 0..30 leaks the algorithm
            {"track_id": tid, "title": title, "artist": artist, "popularity": pop},
        )
        for i, (tid, title, artist, pop) in enumerate(raw)
    ]


# TODO replace the leaky surface above with:
#   1. ``Confidence = Literal["low", "medium", "high"]``
#   2. ``@dataclass(frozen=True) class SongHit`` with the fields named in the instructions
#   3. ``class Recommender(Protocol)`` with ``recommend(query, *, limit=5) -> list[SongHit]``
#   4. ``class PopularityRecommender`` implementing the Protocol
#
# And add the module-guide docstring at the top of the file.
sidebar.py
"""Client that knows too much.

Refactor ``support_sidebar`` to take a ``Recommender`` and ask for
high-confidence hits — no raw scores, no thresholds.
"""

from recommender import recommend

STRONG_THRESHOLD = 12.0   # BM25 scale assumption — a leak waiting to break.


def support_sidebar(query: str) -> list[str]:
    hits = recommend(query)
    return [row["title"] for (_bucket, score, row) in hits if score >= STRONG_THRESHOLD]


if __name__ == "__main__":
    for title in support_sidebar("billie eilish"):
        print(title)
5

Where Did You Put the Database?

Why this matters

The single most common information-hiding leak in real code is storage. A function that takes a sqlite3.Connection (or a MongoClient, or an S3 handle) and returns rows ties every caller to a specific persistence technology. When the team migrates from SQLite to Postgres, from rows to JSON, from synchronous to async, everything moves. This step is the canonical Parnas case made hands-on. You’ll do the whole routine yourself.

🎯 You will learn to

  • Create a Protocol + dataclass + in-memory implementation from a leaky function — independently, using the five-step routine
  • Apply dependency injection: pass the directory in to the client instead of constructing the storage inside it
  • Evaluate the change-impact radius of a storage migration before and after your refactor

The scene

events.py looks up concerts by city. Today’s implementation uses SQLite. The function signature reveals it — every client compiles against sqlite3. The product manager wants to add a JSON-file-backed test fixture for offline development, and the SRE wants to migrate the production catalog to a remote HTTP service. Each of those is a separate file rewrite today. Your job is to make them all one new class apiece.

✏️ Predict before you run

Suppose we keep the current events.py signature and just implement a JSON-file fixture. How many files have to be edited to use it from tour_planner.py?

  • (a) 1 — events.py only.
  • (b) 2 — events.py and tour_planner.py.
  • (c) 3+ — events.py, tour_planner.py, every test that constructs the connection, and any module that builds the SQL table string.
  • (d) 0 — duck typing handles it; pass a JSON dict where a connection is expected.

Commit. After your refactor, the same change will require one new class in events.py and zero edits to tour_planner.py — that is your verification.

Scaffold: write the change map first

This step is the most independent refactor so far, but you still get a planning rail. Before touching code, complete this map mentally:

Question Answer for this step
What is likely to change? SQLite may become Postgres, HTTP, or a JSON fixture.
What is the secret? Persistence technology plus schema/row mapping.
Who may know it? Concrete directory implementations such as SQLiteEventDirectory.
Who must not know it? tour_planner.affordable_shows and tests that only need events.
What is the stable contract? directory.find_in(city) -> list[Event].

Then code in passes: define Event, define the EventDirectory Protocol, make the tiny in-memory implementation, make the SQLite implementation, and only then refactor tour_planner.py. If a pass fails, you know which layer to fix.

Your task

Refactor events.py so the persistence decision is hidden:

  1. Define @dataclass(frozen=True) class Event with title: str, venue: str, date_iso: str, city: str, ticket_price_cents: int.
  2. Define class EventDirectory(Protocol) with def find_in(self, city: str) -> list[Event]: ....
  3. Implement class InMemoryEventDirectory: — constructor takes a list[Event], find_in(city) filters by city. This is your test/fixture implementation.
  4. Implement class SQLiteEventDirectory: — constructor takes a sqlite3.Connection and a table name, find_in(city) runs the same SQL the original function ran and maps rows to Event. This is the only file that may import sqlite3.

Refactor tour_planner.py so affordable_shows(directory, city, max_price_dollars=50) takes an EventDirectory (not a connection). Filter inside the function using event.ticket_price_cents and return a list[Event].

Add the module guide docstring to events.py using the same four labels you used in Step 4.

You will probably break the implementation-swap test first. The most common cause is forgetting to map raw SQL row tuples back to Event objects in SQLiteEventDirectory.find_in. If the test fails, read its diff carefully — the failure is the lesson, not the verdict.

🪞 Before clicking Next

Once all four tests pass, answer this in your head before the quiz:

Without using the words “SQL” or “database”: after the refactor, affordable_shows calls one method on its parameter. Name that method and explain why that single call is enough to absorb every plausible storage migration (SQLite → Postgres → HTTP → file).

The forbidden words force you to describe the contract, not the current implementation. If you find yourself reaching for “SQL”, that is your brain telling you the contract still has a database shape in it — which would mean the abstraction is not really hiding storage.

Starter files
events.py
"""Concert directory.

STARTING STATE: leaks sqlite3 and the row dict shape into every caller.
"""

import sqlite3


def find_events_in_city(
    connection: sqlite3.Connection,
    table: str,
    city: str,
) -> list[dict]:
    rows = connection.execute(
        f"SELECT title, venue, date_iso, city, ticket_price_cents "
        f"FROM {table} WHERE city = ?",
        (city,),
    ).fetchall()
    return [
        {
            "title": r[0],
            "venue": r[1],
            "date_iso": r[2],
            "city": r[3],
            "ticket_price_cents": r[4],
        }
        for r in rows
    ]


# TODO Run the five-step routine yourself:
#   1. Name the change. (One coming: SQLite -> Postgres -> HTTP service.)
#   2. Name the secret. (Persistence technology + schema mapping.)
#   3. Minimum client assumptions. (event has title, venue, date, city, price.)
#   4. Remove the leak.
#        - ``@dataclass(frozen=True) class Event``
#        - ``class EventDirectory(Protocol)`` with ``find_in(city) -> list[Event]``
#        - ``class InMemoryEventDirectory`` (constructor takes list[Event])
#        - ``class SQLiteEventDirectory`` (constructor takes connection + table)
#   5. Verify with a swap. (The hidden test will swap implementations.)
tour_planner.py
"""Client of events.py. Currently knows about sqlite3 by transitive coupling.

Refactor ``affordable_shows`` to take an EventDirectory instead.
"""

from events import find_events_in_city


def affordable_shows(connection, table: str, city: str, max_price_dollars: int = 50):
    cents_limit = max_price_dollars * 100
    events = find_events_in_city(connection, table, city)
    return [e for e in events if e["ticket_price_cents"] <= cents_limit]


if __name__ == "__main__":
    # The demo wires SQLite in *this* file. That is the only place
    # the sqlite3 import is allowed AFTER the refactor.
    import sqlite3
    conn = sqlite3.connect(":memory:")
    conn.execute(
        "CREATE TABLE shows("
        "title TEXT, venue TEXT, date_iso TEXT, city TEXT, ticket_price_cents INT)"
    )
    conn.executemany(
        "INSERT INTO shows VALUES (?, ?, ?, ?, ?)",
        [
            ("Sabrina Carpenter",   "The Forum",        "2026-03-01", "Los Angeles", 11500),
            ("Olivia Rodrigo",      "Crypto.com Arena", "2026-03-05", "Los Angeles",  9800),
            ("Tame Impala",         "Hollywood Bowl",   "2026-04-12", "Los Angeles",  6700),
            ("Local Open Mic",      "Echo Park Bar",    "2026-03-15", "Los Angeles",  1500),
        ],
    )
    conn.commit()

    # After refactor:
    #   from events import SQLiteEventDirectory
    #   directory = SQLiteEventDirectory(conn, "shows")
    #   for ev in affordable_shows(directory, "Los Angeles", max_price_dollars=80):
    #       print(ev)
    for ev in affordable_shows(conn, "shows", "Los Angeles", max_price_dollars=80):
        print(ev)
6

Single Choice: Stop Repeating the Provider List

🧠 Before you read — retrieve from memory

You’re about to do refactor #4. Before another worked example layers on top of the routine you’ve practiced three times already, your brain needs a chance to produce it cold — that’s what makes the next refactor cheaper than the last one, instead of just longer to read.

You’ve now done three refactors that each followed the same five-step routine. Cover the screen and write the five labels of that routine from memory. (A scrap of paper, a comment in your editor, or your head — any form is fine. Just don’t peek.)

Reveal (after you've written your version) The canonical labels — same five every time, from Parnas's design-for-change discipline: ```text 1. Name the change. What is about to change, and why is it likely? 2. Name the secret. Which design decision should one module own? 3. Minimum client assumptions. What does the client *actually* need to know? 4. Remove the leak. Replace exposed representation with domain operations, a Protocol, dependency injection — whatever names the contract without naming the decision. 5. Verify with a swap. Same client, different implementation, same output. ``` If yours matched word-for-word, your schema is solidifying — that's exactly what spaced retrieval is supposed to do. If you got 4 out of 5 (most students do by this point), notice *which* you missed: the one most often dropped is **#3** (minimum client assumptions), because it's the only step that asks you to reason about the *client* rather than the module being refactored. Karpicke & Roediger (2008) found that recalling material without cues produces 50% stronger retention than re-reading the same material. The 30 seconds you just spent writing the routine from memory is the cheapest learning move in this tutorial.

Why this matters

Open any production codebase and search for if provider ==. You’ll find the same alphabetical list of providers in four files. Add a fifth provider and you edit all four — and inevitably miss one, shipping a “feature works on Spotify but silently breaks on Tidal” bug. The SEBook chapter calls this the Single Choice principle: when a system supports several alternatives, only one module should know the exhaustive list. This step makes Single Choice operational. The killer test: you’ll add a fourth provider — invisible to your refactored code — and three client functions will work with it unchanged.

🎯 You will learn to

  • Apply the Single Choice principle by replacing scattered if provider == "..." switches with polymorphism behind a hidden choice point
  • Analyze code for repeated exhaustive lists (the same set of "spotify", "apple_music", "tidal" strings in multiple files is the smell)
  • Create a new provider class that satisfies the StreamingProvider Protocol — and feel that no existing client function had to change to absorb it

The scene

streaming.py has three top-level functions: play_track, share_track, like_track. Each one has the identical if provider == "spotify": ... elif provider == "apple_music": ... elif provider == "tidal": ... ladder. The product manager just said: “Add YouTube Music. Same operations.” The bad design: four edits across three files. The good design: one new class. The test enforces the second.

✏️ Predict before you run

Today’s streaming.py repeats the provider list in three functions. If we add YouTube Music in the current design, how many elif branches must be added across the file?

  • (a) 1 — a new branch in one function is enough.
  • (b) 3 — one new branch per function, three total.
  • (c) 4 — three new branches plus a new helper function.
  • (d) 0 — Python’s match statement handles it.

Commit. Then refactor and see the answer for the good design.

Your task

Refactor streaming.py:

  1. Define class StreamingProvider(Protocol) with play(self, track_id) -> str, share(self, track_id, friend) -> str, like(self, track_id) -> str. Each returns the message string that the current code prints.
  2. Define class SpotifyProvider, class AppleMusicProvider, class TidalProvider — each implements all three methods.
  3. Rewrite play_track(provider: StreamingProvider, track_id: str), share_track(...), and like_track(...) so each just delegates to the corresponding method on the passed-in provider — no if/elif/match ladders anywhere.

The hidden test will then construct a fourth provider — YouTubeMusicProviderwhich your code has never seen. If your play_track/share_track/like_track functions are properly polymorphic, that fourth provider will Just Work. If any branching on "youtube_music" is needed, the test fails.

🪞 Before clicking Next

Once all three tests pass, do this self-check before the quiz:

Without using the word “Protocol”: search this tutorial mentally across all four refactors (Steps 2, 4, 5, and 6). In each one, you replaced direct exposure of a design decision with what kind of thing? The four answers are different in form but all instances of the same move.

The four are: (Step 2) domain operations on a class, (Step 4) a typed shape + dataclass, (Step 5) dependency injection of a typed shape, (Step 6) polymorphism on a typed shape. Each one is a different way to make a contract not name the volatile decision. (The forbidden word forces you to name what each refactor was for, not the Python construct it used.) The quiz’s last question asks this in MCQ form.

Starter files
streaming.py
"""STARTING STATE.

Three functions, each with the same provider ladder. The "exhaustive
list of providers" is duplicated three times. Refactor with
polymorphism behind a hidden choice point.
"""


def play_track(provider: str, track_id: str) -> str:
    if provider == "spotify":
        return f"Playing {track_id} on Spotify..."
    elif provider == "apple_music":
        return f"Playing {track_id} on Apple Music..."
    elif provider == "tidal":
        return f"Streaming {track_id} on Tidal hi-fi..."
    else:
        raise ValueError(f"Unknown provider: {provider}")


def share_track(provider: str, track_id: str, friend: str) -> str:
    if provider == "spotify":
        return f"Shared Spotify link {track_id} with {friend}"
    elif provider == "apple_music":
        return f"Sent Apple Music card for {track_id} to {friend}"
    elif provider == "tidal":
        return f"Tidal shared {track_id} to {friend}"
    else:
        raise ValueError(f"Unknown provider: {provider}")


def like_track(provider: str, track_id: str) -> str:
    if provider == "spotify":
        return f"Liked Spotify track {track_id}"
    elif provider == "apple_music":
        return f"Loved Apple Music track {track_id}"
    elif provider == "tidal":
        return f"Added Tidal track {track_id} to favorites"
    else:
        raise ValueError(f"Unknown provider: {provider}")


# TODO Replace the ladders with:
#   1. ``class StreamingProvider(Protocol)`` (play, share, like)
#   2. ``SpotifyProvider``, ``AppleMusicProvider``, ``TidalProvider``
#   3. Rewrite play_track / share_track / like_track to delegate

if __name__ == "__main__":
    print(play_track("spotify", "t1"))
    print(share_track("apple_music", "t1", "Alex"))
    print(like_track("tidal", "t9"))
7

Sort the Leaks

Why this matters

Steps 2-6 each taught one kind of leak in isolation — that’s blocked practice, and it’s the right shape for building each schema. But real codebases mix leak types, and the skill an engineer actually needs is classification first, fix second: read a snippet, identify which kind of leak it is (or whether it’s a leak at all), and then pick the right routine.

This step is pure judgment — no code to write, no files to refactor. Six short snippets. For each one, you decide what kind of leak (if any) is present and which step’s routine fixes it.

🎯 You will learn to

  • Discriminate between the four leak types you’ve practiced — by attending to deep structure, not surface cues
  • Recognize when a snippet is not a leak, and resist the “always abstract” instinct
  • Match each leak to the step that taught its fix (representation = Step 2, overspecification = Step 4, persistence = Step 5, exhaustive-alternatives = Step 6)

How to read each snippet

Every snippet has a specific design decision visible (or appropriately hidden). The deep-structure cue you’re looking for: what would have to change in clients if the implementation chose differently? If nothing would, it’s not a leak. If many clients would, name the type and pick the routine.

The same five-step routine you retrieved at the start of Step 6 applies to every fix. This step trains the which routine judgment that comes before applying it.

Research base: Rohrer & Taylor (2007) and Dunlosky et al. (2013) find that interleaved practice produces worse performance during practice but dramatically better transfer afterward — because mixing examples forces attention to the structural feature rather than the surface feature. The next two questions might feel harder than Steps 2-6 did. That’s the point.

Starter files
SNIPPETS.md
# Step 7 — Sort the Leaks

Six short snippets are in the quiz on the right. Each shows a small
Python module. For each one, decide:

1. Is there a leak?
2. If yes, which *kind* — representation (Step 2), over-specification
   (Step 4), persistence (Step 5), or exhaustive-alternatives (Step 6)?
3. If no, why is the abstraction unnecessary here?

You will not edit code in this step. The skill being trained is
classification — the move that comes *before* picking a fix.
8

Predict the Blast Radius

Why this matters

Information hiding is verified by simulating change — Parnas’s original test, and the one industry calls change impact analysis. A real engineer’s job isn’t to recite that classes should depend on abstractions. It’s to read a system and predict: if this changes, what else changes? This step is your final exam for the tutorial: a fresh, never-seen MusicShare app with five modules, four plausible change requests (one of which has the correct answer “don’t refactor”), one honest-tradeoff question, and one cold-transfer case from a different domain. Plus one short open-text artifact — a module guide for ui.py — to consolidate everything you’ve learned into the lightest-weight design doc Parnas, Clements & Weiss invented.

🎯 You will learn to

  • Predict the change-impact radius of a plausible future change in a small system before attempting the change
  • Evaluate when a layer of information hiding pays for itself — and when it adds cognitive overhead without proportional benefit
  • Apply the five-step routine on a system you’ve never seen before
  • Produce a Parnas/Clements/Weiss module guide for an unfamiliar module under time pressure

The MusicShare app

MusicShare ships a web UI for discovering and sharing music. Its five real modules:

Module Public surface (the contract) Hidden secret
recommender.py Recommender(Protocol).recommend(query, *, limit) -> list[SongHit] scoring / ranking algorithm
streaming.py StreamingProvider(Protocol) + play_track / share_track / like_track which streaming service is used today
playlist.py Playlist class with add, top_tracks(n), total_duration_minutes(), average_popularity(), __len__ internal storage representation
events.py EventDirectory(Protocol).find_in(city) -> list[Event] which persistence backend stores concert listings
ui.py HTTP handlers for /search, /share, /like, /concerts/<city> how requests are routed / rendered to HTML

Plus the wiring layer (composition_root.py) that picks today’s concrete Recommender, StreamingProvider, and EventDirectory instances.

Your tasks

  1. Write a module guide for ui.py in the file MODULE_GUIDE.md. Use the same four labels you learned in Steps 4–5: Primary secret, Likely changes, Stable contract, Not promised. One stylistic note: Steps 4–5 wrote the guide inside a """...""" Python docstring at the top of a .py file (because there was a module file to attach it to). Here the artifact stands alone, so it’s a .md file with the four labels as Markdown ## headings instead. Same content, same Parnas/Clements/Weiss-1985 format — just rendered for Markdown instead of Python. The labels still match exactly so a future maintainer can grep for them across both formats.

    One or two lines per label is enough — the artifact’s value is in the content, not the length. The test enforces substantive content under each label and that the Not promised section names at least one specific concrete decision (HTML templating, route paths, response formats, authentication, etc.).

  2. Answer all six quiz questions below. Four are change-impact predictions on MusicShare (one of which has “don’t refactor” as the correct answer); the fifth is the honest-tradeoff question; the sixth is an unscaffolded transfer case on a system you have not seen.

The module guide is the consolidation artifact: producing a four-label document for a module you’ve never edited proves you can apply the discipline on cold material. That is the meaningful capstone for this tutorial.

Starter files
SYSTEM.md
# MusicShare system map

Five modules + wiring:

- recommender.py    — Recommender Protocol; today's concrete is PopularityRecommender.
                      Secret: scoring / ranking algorithm.
- streaming.py      — StreamingProvider Protocol; today's concretes are
                      SpotifyProvider, AppleMusicProvider, TidalProvider.
                      Secret: which streaming service is used.
- playlist.py       — Playlist class; secret: internal storage representation.
- events.py         — EventDirectory Protocol; today's concrete is SQLiteEventDirectory.
                      Secret: which persistence backend stores concert listings.
- ui.py             — HTTP handlers for /search, /share, /like, /concerts/<city>.
                      Calls only the four Protocols above (never the concrete classes).
- composition_root.py — picks today's concrete implementations and hands them to ui.py.

The quiz on the right asks you to predict, for several plausible future
changes, WHICH modules need to be edited. No code to refactor — just
your judgment, plus one short module guide.
MODULE_GUIDE.md
# Module guide — ui.py

Write the Parnas/Clements/Weiss module guide for `ui.py`. Use the four
labels exactly as below; replace the `<...>` placeholders with one or
two lines of your own reasoning.

## Primary secret

<One sentence: what design decision does ui.py own and hide?>

## Likely changes

<Bullet two or three plausible future changes this module absorbs locally.>

## Stable contract

<One or two sentences: what do callers of ui.py rely on?>

## Not promised

<Bullet at least two concrete decisions that are NOT part of ui.py's
contract — things a future maintainer must NOT depend on. Be specific:
name HTML/JSON, templating engine, exact response shapes, URL paths,
authentication scheme, etc. A generic "implementation details" line
does not count and the test will reject it.>