1

Private Is Not a Secret

Why this matters

Most CS students learn private before they learn what it’s for. So when someone asks “did you hide the information?”, they answer “yes — the field is private.” That answer is wrong often enough that a billion-dollar industry of code reviews exists. The next five minutes are an inoculation: you will hold a class whose fields are “private” and use its own public API to plant a fake entry — proving the secret leaked anyway.

🎯 You will learn to

  • Distinguish information hiding (a design decision about who is allowed to know what) from private (a syntax feature that helps enforce it, when used carefully)
  • Analyze a public method signature for representation leaks — even when every field uses the _underscore convention

✏️ Predict before you run

Open watch_history.py. Every field starts with _ — Python’s convention for “private.” Will an outside caller still be able to make history.recent() return [..., {"title": "Pirated Movie", "year": 1999}] without calling history.add(...)?

  • (a) No — the underscore convention prevents outside code from touching _history.
  • (b) No — the only public modifier is add(), so _history can only change via add().
  • (c) Yes — because recent() returns the actual internal list, the caller can mutate it from outside.
  • (d) Compile error — Python won’t let you index into a _ -prefixed attribute.

Commit to a letter, then try the task.

Reveal (after you've tried it) Answer **(c)**. The "private" field has not been hidden — only renamed. `recent()` hands the caller a *reference* to `self._history`, which is a `list`, and `list.append(...)` mutates in place. Visibility modifiers stop nothing on their own. The **representation decision** ("storage is a list of dicts you may mutate by reference") has leaked through the public API. Three other ways this same class still leaks even if `recent()` returned `tuple(self._history)`: 1. Callers see the element type (`dict[str, ...]`) and start writing code that depends on the keys. 2. Switching `_history` to a `dict[str, Episode]` keyed by title would change what `recent()` returns and break every caller. 3. `for entry in history.recent():` quietly depends on iteration order. Information hiding isn't a marker on a field — it's a decision about *which design decisions clients may depend on*. The next five steps train that judgment.

Your task — make the leak happen

Open watch_history.py. Below the # TODO marker, write one line that — using only the public API history.recent() — plants a {"title": "Pirated Movie", "year": 1999} entry. Do not call history.add(...). When the test runs, history.recent() must return a list containing that planted entry.

The point isn’t that you should write code like this. The point is that the design lets you, even though the author thought they had hidden the field.

Starter files
watch_history.py
class WatchHistory:
    """Stores what you have watched.

    The author thought ``_history`` was 'hidden' because of the
    underscore convention.
    """

    def __init__(self) -> None:
        self._history: list[dict] = []

    def add(self, title: str, year: int) -> None:
        self._history.append({"title": title, "year": year})

    def recent(self) -> list[dict]:
        # The author meant this as a read-only view.
        # It is not. Find out why in the task below.
        return self._history


if __name__ == "__main__":
    history = WatchHistory()
    history.add("Stranger Things", 2016)
    history.add("Severance", 2022)

    # TODO: Without calling ``history.add(...)``, plant the fake entry
    # ``{"title": "Pirated Movie", "year": 1999}`` so it shows up in
    # ``history.recent()``. One line is enough. Use only the public
    # API — meaning the method ``history.recent()``.
    # planted = ...

    print(history.recent())
2

The Playlist’s Secret

Why this matters

Step 1 showed you a leak the size of one line. Real codebases hide leaks the size of an org chart — and the same leak forces three teams to coordinate every time the data shape changes. The fastest way to feel the difference is to do a small refactor on a Playlist class and then run the same client tests against a different internal storage. If the client breaks, the secret was never hidden.

Connection to the chapter’s KWIC example. Parnas (1972) made the same point at the system level using a Key Word In Context indexer: decomposing by processing steps (input → shift → sort → print) spread the line-storage decision across every module, so a change to line storage broke every module. Decomposing by hidden decisions (line storage, shift generation, sort ordering) kept each change local. This step plays out Parnas’s argument at the class level on a Playlist.

🎯 You will learn to

  • Apply the five-step routine for hiding a secret: name the change, name the secret, list the minimum client assumptions, remove the leak, verify with a swap test
  • Analyze which lines of client code depend on the internal representation
  • Create a new domain operation on the hidden module without changing any client

The scene

You are building a music app. Playlist exposes a raw tracks: list[dict]. Three client features reach in: a textual summary, a top-picks list, and a “party-ready?” check. Run app.py and see the current behavior. Then the product manager files a ticket: “Add remove(title) — needs to be O(1).” That requires switching internal storage from list to dict keyed by title. Run the same client code afterward and watch it break.

✏️ Predict before you run

Look at playlist.py and app.py. If we change Playlist.tracks from a list[dict] to a dict[str, Track] (keyed by title, for O(1) remove), how many files will need to change to keep the demo running?

  • (a) Only playlist.py — the field is private-ish; clients should not care.
  • (b) playlist.py and app.py — every loop and indexing operation in app.py will break.
  • (c) Only app.pyPlaylist itself is fine.
  • (d) Zero files — Python’s duck typing handles it.

Commit to a letter. The first test enforces this.

The five-step routine

You will use this exact routine in every refactor from here on. Memorize the headings, not the lines:

1. Name the change.        What is about to change, and why is it likely?
2. Name the secret.        Which design decision should one module own?
3. Minimum client assumptions.   What does the client *actually* need to know?
4. Remove the leak.        Replace exposed representation with domain operations.
5. Verify with a swap.     Same client, different implementation, same output.

Filled in for Playlist:

1. Change:    Internal storage may move from list to dict for O(1) remove.
2. Secret:    How tracks are stored and ordered for retrieval.
3. Client needs:  top N by popularity, total minutes, average popularity, count.
4. Remove:    Stop exposing ``tracks``; expose ``top_tracks(n)``,
              ``total_duration_minutes()``, ``average_popularity()``, ``__len__``.
5. Verify:    Hidden test runs the same client against a dict-backed Playlist.

Your task

Refactor playlist.py so the public API is domain operations, not a raw list:

add(title, artist, duration_sec, popularity)
top_tracks(n) -> list[Track]          # n highest by popularity
total_duration_minutes() -> float
average_popularity() -> float         # 0 if empty
__len__() -> int

Define a Track dataclass with the four attributes (title, artist, duration_sec, popularity) so top_tracks returns domain objects, not dicts.

Refactor app.py so describe_playlist and is_party_appropriate use only the four methods above. They must not touch playlist.tracks, _tracks, or any dict keys.

One method is up to you: in app.py, write is_party_appropriate(playlist) — returns True if total duration > 90 minutes AND average popularity > 60.

🪞 Before clicking Next

Once all four tests pass, complete this sentence — out loud, to a rubber duck, or in your head — before taking the quiz. (The first quiz question reuses this structure.)

“The decision how tracks are stored used to live in __**. After the refactor it lives in **__. So a switch from list to dict now changes __** file(s) instead of **__.”

The “generation effect” (Chi et al., 1994) says producing the sentence yourself, even silently, strengthens what you just learned more than re-reading does. It takes 15 seconds.

Starter files
playlist.py
"""Music playlist.

STARTING STATE: leaks the internal list[dict] through ``tracks``.
Refactor so the only public surface is *domain operations*.
"""


class Playlist:
    def __init__(self) -> None:
        self.tracks: list[dict] = []

    def add(self, title: str, artist: str, duration_sec: int, popularity: int) -> None:
        self.tracks.append({
            "title": title,
            "artist": artist,
            "duration_sec": duration_sec,
            "popularity": popularity,
        })

# TODO (Step 4 of the routine — "Remove the leak"):
#   1. Define a ``Track`` dataclass with title, artist, duration_sec, popularity.
#   2. Replace ``self.tracks`` with a hidden ``self._tracks``.
#   3. Add the four domain methods listed in the instructions.
app.py
"""Client of Playlist. Currently reaches into the raw list.

Refactor ``describe_playlist`` and ``is_party_appropriate`` to use
ONLY the new domain methods on Playlist (no ``playlist.tracks``,
no dict indexing).
"""

from playlist import Playlist


def describe_playlist(playlist: Playlist) -> str:
    total_minutes = sum(t["duration_sec"] for t in playlist.tracks) / 60
    avg_popularity = (
        sum(t["popularity"] for t in playlist.tracks) / len(playlist.tracks)
        if playlist.tracks else 0
    )
    top_three = sorted(playlist.tracks, key=lambda t: t["popularity"], reverse=True)[:3]

    lines = [
        f"{len(playlist.tracks)} tracks, "
        f"{total_minutes:.1f} min, "
        f"avg popularity {avg_popularity:.0f}"
    ]
    lines.append("Top picks:")
    for t in top_three:
        lines.append(f"  - {t['title']} by {t['artist']}")
    return "\n".join(lines)


def is_party_appropriate(playlist: Playlist) -> bool:
    # TODO: rewrite using only Playlist's new domain methods.
    # Spec: True iff total duration > 90 min AND avg popularity > 60.
    raise NotImplementedError


if __name__ == "__main__":
    p = Playlist()
    p.add("Bad Guy", "Billie Eilish", 194, 95)
    p.add("Levitating", "Dua Lipa", 203, 88)
    p.add("Blinding Lights", "The Weeknd", 200, 92)
    p.add("Heat Waves", "Glass Animals", 238, 80)
    p.add("As It Was", "Harry Styles", 167, 90)
    print(describe_playlist(p))
    print("party-ready?", is_party_appropriate(p))
3

An Interface That Tells You Too Much

Why this matters

Step 2 fixed a mutation leak. This step fixes a subtler one — a contract that looks clean but over-specifies how it computes its answer. Parnas warned about this in 1972 with his KWIC example: an interface that says more than the client needs to know restricts future implementations. A music recommender that returns raw BM25 scores is the modern version. Switch the algorithm from BM25 to embeddings and every numeric threshold in the client breaks.

Heads up — this step adds two new Python features at once. That’s deliberate: typing.Protocol and typing.Literal are the cleanest way to express “any class with this shape works.” If either feels foggy, walk to the primer below, then come back. Confusion on the first read is normal — Parnas’s principle takes most engineers a couple of passes to internalize.

🎯 You will learn to

  • Analyze a read-only API for over-specification — which numeric scales, internal IDs, or raw rows are visible that clients did not need
  • Create a typing.Protocol plus a small dataclass so two different ranking strategies satisfy the same contract
  • Apply the Parnas/Clements/Weiss module-guide mini-doc format: secret, likely changes, stable contract, what is not promised

Two-minute primer on Protocol and Literal

You used Python’s “duck typing” implicitly in Step 2 — the swap test defined DictBackedPlaylist and the client just worked, because Python checks methods at call time, not declaration time. typing.Protocol is how you make that duck typing explicit and type-checkable:

from typing import Protocol

class Recommender(Protocol):
    def recommend(self, query: str, *, limit: int = 5) -> list[SongHit]: ...

Any class that has a recommend(query, *, limit) -> list[SongHit] method automatically satisfies Recommender. No class Foo(Recommender): needed; structural matching only. That’s the contract you’ll write below.

typing.Literal lets a type be one of a fixed set of values:

from typing import Literal
Confidence = Literal["low", "medium", "high"]

Now confidence: Confidence means “must be the string low, medium, or high, and your type checker will yell if you try anything else.” It’s the right tool for a small enum of domain-meaningful labels.

The scene

recommender.py ranks songs for a query. The current contract returns list[tuple[int, float, dict]](bucket_id, similarity_score, raw_row). sidebar.py thresholds at score >= 12.0 to call a hit “strong.” Today’s scorer is BM25-style; scores live in 0..30. Next quarter the team plans to swap in vector embeddings; scores will live in 0..1. Every threshold in every client will silently produce wrong answers.

✏️ Predict before you run

The bad design returns (bucket_id, score, row). If the recommender switches from BM25 to cosine-similarity embeddings, what is the most likely failure mode in the existing sidebar.py?

  • (a) A crash — the new return type won’t match.
  • (b) An empty sidebar — every score will be below the threshold 12.0, so no hits are “strong” anymore.
  • (c) The sidebar shows literally every song — every score will be above 12.0.
  • (d) The sidebar is unchanged — the contract types are the same.

Commit before reading on.

Reveal Answer **(b)**. Cosine-similarity scores live in `0..1`. The old threshold `12.0` is now larger than the highest possible score, so the strong-hit list is always empty. **The sidebar just goes blank in production** with no exception — the worst kind of bug. The deep mistake is not in `sidebar.py`. It is in `recommender.py`'s contract, which exposed the numeric score and tied callers to its *scale*. Parnas's term for this in his 1972 paper: the interface "reveals more than is necessary," restricting which future implementations can satisfy it.

Your task

Refactor recommender.py so the contract exposes only what the client genuinely needs:

  1. Define Confidence = Literal["low", "medium", "high"].
  2. Define @dataclass(frozen=True) class SongHit with track_id: str, title: str, artist: str, confidence: Confidence.
  3. Define class Recommender(Protocol) with def recommend(self, query: str, *, limit: int = 5) -> list[SongHit]: ....
  4. Provide class PopularityRecommender: whose recommend method satisfies the Protocol. Use the helper _strong_track_table() already in the file to populate a few demo hits — assign confidence based on internal popularity buckets (you choose how).
  5. Refactor sidebar.py so support_sidebar(query, recommender) takes a Recommender and returns titles of hits where hit.confidence == "high". No numeric thresholds anywhere in sidebar.py.

Also: write a module guide comment at the top of recommender.py in this exact format (you can fill in the values):

"""
Module guide:
  Primary secret:   <one sentence — name the volatile decision>
  Likely changes:   <bullets — BM25 -> embeddings, score scale shifts, ...>
  Stable contract:  <one sentence — what callers can rely on>
  Not promised:     <bullets — raw scores, bucket IDs, ranking algorithm, ...>
"""

Test 4 will look for those four words (Primary secret, Likely changes, Stable contract, Not promised) — Parnas, Clements, and Weiss called this artifact the module guide in their 1985 paper. It is the lightest-weight design-doc you can write that still records why the boundary exists.

🪞 Before clicking Next

Once all four tests pass, take 20 seconds and answer in your head:

“If a future engineer reads sidebar.py, can they tell which ranking algorithm runs underneath? Why or why not?”

The right answer (“no — sidebar only sees hit.confidence”) is what you just bought with this refactor. Saying it out loud once makes the next refactor easier.

Starter files
recommender.py
"""STARTING STATE.

Today's design returns ``list[tuple[bucket_id, score, raw_row]]``.
Score scale is 0..30 (BM25-like). Refactor as the instructions ask
so a future swap to embeddings (scale 0..1) does NOT break callers.
"""

# The recommender currently exposes raw scores, bucket IDs, and dict rows.
# That is an over-specified contract. Replace it.

_DEMO_CATALOG = [
    # (track_id, title, artist, internal_popularity_0_to_100)
    ("t1", "Bad Guy",          "Billie Eilish",          95),
    ("t2", "Bury a Friend",    "Billie Eilish",          78),
    ("t3", "Lovely",           "Billie Eilish, Khalid",  62),
    ("t4", "Ocean Eyes",       "Billie Eilish",          55),
    ("t5", "Happier Than Ever","Billie Eilish",          88),
    ("t6", "All The Good Girls","Billie Eilish",         40),
]


def _strong_track_table() -> list[tuple[str, str, str, int]]:
    """Return the demo catalog. Use the int popularity to choose confidence."""
    return list(_DEMO_CATALOG)


def recommend(query: str) -> list[tuple[int, float, dict]]:
    """LEAKY contract — returns (bucket_id, score, raw_row)."""
    raw = _strong_track_table()
    # Pretend BM25 scores in 0..30 derived from the popularity field.
    return [
        (
            i // 3,                       # bucket_id leaks an internal partition
            round(pop * 30 / 100, 2),     # score scale 0..30 leaks the algorithm
            {"track_id": tid, "title": title, "artist": artist, "popularity": pop},
        )
        for i, (tid, title, artist, pop) in enumerate(raw)
    ]


# TODO replace the leaky surface above with:
#   1. ``Confidence = Literal["low", "medium", "high"]``
#   2. ``@dataclass(frozen=True) class SongHit`` with the fields named in the instructions
#   3. ``class Recommender(Protocol)`` with ``recommend(query, *, limit=5) -> list[SongHit]``
#   4. ``class PopularityRecommender`` implementing the Protocol
#
# And add the module-guide docstring at the top of the file.
sidebar.py
"""Client that knows too much.

Refactor ``support_sidebar`` to take a ``Recommender`` and ask for
high-confidence hits — no raw scores, no thresholds.
"""

from recommender import recommend

STRONG_THRESHOLD = 12.0   # BM25 scale assumption — a leak waiting to break.


def support_sidebar(query: str) -> list[str]:
    hits = recommend(query)
    return [row["title"] for (_bucket, score, row) in hits if score >= STRONG_THRESHOLD]


if __name__ == "__main__":
    for title in support_sidebar("billie eilish"):
        print(title)
4

Where Did You Put the Database?

Why this matters

The single most common information-hiding leak in real code is storage. A function that takes a sqlite3.Connection (or a MongoClient, or an S3 handle) and returns rows ties every caller to a specific persistence technology. When the team migrates from SQLite to Postgres, from rows to JSON, from synchronous to async, everything moves. This step is the canonical Parnas case made hands-on. You’ll do the whole routine yourself.

🎯 You will learn to

  • Create a Protocol + dataclass + in-memory implementation from a leaky function — independently, using the five-step routine
  • Apply dependency injection: pass the directory in to the client instead of constructing the storage inside it
  • Evaluate the change-impact radius of a storage migration before and after your refactor

The scene

events.py looks up concerts by city. Today’s implementation uses SQLite. The function signature reveals it — every client compiles against sqlite3. The product manager wants to add a JSON-file-backed test fixture for offline development, and the SRE wants to migrate the production catalog to a remote HTTP service. Each of those is a separate file rewrite today. Your job is to make them all one new class apiece.

✏️ Predict before you run

Suppose we keep the current events.py signature and just implement a JSON-file fixture. How many files have to be edited to use it from tour_planner.py?

  • (a) 1 — events.py only.
  • (b) 2 — events.py and tour_planner.py.
  • (c) 3+ — events.py, tour_planner.py, every test that constructs the connection, and any module that builds the SQL table string.
  • (d) 0 — duck typing handles it; pass a JSON dict where a connection is expected.

Commit. After your refactor, the same change will require one new class in events.py and zero edits to tour_planner.py — that is your verification.

Your task

Refactor events.py so the persistence decision is hidden:

  1. Define @dataclass(frozen=True) class Event with title: str, venue: str, date_iso: str, city: str, ticket_price_cents: int.
  2. Define class EventDirectory(Protocol) with def find_in(self, city: str) -> list[Event]: ....
  3. Implement class InMemoryEventDirectory: — constructor takes a list[Event], find_in(city) filters by city. This is your test/fixture implementation.
  4. Implement class SQLiteEventDirectory: — constructor takes a sqlite3.Connection and a table name, find_in(city) runs the same SQL the original function ran and maps rows to Event. This is the only file that may import sqlite3.

Refactor tour_planner.py so affordable_shows(directory, city, max_price_dollars=50) takes an EventDirectory (not a connection). Filter inside the function using event.ticket_price_cents and return a list[Event].

Add the module guide docstring to events.py using the same four labels you used in Step 3.

You will probably break the implementation-swap test first. The most common cause is forgetting to map raw SQL row tuples back to Event objects in SQLiteEventDirectory.find_in. If the test fails, read its diff carefully — the failure is the lesson, not the verdict.

🪞 Before clicking Next

Once all four tests pass, answer this in your head before the quiz:

“After the refactor, affordable_shows calls one method on its parameter. Name that method and explain why that single call is enough to absorb every plausible storage migration (SQLite → Postgres → HTTP → file).”

Starter files
events.py
"""Concert directory.

STARTING STATE: leaks sqlite3 and the row dict shape into every caller.
"""

import sqlite3


def find_events_in_city(
    connection: sqlite3.Connection,
    table: str,
    city: str,
) -> list[dict]:
    rows = connection.execute(
        f"SELECT title, venue, date_iso, city, ticket_price_cents "
        f"FROM {table} WHERE city = ?",
        (city,),
    ).fetchall()
    return [
        {
            "title": r[0],
            "venue": r[1],
            "date_iso": r[2],
            "city": r[3],
            "ticket_price_cents": r[4],
        }
        for r in rows
    ]


# TODO Run the five-step routine yourself:
#   1. Name the change. (One coming: SQLite -> Postgres -> HTTP service.)
#   2. Name the secret. (Persistence technology + schema mapping.)
#   3. Minimum client assumptions. (event has title, venue, date, city, price.)
#   4. Remove the leak.
#        - ``@dataclass(frozen=True) class Event``
#        - ``class EventDirectory(Protocol)`` with ``find_in(city) -> list[Event]``
#        - ``class InMemoryEventDirectory`` (constructor takes list[Event])
#        - ``class SQLiteEventDirectory`` (constructor takes connection + table)
#   5. Verify with a swap. (The hidden test will swap implementations.)
tour_planner.py
"""Client of events.py. Currently knows about sqlite3 by transitive coupling.

Refactor ``affordable_shows`` to take an EventDirectory instead.
"""

from events import find_events_in_city


def affordable_shows(connection, table: str, city: str, max_price_dollars: int = 50):
    cents_limit = max_price_dollars * 100
    events = find_events_in_city(connection, table, city)
    return [e for e in events if e["ticket_price_cents"] <= cents_limit]


if __name__ == "__main__":
    # The demo wires SQLite in *this* file. That is the only place
    # the sqlite3 import is allowed AFTER the refactor.
    import sqlite3
    conn = sqlite3.connect(":memory:")
    conn.execute(
        "CREATE TABLE shows("
        "title TEXT, venue TEXT, date_iso TEXT, city TEXT, ticket_price_cents INT)"
    )
    conn.executemany(
        "INSERT INTO shows VALUES (?, ?, ?, ?, ?)",
        [
            ("Sabrina Carpenter",   "The Forum",        "2026-03-01", "Los Angeles", 11500),
            ("Olivia Rodrigo",      "Crypto.com Arena", "2026-03-05", "Los Angeles",  9800),
            ("Tame Impala",         "Hollywood Bowl",   "2026-04-12", "Los Angeles",  6700),
            ("Local Open Mic",      "Echo Park Bar",    "2026-03-15", "Los Angeles",  1500),
        ],
    )
    conn.commit()

    # After refactor:
    #   from events import SQLiteEventDirectory
    #   directory = SQLiteEventDirectory(conn, "shows")
    #   for ev in affordable_shows(directory, "Los Angeles", max_price_dollars=80):
    #       print(ev)
    for ev in affordable_shows(conn, "shows", "Los Angeles", max_price_dollars=80):
        print(ev)
5

Single Choice: Stop Repeating the Provider List

Why this matters

Open any production codebase and search for if provider ==. You’ll find the same alphabetical list of providers in four files. Add a fifth provider and you edit all four — and inevitably miss one, shipping a “feature works on Spotify but silently breaks on Tidal” bug. The SEBook chapter calls this the Single Choice principle: when a system supports several alternatives, only one module should know the exhaustive list. This step makes Single Choice operational. The killer test: you’ll add a fourth provider — invisible to your refactored code — and three client functions will work with it unchanged.

🎯 You will learn to

  • Apply the Single Choice principle by replacing scattered if provider == "..." switches with polymorphism behind a hidden choice point
  • Analyze code for repeated exhaustive lists (the same set of "spotify", "apple_music", "tidal" strings in multiple files is the smell)
  • Create a new provider class that satisfies the StreamingProvider Protocol — and feel that no existing client function had to change to absorb it

The scene

streaming.py has three top-level functions: play_track, share_track, like_track. Each one has the identical if provider == "spotify": ... elif provider == "apple_music": ... elif provider == "tidal": ... ladder. The product manager just said: “Add YouTube Music. Same operations.” The bad design: four edits across three files. The good design: one new class. The test enforces the second.

✏️ Predict before you run

Today’s streaming.py repeats the provider list in three functions. If we add YouTube Music in the current design, how many elif branches must be added across the file?

  • (a) 1 — a new branch in one function is enough.
  • (b) 3 — one new branch per function, three total.
  • (c) 4 — three new branches plus a new helper function.
  • (d) 0 — Python’s match statement handles it.

Commit. Then refactor and see the answer for the good design.

Your task

Refactor streaming.py:

  1. Define class StreamingProvider(Protocol) with play(self, track_id) -> str, share(self, track_id, friend) -> str, like(self, track_id) -> str. Each returns the message string that the current code prints.
  2. Define class SpotifyProvider, class AppleMusicProvider, class TidalProvider — each implements all three methods.
  3. Rewrite play_track(provider: StreamingProvider, track_id: str), share_track(...), and like_track(...) so each just delegates to the corresponding method on the passed-in provider — no if/elif/match ladders anywhere.

The hidden test will then construct a fourth provider — YouTubeMusicProviderwhich your code has never seen. If your play_track/share_track/like_track functions are properly polymorphic, that fourth provider will Just Work. If any branching on "youtube_music" is needed, the test fails.

🪞 Before clicking Next

Once all three tests pass, do this self-check before the quiz:

“Search this tutorial mentally across all four refactors. In each one, you replaced direct exposure of a design decision with what kind of thing? The four answers are different in form but all an instance of the same move.”

The four are: (Step 2) domain operations, (Step 3) a Protocol + dataclass, (Step 4) dependency injection of a Protocol, (Step 5) polymorphism on a Protocol. Each one is a different way to make a contract not name the volatile decision. The quiz’s last question asks this in MCQ form.

Starter files
streaming.py
"""STARTING STATE.

Three functions, each with the same provider ladder. The "exhaustive
list of providers" is duplicated three times. Refactor with
polymorphism behind a hidden choice point.
"""


def play_track(provider: str, track_id: str) -> str:
    if provider == "spotify":
        return f"Playing {track_id} on Spotify..."
    elif provider == "apple_music":
        return f"Playing {track_id} on Apple Music..."
    elif provider == "tidal":
        return f"Streaming {track_id} on Tidal hi-fi..."
    else:
        raise ValueError(f"Unknown provider: {provider}")


def share_track(provider: str, track_id: str, friend: str) -> str:
    if provider == "spotify":
        return f"Shared Spotify link {track_id} with {friend}"
    elif provider == "apple_music":
        return f"Sent Apple Music card for {track_id} to {friend}"
    elif provider == "tidal":
        return f"Tidal shared {track_id} to {friend}"
    else:
        raise ValueError(f"Unknown provider: {provider}")


def like_track(provider: str, track_id: str) -> str:
    if provider == "spotify":
        return f"Liked Spotify track {track_id}"
    elif provider == "apple_music":
        return f"Loved Apple Music track {track_id}"
    elif provider == "tidal":
        return f"Added Tidal track {track_id} to favorites"
    else:
        raise ValueError(f"Unknown provider: {provider}")


# TODO Replace the ladders with:
#   1. ``class StreamingProvider(Protocol)`` (play, share, like)
#   2. ``SpotifyProvider``, ``AppleMusicProvider``, ``TidalProvider``
#   3. Rewrite play_track / share_track / like_track to delegate

if __name__ == "__main__":
    print(play_track("spotify", "t1"))
    print(share_track("apple_music", "t1", "Alex"))
    print(like_track("tidal", "t9"))
6

Predict the Blast Radius

Why this matters

Information hiding is verified by simulating change — Parnas’s original test, and the one industry calls change impact analysis. A real engineer’s job isn’t to recite that classes should depend on abstractions. It’s to read a system and predict: if this changes, what else changes? This step is your final exam for the tutorial: a fresh, never-seen MusicShare app with five modules, three plausible change requests, and one honest tradeoff. No coding. Pure judgment.

🎯 You will learn to

  • Predict the change-impact radius of a plausible future change in a small system before attempting the change
  • Evaluate when a layer of information hiding pays for itself — and when it adds cognitive overhead without proportional benefit
  • Apply the five-step routine on a system you’ve never seen before

The MusicShare app

MusicShare ships a web UI for discovering and sharing music. Its five real modules:

Module Public surface (the contract) Hidden secret
recommender.py Recommender(Protocol).recommend(query, *, limit) -> list[SongHit] scoring / ranking algorithm
streaming.py StreamingProvider(Protocol) + play_track / share_track / like_track which streaming service is used today
playlist.py Playlist class with add, top_tracks(n), total_duration_minutes(), average_popularity(), __len__ internal storage representation
events.py EventDirectory(Protocol).find_in(city) -> list[Event] which persistence backend stores concert listings
ui.py HTTP handlers for /search, /share, /like, /concerts/<city> how requests are routed / rendered to HTML

Plus the wiring layer (composition_root.py) that picks today’s concrete Recommender, StreamingProvider, and EventDirectory instances.

Work through the four quiz questions below. The first three exercise change-impact reasoning on this system; the fourth tests an honest tradeoff (worth re-reading the When NOT to apply section of the chapter before answering).

Starter files
SYSTEM.md
# MusicShare system map

Five modules + wiring:

- recommender.py    — Recommender Protocol; today's concrete is PopularityRecommender.
                      Secret: scoring / ranking algorithm.
- streaming.py      — StreamingProvider Protocol; today's concretes are
                      SpotifyProvider, AppleMusicProvider, TidalProvider.
                      Secret: which streaming service is used.
- playlist.py       — Playlist class; secret: internal storage representation.
- events.py         — EventDirectory Protocol; today's concrete is SQLiteEventDirectory.
                      Secret: which persistence backend stores concert listings.
- ui.py             — HTTP handlers for /search, /share, /like, /concerts/<city>.
                      Calls only the four Protocols above (never the concrete classes).
- composition_root.py — picks today's concrete implementations and hands them to ui.py.

The quiz on the right asks you to predict, for several plausible future
changes, WHICH modules need to be edited. There is no code to write.