Information Hiding in Python Tutorial

1

Private Is Not a Secret

Why this matters

Most CS students learn private before they learn what it’s for. So when someone asks “did you hide the information?”, they answer “yes — the field is private.” That answer is wrong often enough that a billion-dollar industry of code reviews exists. The next five minutes are an inoculation: you will hold a class whose fields are “private” and use its own public API to plant a fake entry — proving the secret leaked anyway.

🎯 You will learn to

Distinguish information hiding (a design decision about who is allowed to know what) from private (a syntax feature that helps enforce it, when used carefully)
Analyze a public method signature for representation leaks — even when every field uses the _underscore convention

✏️ Predict before you run

Open watch_history.py. Every field starts with _ — Python’s convention for “private.” Will an outside caller still be able to make history.recent() return [..., {"title": "Pirated Movie", "year": 1999}] without calling history.add(...)?

(a) No — the underscore convention prevents outside code from touching _history.
(b) No — the only public modifier is add(), so _history can only change via add().
(c) Yes — because recent() returns the actual internal list, the caller can mutate it from outside.
(d) Compile error — Python won’t let you index into a _ -prefixed attribute.

Commit to a letter, then try the task.

Reveal (after you've tried it)

Answer **(c)**. The "private" field has not been hidden — only renamed. `recent()` hands the caller a *reference* to `self._history`, which is a `list`, and `list.append(...)` mutates in place. Visibility modifiers stop nothing on their own. The **representation decision** ("storage is a list of dicts you may mutate by reference") has leaked through the public API. Three other ways this same class still leaks even if `recent()` returned `tuple(self._history)`: 1. Callers see the element type (`dict[str, ...]`) and start writing code that depends on the keys. 2. Switching `_history` to a `dict[str, Episode]` keyed by title would change what `recent()` returns and break every caller. 3. `for entry in history.recent():` quietly depends on iteration order. Information hiding isn't a marker on a field — it's a decision about *which design decisions clients may depend on*. The next five steps train that judgment.

Your task — make the leak happen

Open watch_history.py. Below the # TODO marker, write one line that — using only the public API history.recent() — plants a {"title": "Pirated Movie", "year": 1999} entry. Do not call history.add(...). When the test runs, history.recent() must return a list containing that planted entry.

The point isn’t that you should write code like this. The point is that the design lets you, even though the author thought they had hidden the field.

Starter files

watch_history.py

class WatchHistory:
    """Stores what you have watched.

    The author thought ``_history`` was 'hidden' because of the
    underscore convention.
    """

    def __init__(self) -> None:
        self._history: list[dict] = []

    def add(self, title: str, year: int) -> None:
        self._history.append({"title": title, "year": year})

    def recent(self) -> list[dict]:
        # The author meant this as a read-only view.
        # It is not. Find out why in the task below.
        return self._history


if __name__ == "__main__":
    history = WatchHistory()
    history.add("Stranger Things", 2016)
    history.add("Severance", 2022)

    # TODO: Without calling ``history.add(...)``, plant the fake entry
    # ``{"title": "Pirated Movie", "year": 1999}`` so it shows up in
    # ``history.recent()``. One line is enough. Use only the public
    # API — meaning the method ``history.recent()``.
    # planted = ...

    print(history.recent())

Solution

watch_history.py

class WatchHistory:
    """Stores what you have watched.

    The author thought ``_history`` was 'hidden' because of the
    underscore convention.
    """

    def __init__(self) -> None:
        self._history: list[dict] = []

    def add(self, title: str, year: int) -> None:
        self._history.append({"title": title, "year": year})

    def recent(self) -> list[dict]:
        return self._history


if __name__ == "__main__":
    history = WatchHistory()
    history.add("Stranger Things", 2016)
    history.add("Severance", 2022)

    # The leak: ``recent()`` returns the same list object that
    # ``_history`` points to. Mutating that list mutates the field.
    history.recent().append({"title": "Pirated Movie", "year": 1999})

    print(history.recent())

What you just proved. The author marked _history as private and thought clients could only modify the state via add(). But recent() returns a reference to the same list — and list.append(...) mutates in place. One line of “client code” bypassed the entire intended invariant.

The fix is not “make the underscore double”. Even __history only triggers name-mangling (_WatchHistory__history), reachable from outside if you really want. The fix is structural: recent() should return an immutable view of the data as domain objects, not the internal list of dicts. Something like:

from dataclasses import dataclass
from typing import Iterable

@dataclass(frozen=True)
class WatchedShow:
    title: str
    year: int

class WatchHistory:
    def __init__(self) -> None:
        self._shows: list[WatchedShow] = []

    def add(self, title: str, year: int) -> None:
        self._shows.append(WatchedShow(title, year))

    def recent(self) -> tuple[WatchedShow, ...]:
        return tuple(self._shows)   # immutable view, domain objects

Now: (1) clients cannot mutate _shows through the return value, and (2) the return type is a domain object (WatchedShow) rather than a dict, so the storage decision can change later without breaking callers. You will do exactly this kind of refactor in the next step, on a richer example.

Step 1 — Knowledge Check

Min. score: 80%

1. Every field in WatchHistory starts with _. Which statement best describes whether the storage representation is hidden?

Yes — the leading underscore is Python’s convention for private, so external code cannot access _history.
The underscore is a convention, not enforcement. But that’s the small problem — the big problem is what the public API returns. The list itself escapes through recent(), no underscore-bypassing needed.
Yes — the public method recent() is the only way clients see the data, so it is encapsulated.
Having only one public read method is not the same as hiding. If that method exposes the internal representation by reference, clients can still depend on the representation. Encapsulation (one access point) ≠ information hiding (that access point reveals nothing volatile).
No — recent() hands back the actual list, so callers can mutate it and depend on its type.
No — leading underscores have no enforcement in Python at all; everything is public anyway.
Half right — underscores aren’t enforced. But the deeper leak isn’t ‘someone reaches in to _history directly.’ It is that recent() hands out the list, which is the same thing in effect with no underscore-poking required. Read what the public method returns, not just what’s named with _.

The fields are technically marked private by convention, but recent() returns the actual list. Any caller can mutate it (you just did) or write code that assumes “it is a list, I can iterate it in insertion order, I can index it by integer, I can .append() to it” — all of which the author did not intend. Hiding the field name doesn’t hide the design decision (“storage = mutable list of dicts”).

2. Which future change does the current WatchHistory design make expensive — meaning many callers would have to be edited?

Adding a new dict key, like a rating, alongside title and year.
Adding a key affects what’s inside each entry, but callers that read existing keys still work. It is a small ripple at worst.
Renaming the private attribute _history to _titles_watched.
Renaming an underscore-prefixed attribute is a local change. No caller mentions the attribute name.
Switching internal storage from list[dict] to dict[str, Episode].
Adding a brand-new public method like clear().
Adding a new method extends the public API without breaking existing callers. Open/Closed Principle in action.

A switch from list to dict[str, Episode] would change what recent() returns. Every caller that does for entry in wh.recent(): expects insertion-order iteration. Every caller that does wh.recent().append(...) (the leak you just exploited) would crash. The internal representation has leaked into the contract because the return type of the public method exposes it.

3. Which is the most accurate one-line definition of information hiding in Parnas’s sense?

Marking fields and methods private so external code can’t access them.
Access modifiers help enforce information hiding when used well, but they don’t define the principle. You can mark every field private and still leak the secret through return types and method signatures.
Splitting a program into small files and functions.
Splitting code is modularization. Information hiding is the criterion for how to split — by hidden decisions, not by processing steps.
Each likely-to-change decision lives in one module, behind a stable interface.
Returning copies instead of references from getter methods.
Returning copies prevents one leak (mutation). It is a useful tactic, not the principle itself.

Parnas’s 1972 definition is that modules should be organized around design decisions that are difficult or likely to change. Each such decision lives in one module; other modules interact with it through an interface that does not reveal the decision. private, returning copies, small files — all useful tactics that may help. But the principle is about which design decisions get hidden, and from whom.

2

The Playlist's Secret

Why this matters

Step 1 showed you a leak the size of one line. Real codebases hide leaks the size of an org chart — and the same leak forces three teams to coordinate every time the data shape changes. The fastest way to feel the difference is to do a small refactor on a Playlist class and then run the same client tests against a different internal storage. If the client breaks, the secret was never hidden.

Connection to the chapter’s KWIC example. Parnas (1972) made the same point at the system level using a Key Word In Context indexer: decomposing by processing steps (input → shift → sort → print) spread the line-storage decision across every module, so a change to line storage broke every module. Decomposing by hidden decisions (line storage, shift generation, sort ordering) kept each change local. This step plays out Parnas’s argument at the class level on a Playlist.

🎯 You will learn to

Apply the five-step routine for hiding a secret: name the change, name the secret, list the minimum client assumptions, remove the leak, verify with a swap test
Analyze which lines of client code depend on the internal representation
Create a new domain operation on the hidden module without changing any client

The scene

You are building a music app. Playlist exposes a raw tracks: list[dict]. Three client features reach in: a textual summary, a top-picks list, and a “party-ready?” check. Run app.py and see the current behavior. Then the product manager files a ticket: “Add remove(title) — needs to be O(1).” That requires switching internal storage from list to dict keyed by title. Run the same client code afterward and watch it break.

✏️ Predict before you run

Look at playlist.py and app.py. If we change Playlist.tracks from a list[dict] to a dict[str, Track] (keyed by title, for O(1) remove), how many files will need to change to keep the demo running?

(a) Only playlist.py — the field is private-ish; clients should not care.
(b) playlist.py and app.py — every loop and indexing operation in app.py will break.
(c) Only app.py — Playlist itself is fine.
(d) Zero files — Python’s duck typing handles it.

Commit to a letter. The first test enforces this.

The five-step routine

You will use this exact routine in every refactor from here on. Memorize the headings, not the lines:

Name the change.        What is about to change, and why is it likely?
Name the secret.        Which design decision should one module own?
Minimum client assumptions.   What does the client *actually* need to know?
Remove the leak.        Replace exposed representation with domain operations.
Verify with a swap.     Same client, different implementation, same output.

Filled in for Playlist:

Change:    Internal storage may move from list to dict for O(1) remove.
Secret:    How tracks are stored and ordered for retrieval.
Client needs:  top N by popularity, total minutes, average popularity, count.
Remove:    Stop exposing ``tracks``; expose ``top_tracks(n)``,
              ``total_duration_minutes()``, ``average_popularity()``, ``__len__``.
Verify:    Hidden test runs the same client against a dict-backed Playlist.

Your task

Refactor playlist.py so the public API is domain operations, not a raw list:

add(title, artist, duration_sec, popularity)
top_tracks(n) -> list[Track]          # n highest by popularity
total_duration_minutes() -> float
average_popularity() -> float         # 0 if empty
__len__() -> int

Define a Track dataclass with the four attributes (title, artist, duration_sec, popularity) so top_tracks returns domain objects, not dicts.

Refactor app.py so describe_playlist and is_party_appropriate use only the four methods above. They must not touch playlist.tracks, _tracks, or any dict keys.

One method is up to you: in app.py, write is_party_appropriate(playlist) — returns True if total duration > 90 minutes AND average popularity > 60.

🪞 Before clicking Next

Once all four tests pass, take 15 seconds and answer in your own words — out loud, to a rubber duck, or in your head:

In ≤25 words, what did this refactor actually buy you? Constraint: don’t use the words “private”, “list”, or “encapsulate” — those are mechanisms, and the answer is about a decision.

Forcing yourself to say it without those words is the point. If your sentence keeps reaching for them, you are still describing the implementation, not the design choice. The first quiz question rewards you for finding the decision-level wording.

The “generation effect” (Chi et al., 1994) says producing the sentence yourself strengthens learning more than re-reading does. The forbidden-word constraint comes from Variation Theory (Marton): forcing different language for the same concept is what makes the schema transferable.

Starter files

playlist.py

"""Music playlist.

STARTING STATE: leaks the internal list[dict] through ``tracks``.
Refactor so the only public surface is *domain operations*.
"""


class Playlist:
    def __init__(self) -> None:
        self.tracks: list[dict] = []

    def add(self, title: str, artist: str, duration_sec: int, popularity: int) -> None:
        self.tracks.append({
            "title": title,
            "artist": artist,
            "duration_sec": duration_sec,
            "popularity": popularity,
        })

# TODO (Step 4 of the routine — "Remove the leak"):
#   1. Define a ``Track`` dataclass with title, artist, duration_sec, popularity.
#   2. Replace ``self.tracks`` with a hidden ``self._tracks``.
#   3. Add the four domain methods listed in the instructions.

app.py

"""Client of Playlist. Currently reaches into the raw list.

Refactor ``describe_playlist`` and ``is_party_appropriate`` to use
ONLY the new domain methods on Playlist (no ``playlist.tracks``,
no dict indexing).
"""

from playlist import Playlist


def describe_playlist(playlist: Playlist) -> str:
    total_minutes = sum(t["duration_sec"] for t in playlist.tracks) / 60
    avg_popularity = (
        sum(t["popularity"] for t in playlist.tracks) / len(playlist.tracks)
        if playlist.tracks else 0
    )
    top_three = sorted(playlist.tracks, key=lambda t: t["popularity"], reverse=True)[:3]

    lines = [
        f"{len(playlist.tracks)} tracks, "
        f"{total_minutes:.1f} min, "
        f"avg popularity {avg_popularity:.0f}"
    ]
    lines.append("Top picks:")
    for t in top_three:
        lines.append(f"  - {t['title']} by {t['artist']}")
    return "\n".join(lines)


def is_party_appropriate(playlist: Playlist) -> bool:
    # TODO: rewrite using only Playlist's new domain methods.
    # Spec: True iff total duration > 90 min AND avg popularity > 60.
    raise NotImplementedError


if __name__ == "__main__":
    p = Playlist()
    p.add("Bad Guy", "Billie Eilish", 194, 95)
    p.add("Levitating", "Dua Lipa", 203, 88)
    p.add("Blinding Lights", "The Weeknd", 200, 92)
    p.add("Heat Waves", "Glass Animals", 238, 80)
    p.add("As It Was", "Harry Styles", 167, 90)
    print(describe_playlist(p))
    print("party-ready?", is_party_appropriate(p))

Solution

playlist.py

"""Music playlist with the storage decision hidden behind domain operations.

The 5-step routine, annotated below as it appears in the code:
  1. Change       : list -> dict for O(1) remove (anticipated next sprint)
  2. Secret       : how tracks are stored AND how they are queried
  3. Client needs : top-N, total minutes, avg popularity, count
  4. Remove leak  : Track dataclass + four domain methods, no exposed list
  5. Verify swap  : DictBackedPlaylist in the test produces identical output
"""

from dataclasses import dataclass


# --- Subgoal 4a: domain object (so callers never see raw dicts) ---
@dataclass(frozen=True)
class Track:
    title: str
    artist: str
    duration_sec: int
    popularity: int


class Playlist:
    # --- Subgoal 4b: hide the representation behind an underscore ---
    def __init__(self) -> None:
        self._tracks: list[Track] = []

    # --- Subgoal 4c: domain operations only, no list returned ---
    def add(self, title: str, artist: str, duration_sec: int, popularity: int) -> None:
        self._tracks.append(Track(title, artist, duration_sec, popularity))

    def top_tracks(self, n: int) -> list[Track]:
        return sorted(self._tracks, key=lambda t: t.popularity, reverse=True)[:n]

    def total_duration_minutes(self) -> float:
        return sum(t.duration_sec for t in self._tracks) / 60

    def average_popularity(self) -> float:
        if not self._tracks:
            return 0
        return sum(t.popularity for t in self._tracks) / len(self._tracks)

    def __len__(self) -> int:
        return len(self._tracks)

app.py

from playlist import Playlist


def describe_playlist(playlist: Playlist) -> str:
    lines = [
        f"{len(playlist)} tracks, "
        f"{playlist.total_duration_minutes():.1f} min, "
        f"avg popularity {playlist.average_popularity():.0f}"
    ]
    lines.append("Top picks:")
    for t in playlist.top_tracks(3):
        lines.append(f"  - {t.title} by {t.artist}")
    return "\n".join(lines)


def is_party_appropriate(playlist: Playlist) -> bool:
    return playlist.total_duration_minutes() > 90 and playlist.average_popularity() > 60


if __name__ == "__main__":
    p = Playlist()
    p.add("Bad Guy", "Billie Eilish", 194, 95)
    p.add("Levitating", "Dua Lipa", 203, 88)
    p.add("Blinding Lights", "The Weeknd", 200, 92)
    p.add("Heat Waves", "Glass Animals", 238, 80)
    p.add("As It Was", "Harry Styles", 167, 90)
    print(describe_playlist(p))
    print("party-ready?", is_party_appropriate(p))

The routine you just ran:

Named the change. “Storage may move from list to dict for O(1) remove.”
Named the secret. “How tracks are stored and queried.”
Listed the minimum client assumptions. Top N, total minutes, avg popularity, count.
Removed the leak. The four domain methods became the entire public surface. _tracks is internal.
Verified with a swap. The hidden test built DictBackedPlaylist with totally different storage. Your app.py produced identical output. That is the operational proof that the secret is hidden.

What you didn’t have to do. You didn’t have to write the dict-backed version. You didn’t have to predict whether it would be a dict, a B-tree, a database, or a remote service. You bought the option to swap any of those in later, at the cost of writing four small methods today.

One trap to remember. A @dataclass(frozen=True) on Track was deliberate. If Track were mutable, top_tracks(3) could be modified by callers — re-leaking the data. Frozen dataclasses are the cheapest way to make a domain object both typed and safe to hand out. (tuple(self._tracks) would also work but loses the named fields.)

Step 2 — Knowledge Check

Min. score: 80%

1. In Parnas’s terms, what is Playlist’s secret after your refactor?

The fact that _tracks is named with an underscore.
Naming is a clue, not a secret. The secret is a design decision, not a syntactic marker.
The order in which tracks were added.
Insertion order may be a behavior the public API promises (e.g., ‘top_tracks ties break by insertion order’), but it is not the central secret. The storage decision behind it is.
How tracks are stored and how queries like ‘top N by popularity’ are computed.
The exact value of the popularity threshold for party-readiness (60).
That threshold is a client policy, owned by app.is_party_appropriate. It lives in the client because the client is the one who decides what ‘party-ready’ means. Playlist itself does not see that constant.

A module’s secret is the volatile design decision it owns. Playlist owns how tracks are stored and how they are queried. The four public methods are the stable contract; the storage choice (list, dict, B-tree, eventually a database) is the hidden decision that can change without forcing changes elsewhere. The swap test you just passed is the operational proof.

2. Suppose a teammate proposes a smaller fix for the Step 1 WatchHistory leak: keep storage as list[dict], but have recent() return tuple(self._history) (an immutable copy). They argue: “Now nobody can mutate it — the secret is hidden.” Are they right, in Parnas’s sense?

Yes — fields are private and the return is immutable, so nothing leaks.
A real improvement, but only for mutation leaks. Clients can still write code that depends on the shape of each element. Changing the element type still ripples.
Not necessarily — the element type is still part of the contract.
Yes — immutability is the entire point of information hiding.
Immutability prevents accidental mutation. Information hiding is the broader discipline of choosing which design decisions clients may depend on. Element shape is one of those decisions.
No — Python tuples are not really immutable; you can still mutate them with __setitem__.
Python tuples are immutable in the sense that matters here — t[0] = ... raises TypeError, and tuple has no __setitem__. But this isn’t the issue with the design.

Returning an immutable copy plugs the mutation leak, but it doesn’t hide the element type. If recent() returns tuple[dict, ...], clients still depend on each entry being a dict with specific keys — changing it to a WatchedShow dataclass still breaks them. That is exactly why your Playlist refactor did two things, not one: a frozen Track dataclass and a public surface of domain methods. Together they hide both the mutation channel and the element shape. Information hiding is about which decisions are visible, not just which writes are prevented.

3. (Select all that apply.) Which of these future changes can your refactored Playlist absorb without forcing any change in app.py? (select all that apply)

Switching internal storage from a list to a dict keyed by title.
Adding a new field release_year to Track (the dataclass).
Adding an optional field with a default value is absorbed; clients ignore it. But if you give it no default, callers of Track(...) must pass it — and Playlist.add calls Track(...). So it’s a ‘sometimes yes, sometimes no’ — we’ve marked it as optional credit.
Adding a new domain method remove(title) to Playlist.
Renaming the public method top_tracks to most_popular.
Renaming a public method on Playlist by definition breaks every client that calls it. That is the opposite of an absorbed change — you have changed the contract.

The shape of “what changes are local” follows directly from the contract you exposed. Internal storage choice is hidden, so (1) is absorbed. New domain methods are additive, so (3) is absorbed. Renaming a method on the contract is a contract change, so (4) ripples. Adding fields can be local — but it depends on whether you preserve the constructor signature (defaults vs. required), which is why (2) is partial credit.

4. You read a teammate’s class and the only thing wrong is that it exposes get_internal_list() -> list[dict]. They argue: “It’s only used by one client right now, so it can’t be a leak.” What is the strongest single counter?

The method’s name uses the word internal, so it’s a smell.
Naming is a hint, not the argument. A method called current_track() could leak just as badly.
The cost shows up when the storage decision changes — every client written by then is also coupled.
Returning a list is always wrong in Python; tuples are required by best practice.
Lists are fine to return; the question is whether the return type is a leak. Often tuple[DomainObject, ...] is the safer shape, but not always.
Information hiding requires zero exposed methods. Even get_internal_list() is too many.
Information hiding never says ‘expose nothing’. It says ‘expose only the stable contract’. Plenty of legitimate methods exist.

This is the change-localization argument and it is the strongest reason in Parnas’s framing. The leak costs nothing at the moment of writing — it costs everything the moment the storage representation needs to change. By then, every client written between now and then will also depend on the leak, multiplying the eventual edit cost. Studies of professional developers find program comprehension (the activity that becomes painful when a representation change must be traced through many files) consumes around 58% of their working time (Xia et al., IEEE TSE 2017). Information hiding is one of the cheapest ways to keep that number from compounding.

5. Spaced retrieval from Step 1. Your Playlist._tracks is named with one underscore. Does the underscore alone hide the storage decision from app.py?

Yes — Python respects the underscore convention strictly.
Linters and IDEs respect the convention, but it is exactly that — a convention. It does not ‘hide’ anything by itself.
Yes — once a field is _-prefixed, no outside code can read it.
Outside code can still write playlist._tracks and Python will not raise. The convention is social, not enforced.
No — the hiding comes from no public method returning or accepting the list.
No — only __double_underscore names are truly private in Python.
Even __double_underscore names are reachable via obj._ClassName__name (name mangling, not access control). The real hiding always comes from what the public API does or does not expose.

Underscore prefixes are a social signal, not enforcement. The real hiding in your refactored Playlist comes from the fact that the public API exposes only domain operations (top_tracks, total_duration_minutes, etc.) — none of them returns the list or accepts a list parameter. That is what lets you swap the storage. The underscore just tells future maintainers “I meant for this to be local.”

3

A Protocol on Familiar Code

Why this matters

Step 2’s swap test worked because Python’s duck typing checks methods at call time. That’s powerful but invisible — nothing in your code says “Playlist and DictBackedPlaylist are interchangeable.” This step introduces the one Python construct that makes that contract visible: typing.Protocol. No new design principle here — just a new way to declare what your Step 2 refactor already accomplished.

Why now, before the next refactor? Steps 4–7 all use Protocol. Pre-loading the syntax on familiar code (your Playlist) means each later step adds only one new design idea at a time, not two or three. That keeps cognitive load on the lesson, not the language.

🎯 You will learn to

Apply typing.Protocol to name a contract that multiple classes satisfy structurally — no explicit inheritance needed
Distinguish Python’s existing duck typing (runtime, invisible) from a typed Protocol (declared, type-checkable)
Recognize that the same construct hides an algorithm in Step 4, a storage backend in Step 5, and an exhaustive set of alternatives in Step 6

Five-minute primer

A Protocol is a class that declares method signatures as a contract. Any class with matching methods satisfies it automatically — no class Foo(Bar): required.

from typing import Protocol

class Counter(Protocol):
    def increment(self) -> None: ...
    def value(self) -> int: ...

class TallyCounter:           # No explicit base class!
    def __init__(self) -> None:
        self._n = 0
    def increment(self) -> None:
        self._n += 1
    def value(self) -> int:
        return self._n

def report(c: Counter) -> str:   # Accepts any Counter-shaped class
    return f"count is {c.value()}"

report(TallyCounter())   # OK — TallyCounter is structurally a Counter

The ... after each method’s signature is literally Python’s ellipsis literal — it tells readers (and mypy) “this method is declared, not implemented here.” The Protocol class itself is never instantiated; concrete classes are.

✏️ Predict before you run

protocol_demo.py has your Step 2 Playlist and a small DictBackedPlaylist. If we add class PlaylistLike(Protocol) with the five methods, will a type checker accept both classes as PlaylistLike?

(a) Only Playlist — DictBackedPlaylist doesn’t inherit from PlaylistLike.
(b) Both — structural matching cares about method shape, not inheritance.
(c) Neither — Playlist doesn’t declare : PlaylistLike either.
(d) Only DictBackedPlaylist — it was added later, so it knows about the Protocol.

Commit, then continue.

Reveal (after you've committed)

Answer **(b)**. Protocols use *structural* subtyping (PEP 544). Any class with the matching methods satisfies the Protocol — no explicit base class, no order-of-definition concerns, no decorator needed. This is what makes `Protocol` the right Python tool for "swap-this-for-that" designs. The Step 2 swap test was the *runtime* proof; `PlaylistLike` is the *declared* contract.

Your task

Open protocol_demo.py. The Playlist class from Step 2 is there, plus a tiny DictBackedPlaylist (the swap class from your Step 2 test, made permanent so it has a name).

Add from typing import Protocol and define class PlaylistLike(Protocol) at the top, with these five methods, each ending in ...:
- add(self, title: str, artist: str, duration_sec: int, popularity: int) -> None
- top_tracks(self, n: int) -> list[Track]
- total_duration_minutes(self) -> float
- average_popularity(self) -> float
- __len__(self) -> int
Change summary(playlist: object) to summary(playlist: PlaylistLike). Do not touch the body — only the annotation.
Do not add (PlaylistLike) to either Playlist or DictBackedPlaylist. The whole point is that they satisfy it without saying so.

The test will call summary(Playlist()) and summary(DictBackedPlaylist()). Both should produce identical-shape strings — proving the same client function accepts two completely different backings, via the declared Protocol rather than runtime luck.

Look back at Step 2’s swap test. It built DictBackedPlaylist inside the test and passed it to your refactored client. That worked because of invisible duck typing — Python found the methods at call time. PlaylistLike is the same fact, now declared. Nothing about Step 2’s runtime behavior changes; what changes is that a future reader can see the contract without running the code.

🪞 Before clicking Next

In ten seconds, finish this aloud:

Without using the word “duck”: what does PlaylistLike make visible about my code that wasn’t visible in Step 2?

The word “duck” is forbidden because “duck typing” is the Python jargon for what’s happening — but the design point is that the contract is now named. The new affordance is reader-visible substitutability. Variation Theory says forcing different language is what makes the concept transferable to the next refactor.

Starter files

protocol_demo.py

"""Step 3 — pre-load Protocol on familiar code.

Playlist and DictBackedPlaylist are both here. Define
``PlaylistLike(Protocol)`` so a single ``summary(p: PlaylistLike)``
function accepts both — by structural matching, no inheritance.
"""

from dataclasses import dataclass
# TODO: from typing import Protocol


@dataclass(frozen=True)
class Track:
    title: str
    artist: str
    duration_sec: int
    popularity: int


# TODO 1: Define ``class PlaylistLike(Protocol)`` here with the five
# methods from the instructions. End each declaration with ``...``.


class Playlist:
    def __init__(self) -> None:
        self._tracks: list[Track] = []
    def add(self, title, artist, duration_sec, popularity):
        self._tracks.append(Track(title, artist, duration_sec, popularity))
    def top_tracks(self, n):
        return sorted(self._tracks, key=lambda t: t.popularity, reverse=True)[:n]
    def total_duration_minutes(self):
        return sum(t.duration_sec for t in self._tracks) / 60
    def average_popularity(self):
        return (sum(t.popularity for t in self._tracks) / len(self._tracks)) if self._tracks else 0
    def __len__(self):
        return len(self._tracks)


class DictBackedPlaylist:
    """Same operations, dict-backed storage. Structural twin of Playlist."""
    def __init__(self) -> None:
        self._by_title: dict = {}
    def add(self, title, artist, duration_sec, popularity):
        self._by_title[title] = Track(title, artist, duration_sec, popularity)
    def top_tracks(self, n):
        return sorted(self._by_title.values(), key=lambda t: t.popularity, reverse=True)[:n]
    def total_duration_minutes(self):
        return sum(t.duration_sec for t in self._by_title.values()) / 60
    def average_popularity(self):
        vs = list(self._by_title.values())
        return (sum(t.popularity for t in vs) / len(vs)) if vs else 0
    def __len__(self):
        return len(self._by_title)


# TODO 2: change the parameter annotation here from ``object`` to ``PlaylistLike``.
def summary(playlist: object) -> str:
    lines = [f"{len(playlist)} tracks, {playlist.total_duration_minutes():.1f} min"]
    for t in playlist.top_tracks(3):
        lines.append(f"  - {t.title}")
    return "\n".join(lines)


if __name__ == "__main__":
    for cls in (Playlist, DictBackedPlaylist):
        p = cls()
        p.add("Bad Guy", "Billie Eilish", 194, 95)
        p.add("Levitating", "Dua Lipa", 203, 88)
        p.add("Blinding Lights", "The Weeknd", 200, 92)
        print(cls.__name__)
        print(summary(p))
        print()

Solution

protocol_demo.py

"""Step 3 — pre-load Protocol on familiar code."""

from dataclasses import dataclass
from typing import Protocol


@dataclass(frozen=True)
class Track:
    title: str
    artist: str
    duration_sec: int
    popularity: int


class PlaylistLike(Protocol):
    def add(self, title: str, artist: str, duration_sec: int, popularity: int) -> None: ...
    def top_tracks(self, n: int) -> list[Track]: ...
    def total_duration_minutes(self) -> float: ...
    def average_popularity(self) -> float: ...
    def __len__(self) -> int: ...


class Playlist:
    def __init__(self) -> None:
        self._tracks: list[Track] = []
    def add(self, title, artist, duration_sec, popularity):
        self._tracks.append(Track(title, artist, duration_sec, popularity))
    def top_tracks(self, n):
        return sorted(self._tracks, key=lambda t: t.popularity, reverse=True)[:n]
    def total_duration_minutes(self):
        return sum(t.duration_sec for t in self._tracks) / 60
    def average_popularity(self):
        return (sum(t.popularity for t in self._tracks) / len(self._tracks)) if self._tracks else 0
    def __len__(self):
        return len(self._tracks)


class DictBackedPlaylist:
    def __init__(self) -> None:
        self._by_title: dict = {}
    def add(self, title, artist, duration_sec, popularity):
        self._by_title[title] = Track(title, artist, duration_sec, popularity)
    def top_tracks(self, n):
        return sorted(self._by_title.values(), key=lambda t: t.popularity, reverse=True)[:n]
    def total_duration_minutes(self):
        return sum(t.duration_sec for t in self._by_title.values()) / 60
    def average_popularity(self):
        vs = list(self._by_title.values())
        return (sum(t.popularity for t in vs) / len(vs)) if vs else 0
    def __len__(self):
        return len(self._by_title)


def summary(playlist: PlaylistLike) -> str:
    lines = [f"{len(playlist)} tracks, {playlist.total_duration_minutes():.1f} min"]
    for t in playlist.top_tracks(3):
        lines.append(f"  - {t.title}")
    return "\n".join(lines)


if __name__ == "__main__":
    for cls in (Playlist, DictBackedPlaylist):
        p = cls()
        p.add("Bad Guy", "Billie Eilish", 194, 95)
        p.add("Levitating", "Dua Lipa", 203, 88)
        p.add("Blinding Lights", "The Weeknd", 200, 92)
        print(cls.__name__)
        print(summary(p))
        print()

What you just bought. You named a contract that Step 2 left implicit. Before this step, “the same client works with two storage backends” was a runtime fact you proved with a test. Now it is a declared Protocol — readable in the file, checkable by mypy, and exactly the construct you’ll layer design judgment on top of in the next steps.

One subtle point. Notice that PlaylistLike has zero opinion on what is hidden behind it. It just says “things that look like this.” That is the right shape for an information-hiding contract: a Protocol is a place to put the secret-free part of a module’s public surface. The secret lives in the implementations — exactly where it belongs.

Looking ahead. Steps 4–7 each layer one new design decision on top of the Protocol mechanics you now own.

Step 4 hides a scoring algorithm (the score-scale leak).
Step 5 hides a storage technology.
Step 6 hides an exhaustive list of providers.
Step 7 asks you to classify leaks across all four refactors and decide when not to apply the principle.

Step 3 — Knowledge Check

Min. score: 80%

1. Why does summary(DictBackedPlaylist()) work even though DictBackedPlaylist does not have (PlaylistLike) in its class definition?

Python silently injects PlaylistLike as a base class at runtime.
Python doesn’t inject anything — inheritance has to be declared explicitly. Structural matching is a separate mechanism added by PEP 544.
Protocol uses structural matching — any class with the required methods qualifies.
DictBackedPlaylist inherits from Playlist, and Playlist is a PlaylistLike.
Look at the file — there is no inheritance between them. Each class defines its own methods independently. Their being interchangeable does not come from a shared base class.
Python ignores type annotations entirely at runtime, so the annotation is decorative.
Half-right that Python ignores annotations at runtime — but mypy and Pyright don’t. And the Protocol mechanism is specifically designed for static checkers to enforce structural rules. ‘Annotations are decorative’ isn’t the reason the swap works — the structural rule is.

Protocols (PEP 544) introduced structural subtyping to Python’s type system. A class satisfies a Protocol if it has the right methods and signatures — there is no (PlaylistLike) required in its declaration. This is exactly the right tool for Step 2’s swap test: the swap class didn’t need to know the Protocol existed when it was written. Now you can declare the contract visibly.

2. Spaced retrieval — Step 2. Now that PlaylistLike is declared, what — in Parnas’s sense — is the secret that PlaylistLike allows implementations to keep?

How many tracks are in the playlist.
That count is part of the contract — __len__ exposes it. The secret is what the contract doesn’t say.
How tracks are stored — list vs. dict vs. anything else.
Whether Python uses duck typing or static typing.
That is a language-implementation detail, not a design decision the module owns.
The fact that PlaylistLike is a Protocol.
Being a Protocol is the mechanism. The secret is what the Protocol allows the implementer to keep private.

Same secret as Step 2 — how tracks are stored — but now you have a named artifact for it. The Protocol declares the five public operations as the stable surface; the storage choice (list, dict, or anything else) sits behind it. Naming the contract makes the secret official: any future engineer can read PlaylistLike and know exactly what is promised, and equally importantly, what isn’t.

3. (Select all that apply.) Which of these classes would satisfy class CounterLike(Protocol) with def increment(self) -> None: ... and def value(self) -> int: ...? (select all that apply)

class Tally: def increment(self): self.n += 1 def value(self): return self.n
class Total(CounterLike): def increment(self): ... (no value defined)
Inheriting from CounterLike doesn’t help — structural matching means the methods must actually exist on the class. Missing value ⇒ doesn’t satisfy. Inheritance without conformance is the worst of both worlds.
class HiddenCount: def increment(self): self._n += 1 def value(self): return self._n
class Logger: def log(self, msg): print(msg)
Different methods entirely. The Protocol asks for increment and value; this class has neither.

Two of these satisfy the Protocol — both define matching increment and value methods. The inheritance line in the missing-value class is irrelevant; what matters is the shape of the class. Subtle point: structural matching is a promise from you, enforceable by static checkers — vanilla Python won’t catch a missing method until you actually call it. That is why your Step 2 swap test still matters: it’s the runtime proof, complementary to the declared Protocol.

4. Spaced retrieval — Step 1. A teammate writes class Wallet(Protocol) with one method def balance(self) -> int: ... and a concrete class CryptoWallet that has matching methods plus a public transactions: list[dict] attribute. They say “the Protocol hides the implementation.” Is that true?

Yes — Protocols hide everything that isn’t declared in them.
A Protocol describes a stable surface; it doesn’t prevent a concrete class from having a wider public API. Step 1’s lesson holds: hiding is about what the implementation chooses not to expose, not what the Protocol declares.
No — transactions is still reachable on the concrete class.
Yes — Python’s annotations enforce that only Protocol methods are visible.
Python’s annotations are advisory at runtime. They guide type checkers; they don’t restrict attribute access.
No — Protocols only work for read-only objects.
There is no such restriction. Protocols can declare mutating methods too.

This is Step 1’s lesson, retrieved through the Step 3 lens. A Protocol is a floor, not a ceiling: the concrete class must have at least the declared methods, but it can also have more. The implementer is the one who controls what additional surface to expose. A public transactions attribute on CryptoWallet is reachable by anyone with a CryptoWallet reference — exactly the way WatchHistory._history leaked through recent() in Step 1. The Protocol is a contract for clients-as-Wallet; it isn’t a wall around the concrete class.

4

An Interface That Tells You Too Much

Why this matters

Step 2 fixed a mutation leak. Step 3 gave you the declared contract — Protocol — for Step 2’s swap test. This step fixes a subtler problem: a contract that looks clean but over-specifies how it computes its answer. Parnas warned about this in 1972 with his KWIC example: an interface that says more than the client needs to know restricts future implementations. A music recommender that returns raw BM25 scores is the modern version. Switch the algorithm from BM25 to embeddings and every numeric threshold in the client breaks.

One new piece of syntax this step: typing.Literal. Protocol you already own from Step 3 — reuse it freely. The new content of this step is design judgment, not mechanics.

🎯 You will learn to

Analyze a read-only API for over-specification — which numeric scales, internal IDs, or raw rows are visible that clients did not need
Create a typing.Protocol plus a small dataclass so two different ranking strategies satisfy the same contract
Apply the Parnas/Clements/Weiss module-guide mini-doc format: secret, likely changes, stable contract, what is not promised

One-minute primer on `typing.Literal`

typing.Literal lets a type be one of a fixed set of values:

from typing import Literal
Confidence = Literal["low", "medium", "high"]

Now confidence: Confidence means “must be the string low, medium, or high, and your type checker will yell if you try anything else.” It’s the right tool for a small enum of domain-meaningful labels.

The scene

recommender.py ranks songs for a query. The current contract returns list[tuple[int, float, dict]] — (bucket_id, similarity_score, raw_row). sidebar.py thresholds at score >= 12.0 to call a hit “strong.” Today’s scorer is BM25-style; scores live in 0..30. Next quarter the team plans to swap in vector embeddings; scores will live in 0..1. Every threshold in every client will silently produce wrong answers.

✏️ Predict before you run

The bad design returns (bucket_id, score, row). If the recommender switches from BM25 to cosine-similarity embeddings, what is the most likely failure mode in the existing sidebar.py?

(a) A crash — the new return type won’t match.
(b) An empty sidebar — every score will be below the threshold 12.0, so no hits are “strong” anymore.
(c) The sidebar shows literally every song — every score will be above 12.0.
(d) The sidebar is unchanged — the contract types are the same.

Commit before reading on.

Reveal

Answer **(b)**. Cosine-similarity scores live in `0..1`. The old threshold `12.0` is now larger than the highest possible score, so the strong-hit list is always empty. **The sidebar just goes blank in production** with no exception — the worst kind of bug. The deep mistake is not in `sidebar.py`. It is in `recommender.py`'s contract, which exposed the numeric score and tied callers to its *scale*. Parnas's term for this in his 1972 paper: the interface "reveals more than is necessary," restricting which future implementations can satisfy it.

Scaffold: trace the leak before you code

Do this step in four small passes. The goal is to lower the typing load so your attention stays on the design decision.

Pass	What to decide	What to edit
1. Name the leak	`sidebar.py` knows the score scale, bucket IDs, and raw row shape. Those belong to the recommender implementation.	Do not touch code yet; point at the three leaking facts in the starter files.
2. Replace the contract	Clients need “is this a strong hit?”, not “what was the raw score?”	Add `Confidence`, `SongHit`, and `Recommender` in `recommender.py`.
3. Move the algorithm decision	Popularity buckets are one implementation’s secret.	Implement `PopularityRecommender.recommend(...)` behind the Protocol.
4. Clean the client	The sidebar should only ask for hits and read domain fields.	Refactor `support_sidebar(query, recommender)` to filter on `hit.confidence`.

Your task

Refactor recommender.py so the contract exposes only what the client genuinely needs:

Define Confidence = Literal["low", "medium", "high"].
Define @dataclass(frozen=True) class SongHit with track_id: str, title: str, artist: str, confidence: Confidence.
Define class Recommender(Protocol) with def recommend(self, query: str, *, limit: int = 5) -> list[SongHit]: ....
Provide class PopularityRecommender: whose recommend method satisfies the Protocol. Use the helper _strong_track_table() already in the file to populate a few demo hits — assign confidence based on internal popularity buckets (you choose how).
Refactor sidebar.py so support_sidebar(query, recommender) takes a Recommender and returns titles of hits where hit.confidence == "high". No numeric thresholds anywhere in sidebar.py.

Also: write a module guide comment at the top of recommender.py in this exact format (you can fill in the values):

"""
Module guide:
  Primary secret:   <one sentence — name the volatile decision>
  Likely changes:   <bullets — BM25 -> embeddings, score scale shifts, ...>
  Stable contract:  <one sentence — what callers can rely on>
  Not promised:     <bullets — raw scores, bucket IDs, ranking algorithm, ...>
"""

Test 4 will look for those four words (Primary secret, Likely changes, Stable contract, Not promised) — Parnas, Clements, and Weiss called this artifact the module guide in their 1985 paper. It is the lightest-weight design-doc you can write that still records why the boundary exists.

🪞 Before clicking Next

Once all four tests pass, take 20 seconds and answer in your head:

Without using the words “score” or “BM25”: if a future engineer reads sidebar.py, can they tell which ranking algorithm runs underneath? Why or why not?

The right answer (“no — sidebar only sees hit.confidence, which is a domain label, not an algorithm artifact”) is what you just bought with this refactor. The forbidden words force you to talk about the concept, not just point at the leak.

Starter files

recommender.py

"""STARTING STATE.

Today's design returns ``list[tuple[bucket_id, score, raw_row]]``.
Score scale is 0..30 (BM25-like). Refactor as the instructions ask
so a future swap to embeddings (scale 0..1) does NOT break callers.
"""

# The recommender currently exposes raw scores, bucket IDs, and dict rows.
# That is an over-specified contract. Replace it.

_DEMO_CATALOG = [
    # (track_id, title, artist, internal_popularity_0_to_100)
    ("t1", "Bad Guy",          "Billie Eilish",          95),
    ("t2", "Bury a Friend",    "Billie Eilish",          78),
    ("t3", "Lovely",           "Billie Eilish, Khalid",  62),
    ("t4", "Ocean Eyes",       "Billie Eilish",          55),
    ("t5", "Happier Than Ever","Billie Eilish",          88),
    ("t6", "All The Good Girls","Billie Eilish",         40),
]


def _strong_track_table() -> list[tuple[str, str, str, int]]:
    """Return the demo catalog. Use the int popularity to choose confidence."""
    return list(_DEMO_CATALOG)


def recommend(query: str) -> list[tuple[int, float, dict]]:
    """LEAKY contract — returns (bucket_id, score, raw_row)."""
    raw = _strong_track_table()
    # Pretend BM25 scores in 0..30 derived from the popularity field.
    return [
        (
            i // 3,                       # bucket_id leaks an internal partition
            round(pop * 30 / 100, 2),     # score scale 0..30 leaks the algorithm
            {"track_id": tid, "title": title, "artist": artist, "popularity": pop},
        )
        for i, (tid, title, artist, pop) in enumerate(raw)
    ]


# TODO replace the leaky surface above with:
#   1. ``Confidence = Literal["low", "medium", "high"]``
#   2. ``@dataclass(frozen=True) class SongHit`` with the fields named in the instructions
#   3. ``class Recommender(Protocol)`` with ``recommend(query, *, limit=5) -> list[SongHit]``
#   4. ``class PopularityRecommender`` implementing the Protocol
#
# And add the module-guide docstring at the top of the file.

sidebar.py

"""Client that knows too much.

Refactor ``support_sidebar`` to take a ``Recommender`` and ask for
high-confidence hits — no raw scores, no thresholds.
"""

from recommender import recommend

STRONG_THRESHOLD = 12.0   # BM25 scale assumption — a leak waiting to break.


def support_sidebar(query: str) -> list[str]:
    hits = recommend(query)
    return [row["title"] for (_bucket, score, row) in hits if score >= STRONG_THRESHOLD]


if __name__ == "__main__":
    for title in support_sidebar("billie eilish"):
        print(title)

Solution

recommender.py

"""Recommender module.

Module guide:
  Primary secret:   how songs are scored and ranked for a query
  Likely changes:
    - BM25 -> embeddings / hybrid retrieval
    - score-scale shifts (0..30 today, 0..1 tomorrow)
    - per-user personalization layer
    - swapping the catalog source (in-memory list -> vector DB)
  Stable contract:  recommend(query, *, limit) -> list[SongHit]
                    with confidence in {"low", "medium", "high"}
                    sorted high -> medium -> low
  Not promised:
    - raw scores or score scale
    - bucket IDs or index-partition keys
    - tie-breaking order within a confidence band
"""

from dataclasses import dataclass
from typing import Literal, Protocol


Confidence = Literal["low", "medium", "high"]


@dataclass(frozen=True)
class SongHit:
    track_id: str
    title: str
    artist: str
    confidence: Confidence


class Recommender(Protocol):
    def recommend(self, query: str, *, limit: int = 5) -> list[SongHit]: ...


_DEMO_CATALOG = [
    ("t1", "Bad Guy",          "Billie Eilish",          95),
    ("t2", "Bury a Friend",    "Billie Eilish",          78),
    ("t3", "Lovely",           "Billie Eilish, Khalid",  62),
    ("t4", "Ocean Eyes",       "Billie Eilish",          55),
    ("t5", "Happier Than Ever","Billie Eilish",          88),
    ("t6", "All The Good Girls","Billie Eilish",         40),
]


def _strong_track_table() -> list[tuple[str, str, str, int]]:
    return list(_DEMO_CATALOG)


class PopularityRecommender:
    """Hidden secret: popularity-bucket ranking from an in-memory list."""

    _HIGH = 80
    _MEDIUM = 60

    def recommend(self, query: str, *, limit: int = 5) -> list[SongHit]:
        rows = _strong_track_table()
        hits = [
            SongHit(tid, title, artist, self._confidence_for(pop))
            for (tid, title, artist, pop) in rows
        ]
        order = {"high": 0, "medium": 1, "low": 2}
        hits.sort(key=lambda h: order[h.confidence])
        return hits[:limit]

    def _confidence_for(self, popularity: int) -> Confidence:
        if popularity >= self._HIGH:
            return "high"
        if popularity >= self._MEDIUM:
            return "medium"
        return "low"

sidebar.py

"""Client that depends only on the Recommender Protocol."""

from recommender import Recommender


def support_sidebar(query: str, recommender: Recommender) -> list[str]:
    page = recommender.recommend(query, limit=5)
    return [hit.title for hit in page if hit.confidence == "high"]


if __name__ == "__main__":
    from recommender import PopularityRecommender
    for title in support_sidebar("billie", PopularityRecommender()):
        print(title)

What you just bought. The sidebar now depends only on the Recommender Protocol — a one-method shape. The popularity-based and embedding-based recommenders both satisfy it. When the team eventually swaps in embeddings, the sidebar code is untouched and the swap test proves it. Parnas in 1972: the right interface “specifies no more information than the client needs to use the module correctly.”

Why the module guide matters. Six months from now, when a teammate asks “can we add raw_score?”, the docstring at the top of recommender.py answers: that field is on the “Not promised” list, for these specific reasons. The module guide is the lightest-weight design doc you can write, and it costs four lines. Parnas, Clements, and Weiss called this the module guide in their 1985 paper on the A-7E flight-software project, where it was the artifact maintainers consulted first to find which module to edit.

One subtle move. You also sorted the returned hits by confidence band (high → medium → low). That ordering is part of the stable contract — clients can rely on it. But notice that within a band the order is unspecified. That preserves the option to tie-break by recency, by user signal, by random shuffle for A/B testing — all future decisions you have not yet made.

Step 4 — Knowledge Check

Min. score: 80%

1. Parnas’s 1972 paper pointed out that even his “good” KWIC decomposition had a leak: the circular-shift module exposed an ordering its clients did not need. Which best matches the same leak in this step’s original recommend() function?

The fact that recommend is a free function instead of a method on a class.
Free vs. method is style. The leak is structural — what the contract says.
The exposure of the raw score and its 0..30 scale, which locks in BM25-shaped scoring.
The fact that recommend returns more than one hit at a time.
Returning many hits is exactly what the client needs. Not a leak.
The use of tuples instead of dicts as the return type.
Tuples vs. dicts is a shape choice; both can be over-specified. The deeper leak is the score scale, which would also leak whether you wrapped it in a dict.

Parnas’s exact quote in the KWIC discussion: “we have specified more than was necessary and thereby reduced the number of possible implementations.” The raw score and its scale are a representation detail clients did not need to know. Clients can write score > 12.0 and now any new algorithm must produce numbers on the same scale. Confidence buckets ("high", "medium", "low") reveal what the client needs — is this hit strong enough to surface — without naming the algorithm.

2. (Select all that apply.) Which of these are legitimate facts for the contract to promise to clients, and which would be leaks? Mark all the items that are LEAKS the contract should hide. (select all that apply)

Each hit’s confidence is one of three labels: low, medium, or high.
Legitimate. The Confidence enum is part of the stable contract — clients need it to filter.
Hits are returned in descending confidence order (high before medium before low).
Legitimate. Ordering by confidence is a domain promise the client genuinely needs. Note we don’t promise the secondary ordering within a confidence band — that lets the implementation tie-break however it wants.
The exact numeric BM25 score for each hit.
The internal bucket_id used to partition the search index.
The query string can be empty (returns empty list).
Legitimate. Edge-case behavior on empty input is part of the contract.

A clean module guide for Recommender says exactly this: Confidence is in the contract because clients can’t reason without it; descending-by-confidence is in the contract because it’s a domain-level promise; raw scores and bucket IDs are not in the contract because they tie the implementation to one algorithm and one index layout. The skill is naming what is allowed to vary later.

3. Spaced retrieval from Step 2. Suppose your PopularityRecommender internally stores its catalog as a list[tuple]. Which class is responsible for the decision “use a list of tuples”?

sidebar.py — it consumes the data, so the storage is its responsibility.
The client never sees the storage. If you made the client responsible, you’d be back in Step 1’s leak.
Recommender — the Protocol owns all algorithmic decisions.
A Protocol describes the contract, not the storage. It cannot ‘own’ a storage choice.
PopularityRecommender — it is the one implementation that makes this storage choice, behind the shared contract.
Both PopularityRecommender and EmbeddingRecommender — implementations of the same Protocol must agree on storage.
The whole point of the Protocol is that each implementation makes its own storage choice. EmbeddingRecommender might use a NumPy array; PopularityRecommender might use a list. They never compare notes.

Each implementation behind a Protocol owns its own storage secret. The Protocol’s job is to make sure the clients never have to care. That is what lets EmbeddingRecommender use a vector store and PopularityRecommender use a list of tuples without forcing either choice on the other.

4. You’re reviewing a teammate’s PR. They added a new method to Recommender: def raw_score(self, hit: SongHit) -> float. Should you approve?

Yes — extra methods are additive, so they can’t break clients.
Additive changes are usually safe. This one isn’t, because it re-adds the very decision we worked to hide. Now EmbeddingRecommender must also expose a ‘score’ — even though embeddings don’t have a meaningful single score in the same scale.
No — it puts the raw score back in the contract, re-leaking the decision.
Yes, if they add a docstring saying clients shouldn’t use it.
‘Please don’t use this method’ is documentation as a substitute for hiding. It always loses to the next intern who really needs that number.
Yes, if they mark it as a static method.
Static vs. instance doesn’t affect contract leakage. The method is still on the Protocol.

A subtle anti-pattern: adding to the contract can re-leak a decision you previously hid. Confidence labels were chosen precisely so the raw score and its scale wouldn’t be visible. The right move is to ask the teammate what client problem raw_score solves, and address that problem at the right level — e.g., add a is_top_match(hit) predicate, expose a new confidence band like "very_high", or expose an explanation token. Never re-expose the score itself.

5. Spaced retrieval — Step 3 (Protocol mechanics). Your PopularityRecommender was defined before Recommender(Protocol) appeared in the same file. A second class, EmbeddingRecommender, is defined in a test file — not even imported by recommender.py. Both satisfy the Recommender Protocol. Why does that work?

Python silently rewrites both classes at import time to inherit from Recommender.
Python doesn’t rewrite anything. There is no runtime injection of base classes.
Structural subtyping — any class with matching methods qualifies.
Only the second class works; the first one was defined before Recommender, so it can’t satisfy it.
Definition order doesn’t matter for structural matching. A class satisfies a Protocol when its shape matches — when it was defined relative to the Protocol is irrelevant.
It works because both classes happen to have recommend in their name.
Method names matter — recommend matching is part of the structural match — but it’s the signature and existence that count, not the class name.

Step 3’s central lesson, retrieved through this step’s lens. Protocols use structural subtyping, so neither order of definition nor explicit inheritance is needed. The Step 3 swap test (PlaylistLike) and this step’s swap test (Recommender + EmbeddingRecommender) are the same mechanism applied to two different domains. That is what makes the Protocol the right tool for “swap-this-for-that” designs: new implementations can be added later, anywhere, with no edit to the Protocol or its existing implementers.

5

Where Did You Put the Database?

Why this matters

The single most common information-hiding leak in real code is storage. A function that takes a sqlite3.Connection (or a MongoClient, or an S3 handle) and returns rows ties every caller to a specific persistence technology. When the team migrates from SQLite to Postgres, from rows to JSON, from synchronous to async, everything moves. This step is the canonical Parnas case made hands-on. You’ll do the whole routine yourself.

🎯 You will learn to

Create a Protocol + dataclass + in-memory implementation from a leaky function — independently, using the five-step routine
Apply dependency injection: pass the directory in to the client instead of constructing the storage inside it
Evaluate the change-impact radius of a storage migration before and after your refactor

The scene

events.py looks up concerts by city. Today’s implementation uses SQLite. The function signature reveals it — every client compiles against sqlite3. The product manager wants to add a JSON-file-backed test fixture for offline development, and the SRE wants to migrate the production catalog to a remote HTTP service. Each of those is a separate file rewrite today. Your job is to make them all one new class apiece.

✏️ Predict before you run

Suppose we keep the current events.py signature and just implement a JSON-file fixture. How many files have to be edited to use it from tour_planner.py?

(a) 1 — events.py only.
(b) 2 — events.py and tour_planner.py.
(c) 3+ — events.py, tour_planner.py, every test that constructs the connection, and any module that builds the SQL table string.
(d) 0 — duck typing handles it; pass a JSON dict where a connection is expected.

Commit. After your refactor, the same change will require one new class in events.py and zero edits to tour_planner.py — that is your verification.

Scaffold: write the change map first

This step is the most independent refactor so far, but you still get a planning rail. Before touching code, complete this map mentally:

Question	Answer for this step
What is likely to change?	SQLite may become Postgres, HTTP, or a JSON fixture.
What is the secret?	Persistence technology plus schema/row mapping.
Who may know it?	Concrete directory implementations such as `SQLiteEventDirectory`.
Who must not know it?	`tour_planner.affordable_shows` and tests that only need events.
What is the stable contract?	`directory.find_in(city) -> list[Event]`.

Then code in passes: define Event, define the EventDirectory Protocol, make the tiny in-memory implementation, make the SQLite implementation, and only then refactor tour_planner.py. If a pass fails, you know which layer to fix.

Your task

Refactor events.py so the persistence decision is hidden:

Define @dataclass(frozen=True) class Event with title: str, venue: str, date_iso: str, city: str, ticket_price_cents: int.
Define class EventDirectory(Protocol) with def find_in(self, city: str) -> list[Event]: ....
Implement class InMemoryEventDirectory: — constructor takes a list[Event], find_in(city) filters by city. This is your test/fixture implementation.
Implement class SQLiteEventDirectory: — constructor takes a sqlite3.Connection and a table name, find_in(city) runs the same SQL the original function ran and maps rows to Event. This is the only file that may import sqlite3.

Refactor tour_planner.py so affordable_shows(directory, city, max_price_dollars=50) takes an EventDirectory (not a connection). Filter inside the function using event.ticket_price_cents and return a list[Event].

Add the module guide docstring to events.py using the same four labels you used in Step 4.

You will probably break the implementation-swap test first. The most common cause is forgetting to map raw SQL row tuples back to Event objects in SQLiteEventDirectory.find_in. If the test fails, read its diff carefully — the failure is the lesson, not the verdict.

🪞 Before clicking Next

Once all four tests pass, answer this in your head before the quiz:

Without using the words “SQL” or “database”: after the refactor, affordable_shows calls one method on its parameter. Name that method and explain why that single call is enough to absorb every plausible storage migration (SQLite → Postgres → HTTP → file).

The forbidden words force you to describe the contract, not the current implementation. If you find yourself reaching for “SQL”, that is your brain telling you the contract still has a database shape in it — which would mean the abstraction is not really hiding storage.

Starter files

events.py

"""Concert directory.

STARTING STATE: leaks sqlite3 and the row dict shape into every caller.
"""

import sqlite3


def find_events_in_city(
    connection: sqlite3.Connection,
    table: str,
    city: str,
) -> list[dict]:
    rows = connection.execute(
        f"SELECT title, venue, date_iso, city, ticket_price_cents "
        f"FROM {table} WHERE city = ?",
        (city,),
    ).fetchall()
    return [
        {
            "title": r[0],
            "venue": r[1],
            "date_iso": r[2],
            "city": r[3],
            "ticket_price_cents": r[4],
        }
        for r in rows
    ]


# TODO Run the five-step routine yourself:
#   1. Name the change. (One coming: SQLite -> Postgres -> HTTP service.)
#   2. Name the secret. (Persistence technology + schema mapping.)
#   3. Minimum client assumptions. (event has title, venue, date, city, price.)
#   4. Remove the leak.
#        - ``@dataclass(frozen=True) class Event``
#        - ``class EventDirectory(Protocol)`` with ``find_in(city) -> list[Event]``
#        - ``class InMemoryEventDirectory`` (constructor takes list[Event])
#        - ``class SQLiteEventDirectory`` (constructor takes connection + table)
#   5. Verify with a swap. (The hidden test will swap implementations.)

tour_planner.py

"""Client of events.py. Currently knows about sqlite3 by transitive coupling.

Refactor ``affordable_shows`` to take an EventDirectory instead.
"""

from events import find_events_in_city


def affordable_shows(connection, table: str, city: str, max_price_dollars: int = 50):
    cents_limit = max_price_dollars * 100
    events = find_events_in_city(connection, table, city)
    return [e for e in events if e["ticket_price_cents"] <= cents_limit]


if __name__ == "__main__":
    # The demo wires SQLite in *this* file. That is the only place
    # the sqlite3 import is allowed AFTER the refactor.
    import sqlite3
    conn = sqlite3.connect(":memory:")
    conn.execute(
        "CREATE TABLE shows("
        "title TEXT, venue TEXT, date_iso TEXT, city TEXT, ticket_price_cents INT)"
    )
    conn.executemany(
        "INSERT INTO shows VALUES (?, ?, ?, ?, ?)",
        [
            ("Sabrina Carpenter",   "The Forum",        "2026-03-01", "Los Angeles", 11500),
            ("Olivia Rodrigo",      "Crypto.com Arena", "2026-03-05", "Los Angeles",  9800),
            ("Tame Impala",         "Hollywood Bowl",   "2026-04-12", "Los Angeles",  6700),
            ("Local Open Mic",      "Echo Park Bar",    "2026-03-15", "Los Angeles",  1500),
        ],
    )
    conn.commit()

    # After refactor:
    #   from events import SQLiteEventDirectory
    #   directory = SQLiteEventDirectory(conn, "shows")
    #   for ev in affordable_shows(directory, "Los Angeles", max_price_dollars=80):
    #       print(ev)
    for ev in affordable_shows(conn, "shows", "Los Angeles", max_price_dollars=80):
        print(ev)

Solution

events.py

"""Concert directory.

Module guide:
  Primary secret:   how events are persisted and looked up
  Likely changes:
    - SQLite -> Postgres / remote HTTP service
    - column / schema renames
    - addition of caching or read replicas
  Stable contract:  EventDirectory.find_in(city) -> list[Event]
                    Event is a frozen dataclass of domain fields
  Not promised:
    - the storage technology, connection object, or table name
    - SQL column names or row encoding
    - whether results are cached, paginated, or streamed
"""

from __future__ import annotations

import sqlite3
from dataclasses import dataclass
from typing import Protocol


@dataclass(frozen=True)
class Event:
    title: str
    venue: str
    date_iso: str
    city: str
    ticket_price_cents: int


class EventDirectory(Protocol):
    def find_in(self, city: str) -> list[Event]: ...


class InMemoryEventDirectory:
    """Test/fixture implementation. Constructor takes the events directly."""

    def __init__(self, events: list[Event]) -> None:
        self._events = list(events)

    def find_in(self, city: str) -> list[Event]:
        return [e for e in self._events if e.city == city]


class SQLiteEventDirectory:
    """Production implementation. This is the ONLY file that knows SQLite."""

    _COLUMNS = "title, venue, date_iso, city, ticket_price_cents"

    def __init__(self, connection: sqlite3.Connection, table: str) -> None:
        self._conn = connection
        self._table = table

    def find_in(self, city: str) -> list[Event]:
        rows = self._conn.execute(
            f"SELECT {self._COLUMNS} FROM {self._table} WHERE city = ?",
            (city,),
        ).fetchall()
        return [Event(*r) for r in rows]

tour_planner.py

from events import EventDirectory, Event


def affordable_shows(
    directory: EventDirectory,
    city: str,
    max_price_dollars: int = 50,
) -> list[Event]:
    cents_limit = max_price_dollars * 100
    return [e for e in directory.find_in(city) if e.ticket_price_cents <= cents_limit]


if __name__ == "__main__":
    import sqlite3
    from events import SQLiteEventDirectory

    conn = sqlite3.connect(":memory:")
    conn.execute(
        "CREATE TABLE shows("
        "title TEXT, venue TEXT, date_iso TEXT, city TEXT, ticket_price_cents INT)"
    )
    conn.executemany(
        "INSERT INTO shows VALUES (?, ?, ?, ?, ?)",
        [
            ("Sabrina Carpenter",   "The Forum",        "2026-03-01", "Los Angeles", 11500),
            ("Olivia Rodrigo",      "Crypto.com Arena", "2026-03-05", "Los Angeles",  9800),
            ("Tame Impala",         "Hollywood Bowl",   "2026-04-12", "Los Angeles",  6700),
            ("Local Open Mic",      "Echo Park Bar",    "2026-03-15", "Los Angeles",  1500),
        ],
    )
    conn.commit()

    directory = SQLiteEventDirectory(conn, "shows")
    for ev in affordable_shows(directory, "Los Angeles", max_price_dollars=80):
        print(ev)

The Parnas case in one tutorial. events.py is now the only file that knows the persistence decision. tour_planner.py knows only the EventDirectory Protocol and the Event dataclass. Migrating to Postgres or an HTTP service is one new class.

What you proved with the swap test. When the same affordable_shows function ran against InMemoryEventDirectory and SQLiteEventDirectory and returned the same set of titles, you proved the function couldn’t be reaching into storage internals. That’s the operational definition of “the secret is hidden” — the test passing is the evidence.

One pedagogically important note about your __main__ demo. The import sqlite3 in tour_planner.py’s __main__ is fine — that’s the wiring layer (sometimes called composition root or bootstrap). Wiring is where you finally pick which concrete implementation to use. The rule isn’t “nobody outside events.py may say sqlite3”; the rule is “nobody outside events.py may depend on sqlite3 in their business logic.” Wiring code is allowed — that’s where the choice actually has to live somewhere.

Step 5 — Knowledge Check

Min. score: 80%

1. After the refactor, which of these changes touches only one file (the file that owns the storage secret)?

Switching the production catalog from SQLite to a remote HTTP service.
Renaming the ticket_price_cents field on Event to cents.
Renaming a field on Event is a contract change — Event is shared by every client and every implementation. That ripples.
Changing affordable_shows to return JSON instead of Event objects.
affordable_shows returns to its caller. Changing the return type changes that contract.
Adding a new field age_restriction to Event.
Adding a field touches Event (contract) AND every implementation that constructs Events (every directory). That’s 2+ files.

Storage is the secret. A HttpEventDirectory is a new class behind the same EventDirectory Protocol — zero edits to tour_planner.py, zero edits to Event, zero edits to InMemoryEventDirectory. That is what you bought with the refactor.

2. (Select all that apply.) Which of these are good reasons your InMemoryEventDirectory is worth writing, even though production uses SQLite? (select all that apply)

Tests don’t need a real database — they construct events in Python and run instantly.
Two engineers can work in parallel: one builds the SQLite implementation, one builds the client features, both against InMemoryEventDirectory.
Future readers see two implementations and learn the Protocol is what the contract really is — not whatever SQLite happens to do.
It lets you ship the product to customers who don’t have SQLite installed.
SQLite ships with Python’s standard library — no install needed. This isn’t a real win and you shouldn’t reach for it as the justification. The other three reasons are the real ones.

Information hiding pays in three currencies, all visible here: testability (1), parallel work (2), and comprehensibility (3) — Parnas’s original three benefits in the 1972 paper. The fourth option sounds plausible but isn’t actually true.

3. A teammate writes a new client function that takes an EventDirectory parameter. But for “convenience,” they also add a second parameter connection: sqlite3.Connection, because they need to do a one-off transaction-level operation. What is the problem?

Two parameters is too many; pick one.
Two parameters is fine. The number isn’t the issue.
It re-introduces the sqlite3 coupling the Protocol was meant to remove.
Nothing — EventDirectory is just an interface, you can take other parameters too.
It IS a Protocol you can take other parameters alongside — but if the other parameter re-introduces the very coupling the Protocol was meant to remove, you have undone the abstraction.
You shouldn’t use sqlite3 in Python; use SQLAlchemy.
The library choice isn’t what’s at stake. Even an SQLAlchemy session parameter would re-couple the same way.

This is a real, subtle anti-pattern: “leaky abstraction by parameter creep.” With a connection parameter, the client now compiles only against SQLite. The Protocol promises you don’t need a connection to ask about events — and that promise breaks the moment a function also requires a connection. The fix is to push the transactional operation behind the Protocol (e.g., a new EventDirectory.archive(event) method) so the connection is still local to the directory implementation.

4. Spaced retrieval — Step 4 (overspecification). Your EventDirectory.find_in(city) returns list[Event]. The team is asked to add pagination. Two design proposals:

A: find_in(self, city: str, *, page: int, page_size: int) -> list[Event]
B: find_in(self, city: str, *, cursor: str | None = None, limit: int = 50) -> EventPage where EventPage is @dataclass(frozen=True) with events: list[Event] and next_cursor: str | None.

Which design hides more, in Parnas’s sense?

A — page numbers are universal and familiar.
Page numbers are familiar but they’re an implementation detail. They imply numeric offset and stable ordering between calls — two assumptions that break the moment the directory is backed by an event stream or a sharded service.
B — cursors hide whether pagination is offset-, keyset-, or token-based.
Neither — they hide the same amount of implementation detail.
They look similar but B abstracts more: any pagination strategy can return some opaque cursor; only offset pagination can return a page number.
A — EventPage adds a useless extra class for no gain.
Wrapping the result in a small dataclass costs almost nothing and lets you evolve the response shape without breaking callers. (Useless? Add total_known: int later and no caller breaks.)

Clients only care that there is a next_cursor (or none) — same lesson as the recommender’s Confidence enum. Naming the domain concept (next_cursor) instead of the algorithm detail (page: int) hides one more design decision. The reader who studied Step 4 should now find this pattern instinctive — a faded transfer of the exact same idea, applied to a different domain.

5. Spaced retrieval — Step 2 (representation). Your Event dataclass is declared @dataclass(frozen=True). Why frozen, specifically, and not just @dataclass?

Frozen dataclasses are faster at construction.
Performance is not the reason — at the per-instance scale you’d notice, cost is dominated by allocation, not field assignment. The reason to choose frozen is semantic (mutation safety).
Callers can’t mutate an Event they received from find_in().
Frozen is required for dataclasses to work with Protocol.
Frozen and Protocol are independent features. A non-frozen dataclass still satisfies a Protocol if the methods match.
Frozen automatically makes all fields private.
Frozen has nothing to do with field privacy. All dataclass fields are public regardless of frozenness.

Step 2’s lesson, retrieved through the Step 5 lens. Step 2’s Track was frozen so top_tracks(3) could safely return references to internal tracks without callers mutating them. Step 5’s Event is frozen for the same reason: an in-memory directory may cache Event instances, and a caller who mutates one would corrupt that cache without warning. Frozen dataclasses are the cheapest way to make a domain object both typed and safe to hand out — the same one-line move you used in Step 2 generalizes here.

6

Single Choice: Stop Repeating the Provider List

🧠 Before you read — retrieve from memory

You’re about to do refactor #4. Before another worked example layers on top of the routine you’ve practiced three times already, your brain needs a chance to produce it cold — that’s what makes the next refactor cheaper than the last one, instead of just longer to read.

You’ve now done three refactors that each followed the same five-step routine. Cover the screen and write the five labels of that routine from memory. (A scrap of paper, a comment in your editor, or your head — any form is fine. Just don’t peek.)

Reveal (after you've written your version)

The canonical labels — same five every time, from Parnas's design-for-change discipline: ```text 1. Name the change. What is about to change, and why is it likely? 2. Name the secret. Which design decision should one module own? 3. Minimum client assumptions. What does the client *actually* need to know? 4. Remove the leak. Replace exposed representation with domain operations, a Protocol, dependency injection — whatever names the contract without naming the decision. 5. Verify with a swap. Same client, different implementation, same output. ``` If yours matched word-for-word, your schema is solidifying — that's exactly what spaced retrieval is supposed to do. If you got 4 out of 5 (most students do by this point), notice *which* you missed: the one most often dropped is **#3** (minimum client assumptions), because it's the only step that asks you to reason about the *client* rather than the module being refactored. Karpicke & Roediger (2008) found that recalling material without cues produces 50% stronger retention than re-reading the same material. The 30 seconds you just spent writing the routine from memory is the cheapest learning move in this tutorial.

Why this matters

Open any production codebase and search for if provider ==. You’ll find the same alphabetical list of providers in four files. Add a fifth provider and you edit all four — and inevitably miss one, shipping a “feature works on Spotify but silently breaks on Tidal” bug. The SEBook chapter calls this the Single Choice principle: when a system supports several alternatives, only one module should know the exhaustive list. This step makes Single Choice operational. The killer test: you’ll add a fourth provider — invisible to your refactored code — and three client functions will work with it unchanged.

🎯 You will learn to

Apply the Single Choice principle by replacing scattered if provider == "..." switches with polymorphism behind a hidden choice point
Analyze code for repeated exhaustive lists (the same set of "spotify", "apple_music", "tidal" strings in multiple files is the smell)
Create a new provider class that satisfies the StreamingProvider Protocol — and feel that no existing client function had to change to absorb it

The scene

streaming.py has three top-level functions: play_track, share_track, like_track. Each one has the identical if provider == "spotify": ... elif provider == "apple_music": ... elif provider == "tidal": ... ladder. The product manager just said: “Add YouTube Music. Same operations.” The bad design: four edits across three files. The good design: one new class. The test enforces the second.

✏️ Predict before you run

Today’s streaming.py repeats the provider list in three functions. If we add YouTube Music in the current design, how many elif branches must be added across the file?

(a) 1 — a new branch in one function is enough.
(b) 3 — one new branch per function, three total.
(c) 4 — three new branches plus a new helper function.
(d) 0 — Python’s match statement handles it.

Commit. Then refactor and see the answer for the good design.

Your task

Refactor streaming.py:

Define class StreamingProvider(Protocol) with play(self, track_id) -> str, share(self, track_id, friend) -> str, like(self, track_id) -> str. Each returns the message string that the current code prints.
Define class SpotifyProvider, class AppleMusicProvider, class TidalProvider — each implements all three methods.
Rewrite play_track(provider: StreamingProvider, track_id: str), share_track(...), and like_track(...) so each just delegates to the corresponding method on the passed-in provider — no if/elif/match ladders anywhere.

The hidden test will then construct a fourth provider — YouTubeMusicProvider — which your code has never seen. If your play_track/share_track/like_track functions are properly polymorphic, that fourth provider will Just Work. If any branching on "youtube_music" is needed, the test fails.

🪞 Before clicking Next

Once all three tests pass, do this self-check before the quiz:

Without using the word “Protocol”: search this tutorial mentally across all four refactors (Steps 2, 4, 5, and 6). In each one, you replaced direct exposure of a design decision with what kind of thing? The four answers are different in form but all instances of the same move.

The four are: (Step 2) domain operations on a class, (Step 4) a typed shape + dataclass, (Step 5) dependency injection of a typed shape, (Step 6) polymorphism on a typed shape. Each one is a different way to make a contract not name the volatile decision. (The forbidden word forces you to name what each refactor was for, not the Python construct it used.) The quiz’s last question asks this in MCQ form.

Starter files

streaming.py

"""STARTING STATE.

Three functions, each with the same provider ladder. The "exhaustive
list of providers" is duplicated three times. Refactor with
polymorphism behind a hidden choice point.
"""


def play_track(provider: str, track_id: str) -> str:
    if provider == "spotify":
        return f"Playing {track_id} on Spotify..."
    elif provider == "apple_music":
        return f"Playing {track_id} on Apple Music..."
    elif provider == "tidal":
        return f"Streaming {track_id} on Tidal hi-fi..."
    else:
        raise ValueError(f"Unknown provider: {provider}")


def share_track(provider: str, track_id: str, friend: str) -> str:
    if provider == "spotify":
        return f"Shared Spotify link {track_id} with {friend}"
    elif provider == "apple_music":
        return f"Sent Apple Music card for {track_id} to {friend}"
    elif provider == "tidal":
        return f"Tidal shared {track_id} to {friend}"
    else:
        raise ValueError(f"Unknown provider: {provider}")


def like_track(provider: str, track_id: str) -> str:
    if provider == "spotify":
        return f"Liked Spotify track {track_id}"
    elif provider == "apple_music":
        return f"Loved Apple Music track {track_id}"
    elif provider == "tidal":
        return f"Added Tidal track {track_id} to favorites"
    else:
        raise ValueError(f"Unknown provider: {provider}")


# TODO Replace the ladders with:
#   1. ``class StreamingProvider(Protocol)`` (play, share, like)
#   2. ``SpotifyProvider``, ``AppleMusicProvider``, ``TidalProvider``
#   3. Rewrite play_track / share_track / like_track to delegate

if __name__ == "__main__":
    print(play_track("spotify", "t1"))
    print(share_track("apple_music", "t1", "Alex"))
    print(like_track("tidal", "t9"))

Solution

streaming.py

"""Polymorphism behind a hidden choice point — Single Choice in one file."""

from typing import Protocol


class StreamingProvider(Protocol):
    def play(self, track_id: str) -> str: ...
    def share(self, track_id: str, friend: str) -> str: ...
    def like(self, track_id: str) -> str: ...


class SpotifyProvider:
    def play(self, track_id):
        return f"Playing {track_id} on Spotify..."
    def share(self, track_id, friend):
        return f"Shared Spotify link {track_id} with {friend}"
    def like(self, track_id):
        return f"Liked Spotify track {track_id}"


class AppleMusicProvider:
    def play(self, track_id):
        return f"Playing {track_id} on Apple Music..."
    def share(self, track_id, friend):
        return f"Sent Apple Music card for {track_id} to {friend}"
    def like(self, track_id):
        return f"Loved Apple Music track {track_id}"


class TidalProvider:
    def play(self, track_id):
        return f"Streaming {track_id} on Tidal hi-fi..."
    def share(self, track_id, friend):
        return f"Tidal shared {track_id} to {friend}"
    def like(self, track_id):
        return f"Added Tidal track {track_id} to favorites"


def play_track(provider: StreamingProvider, track_id: str) -> str:
    return provider.play(track_id)


def share_track(provider: StreamingProvider, track_id: str, friend: str) -> str:
    return provider.share(track_id, friend)


def like_track(provider: StreamingProvider, track_id: str) -> str:
    return provider.like(track_id)


# ---- Wiring (the ONE place that knows the exhaustive list of providers) ----
_REGISTRY: dict[str, type[StreamingProvider]] = {
    "spotify":      SpotifyProvider,
    "apple_music":  AppleMusicProvider,
    "tidal":        TidalProvider,
}


def provider_for(name: str) -> StreamingProvider:
    """Composition-root helper: pick a provider by name from a config string."""
    if name not in _REGISTRY:
        raise ValueError(f"Unknown provider: {name}")
    return _REGISTRY[name]()


if __name__ == "__main__":
    print(play_track (provider_for("spotify"),     "t1"))
    print(share_track(provider_for("apple_music"), "t1", "Alex"))
    print(like_track (provider_for("tidal"),       "t9"))

The Single Choice payoff, in one sentence. Adding a fourth provider is one new class and one new entry in the wiring registry. The client functions play_track, share_track, like_track do not change. The test proved it by constructing YouTubeMusicProvider outside your code and passing it through — zero edits required.

Where the choice still lives. Notice that the exhaustive list of provider names does still exist — in _REGISTRY. That’s deliberate. Single Choice doesn’t say “no module knows the list.” It says “only one module knows the list.” The wiring layer is that module. Every other module sees a StreamingProvider and forgets which one it is.

The general pattern this step taught. When you find the same exhaustive list (provider, payment_method, tax_jurisdiction, auth_strategy, etc.) appearing in if ladders in multiple files, the fix is always the same shape:

Define a Protocol for the operation set.
Make each alternative a class implementing the Protocol.
Have client code call the operation on an injected instance.
Put the exhaustive list in one wiring/registry module.

This is the chapter’s Single Choice principle, made operational. Now, when you encounter the same shape at work (or in your CS130 group project), you have a routine — not just a name.

Step 6 — Knowledge Check

Min. score: 80%

1. The Single Choice principle says: if a system supports several alternatives, only one module should know the exhaustive list. In your refactored code, where does the list of supported providers actually live?

Inside play_track, share_track, and like_track — they each list which providers they support.
That’s where it lived before the refactor. The three functions are now provider-agnostic. They never list providers.
Inside the StreamingProvider Protocol — it lists all valid implementers.
A Protocol describes the shape providers must satisfy. It doesn’t list which providers exist — that’s a runtime, not a type, concern.
In the wiring code (composition root) that constructs the provider and hands it to the client functions.
Nowhere — Python’s duck typing means no module knows.
Duck typing tells you how the dispatch happens. The list of supported providers still has to be assembled somewhere — at the wiring layer. The win is that the list is in one place.

The polymorphic dispatch (provider.play(track_id)) replaces the if/elif ladder, but the list still has to exist somewhere — when the app starts up, someone decides “today we’re using YouTube Music.” That somewhere is the wiring code (composition root) — and now it’s the only place that knows the full list. Adding a fifth provider is one new class + one new wiring entry. That’s Single Choice.

2. Before the refactor, the same if provider == "spotify": ... elif "apple_music" ... ladder appeared in three functions. What kind of coupling connected those three functions?

Syntactic coupling — they all imported the same module.
They all lived in the same file but didn’t import each other. The coupling was deeper than syntax.
Semantic coupling — all three shared an assumption about the provider list.
Inheritance coupling — they all derived from a common base.
They were functions, not classes. No inheritance involved.
There was no coupling — three separate functions are independent.
They were dangerously coupled — through a shared assumption, not through a call. That’s the worst kind: invisible until something silently breaks.

Semantic coupling is the SEBook chapter’s term for “two modules share the same assumption without saying so.” Change the list in one place and the others silently disagree. The provider-list scattered across three functions was a textbook case. The compile-time tools (grep, type checkers) couldn’t help you find it. The polymorphism refactor removes the shared assumption from those three modules — they now only know provider.play(...). The assumption now lives in one place: the wiring.

3. (Select all that apply.) Which of these is now CHEAP to do, after your Single Choice refactor? (select all that apply)

Add a fourth provider, e.g. YouTube Music — one new class, no edits to play_track/share_track/like_track.
A/B-test two implementations of the Spotify provider at the same time (e.g., legacy SDK vs. new SDK) by passing different SpotifyProvider instances to different users.
Drop one provider entirely (e.g. drop Tidal): delete the TidalProvider class and remove it from the wiring map.
Rename the method like to favorite across all providers — that’s one keyword change in one file.
Renaming a method on the Protocol is a contract change — every implementation must update, and every caller of provider.like(...) must update. That ripples. It’s the same lesson as Step 2’s quiz: renaming a public method on the contract is the opposite of an absorbed change.

Three of these are absorbed by the refactor; the fourth (renaming like to favorite) is a contract change and ripples. A/B-testing two implementations of the same provider at once is the test (2) — and it’s only possible because the wiring layer hands out provider instances, not provider strings. That kind of flexibility is one of the quietest big wins of polymorphism-behind-a-Protocol.

4. Spaced retrieval — Step 1 (private isn’t enough). A teammate asks: “Could we have solved the original provider-coupling problem just by making provider a private field on a single shared module, instead of three top-level functions?” What’s the cleanest objection?

Private fields would make the code uncompilable in Python.
Python compiles fine with private (or even pseudo-private) attributes.
No — the if/elif ladder would still appear in every method.
Single Choice requires every variable to be public.
Single Choice says nothing about field visibility — it’s about where the list of alternatives lives.
Private fields are inherently slower at runtime.
Speed is irrelevant to the question.

This is Step 1’s lesson, retrieved through Step 6’s lens. Visibility modifiers are not the unit of information hiding. Hiding the provider: str field as self._provider would still leave the same if self._provider == "spotify": ... elif ... ladder in every method — the shape of the leak doesn’t change. The Single Choice violation lives in the branching ladder, not the variable name. The fix is structural (polymorphism on a Protocol), not lexical (more underscores). Same lesson as Step 1’s WatchHistory._history: the underscore did not hide the design decision.

5. Spaced retrieval across the tutorial. Which of the following best describes the single common move that connects Steps 2, 4, 5, and 6?

Add a @dataclass(frozen=True) to every input the SUT receives.
Frozen dataclasses are a tool you used in three of the four refactors — for domain objects. Not for every input. Hashing the tool to the goal loses the point.
Wrap each free function in a class with a single run() method.
Wrapping things in classes was the visible artifact. The principle was always ‘hide the volatile decision’, not ‘wrap in class’.
Replace direct exposure of a volatile decision with a stable contract.
Add type hints to every parameter so the linter can find leaks.
Type hints helped the linter and the reader. They didn’t hide anything by themselves.

Every refactor step did the same operation at a different level: identify the design decision that is likely to change, then replace its direct exposure with a stable contract that does not name it. Storage (Step 2). Algorithm + score scale (Step 4). Persistence technology (Step 5). Exhaustive list of providers (Step 6). The form changes per step (domain methods, then Protocol + dataclass, then dependency injection, then polymorphism), but the principle is one move repeated. That’s the entire skill this tutorial trains. Step 3 gave you the Python construct that all of them used; Step 7 will ask you to recognize the type of leak before you fix it.

7

Sort the Leaks

Why this matters

Steps 2-6 each taught one kind of leak in isolation — that’s blocked practice, and it’s the right shape for building each schema. But real codebases mix leak types, and the skill an engineer actually needs is classification first, fix second: read a snippet, identify which kind of leak it is (or whether it’s a leak at all), and then pick the right routine.

This step is pure judgment — no code to write, no files to refactor. Six short snippets. For each one, you decide what kind of leak (if any) is present and which step’s routine fixes it.

🎯 You will learn to

Discriminate between the four leak types you’ve practiced — by attending to deep structure, not surface cues
Recognize when a snippet is not a leak, and resist the “always abstract” instinct
Match each leak to the step that taught its fix (representation = Step 2, overspecification = Step 4, persistence = Step 5, exhaustive-alternatives = Step 6)

How to read each snippet

Every snippet has a specific design decision visible (or appropriately hidden). The deep-structure cue you’re looking for: what would have to change in clients if the implementation chose differently? If nothing would, it’s not a leak. If many clients would, name the type and pick the routine.

The same five-step routine you retrieved at the start of Step 6 applies to every fix. This step trains the which routine judgment that comes before applying it.

Research base: Rohrer & Taylor (2007) and Dunlosky et al. (2013) find that interleaved practice produces worse performance during practice but dramatically better transfer afterward — because mixing examples forces attention to the structural feature rather than the surface feature. The next two questions might feel harder than Steps 2-6 did. That’s the point.

Starter files

SNIPPETS.md

# Step 7 — Sort the Leaks

Six short snippets are in the quiz on the right. Each shows a small
Python module. For each one, decide:

1. Is there a leak?
2. If yes, which *kind* — representation (Step 2), over-specification
   (Step 4), persistence (Step 5), or exhaustive-alternatives (Step 6)?
3. If no, why is the abstraction unnecessary here?

You will not edit code in this step. The skill being trained is
classification — the move that comes *before* picking a fix.

Solution

SNIPPETS.md

# Step 7 — answer key

| Snippet | Leak type                       | Routine to apply |
|---------|---------------------------------|-------------------|
| 1       | Representation                  | Step 2 — frozen dataclass + domain methods |
| 2       | Over-specification              | Step 4 — Protocol + domain-level labels |
| 3       | Persistence                     | Step 5 — Protocol over storage, dependency injection |
| 4       | Exhaustive alternatives         | Step 6 — polymorphism behind a Protocol |
| 5       | **Not a leak** — scope too small | Don't refactor; revisit if growth makes it plausible |
| 6       | Representation **and** Exhaustive alternatives | Steps 2 + 6, combined |

What you just trained. Each of Steps 2-6 taught one kind of leak in isolation — blocked practice, useful for building each schema. This step mixed them — interleaved practice, useful for building discrimination. Research (Rohrer & Taylor 2007; Dunlosky et al. 2013) finds that interleaved practice feels harder during practice but produces dramatically better transfer afterward, because it forces attention to the structural feature (the design decision being exposed) rather than the surface feature (the language construct or domain vocabulary).

The honest one (Snippet 5). The correct answer was “don’t refactor”. If the entire takeaway of this tutorial were “always hide”, you’d over-apply the principle and produce abstractions nobody pays for. The principle is a bet on future change. Bet where change is plausible; abstain where it isn’t. The next step — change-impact prediction on a whole system — uses this same calibration.

The two-leak one (Snippet 6). Real codebases stack leaks. The fix is to apply both routines, in either order. The fact that you can name which routine applies to which leak is the operational form of the skill the tutorial trains.

Step 7 — Knowledge Check

Min. score: 80%

1. Snippet 1.

class ConcertCalendar:
    def __init__(self):
        self._dates: list[dict] = []
    def add(self, date_iso: str, venue: str):
        self._dates.append({"date": date_iso, "venue": venue})
    def all(self) -> list[dict]:
        return self._dates

What kind of leak — if any — does all() introduce?

Representation leak — clients can mutate the internal list and depend on its keys.
Over-specification leak — the contract reveals a scoring/ordering detail.
There’s no algorithmic ordering being revealed — no scores, no scale. The leak is shape, not algorithm.
Persistence leak — a storage technology is exposed in the contract.
No storage technology mentioned. The list is in-memory and an implementation detail. The leak is shape, not persistence.
Exhaustive-alternatives leak — an if x == ladder appears in multiple places.
No exhaustive list of alternatives appears.
Not a leak — this contract is fine.
Returning the internal list[dict] by reference is the exact anti-pattern Step 2 refuted (and Step 1’s WatchHistory.recent() before it). Clients can mutate it and depend on dict keys. That’s a leak.

Same shape as Step 2’s Playlist before refactoring, and Step 1’s WatchHistory.recent() before its fix. Leaked decisions: “storage is a list”, “items are dicts with these keys”, “iteration order is insertion order”, “clients can mutate it.” The fix is the representation refactor (Step 2 routine): a frozen dataclass + domain methods, with all() returning a domain-typed sequence.

2. Snippet 2.

def rank_articles(query: str) -> list[tuple[int, float, dict]]:
    """Returns (shard_id, tfidf_score, raw_row) tuples."""
    ...

What kind of leak — if any?

Representation leak — list and dict show internal storage.
Yes the return type uses list and dict — but the deeper problem is what those values mean: tfidf_score and its scale lock the contract to one algorithm. The shape is a symptom; the algorithm leak is the root.
Over-specification leak — the contract names the score scale and algorithm.
Persistence leak — a database schema is exposed.
No database object is in the signature.
Exhaustive-alternatives leak — a list of providers is repeated.
No ladder of alternatives appears.
Not a leak — the contract is precise, which is good.
Precision and over-specification are different things. The contract specifies more than the client needs — the score scale and shard_id are implementation choices, not domain concepts. Step 4’s exact lesson.

This is Step 4’s leak: a read-only API that looks clean but exposes the ranking algorithm (tfidf_score) and its numeric scale. A future swap to embeddings (scores in 0..1) would silently break every client’s threshold. The Step 4 routine: hide it behind a Confidence enum or similar domain-level label, then expose list[ArticleHit] instead of the raw tuple.

3. Snippet 3.

def list_attendees(
    conn: sqlite3.Connection,
    table: str,
    event_id: int,
) -> list[dict]:
    return conn.execute(
        f"SELECT name, email FROM {table} WHERE event_id = ?",
        (event_id,),
    ).fetchall()

What kind of leak — if any?

Representation leak — list[dict] exposes element shape.
list[dict] is a secondary issue — fixing it without fixing the Connection parameter still leaves you tied to SQLite. The deeper leak is which storage technology is named in the signature.
Over-specification leak — event_id: int ties clients to integer IDs.
Integer IDs are usually a fine domain primitive. They’re not the leak here.
Persistence leak — the signature names sqlite3.Connection and a SQL table.
Exhaustive-alternatives leak — multiple event types are switched on.
No alternatives ladder; this is a single function.
Not a leak — type hints make this safe.
Type hints describe shape; a sqlite3.Connection parameter is the leak the type hint names. Step 5’s lesson exactly — annotations don’t hide what they name.

Step 5’s canonical leak. sqlite3.Connection and a SQL table name in the public signature mean every caller compiles against SQLite — and the SRE migrating to Postgres or an HTTP service has to chase the type through every file. Step 5 routine: hide it behind an AttendeeDirectory(Protocol) with find_for_event(event_id) -> list[Attendee].

4. Snippet 4.

def send_notification(channel: str, recipient: str, body: str) -> None:
    if channel == "email":
        send_email(recipient, body)
    elif channel == "sms":
        send_sms(recipient, body)
    elif channel == "push":
        send_push(recipient, body)
    else:
        raise ValueError(channel)

def queue_notification(channel: str, recipient: str, body: str) -> str:
    if channel == "email":
        return f"queued email job to {recipient}"
    elif channel == "sms":
        return f"queued sms job to {recipient}"
    elif channel == "push":
        return f"queued push job to {recipient}"
    else:
        raise ValueError(channel)

What kind of leak — if any?

Representation leak — str exposes the channel identifier.
Strings as IDs are fine; that’s not the leak. The leak is repetition of the same allowed-strings list.
Over-specification leak — channel choice reveals delivery strategy.
The channels (email/sms/push) are the domain concepts at this layer — a UI must show which ones exist. The leak is that the list is repeated, not that it’s visible.
Persistence leak — queue_notification mentions a queue.
The word ‘queue’ is a domain concept here, not a storage technology.
Exhaustive-alternatives leak — the channel list is repeated in two functions.
Not a leak — clear if/elif is good practice.
Repeating the same if x == ladder in multiple functions is the exact Single Choice violation Step 6 refuted. Adding a fourth channel requires editing both functions; missing either ships a half-broken feature.

Step 6’s leak. The exhaustive list (email/sms/push) appears in two functions; nothing forces them to stay in sync. Step 6 routine: polymorphism behind a NotificationChannel(Protocol) with send and queue methods. One module — the wiring/composition root — owns the list of supported channels.

5. Snippet 5 — the honest one.

# cleanup_old_drafts.py
# Run weekly via cron. Deletes draft files older than 30 days.

import time
from pathlib import Path

DRAFT_DIR = Path("/var/app/drafts")
CUTOFF_SECONDS = 30 * 24 * 3600

for path in DRAFT_DIR.glob("*.draft"):
    if time.time() - path.stat().st_mtime > CUTOFF_SECONDS:
        path.unlink()

A teammate proposes: “Put the file glob behind a DraftSource(Protocol) so we could swap to S3 later.” Should you?

Yes — every filesystem operation should be hidden behind a Protocol.
Information hiding is a bet on future change. A 10-line script with no other callers is the wrong place to bet. The chapter’s ‘When NOT to apply’ section exists for exactly this reason.
No — a 10-line cron script with no second caller doesn’t earn the indirection.
Yes — Step 5 said to always hide persistence.
Step 5 said to hide persistence when the future change is plausible — production catalogs with multiple readers, yes; a 10-line cron job with no second caller, no. Always-hide is the over-application failure mode this step calibrates against.
No — but only because Python’s pathlib is already an abstraction.
pathlib.Path is a thin wrapper around filesystem syscalls; it doesn’t hide which filesystem you’re touching. The reason to skip the abstraction is the script’s small scope, not pathlib’s design.

The honest answer: not every leak should be hidden. Information hiding pays in maintenance, and a 10-line cron script with no plausible second caller has no maintenance to amortize against. The layer taxes every future reader for an S3 migration that may never happen. If this script grew (multiple draft sources, multiple deletion policies, multiple environments), then the abstraction would earn its place. Until then, the indirection is pure cognitive tax. The skill is choosing when to apply, not just how — Step 8 puts a number on this with the “blast radius” exercise.

6. Snippet 6 — interleaved final.

# tournament.py
class TournamentBracket:
    def __init__(self):
        self._matches: list[dict] = []
    def add_match(self, team_a: str, team_b: str, court: str):
        self._matches.append({"a": team_a, "b": team_b, "court": court})

    def assign_court(self, match_index: int, court_provider: str) -> str:
        if court_provider == "stadium":
            return f"Stadium court for match {match_index}"
        elif court_provider == "outdoor":
            return f"Outdoor court for match {match_index}"
        elif court_provider == "indoor":
            return f"Indoor court for match {match_index}"
        raise ValueError(court_provider)

    def matches(self) -> list[dict]:
        return self._matches

(Select all that apply.) Which leak types are present? (select all that apply)

Representation leak — matches() returns the internal list[dict] by reference.
Over-specification leak — assign_court reveals a score scale.
No score or numeric scale is exposed. The leaks are shape and exhaustive-list — not algorithmic.
Persistence leak — a database is exposed.
No storage technology appears.
Exhaustive-alternatives leak — assign_court enumerates the court-provider list inline.

Two leaks at once — the realistic case. matches() is Step 2’s representation leak; assign_court is Step 6’s Single Choice violation. The fix needs both routines: a frozen Match dataclass with domain methods and a CourtProvider(Protocol) injected at construction. Real codebases mix leak types. Classification matters as much as the fix because each leak type has a different routine attached to it.

8

Predict the Blast Radius

Why this matters

Information hiding is verified by simulating change — Parnas’s original test, and the one industry calls change impact analysis. A real engineer’s job isn’t to recite that classes should depend on abstractions. It’s to read a system and predict: if this changes, what else changes? This step is your final exam for the tutorial: a fresh, never-seen MusicShare app with five modules, four plausible change requests (one of which has the correct answer “don’t refactor”), one honest-tradeoff question, and one cold-transfer case from a different domain. Plus one short open-text artifact — a module guide for ui.py — to consolidate everything you’ve learned into the lightest-weight design doc Parnas, Clements & Weiss invented.

🎯 You will learn to

Predict the change-impact radius of a plausible future change in a small system before attempting the change
Evaluate when a layer of information hiding pays for itself — and when it adds cognitive overhead without proportional benefit
Apply the five-step routine on a system you’ve never seen before
Produce a Parnas/Clements/Weiss module guide for an unfamiliar module under time pressure

The MusicShare app

MusicShare ships a web UI for discovering and sharing music. Its five real modules:

Module	Public surface (the contract)	Hidden secret
`recommender.py`	`Recommender(Protocol).recommend(query, *, limit) -> list[SongHit]`	scoring / ranking algorithm
`streaming.py`	`StreamingProvider(Protocol)` + `play_track` / `share_track` / `like_track`	which streaming service is used today
`playlist.py`	`Playlist` class with `add`, `top_tracks(n)`, `total_duration_minutes()`, `average_popularity()`, `__len__`	internal storage representation
`events.py`	`EventDirectory(Protocol).find_in(city) -> list[Event]`	which persistence backend stores concert listings
`ui.py`	HTTP handlers for `/search`, `/share`, `/like`, `/concerts/<city>`	how requests are routed / rendered to HTML

Plus the wiring layer (composition_root.py) that picks today’s concrete Recommender, StreamingProvider, and EventDirectory instances.

Your tasks

Write a module guide for ui.py in the file MODULE_GUIDE.md. Use the same four labels you learned in Steps 4–5: Primary secret, Likely changes, Stable contract, Not promised. One stylistic note: Steps 4–5 wrote the guide inside a """...""" Python docstring at the top of a .py file (because there was a module file to attach it to). Here the artifact stands alone, so it’s a .md file with the four labels as Markdown ## headings instead. Same content, same Parnas/Clements/Weiss-1985 format — just rendered for Markdown instead of Python. The labels still match exactly so a future maintainer can grep for them across both formats.

One or two lines per label is enough — the artifact’s value is in the content, not the length. The test enforces substantive content under each label and that the Not promised section names at least one specific concrete decision (HTML templating, route paths, response formats, authentication, etc.).
Answer all six quiz questions below. Four are change-impact predictions on MusicShare (one of which has “don’t refactor” as the correct answer); the fifth is the honest-tradeoff question; the sixth is an unscaffolded transfer case on a system you have not seen.

The module guide is the consolidation artifact: producing a four-label document for a module you’ve never edited proves you can apply the discipline on cold material. That is the meaningful capstone for this tutorial.

Starter files

SYSTEM.md

# MusicShare system map

Five modules + wiring:

- recommender.py    — Recommender Protocol; today's concrete is PopularityRecommender.
                      Secret: scoring / ranking algorithm.
- streaming.py      — StreamingProvider Protocol; today's concretes are
                      SpotifyProvider, AppleMusicProvider, TidalProvider.
                      Secret: which streaming service is used.
- playlist.py       — Playlist class; secret: internal storage representation.
- events.py         — EventDirectory Protocol; today's concrete is SQLiteEventDirectory.
                      Secret: which persistence backend stores concert listings.
- ui.py             — HTTP handlers for /search, /share, /like, /concerts/<city>.
                      Calls only the four Protocols above (never the concrete classes).
- composition_root.py — picks today's concrete implementations and hands them to ui.py.

The quiz on the right asks you to predict, for several plausible future
changes, WHICH modules need to be edited. No code to refactor — just
your judgment, plus one short module guide.

MODULE_GUIDE.md

# Module guide — ui.py

Write the Parnas/Clements/Weiss module guide for `ui.py`. Use the four
labels exactly as below; replace the `<...>` placeholders with one or
two lines of your own reasoning.

## Primary secret

<One sentence: what design decision does ui.py own and hide?>

## Likely changes

<Bullet two or three plausible future changes this module absorbs locally.>

## Stable contract

<One or two sentences: what do callers of ui.py rely on?>

## Not promised

<Bullet at least two concrete decisions that are NOT part of ui.py's
contract — things a future maintainer must NOT depend on. Be specific:
name HTML/JSON, templating engine, exact response shapes, URL paths,
authentication scheme, etc. A generic "implementation details" line
does not count and the test will reject it.>

Solution

SYSTEM.md

# MusicShare system map — answer key

## Change 1: Add YouTube Music
Edits: streaming.py (new YouTubeMusicProvider), composition_root.py (registry entry).
That's it. ui.py and playlist.py untouched. Single Choice payoff (Step 6).

## Change 2: Migrate events to a remote HTTP service
Edits: events.py (new HttpEventDirectory), composition_root.py (swap wiring).
ui.py untouched. The 200 tests that use InMemoryEventDirectory still pass.
Canonical Parnas storage case (Step 5).

## Change 3: "Humanize" track durations in the UI
Edits: playlist.py (humanize helper on Track), ui.py (call it at render).
Honest: this IS a multi-file change. A new FEATURE is not a hidden DECISION changing.

## Change 4: nightly_health_check.py one-off cron script
Edits: NONE — let it use sqlite3 directly. A 25-line script with no
second caller doesn't pay for an abstraction layer. Step 7's Snippet 5
and the chapter's "When NOT to apply" section both warn against
over-application.

## The tradeoff
Information hiding helps modification, not first-read clarity. Bet on it
where change is plausible; skip it where it isn't. The right number of
abstractions is the smallest number that lets the system change gracefully.

## Cold transfer: CampusRide REST -> GraphQL
Edits: scooters.py (new or replaced concrete gateway), composition_root.py
(wiring). trip_planner.py, pricing.py, and ui.py stay on the
ScooterGateway domain contract. If any of them need GraphQL details,
the vendor protocol leaked.

MODULE_GUIDE.md

# Module guide — ui.py (sample answer)

## Primary secret

How HTTP requests map to domain operations on the four Protocols
(Recommender, StreamingProvider, Playlist, EventDirectory), and
how their return values get rendered to clients.

## Likely changes

- Switch server-rendered HTML to JSON for an SPA frontend.
- Swap the templating engine (Jinja → htpy → none-at-all-just-strings).
- Add new routes (/playlists, /recommendations/<id>, etc.).
- Change the auth scheme (session cookies → JWT, etc.).

## Stable contract

Each HTTP route accepts validated input and returns a response that
the browser can render. Domain operations are reached only through
the four injected Protocols.

## Not promised

- Exact HTML structure or CSS class names.
- The templating engine (Jinja, Mako, htpy, plain strings — all valid).
- URL path format (`/concerts/<city>` could become `/cities/<city>/concerts`).
- Response status codes beyond 200/4xx categories.
- Cookie-based auth specifically — could become Bearer tokens, OAuth, etc.
- The order or shape of the underlying Protocol calls inside a handler.

Your training is complete. Eight steps ago you proved that private is not a secret. You then ran the five-step routine on representation (Playlist, Step 2), declared the contract via Protocol (Step 3), attacked over-specification (Recommender, Step 4), hid persistence (EventDirectory, Step 5), and applied Single Choice (StreamingProvider, Step 6). Step 7 trained the which kind of leak is this? discrimination across all four — including the “don’t refactor” calibration. This step applied everything to an unfamiliar whole system.

The same routine, repeated four times across very different domains, is the operational form of David Parnas’s 1972 criterion. And the module guide you just wrote for ui.py is the artifact Parnas, Clements, & Weiss (1985) called the lightest-weight design doc that records why. Four labels, a few lines each, and you have something a future maintainer can read in 30 seconds to decide whether their change belongs in this module.

What to take with you. When you next find yourself reading or writing Python code, run this five-line audit on any module:

What is this module's secret? (A volatile decision, one sentence.)
What does its public API let clients see beyond that secret?
Could two different implementations both satisfy this contract?
If the secret changed, how many files would I edit?
Is the cost of the abstraction less than the cost of the future change?

If the answer to (1) is nothing, the module is shallow — merge it upward. If (2) reveals a leak, narrow the contract. If (3) is no, the secret has not been hidden. If (4) is many, redesign. If (5) is no, do not abstract — your reader pays for the layer every time, future change or not. The last item is what Step 7’s Snippet 5 and this step’s Change 4 trained: knowing when not to apply the principle is part of applying it well.

Now go fix some real code.

Step 8 — Knowledge Check

Min. score: 80%

1. Change 1: Add YouTube Music as a fourth streaming service. Users should be able to play, share, and like tracks on it just like the other three. Which files need to be edited? (Select all that apply.) (select all that apply)

streaming.py — add a YouTubeMusicProvider class implementing the Protocol.
composition_root.py — add "youtube_music": YouTubeMusicProvider to the wiring registry.
ui.py — add a branch for the new provider in /share and /like handlers.
If ui.py needed a branch for each provider, you’d be back in the same Single Choice violation Step 6 was about. The whole point of polymorphism behind the StreamingProvider Protocol is that ui.py calls provider.play(...) without caring which provider it is.
playlist.py — playlists need to track which provider each track came from.
Tempting because YouTube Music tracks feel different, but the Track dataclass already has all the fields it needs. Provider is a concern of streaming.py, not playlist.py.

This is the Single Choice payoff. One new class in streaming.py and one new entry in the wiring registry. Zero edits to ui.py, zero edits to playlist.py. Compare with the pre-refactor design from Step 6, which would have required edits in three functions — and miss any single one, and your “Add YouTube Music” feature ships half-broken.

2. Change 2: The SRE team migrates the production concert catalog from SQLite to a remote HTTP service (an internal REST API). The data shape is the same; only the storage moves. Which files need to be edited? (Select all that apply.) (select all that apply)

events.py — add an HttpEventDirectory class implementing the EventDirectory Protocol.
composition_root.py — swap in HttpEventDirectory(...) when constructing the directory.
ui.py — change the /concerts/<city> handler to make HTTP requests instead of database queries.
ui.py only knows the EventDirectory Protocol — it calls directory.find_in(city). The storage technology is hidden from it, by design.
Every test that builds an in-memory event directory — they all need to add an HTTP mock.
Tests that build InMemoryEventDirectory are unaffected. That’s exactly why you wrote that in-memory class in Step 5 — fast, hermetic tests that don’t care which production backend the app uses today.

The canonical Parnas case, made concrete. One new class in events.py. One line in the wiring. The 200 tests that build InMemoryEventDirectory still pass without changes — that’s the testability benefit. The ui.py handlers compile against the Protocol and don’t know HTTP from SQL — that’s the comprehensibility benefit. The SRE migration ships in a week, not a quarter — that’s the change-locality benefit.

3. Change 3: Product wants the UI to show “about 3 minutes” instead of “194 seconds” everywhere a track duration appears. A new humanized-duration string is needed on each Track. Which files need to be edited? (Select all that apply.) (select all that apply)

playlist.py — add a humanized_duration property to Track, or a helper next to the data.
ui.py — call the new property/helper where tracks render.
recommender.py — Recommender returns song data, so it should carry the new humanized duration too.
Tempting because the recommender returns song info — but check SongHit’s actual fields: track_id, title, artist, confidence. No duration_sec. The new humanized field isn’t part of this contract, so the file doesn’t need editing. Read the actual contract before predicting impact.
streaming.py — StreamingProvider.play() should report the duration as part of its confirmation message.
Tempting because the streaming provider plays tracks — but check what play() actually returns: a confirmation string, not a duration. The humanized field doesn’t belong in streaming.py’s contract.

This is the honest change in the set — a real, multi-file edit. Adding a new piece of presentation logic does mean changes in playlist.py (where Track is defined) and ui.py (where it renders). That’s normal and fine. Information hiding does not promise every change is local — it promises that change-prone decisions stay local. Adding a new field is a new feature, not a change-prone decision leaking. The takeaway: don’t over-claim what hiding buys you. Sometimes a two-file edit is the right answer.

4. Change 4 — the “don’t refactor” calibration. A new teammate proposes: “I want to write a one-off script, nightly_health_check.py, that connects directly to SQLite and prints a count of events per city to a log file. It runs once a night via cron, doesn’t share code with anything else, and the whole thing is ~25 lines.” They ask: “Should I make it use the EventDirectory Protocol instead of sqlite3 directly?” What is the right call?

Yes — every SQLite usage in the codebase should go through EventDirectory.
Information hiding is a bet on future change. The bet’s cost is the layer of indirection every reader pays. For a 25-line one-off script, that cost is paid forever with no payoff. Step 7’s Snippet 5 was exactly this lesson.
No — a 25-line cron script with no second caller doesn’t earn the indirection.
Yes — Step 5 said to always hide persistence behind a Protocol.
Step 5 said to hide persistence when the change is plausible — production catalogs with many readers, yes; one-off scripts, no. Always-hide is the over-application failure mode you trained against in Step 7.
No — nightly_health_check.py should use a completely different ORM to be safe.
Switching ORMs adds more coupling, not less, and doesn’t address the question. The honest answer is: this script doesn’t need an abstraction layer at all.

The honest answer: not every direct dependency should be hidden. Information hiding is a maintenance investment that pays back when (a) there are multiple callers, and (b) the hidden decision is plausibly volatile. A 25-line cron script has neither property. Forcing it through EventDirectory would mean:

The future reader of the cron script has to chase a Protocol they don’t need.
If EventDirectory ever grows new methods, this script breaks even though its needs are unchanged.
The Protocol stops being “things that look like a directory” and starts being “things that satisfy the union of every consumer’s needs.”

The Step 7 calibration (Snippet 5) and the chapter’s When NOT to apply section are exactly about this. The right number of abstractions is the smallest number that lets the system change gracefully. Below that number, you’re under-engineered; above it, you tax every reader. Both extremes are bugs.

5. The honest tradeoff. Tempero, Blincoe, and Lottridge (2023) found that more modular code helped students complete modification tasks but did not consistently make code easier to understand on first encounter. What is the right takeaway?

Modularity is overrated; skip it.
The study showed modularity helps modification — the very task most engineers spend most of their time on. Skipping it is the wrong call.
Apply information hiding aggressively everywhere, including throwaway scripts and one-off demos.
Aggressive hiding adds layers without payoff in places where the decision will never change. That’s the trap this option warns against, and the exact lesson of Step 7’s Snippet 5 and Change 4 above.
Apply hiding where future change is plausible — the payoff is in maintenance, not first-read clarity.
Hide everything — every method should be private, every class should have a Protocol.
‘Hide everything’ is the over-applied version of the principle that the chapter’s When NOT to apply section warns about. Indirection has a real reading cost. Pay it where change is plausible; skip it where it isn’t.

Information hiding is a bet on future change. It’s a great bet where the design decision is plausibly volatile — vendors, storage, algorithms, regulatory rules. It’s a bad bet on decisions that will never change. A 50-line cron job does not need a PaymentGateway Protocol; a payments codebase does. The SE maxim from the chapter: the right number of abstractions is the smallest number that lets the system change gracefully. Beyond that number, every extra layer is a tax on every reader.

What you’ve actually learned in this tutorial. You can now (a) name a module’s secret, (b) spot the contract leaks, (c) refactor a leaky module behind a Protocol or domain methods, (d) verify the secret is hidden with an implementation-swap test, (e) apply Single Choice when alternatives are exhaustive, (f) classify an unfamiliar leak before fixing it (Step 7), (g) abstain when no change is plausible, and (h) predict the change-impact radius before you start editing. That’s the operational form of Parnas’s principle — and it survives the move from a tutorial to a real codebase.

6. Cold transfer — no MusicShare scaffolding this time. CampusRide has these modules:

scooters.py owns a ScooterGateway(Protocol) with nearby(location) and reserve(scooter_id).
trip_planner.py chooses a route from available scooters and campus buildings.
pricing.py computes student discounts and surge pricing.
ui.py renders the map and reservation button.
composition_root.py wires today’s concrete gateway into the app.

The vendor replaces its REST API with GraphQL. The domain fields returned by ScooterGateway stay the same. Which files need to be edited? (Select all that apply.) (select all that apply)

scooters.py — add or replace the concrete gateway implementation that knows the vendor protocol.
composition_root.py — wire the app to the new concrete gateway.
trip_planner.py — route choice must change because GraphQL is a different query language.
trip_planner.py depends on domain scooter data, not the vendor wire protocol. If it must change for REST -> GraphQL, the gateway leaked.
pricing.py — student discounts must change because vendor transport changed.
Pricing policy is a separate secret. A vendor transport change should not rewrite discount rules.
ui.py — the map must render GraphQL responses directly.
ui.py should render domain objects from the gateway, not raw vendor responses. Rendering GraphQL payloads directly would be the same leak you fixed in Steps 4 and 5.

This is the same shape as Step 5, but without the MusicShare table. The volatile decision is the vendor protocol. It belongs in scooters.py behind ScooterGateway; composition_root.py chooses the concrete implementation. Route planning, pricing, and UI rendering should stay on the stable domain contract.

Information Hiding in Python: Hide the Decision, Not Just the Field

Private Is Not a Secret

Why this matters

🎯 You will learn to

✏️ Predict before you run

Your task — make the leak happen

Solution

Step 1 — Knowledge Check

The Playlist's Secret

Why this matters

🎯 You will learn to

The scene

✏️ Predict before you run

The five-step routine

Your task

🪞 Before clicking Next

Solution

Step 2 — Knowledge Check

A Protocol on Familiar Code

Why this matters

🎯 You will learn to

Five-minute primer

✏️ Predict before you run

Your task

🪞 Before clicking Next

Solution

Step 3 — Knowledge Check

An Interface That Tells You Too Much

Why this matters

🎯 You will learn to

One-minute primer on typing.Literal

The scene

✏️ Predict before you run

Scaffold: trace the leak before you code

Your task

🪞 Before clicking Next

Solution

Step 4 — Knowledge Check

Where Did You Put the Database?

Why this matters

🎯 You will learn to

The scene

✏️ Predict before you run

Scaffold: write the change map first

Your task

🪞 Before clicking Next

Solution

Step 5 — Knowledge Check

Single Choice: Stop Repeating the Provider List

🧠 Before you read — retrieve from memory

Why this matters

🎯 You will learn to

The scene

✏️ Predict before you run

Your task

🪞 Before clicking Next

Solution

Step 6 — Knowledge Check

Sort the Leaks

Why this matters

🎯 You will learn to

How to read each snippet

Solution

Step 7 — Knowledge Check

Predict the Blast Radius

Why this matters

🎯 You will learn to

The MusicShare app

Your tasks

Solution

Step 8 — Knowledge Check

One-minute primer on `typing.Literal`