Information Hiding in Python: Hide the Decision, Not Just the Field
Refactor leaky Python modules so a plausible future change touches one file, not twenty. You will spot leaked design decisions in code that technically uses private fields, hide a storage representation behind domain operations, hide an algorithm choice behind a Protocol, swap implementations to prove the secret is really hidden, and apply Single Choice so adding a new streaming provider is one new class — not five if/else edits. **Expect to be stuck once or twice — Parnas's principle takes most engineers a couple of passes to internalize, and the gap between 'I followed it' and 'I can do it from scratch' is exactly where the learning is.**
Private Is Not a Secret
Why this matters
Most CS students learn private before they learn what it’s for. So when someone asks “did you hide the information?”, they answer “yes — the field is private.” That answer is wrong often enough that a billion-dollar industry of code reviews exists. The next five minutes are an inoculation: you will hold a class whose fields are “private” and use its own public API to plant a fake entry — proving the secret leaked anyway.
🎯 You will learn to
- Distinguish information hiding (a design decision about who is allowed to know what) from
private(a syntax feature that helps enforce it, when used carefully) - Analyze a public method signature for representation leaks — even when every field uses the
_underscoreconvention
✏️ Predict before you run
Open watch_history.py. Every field starts with _ — Python’s convention for “private.” Will an outside caller still be able to make history.recent() return [..., {"title": "Pirated Movie", "year": 1999}] without calling history.add(...)?
- (a) No — the underscore convention prevents outside code from touching
_history. - (b) No — the only public modifier is
add(), so_historycan only change viaadd(). - (c) Yes — because
recent()returns the actual internal list, the caller can mutate it from outside. - (d) Compile error — Python won’t let you index into a
_-prefixed attribute.
Commit to a letter, then try the task.
Reveal (after you've tried it)
Answer **(c)**. The "private" field has not been hidden — only renamed. `recent()` hands the caller a *reference* to `self._history`, which is a `list`, and `list.append(...)` mutates in place. Visibility modifiers stop nothing on their own. The **representation decision** ("storage is a list of dicts you may mutate by reference") has leaked through the public API. Three other ways this same class still leaks even if `recent()` returned `tuple(self._history)`: 1. Callers see the element type (`dict[str, ...]`) and start writing code that depends on the keys. 2. Switching `_history` to a `dict[str, Episode]` keyed by title would change what `recent()` returns and break every caller. 3. `for entry in history.recent():` quietly depends on iteration order. Information hiding isn't a marker on a field — it's a decision about *which design decisions clients may depend on*. The next five steps train that judgment.Your task — make the leak happen
Open watch_history.py. Below the # TODO marker, write one line that — using only the public API history.recent() — plants a {"title": "Pirated Movie", "year": 1999} entry. Do not call history.add(...). When the test runs, history.recent() must return a list containing that planted entry.
The point isn’t that you should write code like this. The point is that the design lets you, even though the author thought they had hidden the field.
class WatchHistory:
"""Stores what you have watched.
The author thought ``_history`` was 'hidden' because of the
underscore convention.
"""
def __init__(self) -> None:
self._history: list[dict] = []
def add(self, title: str, year: int) -> None:
self._history.append({"title": title, "year": year})
def recent(self) -> list[dict]:
# The author meant this as a read-only view.
# It is not. Find out why in the task below.
return self._history
if __name__ == "__main__":
history = WatchHistory()
history.add("Stranger Things", 2016)
history.add("Severance", 2022)
# TODO: Without calling ``history.add(...)``, plant the fake entry
# ``{"title": "Pirated Movie", "year": 1999}`` so it shows up in
# ``history.recent()``. One line is enough. Use only the public
# API — meaning the method ``history.recent()``.
# planted = ...
print(history.recent())
Solution
class WatchHistory:
"""Stores what you have watched.
The author thought ``_history`` was 'hidden' because of the
underscore convention.
"""
def __init__(self) -> None:
self._history: list[dict] = []
def add(self, title: str, year: int) -> None:
self._history.append({"title": title, "year": year})
def recent(self) -> list[dict]:
return self._history
if __name__ == "__main__":
history = WatchHistory()
history.add("Stranger Things", 2016)
history.add("Severance", 2022)
# The leak: ``recent()`` returns the same list object that
# ``_history`` points to. Mutating that list mutates the field.
history.recent().append({"title": "Pirated Movie", "year": 1999})
print(history.recent())
What you just proved. The author marked _history as private and
thought clients could only modify the state via add(). But recent()
returns a reference to the same list — and list.append(...) mutates
in place. One line of “client code” bypassed the entire intended
invariant.
The fix is not “make the underscore double”. Even __history only
triggers name-mangling (_WatchHistory__history), reachable from
outside if you really want. The fix is structural: recent() should
return an immutable view of the data as domain objects, not the
internal list of dicts. Something like:
from dataclasses import dataclass
from typing import Iterable
@dataclass(frozen=True)
class WatchedShow:
title: str
year: int
class WatchHistory:
def __init__(self) -> None:
self._shows: list[WatchedShow] = []
def add(self, title: str, year: int) -> None:
self._shows.append(WatchedShow(title, year))
def recent(self) -> tuple[WatchedShow, ...]:
return tuple(self._shows) # immutable view, domain objects
Now: (1) clients cannot mutate _shows through the return value, and
(2) the return type is a domain object (WatchedShow) rather than
a dict, so the storage decision can change later without breaking
callers. You will do exactly this kind of refactor in the next step,
on a richer example.
Step 1 — Knowledge Check
Min. score: 80%
1. Every field in WatchHistory starts with _. Which statement best
describes whether the storage representation is hidden?
The fields are technically marked private by convention, but recent() returns the actual list. Any caller can mutate it (you just did) or write code that assumes “it is a list, I can iterate it in insertion order, I can index it by integer, I can .append() to it” — all of which the author did not intend. Hiding the field name doesn’t hide the design decision (“storage = mutable list of dicts”).
2. Which future change does the current WatchHistory design make
expensive — meaning many callers would have to be edited?
A switch from list to dict[str, Episode] would change what recent() returns. Every caller that does for entry in wh.recent(): expects insertion-order iteration. Every caller that does wh.recent().append(...) (the leak you just exploited) would crash. The internal representation has leaked into the contract because the return type of the public method exposes it.
3. Which is the most accurate one-line definition of information hiding in Parnas’s sense?
Parnas’s 1972 definition is that modules should be organized around design decisions that are difficult or likely to change. Each such decision lives in one module; other modules interact with it through an interface that does not reveal the decision. private, returning copies, small files — all useful tactics that may help. But the principle is about which design decisions get hidden, and from whom.
The Playlist's Secret
Why this matters
Step 1 showed you a leak the size of one line. Real codebases hide leaks the size of an org chart — and the same leak forces three teams to coordinate every time the data shape changes. The fastest way to feel the difference is to do a small refactor on a Playlist class and then run the same client tests against a different internal storage. If the client breaks, the secret was never hidden.
Connection to the chapter’s KWIC example. Parnas (1972) made the same point at the system level using a Key Word In Context indexer: decomposing by processing steps (input → shift → sort → print) spread the line-storage decision across every module, so a change to line storage broke every module. Decomposing by hidden decisions (line storage, shift generation, sort ordering) kept each change local. This step plays out Parnas’s argument at the class level on a Playlist.
🎯 You will learn to
- Apply the five-step routine for hiding a secret: name the change, name the secret, list the minimum client assumptions, remove the leak, verify with a swap test
- Analyze which lines of client code depend on the internal representation
- Create a new domain operation on the hidden module without changing any client
The scene
You are building a music app. Playlist exposes a raw tracks: list[dict]. Three client features reach in: a textual summary, a top-picks list, and a “party-ready?” check. Run app.py and see the current behavior. Then the product manager files a ticket: “Add remove(title) — needs to be O(1).” That requires switching internal storage from list to dict keyed by title. Run the same client code afterward and watch it break.
✏️ Predict before you run
Look at playlist.py and app.py. If we change Playlist.tracks from a list[dict] to a dict[str, Track] (keyed by title, for O(1) remove), how many files will need to change to keep the demo running?
- (a) Only
playlist.py— the field is private-ish; clients should not care. - (b)
playlist.pyandapp.py— every loop and indexing operation inapp.pywill break. - (c) Only
app.py—Playlistitself is fine. - (d) Zero files — Python’s duck typing handles it.
Commit to a letter. The first test enforces this.
The five-step routine
You will use this exact routine in every refactor from here on. Memorize the headings, not the lines:
1. Name the change. What is about to change, and why is it likely?
2. Name the secret. Which design decision should one module own?
3. Minimum client assumptions. What does the client *actually* need to know?
4. Remove the leak. Replace exposed representation with domain operations.
5. Verify with a swap. Same client, different implementation, same output.
Filled in for Playlist:
1. Change: Internal storage may move from list to dict for O(1) remove.
2. Secret: How tracks are stored and ordered for retrieval.
3. Client needs: top N by popularity, total minutes, average popularity, count.
4. Remove: Stop exposing ``tracks``; expose ``top_tracks(n)``,
``total_duration_minutes()``, ``average_popularity()``, ``__len__``.
5. Verify: Hidden test runs the same client against a dict-backed Playlist.
Your task
Refactor playlist.py so the public API is domain operations, not a raw list:
add(title, artist, duration_sec, popularity)
top_tracks(n) -> list[Track] # n highest by popularity
total_duration_minutes() -> float
average_popularity() -> float # 0 if empty
__len__() -> int
Define a Track dataclass with the four attributes (title, artist, duration_sec, popularity) so top_tracks returns domain objects, not dicts.
Refactor app.py so describe_playlist and is_party_appropriate use only the four methods above. They must not touch playlist.tracks, _tracks, or any dict keys.
One method is up to you: in app.py, write is_party_appropriate(playlist) — returns True if total duration > 90 minutes AND average popularity > 60.
🪞 Before clicking Next
Once all four tests pass, take 15 seconds and answer in your own words — out loud, to a rubber duck, or in your head:
In ≤25 words, what did this refactor actually buy you? Constraint: don’t use the words “private”, “list”, or “encapsulate” — those are mechanisms, and the answer is about a decision.
Forcing yourself to say it without those words is the point. If your sentence keeps reaching for them, you are still describing the implementation, not the design choice. The first quiz question rewards you for finding the decision-level wording.
The “generation effect” (Chi et al., 1994) says producing the sentence yourself strengthens learning more than re-reading does. The forbidden-word constraint comes from Variation Theory (Marton): forcing different language for the same concept is what makes the schema transferable.
"""Music playlist.
STARTING STATE: leaks the internal list[dict] through ``tracks``.
Refactor so the only public surface is *domain operations*.
"""
class Playlist:
def __init__(self) -> None:
self.tracks: list[dict] = []
def add(self, title: str, artist: str, duration_sec: int, popularity: int) -> None:
self.tracks.append({
"title": title,
"artist": artist,
"duration_sec": duration_sec,
"popularity": popularity,
})
# TODO (Step 4 of the routine — "Remove the leak"):
# 1. Define a ``Track`` dataclass with title, artist, duration_sec, popularity.
# 2. Replace ``self.tracks`` with a hidden ``self._tracks``.
# 3. Add the four domain methods listed in the instructions.
"""Client of Playlist. Currently reaches into the raw list.
Refactor ``describe_playlist`` and ``is_party_appropriate`` to use
ONLY the new domain methods on Playlist (no ``playlist.tracks``,
no dict indexing).
"""
from playlist import Playlist
def describe_playlist(playlist: Playlist) -> str:
total_minutes = sum(t["duration_sec"] for t in playlist.tracks) / 60
avg_popularity = (
sum(t["popularity"] for t in playlist.tracks) / len(playlist.tracks)
if playlist.tracks else 0
)
top_three = sorted(playlist.tracks, key=lambda t: t["popularity"], reverse=True)[:3]
lines = [
f"{len(playlist.tracks)} tracks, "
f"{total_minutes:.1f} min, "
f"avg popularity {avg_popularity:.0f}"
]
lines.append("Top picks:")
for t in top_three:
lines.append(f" - {t['title']} by {t['artist']}")
return "\n".join(lines)
def is_party_appropriate(playlist: Playlist) -> bool:
# TODO: rewrite using only Playlist's new domain methods.
# Spec: True iff total duration > 90 min AND avg popularity > 60.
raise NotImplementedError
if __name__ == "__main__":
p = Playlist()
p.add("Bad Guy", "Billie Eilish", 194, 95)
p.add("Levitating", "Dua Lipa", 203, 88)
p.add("Blinding Lights", "The Weeknd", 200, 92)
p.add("Heat Waves", "Glass Animals", 238, 80)
p.add("As It Was", "Harry Styles", 167, 90)
print(describe_playlist(p))
print("party-ready?", is_party_appropriate(p))
Solution
"""Music playlist with the storage decision hidden behind domain operations.
The 5-step routine, annotated below as it appears in the code:
1. Change : list -> dict for O(1) remove (anticipated next sprint)
2. Secret : how tracks are stored AND how they are queried
3. Client needs : top-N, total minutes, avg popularity, count
4. Remove leak : Track dataclass + four domain methods, no exposed list
5. Verify swap : DictBackedPlaylist in the test produces identical output
"""
from dataclasses import dataclass
# --- Subgoal 4a: domain object (so callers never see raw dicts) ---
@dataclass(frozen=True)
class Track:
title: str
artist: str
duration_sec: int
popularity: int
class Playlist:
# --- Subgoal 4b: hide the representation behind an underscore ---
def __init__(self) -> None:
self._tracks: list[Track] = []
# --- Subgoal 4c: domain operations only, no list returned ---
def add(self, title: str, artist: str, duration_sec: int, popularity: int) -> None:
self._tracks.append(Track(title, artist, duration_sec, popularity))
def top_tracks(self, n: int) -> list[Track]:
return sorted(self._tracks, key=lambda t: t.popularity, reverse=True)[:n]
def total_duration_minutes(self) -> float:
return sum(t.duration_sec for t in self._tracks) / 60
def average_popularity(self) -> float:
if not self._tracks:
return 0
return sum(t.popularity for t in self._tracks) / len(self._tracks)
def __len__(self) -> int:
return len(self._tracks)
from playlist import Playlist
def describe_playlist(playlist: Playlist) -> str:
lines = [
f"{len(playlist)} tracks, "
f"{playlist.total_duration_minutes():.1f} min, "
f"avg popularity {playlist.average_popularity():.0f}"
]
lines.append("Top picks:")
for t in playlist.top_tracks(3):
lines.append(f" - {t.title} by {t.artist}")
return "\n".join(lines)
def is_party_appropriate(playlist: Playlist) -> bool:
return playlist.total_duration_minutes() > 90 and playlist.average_popularity() > 60
if __name__ == "__main__":
p = Playlist()
p.add("Bad Guy", "Billie Eilish", 194, 95)
p.add("Levitating", "Dua Lipa", 203, 88)
p.add("Blinding Lights", "The Weeknd", 200, 92)
p.add("Heat Waves", "Glass Animals", 238, 80)
p.add("As It Was", "Harry Styles", 167, 90)
print(describe_playlist(p))
print("party-ready?", is_party_appropriate(p))
The routine you just ran:
- Named the change. “Storage may move from list to dict for O(1) remove.”
- Named the secret. “How tracks are stored and queried.”
- Listed the minimum client assumptions. Top N, total minutes, avg popularity, count.
- Removed the leak. The four domain methods became the entire public surface.
_tracksis internal. - Verified with a swap. The hidden test built
DictBackedPlaylistwith totally different storage. Yourapp.pyproduced identical output. That is the operational proof that the secret is hidden.
What you didn’t have to do. You didn’t have to write the dict-backed version. You didn’t have to predict whether it would be a dict, a B-tree, a database, or a remote service. You bought the option to swap any of those in later, at the cost of writing four small methods today.
One trap to remember. A @dataclass(frozen=True) on Track was deliberate. If Track were mutable, top_tracks(3) could be modified by callers — re-leaking the data. Frozen dataclasses are the cheapest way to make a domain object both typed and safe to hand out. (tuple(self._tracks) would also work but loses the named fields.)
Step 2 — Knowledge Check
Min. score: 80%
1. In Parnas’s terms, what is Playlist’s secret after your refactor?
A module’s secret is the volatile design decision it owns. Playlist owns how tracks are stored and how they are queried. The four public methods are the stable contract; the storage choice (list, dict, B-tree, eventually a database) is the hidden decision that can change without forcing changes elsewhere. The swap test you just passed is the operational proof.
2. Suppose a teammate proposes a smaller fix for the Step 1 WatchHistory
leak: keep storage as list[dict], but have recent() return
tuple(self._history) (an immutable copy). They argue: “Now nobody
can mutate it — the secret is hidden.” Are they right, in Parnas’s sense?
Returning an immutable copy plugs the mutation leak, but it doesn’t hide the element type. If recent() returns tuple[dict, ...], clients still depend on each entry being a dict with specific keys — changing it to a WatchedShow dataclass still breaks them. That is exactly why your Playlist refactor did two things, not one: a frozen Track dataclass and a public surface of domain methods. Together they hide both the mutation channel and the element shape. Information hiding is about which decisions are visible, not just which writes are prevented.
3. (Select all that apply.) Which of these future changes can your refactored Playlist absorb without forcing any change in app.py?
(select all that apply)
The shape of “what changes are local” follows directly from the contract you exposed. Internal storage choice is hidden, so (1) is absorbed. New domain methods are additive, so (3) is absorbed. Renaming a method on the contract is a contract change, so (4) ripples. Adding fields can be local — but it depends on whether you preserve the constructor signature (defaults vs. required), which is why (2) is partial credit.
4. You read a teammate’s class and the only thing wrong is that it
exposes get_internal_list() -> list[dict]. They argue: “It’s
only used by one client right now, so it can’t be a leak.”
What is the strongest single counter?
This is the change-localization argument and it is the strongest reason in Parnas’s framing. The leak costs nothing at the moment of writing — it costs everything the moment the storage representation needs to change. By then, every client written between now and then will also depend on the leak, multiplying the eventual edit cost. Studies of professional developers find program comprehension (the activity that becomes painful when a representation change must be traced through many files) consumes around 58% of their working time (Xia et al., IEEE TSE 2017). Information hiding is one of the cheapest ways to keep that number from compounding.
5. Spaced retrieval from Step 1. Your Playlist._tracks is named
with one underscore. Does the underscore alone hide the storage
decision from app.py?
Underscore prefixes are a social signal, not enforcement. The real hiding in your refactored Playlist comes from the fact that the public API exposes only domain operations (top_tracks, total_duration_minutes, etc.) — none of them returns the list or accepts a list parameter. That is what lets you swap the storage. The underscore just tells future maintainers “I meant for this to be local.”
A Protocol on Familiar Code
Why this matters
Step 2’s swap test worked because Python’s duck typing checks methods at call time. That’s powerful but invisible — nothing in your code says “Playlist and DictBackedPlaylist are interchangeable.” This step introduces the one Python construct that makes that contract visible: typing.Protocol. No new design principle here — just a new way to declare what your Step 2 refactor already accomplished.
Why now, before the next refactor? Steps 4–7 all use
Protocol. Pre-loading the syntax on familiar code (your Playlist) means each later step adds only one new design idea at a time, not two or three. That keeps cognitive load on the lesson, not the language.
🎯 You will learn to
- Apply
typing.Protocolto name a contract that multiple classes satisfy structurally — no explicit inheritance needed - Distinguish Python’s existing duck typing (runtime, invisible) from a typed
Protocol(declared, type-checkable) - Recognize that the same construct hides an algorithm in Step 4, a storage backend in Step 5, and an exhaustive set of alternatives in Step 6
Five-minute primer
A Protocol is a class that declares method signatures as a contract. Any class with matching methods satisfies it automatically — no class Foo(Bar): required.
from typing import Protocol
class Counter(Protocol):
def increment(self) -> None: ...
def value(self) -> int: ...
class TallyCounter: # No explicit base class!
def __init__(self) -> None:
self._n = 0
def increment(self) -> None:
self._n += 1
def value(self) -> int:
return self._n
def report(c: Counter) -> str: # Accepts any Counter-shaped class
return f"count is {c.value()}"
report(TallyCounter()) # OK — TallyCounter is structurally a Counter
The ... after each method’s signature is literally Python’s ellipsis literal — it tells readers (and mypy) “this method is declared, not implemented here.” The Protocol class itself is never instantiated; concrete classes are.
✏️ Predict before you run
protocol_demo.py has your Step 2 Playlist and a small DictBackedPlaylist. If we add class PlaylistLike(Protocol) with the five methods, will a type checker accept both classes as PlaylistLike?
- (a) Only
Playlist—DictBackedPlaylistdoesn’t inherit fromPlaylistLike. - (b) Both — structural matching cares about method shape, not inheritance.
- (c) Neither —
Playlistdoesn’t declare: PlaylistLikeeither. - (d) Only
DictBackedPlaylist— it was added later, so it knows about the Protocol.
Commit, then continue.
Reveal (after you've committed)
Answer **(b)**. Protocols use *structural* subtyping (PEP 544). Any class with the matching methods satisfies the Protocol — no explicit base class, no order-of-definition concerns, no decorator needed. This is what makes `Protocol` the right Python tool for "swap-this-for-that" designs. The Step 2 swap test was the *runtime* proof; `PlaylistLike` is the *declared* contract.Your task
Open protocol_demo.py. The Playlist class from Step 2 is there, plus a tiny DictBackedPlaylist (the swap class from your Step 2 test, made permanent so it has a name).
- Add
from typing import Protocoland defineclass PlaylistLike(Protocol)at the top, with these five methods, each ending in...:add(self, title: str, artist: str, duration_sec: int, popularity: int) -> Nonetop_tracks(self, n: int) -> list[Track]total_duration_minutes(self) -> floataverage_popularity(self) -> float__len__(self) -> int
- Change
summary(playlist: object)tosummary(playlist: PlaylistLike). Do not touch the body — only the annotation. - Do not add
(PlaylistLike)to eitherPlaylistorDictBackedPlaylist. The whole point is that they satisfy it without saying so.
The test will call summary(Playlist()) and summary(DictBackedPlaylist()). Both should produce identical-shape strings — proving the same client function accepts two completely different backings, via the declared Protocol rather than runtime luck.
Look back at Step 2’s swap test. It built
DictBackedPlaylistinside the test and passed it to your refactored client. That worked because of invisible duck typing — Python found the methods at call time.PlaylistLikeis the same fact, now declared. Nothing about Step 2’s runtime behavior changes; what changes is that a future reader can see the contract without running the code.
🪞 Before clicking Next
In ten seconds, finish this aloud:
Without using the word “duck”: what does
PlaylistLikemake visible about my code that wasn’t visible in Step 2?
The word “duck” is forbidden because “duck typing” is the Python jargon for what’s happening — but the design point is that the contract is now named. The new affordance is reader-visible substitutability. Variation Theory says forcing different language is what makes the concept transferable to the next refactor.
"""Step 3 — pre-load Protocol on familiar code.
Playlist and DictBackedPlaylist are both here. Define
``PlaylistLike(Protocol)`` so a single ``summary(p: PlaylistLike)``
function accepts both — by structural matching, no inheritance.
"""
from dataclasses import dataclass
# TODO: from typing import Protocol
@dataclass(frozen=True)
class Track:
title: str
artist: str
duration_sec: int
popularity: int
# TODO 1: Define ``class PlaylistLike(Protocol)`` here with the five
# methods from the instructions. End each declaration with ``...``.
class Playlist:
def __init__(self) -> None:
self._tracks: list[Track] = []
def add(self, title, artist, duration_sec, popularity):
self._tracks.append(Track(title, artist, duration_sec, popularity))
def top_tracks(self, n):
return sorted(self._tracks, key=lambda t: t.popularity, reverse=True)[:n]
def total_duration_minutes(self):
return sum(t.duration_sec for t in self._tracks) / 60
def average_popularity(self):
return (sum(t.popularity for t in self._tracks) / len(self._tracks)) if self._tracks else 0
def __len__(self):
return len(self._tracks)
class DictBackedPlaylist:
"""Same operations, dict-backed storage. Structural twin of Playlist."""
def __init__(self) -> None:
self._by_title: dict = {}
def add(self, title, artist, duration_sec, popularity):
self._by_title[title] = Track(title, artist, duration_sec, popularity)
def top_tracks(self, n):
return sorted(self._by_title.values(), key=lambda t: t.popularity, reverse=True)[:n]
def total_duration_minutes(self):
return sum(t.duration_sec for t in self._by_title.values()) / 60
def average_popularity(self):
vs = list(self._by_title.values())
return (sum(t.popularity for t in vs) / len(vs)) if vs else 0
def __len__(self):
return len(self._by_title)
# TODO 2: change the parameter annotation here from ``object`` to ``PlaylistLike``.
def summary(playlist: object) -> str:
lines = [f"{len(playlist)} tracks, {playlist.total_duration_minutes():.1f} min"]
for t in playlist.top_tracks(3):
lines.append(f" - {t.title}")
return "\n".join(lines)
if __name__ == "__main__":
for cls in (Playlist, DictBackedPlaylist):
p = cls()
p.add("Bad Guy", "Billie Eilish", 194, 95)
p.add("Levitating", "Dua Lipa", 203, 88)
p.add("Blinding Lights", "The Weeknd", 200, 92)
print(cls.__name__)
print(summary(p))
print()
Solution
"""Step 3 — pre-load Protocol on familiar code."""
from dataclasses import dataclass
from typing import Protocol
@dataclass(frozen=True)
class Track:
title: str
artist: str
duration_sec: int
popularity: int
class PlaylistLike(Protocol):
def add(self, title: str, artist: str, duration_sec: int, popularity: int) -> None: ...
def top_tracks(self, n: int) -> list[Track]: ...
def total_duration_minutes(self) -> float: ...
def average_popularity(self) -> float: ...
def __len__(self) -> int: ...
class Playlist:
def __init__(self) -> None:
self._tracks: list[Track] = []
def add(self, title, artist, duration_sec, popularity):
self._tracks.append(Track(title, artist, duration_sec, popularity))
def top_tracks(self, n):
return sorted(self._tracks, key=lambda t: t.popularity, reverse=True)[:n]
def total_duration_minutes(self):
return sum(t.duration_sec for t in self._tracks) / 60
def average_popularity(self):
return (sum(t.popularity for t in self._tracks) / len(self._tracks)) if self._tracks else 0
def __len__(self):
return len(self._tracks)
class DictBackedPlaylist:
def __init__(self) -> None:
self._by_title: dict = {}
def add(self, title, artist, duration_sec, popularity):
self._by_title[title] = Track(title, artist, duration_sec, popularity)
def top_tracks(self, n):
return sorted(self._by_title.values(), key=lambda t: t.popularity, reverse=True)[:n]
def total_duration_minutes(self):
return sum(t.duration_sec for t in self._by_title.values()) / 60
def average_popularity(self):
vs = list(self._by_title.values())
return (sum(t.popularity for t in vs) / len(vs)) if vs else 0
def __len__(self):
return len(self._by_title)
def summary(playlist: PlaylistLike) -> str:
lines = [f"{len(playlist)} tracks, {playlist.total_duration_minutes():.1f} min"]
for t in playlist.top_tracks(3):
lines.append(f" - {t.title}")
return "\n".join(lines)
if __name__ == "__main__":
for cls in (Playlist, DictBackedPlaylist):
p = cls()
p.add("Bad Guy", "Billie Eilish", 194, 95)
p.add("Levitating", "Dua Lipa", 203, 88)
p.add("Blinding Lights", "The Weeknd", 200, 92)
print(cls.__name__)
print(summary(p))
print()
What you just bought. You named a contract that Step 2 left implicit. Before this step, “the same client works with two storage backends” was a runtime fact you proved with a test. Now it is a declared Protocol — readable in the file, checkable by mypy, and exactly the construct you’ll layer design judgment on top of in the next steps.
One subtle point. Notice that PlaylistLike has zero opinion on what is hidden behind it. It just says “things that look like this.” That is the right shape for an information-hiding contract: a Protocol is a place to put the secret-free part of a module’s public surface. The secret lives in the implementations — exactly where it belongs.
Looking ahead. Steps 4–7 each layer one new design decision on top of the Protocol mechanics you now own.
- Step 4 hides a scoring algorithm (the score-scale leak).
- Step 5 hides a storage technology.
- Step 6 hides an exhaustive list of providers.
- Step 7 asks you to classify leaks across all four refactors and decide when not to apply the principle.
Step 3 — Knowledge Check
Min. score: 80%
1. Why does summary(DictBackedPlaylist()) work even though
DictBackedPlaylist does not have (PlaylistLike) in its
class definition?
Protocols (PEP 544) introduced structural subtyping to Python’s type system. A class satisfies a Protocol if it has the right methods and signatures — there is no (PlaylistLike) required in its declaration. This is exactly the right tool for Step 2’s swap test: the swap class didn’t need to know the Protocol existed when it was written. Now you can declare the contract visibly.
2. Spaced retrieval — Step 2. Now that PlaylistLike is declared,
what — in Parnas’s sense — is the secret that PlaylistLike
allows implementations to keep?
Same secret as Step 2 — how tracks are stored — but now you have a named artifact for it. The Protocol declares the five public operations as the stable surface; the storage choice (list, dict, or anything else) sits behind it. Naming the contract makes the secret official: any future engineer can read PlaylistLike and know exactly what is promised, and equally importantly, what isn’t.
3. (Select all that apply.) Which of these classes would satisfy
class CounterLike(Protocol) with def increment(self) -> None: ...
and def value(self) -> int: ...?
(select all that apply)
Two of these satisfy the Protocol — both define matching increment and value methods. The inheritance line in the missing-value class is irrelevant; what matters is the shape of the class. Subtle point: structural matching is a promise from you, enforceable by static checkers — vanilla Python won’t catch a missing method until you actually call it. That is why your Step 2 swap test still matters: it’s the runtime proof, complementary to the declared Protocol.
4. Spaced retrieval — Step 1. A teammate writes
class Wallet(Protocol) with one method def balance(self) -> int: ...
and a concrete class CryptoWallet that has matching methods plus a
public transactions: list[dict] attribute. They say “the
Protocol hides the implementation.” Is that true?
This is Step 1’s lesson, retrieved through the Step 3 lens. A Protocol is a floor, not a ceiling: the concrete class must have at least the declared methods, but it can also have more. The implementer is the one who controls what additional surface to expose. A public transactions attribute on CryptoWallet is reachable by anyone with a CryptoWallet reference — exactly the way WatchHistory._history leaked through recent() in Step 1. The Protocol is a contract for clients-as-Wallet; it isn’t a wall around the concrete class.
An Interface That Tells You Too Much
Why this matters
Step 2 fixed a mutation leak. Step 3 gave you the declared contract — Protocol — for Step 2’s swap test. This step fixes a subtler problem: a contract that looks clean but over-specifies how it computes its answer. Parnas warned about this in 1972 with his KWIC example: an interface that says more than the client needs to know restricts future implementations. A music recommender that returns raw BM25 scores is the modern version. Switch the algorithm from BM25 to embeddings and every numeric threshold in the client breaks.
One new piece of syntax this step:
typing.Literal.Protocolyou already own from Step 3 — reuse it freely. The new content of this step is design judgment, not mechanics.
🎯 You will learn to
- Analyze a read-only API for over-specification — which numeric scales, internal IDs, or raw rows are visible that clients did not need
- Create a
typing.Protocolplus a smalldataclassso two different ranking strategies satisfy the same contract - Apply the Parnas/Clements/Weiss module-guide mini-doc format: secret, likely changes, stable contract, what is not promised
One-minute primer on typing.Literal
typing.Literal lets a type be one of a fixed set of values:
from typing import Literal
Confidence = Literal["low", "medium", "high"]
Now confidence: Confidence means “must be the string low, medium, or high, and your type checker will yell if you try anything else.” It’s the right tool for a small enum of domain-meaningful labels.
The scene
recommender.py ranks songs for a query. The current contract returns list[tuple[int, float, dict]] — (bucket_id, similarity_score, raw_row). sidebar.py thresholds at score >= 12.0 to call a hit “strong.” Today’s scorer is BM25-style; scores live in 0..30. Next quarter the team plans to swap in vector embeddings; scores will live in 0..1. Every threshold in every client will silently produce wrong answers.
✏️ Predict before you run
The bad design returns (bucket_id, score, row). If the recommender switches from BM25 to cosine-similarity embeddings, what is the most likely failure mode in the existing sidebar.py?
- (a) A crash — the new return type won’t match.
- (b) An empty sidebar — every score will be below the threshold
12.0, so no hits are “strong” anymore. - (c) The sidebar shows literally every song — every score will be above
12.0. - (d) The sidebar is unchanged — the contract types are the same.
Commit before reading on.
Reveal
Answer **(b)**. Cosine-similarity scores live in `0..1`. The old threshold `12.0` is now larger than the highest possible score, so the strong-hit list is always empty. **The sidebar just goes blank in production** with no exception — the worst kind of bug. The deep mistake is not in `sidebar.py`. It is in `recommender.py`'s contract, which exposed the numeric score and tied callers to its *scale*. Parnas's term for this in his 1972 paper: the interface "reveals more than is necessary," restricting which future implementations can satisfy it.Scaffold: trace the leak before you code
Do this step in four small passes. The goal is to lower the typing load so your attention stays on the design decision.
| Pass | What to decide | What to edit |
|---|---|---|
| 1. Name the leak | sidebar.py knows the score scale, bucket IDs, and raw row shape. Those belong to the recommender implementation. |
Do not touch code yet; point at the three leaking facts in the starter files. |
| 2. Replace the contract | Clients need “is this a strong hit?”, not “what was the raw score?” | Add Confidence, SongHit, and Recommender in recommender.py. |
| 3. Move the algorithm decision | Popularity buckets are one implementation’s secret. | Implement PopularityRecommender.recommend(...) behind the Protocol. |
| 4. Clean the client | The sidebar should only ask for hits and read domain fields. | Refactor support_sidebar(query, recommender) to filter on hit.confidence. |
Your task
Refactor recommender.py so the contract exposes only what the client genuinely needs:
- Define
Confidence = Literal["low", "medium", "high"]. - Define
@dataclass(frozen=True) class SongHitwithtrack_id: str,title: str,artist: str,confidence: Confidence. - Define
class Recommender(Protocol)withdef recommend(self, query: str, *, limit: int = 5) -> list[SongHit]: .... - Provide
class PopularityRecommender:whoserecommendmethod satisfies the Protocol. Use the helper_strong_track_table()already in the file to populate a few demo hits — assign confidence based on internal popularity buckets (you choose how). - Refactor
sidebar.pysosupport_sidebar(query, recommender)takes aRecommenderand returns titles of hits wherehit.confidence == "high". No numeric thresholds anywhere insidebar.py.
Also: write a module guide comment at the top of recommender.py in this exact format (you can fill in the values):
"""
Module guide:
Primary secret: <one sentence — name the volatile decision>
Likely changes: <bullets — BM25 -> embeddings, score scale shifts, ...>
Stable contract: <one sentence — what callers can rely on>
Not promised: <bullets — raw scores, bucket IDs, ranking algorithm, ...>
"""
Test 4 will look for those four words (Primary secret, Likely changes, Stable contract, Not promised) — Parnas, Clements, and Weiss called this artifact the module guide in their 1985 paper. It is the lightest-weight design-doc you can write that still records why the boundary exists.
🪞 Before clicking Next
Once all four tests pass, take 20 seconds and answer in your head:
Without using the words “score” or “BM25”: if a future engineer reads
sidebar.py, can they tell which ranking algorithm runs underneath? Why or why not?
The right answer (“no — sidebar only sees hit.confidence, which is a domain label, not an algorithm artifact”) is what you just bought with this refactor. The forbidden words force you to talk about the concept, not just point at the leak.
"""STARTING STATE.
Today's design returns ``list[tuple[bucket_id, score, raw_row]]``.
Score scale is 0..30 (BM25-like). Refactor as the instructions ask
so a future swap to embeddings (scale 0..1) does NOT break callers.
"""
# The recommender currently exposes raw scores, bucket IDs, and dict rows.
# That is an over-specified contract. Replace it.
_DEMO_CATALOG = [
# (track_id, title, artist, internal_popularity_0_to_100)
("t1", "Bad Guy", "Billie Eilish", 95),
("t2", "Bury a Friend", "Billie Eilish", 78),
("t3", "Lovely", "Billie Eilish, Khalid", 62),
("t4", "Ocean Eyes", "Billie Eilish", 55),
("t5", "Happier Than Ever","Billie Eilish", 88),
("t6", "All The Good Girls","Billie Eilish", 40),
]
def _strong_track_table() -> list[tuple[str, str, str, int]]:
"""Return the demo catalog. Use the int popularity to choose confidence."""
return list(_DEMO_CATALOG)
def recommend(query: str) -> list[tuple[int, float, dict]]:
"""LEAKY contract — returns (bucket_id, score, raw_row)."""
raw = _strong_track_table()
# Pretend BM25 scores in 0..30 derived from the popularity field.
return [
(
i // 3, # bucket_id leaks an internal partition
round(pop * 30 / 100, 2), # score scale 0..30 leaks the algorithm
{"track_id": tid, "title": title, "artist": artist, "popularity": pop},
)
for i, (tid, title, artist, pop) in enumerate(raw)
]
# TODO replace the leaky surface above with:
# 1. ``Confidence = Literal["low", "medium", "high"]``
# 2. ``@dataclass(frozen=True) class SongHit`` with the fields named in the instructions
# 3. ``class Recommender(Protocol)`` with ``recommend(query, *, limit=5) -> list[SongHit]``
# 4. ``class PopularityRecommender`` implementing the Protocol
#
# And add the module-guide docstring at the top of the file.
"""Client that knows too much.
Refactor ``support_sidebar`` to take a ``Recommender`` and ask for
high-confidence hits — no raw scores, no thresholds.
"""
from recommender import recommend
STRONG_THRESHOLD = 12.0 # BM25 scale assumption — a leak waiting to break.
def support_sidebar(query: str) -> list[str]:
hits = recommend(query)
return [row["title"] for (_bucket, score, row) in hits if score >= STRONG_THRESHOLD]
if __name__ == "__main__":
for title in support_sidebar("billie eilish"):
print(title)
Solution
"""Recommender module.
Module guide:
Primary secret: how songs are scored and ranked for a query
Likely changes:
- BM25 -> embeddings / hybrid retrieval
- score-scale shifts (0..30 today, 0..1 tomorrow)
- per-user personalization layer
- swapping the catalog source (in-memory list -> vector DB)
Stable contract: recommend(query, *, limit) -> list[SongHit]
with confidence in {"low", "medium", "high"}
sorted high -> medium -> low
Not promised:
- raw scores or score scale
- bucket IDs or index-partition keys
- tie-breaking order within a confidence band
"""
from dataclasses import dataclass
from typing import Literal, Protocol
Confidence = Literal["low", "medium", "high"]
@dataclass(frozen=True)
class SongHit:
track_id: str
title: str
artist: str
confidence: Confidence
class Recommender(Protocol):
def recommend(self, query: str, *, limit: int = 5) -> list[SongHit]: ...
_DEMO_CATALOG = [
("t1", "Bad Guy", "Billie Eilish", 95),
("t2", "Bury a Friend", "Billie Eilish", 78),
("t3", "Lovely", "Billie Eilish, Khalid", 62),
("t4", "Ocean Eyes", "Billie Eilish", 55),
("t5", "Happier Than Ever","Billie Eilish", 88),
("t6", "All The Good Girls","Billie Eilish", 40),
]
def _strong_track_table() -> list[tuple[str, str, str, int]]:
return list(_DEMO_CATALOG)
class PopularityRecommender:
"""Hidden secret: popularity-bucket ranking from an in-memory list."""
_HIGH = 80
_MEDIUM = 60
def recommend(self, query: str, *, limit: int = 5) -> list[SongHit]:
rows = _strong_track_table()
hits = [
SongHit(tid, title, artist, self._confidence_for(pop))
for (tid, title, artist, pop) in rows
]
order = {"high": 0, "medium": 1, "low": 2}
hits.sort(key=lambda h: order[h.confidence])
return hits[:limit]
def _confidence_for(self, popularity: int) -> Confidence:
if popularity >= self._HIGH:
return "high"
if popularity >= self._MEDIUM:
return "medium"
return "low"
"""Client that depends only on the Recommender Protocol."""
from recommender import Recommender
def support_sidebar(query: str, recommender: Recommender) -> list[str]:
page = recommender.recommend(query, limit=5)
return [hit.title for hit in page if hit.confidence == "high"]
if __name__ == "__main__":
from recommender import PopularityRecommender
for title in support_sidebar("billie", PopularityRecommender()):
print(title)
What you just bought. The sidebar now depends only on the
Recommender Protocol — a one-method shape. The popularity-based and
embedding-based recommenders both satisfy it. When the team eventually
swaps in embeddings, the sidebar code is untouched and the swap test
proves it. Parnas in 1972: the right interface “specifies no more
information than the client needs to use the module correctly.”
Why the module guide matters. Six months from now, when a teammate
asks “can we add raw_score?”, the docstring at the top of
recommender.py answers: that field is on the “Not promised” list,
for these specific reasons. The module guide is the lightest-weight
design doc you can write, and it costs four lines. Parnas, Clements,
and Weiss called this the module guide in their 1985 paper on the
A-7E flight-software project, where it was the artifact maintainers
consulted first to find which module to edit.
One subtle move. You also sorted the returned hits by confidence band (high → medium → low). That ordering is part of the stable contract — clients can rely on it. But notice that within a band the order is unspecified. That preserves the option to tie-break by recency, by user signal, by random shuffle for A/B testing — all future decisions you have not yet made.
Step 4 — Knowledge Check
Min. score: 80%
1. Parnas’s 1972 paper pointed out that even his “good” KWIC
decomposition had a leak: the circular-shift module exposed an
ordering its clients did not need. Which best matches the same
leak in this step’s original recommend() function?
Parnas’s exact quote in the KWIC discussion: “we have specified more than was necessary and thereby reduced the number of possible implementations.” The raw score and its scale are a representation detail clients did not need to know. Clients can write score > 12.0 and now any new algorithm must produce numbers on the same scale. Confidence buckets ("high", "medium", "low") reveal what the client needs — is this hit strong enough to surface — without naming the algorithm.
2. (Select all that apply.) Which of these are legitimate facts for the contract to promise to clients, and which would be leaks? Mark all the items that are LEAKS the contract should hide. (select all that apply)
A clean module guide for Recommender says exactly this: Confidence is in the contract because clients can’t reason without it; descending-by-confidence is in the contract because it’s a domain-level promise; raw scores and bucket IDs are not in the contract because they tie the implementation to one algorithm and one index layout. The skill is naming what is allowed to vary later.
3. Spaced retrieval from Step 2. Suppose your PopularityRecommender
internally stores its catalog as a list[tuple]. Which class is
responsible for the decision “use a list of tuples”?
Each implementation behind a Protocol owns its own storage secret. The Protocol’s job is to make sure the clients never have to care. That is what lets EmbeddingRecommender use a vector store and PopularityRecommender use a list of tuples without forcing either choice on the other.
4. You’re reviewing a teammate’s PR. They added a new method to
Recommender: def raw_score(self, hit: SongHit) -> float.
Should you approve?
A subtle anti-pattern: adding to the contract can re-leak a decision you previously hid. Confidence labels were chosen precisely so the raw score and its scale wouldn’t be visible. The right move is to ask the teammate what client problem raw_score solves, and address that problem at the right level — e.g., add a is_top_match(hit) predicate, expose a new confidence band like "very_high", or expose an explanation token. Never re-expose the score itself.
5. Spaced retrieval — Step 3 (Protocol mechanics). Your
PopularityRecommender was defined before Recommender(Protocol)
appeared in the same file. A second class, EmbeddingRecommender,
is defined in a test file — not even imported by recommender.py.
Both satisfy the Recommender Protocol. Why does that work?
Step 3’s central lesson, retrieved through this step’s lens. Protocols use structural subtyping, so neither order of definition nor explicit inheritance is needed. The Step 3 swap test (PlaylistLike) and this step’s swap test (Recommender + EmbeddingRecommender) are the same mechanism applied to two different domains. That is what makes the Protocol the right tool for “swap-this-for-that” designs: new implementations can be added later, anywhere, with no edit to the Protocol or its existing implementers.
Where Did You Put the Database?
Why this matters
The single most common information-hiding leak in real code is storage. A function that takes a sqlite3.Connection (or a MongoClient, or an S3 handle) and returns rows ties every caller to a specific persistence technology. When the team migrates from SQLite to Postgres, from rows to JSON, from synchronous to async, everything moves. This step is the canonical Parnas case made hands-on. You’ll do the whole routine yourself.
🎯 You will learn to
- Create a
Protocol+ dataclass + in-memory implementation from a leaky function — independently, using the five-step routine - Apply dependency injection: pass the directory in to the client instead of constructing the storage inside it
- Evaluate the change-impact radius of a storage migration before and after your refactor
The scene
events.py looks up concerts by city. Today’s implementation uses SQLite. The function signature reveals it — every client compiles against sqlite3. The product manager wants to add a JSON-file-backed test fixture for offline development, and the SRE wants to migrate the production catalog to a remote HTTP service. Each of those is a separate file rewrite today. Your job is to make them all one new class apiece.
✏️ Predict before you run
Suppose we keep the current events.py signature and just implement a JSON-file fixture. How many files have to be edited to use it from tour_planner.py?
- (a) 1 —
events.pyonly. - (b) 2 —
events.pyandtour_planner.py. - (c) 3+ —
events.py,tour_planner.py, every test that constructs the connection, and any module that builds the SQLtablestring. - (d) 0 — duck typing handles it; pass a JSON dict where a connection is expected.
Commit. After your refactor, the same change will require one new class in events.py and zero edits to tour_planner.py — that is your verification.
Scaffold: write the change map first
This step is the most independent refactor so far, but you still get a planning rail. Before touching code, complete this map mentally:
| Question | Answer for this step |
|---|---|
| What is likely to change? | SQLite may become Postgres, HTTP, or a JSON fixture. |
| What is the secret? | Persistence technology plus schema/row mapping. |
| Who may know it? | Concrete directory implementations such as SQLiteEventDirectory. |
| Who must not know it? | tour_planner.affordable_shows and tests that only need events. |
| What is the stable contract? | directory.find_in(city) -> list[Event]. |
Then code in passes: define Event, define the EventDirectory Protocol, make the tiny in-memory implementation, make the SQLite implementation, and only then refactor tour_planner.py. If a pass fails, you know which layer to fix.
Your task
Refactor events.py so the persistence decision is hidden:
- Define
@dataclass(frozen=True) class Eventwithtitle: str,venue: str,date_iso: str,city: str,ticket_price_cents: int. - Define
class EventDirectory(Protocol)withdef find_in(self, city: str) -> list[Event]: .... - Implement
class InMemoryEventDirectory:— constructor takes alist[Event],find_in(city)filters by city. This is your test/fixture implementation. - Implement
class SQLiteEventDirectory:— constructor takes asqlite3.Connectionand a table name,find_in(city)runs the same SQL the original function ran and maps rows toEvent. This is the only file that mayimport sqlite3.
Refactor tour_planner.py so affordable_shows(directory, city, max_price_dollars=50) takes an EventDirectory (not a connection). Filter inside the function using event.ticket_price_cents and return a list[Event].
Add the module guide docstring to events.py using the same four labels you used in Step 4.
You will probably break the implementation-swap test first. The most common cause is forgetting to map raw SQL row tuples back to
Eventobjects inSQLiteEventDirectory.find_in. If the test fails, read its diff carefully — the failure is the lesson, not the verdict.
🪞 Before clicking Next
Once all four tests pass, answer this in your head before the quiz:
Without using the words “SQL” or “database”: after the refactor,
affordable_showscalls one method on its parameter. Name that method and explain why that single call is enough to absorb every plausible storage migration (SQLite → Postgres → HTTP → file).
The forbidden words force you to describe the contract, not the current implementation. If you find yourself reaching for “SQL”, that is your brain telling you the contract still has a database shape in it — which would mean the abstraction is not really hiding storage.
"""Concert directory.
STARTING STATE: leaks sqlite3 and the row dict shape into every caller.
"""
import sqlite3
def find_events_in_city(
connection: sqlite3.Connection,
table: str,
city: str,
) -> list[dict]:
rows = connection.execute(
f"SELECT title, venue, date_iso, city, ticket_price_cents "
f"FROM {table} WHERE city = ?",
(city,),
).fetchall()
return [
{
"title": r[0],
"venue": r[1],
"date_iso": r[2],
"city": r[3],
"ticket_price_cents": r[4],
}
for r in rows
]
# TODO Run the five-step routine yourself:
# 1. Name the change. (One coming: SQLite -> Postgres -> HTTP service.)
# 2. Name the secret. (Persistence technology + schema mapping.)
# 3. Minimum client assumptions. (event has title, venue, date, city, price.)
# 4. Remove the leak.
# - ``@dataclass(frozen=True) class Event``
# - ``class EventDirectory(Protocol)`` with ``find_in(city) -> list[Event]``
# - ``class InMemoryEventDirectory`` (constructor takes list[Event])
# - ``class SQLiteEventDirectory`` (constructor takes connection + table)
# 5. Verify with a swap. (The hidden test will swap implementations.)
"""Client of events.py. Currently knows about sqlite3 by transitive coupling.
Refactor ``affordable_shows`` to take an EventDirectory instead.
"""
from events import find_events_in_city
def affordable_shows(connection, table: str, city: str, max_price_dollars: int = 50):
cents_limit = max_price_dollars * 100
events = find_events_in_city(connection, table, city)
return [e for e in events if e["ticket_price_cents"] <= cents_limit]
if __name__ == "__main__":
# The demo wires SQLite in *this* file. That is the only place
# the sqlite3 import is allowed AFTER the refactor.
import sqlite3
conn = sqlite3.connect(":memory:")
conn.execute(
"CREATE TABLE shows("
"title TEXT, venue TEXT, date_iso TEXT, city TEXT, ticket_price_cents INT)"
)
conn.executemany(
"INSERT INTO shows VALUES (?, ?, ?, ?, ?)",
[
("Sabrina Carpenter", "The Forum", "2026-03-01", "Los Angeles", 11500),
("Olivia Rodrigo", "Crypto.com Arena", "2026-03-05", "Los Angeles", 9800),
("Tame Impala", "Hollywood Bowl", "2026-04-12", "Los Angeles", 6700),
("Local Open Mic", "Echo Park Bar", "2026-03-15", "Los Angeles", 1500),
],
)
conn.commit()
# After refactor:
# from events import SQLiteEventDirectory
# directory = SQLiteEventDirectory(conn, "shows")
# for ev in affordable_shows(directory, "Los Angeles", max_price_dollars=80):
# print(ev)
for ev in affordable_shows(conn, "shows", "Los Angeles", max_price_dollars=80):
print(ev)
Solution
"""Concert directory.
Module guide:
Primary secret: how events are persisted and looked up
Likely changes:
- SQLite -> Postgres / remote HTTP service
- column / schema renames
- addition of caching or read replicas
Stable contract: EventDirectory.find_in(city) -> list[Event]
Event is a frozen dataclass of domain fields
Not promised:
- the storage technology, connection object, or table name
- SQL column names or row encoding
- whether results are cached, paginated, or streamed
"""
from __future__ import annotations
import sqlite3
from dataclasses import dataclass
from typing import Protocol
@dataclass(frozen=True)
class Event:
title: str
venue: str
date_iso: str
city: str
ticket_price_cents: int
class EventDirectory(Protocol):
def find_in(self, city: str) -> list[Event]: ...
class InMemoryEventDirectory:
"""Test/fixture implementation. Constructor takes the events directly."""
def __init__(self, events: list[Event]) -> None:
self._events = list(events)
def find_in(self, city: str) -> list[Event]:
return [e for e in self._events if e.city == city]
class SQLiteEventDirectory:
"""Production implementation. This is the ONLY file that knows SQLite."""
_COLUMNS = "title, venue, date_iso, city, ticket_price_cents"
def __init__(self, connection: sqlite3.Connection, table: str) -> None:
self._conn = connection
self._table = table
def find_in(self, city: str) -> list[Event]:
rows = self._conn.execute(
f"SELECT {self._COLUMNS} FROM {self._table} WHERE city = ?",
(city,),
).fetchall()
return [Event(*r) for r in rows]
from events import EventDirectory, Event
def affordable_shows(
directory: EventDirectory,
city: str,
max_price_dollars: int = 50,
) -> list[Event]:
cents_limit = max_price_dollars * 100
return [e for e in directory.find_in(city) if e.ticket_price_cents <= cents_limit]
if __name__ == "__main__":
import sqlite3
from events import SQLiteEventDirectory
conn = sqlite3.connect(":memory:")
conn.execute(
"CREATE TABLE shows("
"title TEXT, venue TEXT, date_iso TEXT, city TEXT, ticket_price_cents INT)"
)
conn.executemany(
"INSERT INTO shows VALUES (?, ?, ?, ?, ?)",
[
("Sabrina Carpenter", "The Forum", "2026-03-01", "Los Angeles", 11500),
("Olivia Rodrigo", "Crypto.com Arena", "2026-03-05", "Los Angeles", 9800),
("Tame Impala", "Hollywood Bowl", "2026-04-12", "Los Angeles", 6700),
("Local Open Mic", "Echo Park Bar", "2026-03-15", "Los Angeles", 1500),
],
)
conn.commit()
directory = SQLiteEventDirectory(conn, "shows")
for ev in affordable_shows(directory, "Los Angeles", max_price_dollars=80):
print(ev)
The Parnas case in one tutorial. events.py is now the only
file that knows the persistence decision. tour_planner.py knows
only the EventDirectory Protocol and the Event dataclass.
Migrating to Postgres or an HTTP service is one new class.
What you proved with the swap test. When the same affordable_shows
function ran against InMemoryEventDirectory and SQLiteEventDirectory
and returned the same set of titles, you proved the function couldn’t
be reaching into storage internals. That’s the operational definition
of “the secret is hidden” — the test passing is the evidence.
One pedagogically important note about your __main__ demo. The
import sqlite3 in tour_planner.py’s __main__ is fine — that’s
the wiring layer (sometimes called composition root or bootstrap).
Wiring is where you finally pick which concrete implementation to use.
The rule isn’t “nobody outside events.py may say sqlite3”; the rule
is “nobody outside events.py may depend on sqlite3 in their
business logic.” Wiring code is allowed — that’s where the choice
actually has to live somewhere.
Step 5 — Knowledge Check
Min. score: 80%1. After the refactor, which of these changes touches only one file (the file that owns the storage secret)?
Storage is the secret. A HttpEventDirectory is a new class behind the same EventDirectory Protocol — zero edits to tour_planner.py, zero edits to Event, zero edits to InMemoryEventDirectory. That is what you bought with the refactor.
2. (Select all that apply.) Which of these are good reasons your
InMemoryEventDirectory is worth writing, even though production
uses SQLite?
(select all that apply)
Information hiding pays in three currencies, all visible here: testability (1), parallel work (2), and comprehensibility (3) — Parnas’s original three benefits in the 1972 paper. The fourth option sounds plausible but isn’t actually true.
3. A teammate writes a new client function that takes an
EventDirectory parameter. But for “convenience,” they also
add a second parameter connection: sqlite3.Connection,
because they need to do a one-off transaction-level operation.
What is the problem?
This is a real, subtle anti-pattern: “leaky abstraction by parameter creep.” With a connection parameter, the client now compiles only against SQLite. The Protocol promises you don’t need a connection to ask about events — and that promise breaks the moment a function also requires a connection. The fix is to push the transactional operation behind the Protocol (e.g., a new EventDirectory.archive(event) method) so the connection is still local to the directory implementation.
4. Spaced retrieval — Step 4 (overspecification). Your EventDirectory.find_in(city) returns
list[Event]. The team is asked to add pagination. Two design proposals:
- A:
find_in(self, city: str, *, page: int, page_size: int) -> list[Event] - B:
find_in(self, city: str, *, cursor: str | None = None, limit: int = 50) -> EventPagewhereEventPageis@dataclass(frozen=True)withevents: list[Event]andnext_cursor: str | None.
Clients only care that there is a next_cursor (or none) — same lesson as the recommender’s Confidence enum. Naming the domain concept (next_cursor) instead of the algorithm detail (page: int) hides one more design decision. The reader who studied Step 4 should now find this pattern instinctive — a faded transfer of the exact same idea, applied to a different domain.
5. Spaced retrieval — Step 2 (representation). Your Event
dataclass is declared @dataclass(frozen=True). Why frozen,
specifically, and not just @dataclass?
Step 2’s lesson, retrieved through the Step 5 lens. Step 2’s Track was frozen so top_tracks(3) could safely return references to internal tracks without callers mutating them. Step 5’s Event is frozen for the same reason: an in-memory directory may cache Event instances, and a caller who mutates one would corrupt that cache without warning. Frozen dataclasses are the cheapest way to make a domain object both typed and safe to hand out — the same one-line move you used in Step 2 generalizes here.
Single Choice: Stop Repeating the Provider List
🧠 Before you read — retrieve from memory
You’re about to do refactor #4. Before another worked example layers on top of the routine you’ve practiced three times already, your brain needs a chance to produce it cold — that’s what makes the next refactor cheaper than the last one, instead of just longer to read.
You’ve now done three refactors that each followed the same five-step routine. Cover the screen and write the five labels of that routine from memory. (A scrap of paper, a comment in your editor, or your head — any form is fine. Just don’t peek.)
Reveal (after you've written your version)
The canonical labels — same five every time, from Parnas's design-for-change discipline: ```text 1. Name the change. What is about to change, and why is it likely? 2. Name the secret. Which design decision should one module own? 3. Minimum client assumptions. What does the client *actually* need to know? 4. Remove the leak. Replace exposed representation with domain operations, a Protocol, dependency injection — whatever names the contract without naming the decision. 5. Verify with a swap. Same client, different implementation, same output. ``` If yours matched word-for-word, your schema is solidifying — that's exactly what spaced retrieval is supposed to do. If you got 4 out of 5 (most students do by this point), notice *which* you missed: the one most often dropped is **#3** (minimum client assumptions), because it's the only step that asks you to reason about the *client* rather than the module being refactored. Karpicke & Roediger (2008) found that recalling material without cues produces 50% stronger retention than re-reading the same material. The 30 seconds you just spent writing the routine from memory is the cheapest learning move in this tutorial.Why this matters
Open any production codebase and search for if provider ==. You’ll find the same alphabetical list of providers in four files. Add a fifth provider and you edit all four — and inevitably miss one, shipping a “feature works on Spotify but silently breaks on Tidal” bug. The SEBook chapter calls this the Single Choice principle: when a system supports several alternatives, only one module should know the exhaustive list. This step makes Single Choice operational. The killer test: you’ll add a fourth provider — invisible to your refactored code — and three client functions will work with it unchanged.
🎯 You will learn to
- Apply the Single Choice principle by replacing scattered
if provider == "..."switches with polymorphism behind a hidden choice point - Analyze code for repeated exhaustive lists (the same set of
"spotify","apple_music","tidal"strings in multiple files is the smell) - Create a new provider class that satisfies the
StreamingProviderProtocol — and feel that no existing client function had to change to absorb it
The scene
streaming.py has three top-level functions: play_track, share_track, like_track. Each one has the identical if provider == "spotify": ... elif provider == "apple_music": ... elif provider == "tidal": ... ladder. The product manager just said: “Add YouTube Music. Same operations.” The bad design: four edits across three files. The good design: one new class. The test enforces the second.
✏️ Predict before you run
Today’s streaming.py repeats the provider list in three functions. If we add YouTube Music in the current design, how many elif branches must be added across the file?
- (a) 1 — a new branch in one function is enough.
- (b) 3 — one new branch per function, three total.
- (c) 4 — three new branches plus a new helper function.
- (d) 0 — Python’s
matchstatement handles it.
Commit. Then refactor and see the answer for the good design.
Your task
Refactor streaming.py:
- Define
class StreamingProvider(Protocol)withplay(self, track_id) -> str,share(self, track_id, friend) -> str,like(self, track_id) -> str. Each returns the message string that the current code prints. - Define
class SpotifyProvider,class AppleMusicProvider,class TidalProvider— each implements all three methods. - Rewrite
play_track(provider: StreamingProvider, track_id: str),share_track(...), andlike_track(...)so each just delegates to the corresponding method on the passed-in provider — noif/elif/matchladders anywhere.
The hidden test will then construct a fourth provider — YouTubeMusicProvider — which your code has never seen. If your play_track/share_track/like_track functions are properly polymorphic, that fourth provider will Just Work. If any branching on "youtube_music" is needed, the test fails.
🪞 Before clicking Next
Once all three tests pass, do this self-check before the quiz:
Without using the word “Protocol”: search this tutorial mentally across all four refactors (Steps 2, 4, 5, and 6). In each one, you replaced direct exposure of a design decision with what kind of thing? The four answers are different in form but all instances of the same move.
The four are: (Step 2) domain operations on a class, (Step 4) a typed shape + dataclass, (Step 5) dependency injection of a typed shape, (Step 6) polymorphism on a typed shape. Each one is a different way to make a contract not name the volatile decision. (The forbidden word forces you to name what each refactor was for, not the Python construct it used.) The quiz’s last question asks this in MCQ form.
"""STARTING STATE.
Three functions, each with the same provider ladder. The "exhaustive
list of providers" is duplicated three times. Refactor with
polymorphism behind a hidden choice point.
"""
def play_track(provider: str, track_id: str) -> str:
if provider == "spotify":
return f"Playing {track_id} on Spotify..."
elif provider == "apple_music":
return f"Playing {track_id} on Apple Music..."
elif provider == "tidal":
return f"Streaming {track_id} on Tidal hi-fi..."
else:
raise ValueError(f"Unknown provider: {provider}")
def share_track(provider: str, track_id: str, friend: str) -> str:
if provider == "spotify":
return f"Shared Spotify link {track_id} with {friend}"
elif provider == "apple_music":
return f"Sent Apple Music card for {track_id} to {friend}"
elif provider == "tidal":
return f"Tidal shared {track_id} to {friend}"
else:
raise ValueError(f"Unknown provider: {provider}")
def like_track(provider: str, track_id: str) -> str:
if provider == "spotify":
return f"Liked Spotify track {track_id}"
elif provider == "apple_music":
return f"Loved Apple Music track {track_id}"
elif provider == "tidal":
return f"Added Tidal track {track_id} to favorites"
else:
raise ValueError(f"Unknown provider: {provider}")
# TODO Replace the ladders with:
# 1. ``class StreamingProvider(Protocol)`` (play, share, like)
# 2. ``SpotifyProvider``, ``AppleMusicProvider``, ``TidalProvider``
# 3. Rewrite play_track / share_track / like_track to delegate
if __name__ == "__main__":
print(play_track("spotify", "t1"))
print(share_track("apple_music", "t1", "Alex"))
print(like_track("tidal", "t9"))
Solution
"""Polymorphism behind a hidden choice point — Single Choice in one file."""
from typing import Protocol
class StreamingProvider(Protocol):
def play(self, track_id: str) -> str: ...
def share(self, track_id: str, friend: str) -> str: ...
def like(self, track_id: str) -> str: ...
class SpotifyProvider:
def play(self, track_id):
return f"Playing {track_id} on Spotify..."
def share(self, track_id, friend):
return f"Shared Spotify link {track_id} with {friend}"
def like(self, track_id):
return f"Liked Spotify track {track_id}"
class AppleMusicProvider:
def play(self, track_id):
return f"Playing {track_id} on Apple Music..."
def share(self, track_id, friend):
return f"Sent Apple Music card for {track_id} to {friend}"
def like(self, track_id):
return f"Loved Apple Music track {track_id}"
class TidalProvider:
def play(self, track_id):
return f"Streaming {track_id} on Tidal hi-fi..."
def share(self, track_id, friend):
return f"Tidal shared {track_id} to {friend}"
def like(self, track_id):
return f"Added Tidal track {track_id} to favorites"
def play_track(provider: StreamingProvider, track_id: str) -> str:
return provider.play(track_id)
def share_track(provider: StreamingProvider, track_id: str, friend: str) -> str:
return provider.share(track_id, friend)
def like_track(provider: StreamingProvider, track_id: str) -> str:
return provider.like(track_id)
# ---- Wiring (the ONE place that knows the exhaustive list of providers) ----
_REGISTRY: dict[str, type[StreamingProvider]] = {
"spotify": SpotifyProvider,
"apple_music": AppleMusicProvider,
"tidal": TidalProvider,
}
def provider_for(name: str) -> StreamingProvider:
"""Composition-root helper: pick a provider by name from a config string."""
if name not in _REGISTRY:
raise ValueError(f"Unknown provider: {name}")
return _REGISTRY[name]()
if __name__ == "__main__":
print(play_track (provider_for("spotify"), "t1"))
print(share_track(provider_for("apple_music"), "t1", "Alex"))
print(like_track (provider_for("tidal"), "t9"))
The Single Choice payoff, in one sentence. Adding a fourth provider
is one new class and one new entry in the wiring registry. The
client functions play_track, share_track, like_track do not change.
The test proved it by constructing YouTubeMusicProvider outside your
code and passing it through — zero edits required.
Where the choice still lives. Notice that the exhaustive list of
provider names does still exist — in _REGISTRY. That’s deliberate.
Single Choice doesn’t say “no module knows the list.” It says “only
one module knows the list.” The wiring layer is that module. Every
other module sees a StreamingProvider and forgets which one it is.
The general pattern this step taught. When you find the same
exhaustive list (provider, payment_method, tax_jurisdiction,
auth_strategy, etc.) appearing in if ladders in multiple files,
the fix is always the same shape:
- Define a
Protocolfor the operation set. - Make each alternative a class implementing the Protocol.
- Have client code call the operation on an injected instance.
- Put the exhaustive list in one wiring/registry module.
This is the chapter’s Single Choice principle, made operational. Now, when you encounter the same shape at work (or in your CS130 group project), you have a routine — not just a name.
Step 6 — Knowledge Check
Min. score: 80%1. The Single Choice principle says: if a system supports several alternatives, only one module should know the exhaustive list. In your refactored code, where does the list of supported providers actually live?
The polymorphic dispatch (provider.play(track_id)) replaces the if/elif ladder, but the list still has to exist somewhere — when the app starts up, someone decides “today we’re using YouTube Music.” That somewhere is the wiring code (composition root) — and now it’s the only place that knows the full list. Adding a fifth provider is one new class + one new wiring entry. That’s Single Choice.
2. Before the refactor, the same if provider == "spotify": ... elif "apple_music" ...
ladder appeared in three functions. What kind of coupling
connected those three functions?
Semantic coupling is the SEBook chapter’s term for “two modules share the same assumption without saying so.” Change the list in one place and the others silently disagree. The provider-list scattered across three functions was a textbook case. The compile-time tools (grep, type checkers) couldn’t help you find it. The polymorphism refactor removes the shared assumption from those three modules — they now only know provider.play(...). The assumption now lives in one place: the wiring.
3. (Select all that apply.) Which of these is now CHEAP to do, after your Single Choice refactor? (select all that apply)
Three of these are absorbed by the refactor; the fourth (renaming like to favorite) is a contract change and ripples. A/B-testing two implementations of the same provider at once is the test (2) — and it’s only possible because the wiring layer hands out provider instances, not provider strings. That kind of flexibility is one of the quietest big wins of polymorphism-behind-a-Protocol.
4. Spaced retrieval — Step 1 (private isn’t enough). A teammate
asks: “Could we have solved the original provider-coupling problem
just by making provider a private field on a single shared
module, instead of three top-level functions?”
What’s the cleanest objection?
This is Step 1’s lesson, retrieved through Step 6’s lens. Visibility modifiers are not the unit of information hiding. Hiding the provider: str field as self._provider would still leave the same if self._provider == "spotify": ... elif ... ladder in every method — the shape of the leak doesn’t change. The Single Choice violation lives in the branching ladder, not the variable name. The fix is structural (polymorphism on a Protocol), not lexical (more underscores). Same lesson as Step 1’s WatchHistory._history: the underscore did not hide the design decision.
5. Spaced retrieval across the tutorial. Which of the following best describes the single common move that connects Steps 2, 4, 5, and 6?
Every refactor step did the same operation at a different level: identify the design decision that is likely to change, then replace its direct exposure with a stable contract that does not name it. Storage (Step 2). Algorithm + score scale (Step 4). Persistence technology (Step 5). Exhaustive list of providers (Step 6). The form changes per step (domain methods, then Protocol + dataclass, then dependency injection, then polymorphism), but the principle is one move repeated. That’s the entire skill this tutorial trains. Step 3 gave you the Python construct that all of them used; Step 7 will ask you to recognize the type of leak before you fix it.
Sort the Leaks
Why this matters
Steps 2-6 each taught one kind of leak in isolation — that’s blocked practice, and it’s the right shape for building each schema. But real codebases mix leak types, and the skill an engineer actually needs is classification first, fix second: read a snippet, identify which kind of leak it is (or whether it’s a leak at all), and then pick the right routine.
This step is pure judgment — no code to write, no files to refactor. Six short snippets. For each one, you decide what kind of leak (if any) is present and which step’s routine fixes it.
🎯 You will learn to
- Discriminate between the four leak types you’ve practiced — by attending to deep structure, not surface cues
- Recognize when a snippet is not a leak, and resist the “always abstract” instinct
- Match each leak to the step that taught its fix (representation = Step 2, overspecification = Step 4, persistence = Step 5, exhaustive-alternatives = Step 6)
How to read each snippet
Every snippet has a specific design decision visible (or appropriately hidden). The deep-structure cue you’re looking for: what would have to change in clients if the implementation chose differently? If nothing would, it’s not a leak. If many clients would, name the type and pick the routine.
The same five-step routine you retrieved at the start of Step 6 applies to every fix. This step trains the which routine judgment that comes before applying it.
Research base: Rohrer & Taylor (2007) and Dunlosky et al. (2013) find that interleaved practice produces worse performance during practice but dramatically better transfer afterward — because mixing examples forces attention to the structural feature rather than the surface feature. The next two questions might feel harder than Steps 2-6 did. That’s the point.
# Step 7 — Sort the Leaks
Six short snippets are in the quiz on the right. Each shows a small
Python module. For each one, decide:
1. Is there a leak?
2. If yes, which *kind* — representation (Step 2), over-specification
(Step 4), persistence (Step 5), or exhaustive-alternatives (Step 6)?
3. If no, why is the abstraction unnecessary here?
You will not edit code in this step. The skill being trained is
classification — the move that comes *before* picking a fix.
Solution
# Step 7 — answer key
| Snippet | Leak type | Routine to apply |
|---------|---------------------------------|-------------------|
| 1 | Representation | Step 2 — frozen dataclass + domain methods |
| 2 | Over-specification | Step 4 — Protocol + domain-level labels |
| 3 | Persistence | Step 5 — Protocol over storage, dependency injection |
| 4 | Exhaustive alternatives | Step 6 — polymorphism behind a Protocol |
| 5 | **Not a leak** — scope too small | Don't refactor; revisit if growth makes it plausible |
| 6 | Representation **and** Exhaustive alternatives | Steps 2 + 6, combined |
What you just trained. Each of Steps 2-6 taught one kind of leak in isolation — blocked practice, useful for building each schema. This step mixed them — interleaved practice, useful for building discrimination. Research (Rohrer & Taylor 2007; Dunlosky et al. 2013) finds that interleaved practice feels harder during practice but produces dramatically better transfer afterward, because it forces attention to the structural feature (the design decision being exposed) rather than the surface feature (the language construct or domain vocabulary).
The honest one (Snippet 5). The correct answer was “don’t refactor”. If the entire takeaway of this tutorial were “always hide”, you’d over-apply the principle and produce abstractions nobody pays for. The principle is a bet on future change. Bet where change is plausible; abstain where it isn’t. The next step — change-impact prediction on a whole system — uses this same calibration.
The two-leak one (Snippet 6). Real codebases stack leaks. The fix is to apply both routines, in either order. The fact that you can name which routine applies to which leak is the operational form of the skill the tutorial trains.
Step 7 — Knowledge Check
Min. score: 80%1. Snippet 1.
class ConcertCalendar:
def __init__(self):
self._dates: list[dict] = []
def add(self, date_iso: str, venue: str):
self._dates.append({"date": date_iso, "venue": venue})
def all(self) -> list[dict]:
return self._dates
all() introduce?
Same shape as Step 2’s Playlist before refactoring, and Step 1’s WatchHistory.recent() before its fix. Leaked decisions: “storage is a list”, “items are dicts with these keys”, “iteration order is insertion order”, “clients can mutate it.” The fix is the representation refactor (Step 2 routine): a frozen dataclass + domain methods, with all() returning a domain-typed sequence.
2. Snippet 2.
def rank_articles(query: str) -> list[tuple[int, float, dict]]:
"""Returns (shard_id, tfidf_score, raw_row) tuples."""
...
This is Step 4’s leak: a read-only API that looks clean but exposes the ranking algorithm (tfidf_score) and its numeric scale. A future swap to embeddings (scores in 0..1) would silently break every client’s threshold. The Step 4 routine: hide it behind a Confidence enum or similar domain-level label, then expose list[ArticleHit] instead of the raw tuple.
3. Snippet 3.
def list_attendees(
conn: sqlite3.Connection,
table: str,
event_id: int,
) -> list[dict]:
return conn.execute(
f"SELECT name, email FROM {table} WHERE event_id = ?",
(event_id,),
).fetchall()
Step 5’s canonical leak. sqlite3.Connection and a SQL table name in the public signature mean every caller compiles against SQLite — and the SRE migrating to Postgres or an HTTP service has to chase the type through every file. Step 5 routine: hide it behind an AttendeeDirectory(Protocol) with find_for_event(event_id) -> list[Attendee].
4. Snippet 4.
def send_notification(channel: str, recipient: str, body: str) -> None:
if channel == "email":
send_email(recipient, body)
elif channel == "sms":
send_sms(recipient, body)
elif channel == "push":
send_push(recipient, body)
else:
raise ValueError(channel)
def queue_notification(channel: str, recipient: str, body: str) -> str:
if channel == "email":
return f"queued email job to {recipient}"
elif channel == "sms":
return f"queued sms job to {recipient}"
elif channel == "push":
return f"queued push job to {recipient}"
else:
raise ValueError(channel)
Step 6’s leak. The exhaustive list (email/sms/push) appears in two functions; nothing forces them to stay in sync. Step 6 routine: polymorphism behind a NotificationChannel(Protocol) with send and queue methods. One module — the wiring/composition root — owns the list of supported channels.
5. Snippet 5 — the honest one.
# cleanup_old_drafts.py
# Run weekly via cron. Deletes draft files older than 30 days.
import time
from pathlib import Path
DRAFT_DIR = Path("/var/app/drafts")
CUTOFF_SECONDS = 30 * 24 * 3600
for path in DRAFT_DIR.glob("*.draft"):
if time.time() - path.stat().st_mtime > CUTOFF_SECONDS:
path.unlink()
DraftSource(Protocol) so we could swap to S3 later.” Should you?
The honest answer: not every leak should be hidden. Information hiding pays in maintenance, and a 10-line cron script with no plausible second caller has no maintenance to amortize against. The layer taxes every future reader for an S3 migration that may never happen. If this script grew (multiple draft sources, multiple deletion policies, multiple environments), then the abstraction would earn its place. Until then, the indirection is pure cognitive tax. The skill is choosing when to apply, not just how — Step 8 puts a number on this with the “blast radius” exercise.
6. Snippet 6 — interleaved final.
# tournament.py
class TournamentBracket:
def __init__(self):
self._matches: list[dict] = []
def add_match(self, team_a: str, team_b: str, court: str):
self._matches.append({"a": team_a, "b": team_b, "court": court})
def assign_court(self, match_index: int, court_provider: str) -> str:
if court_provider == "stadium":
return f"Stadium court for match {match_index}"
elif court_provider == "outdoor":
return f"Outdoor court for match {match_index}"
elif court_provider == "indoor":
return f"Indoor court for match {match_index}"
raise ValueError(court_provider)
def matches(self) -> list[dict]:
return self._matches
Two leaks at once — the realistic case. matches() is Step 2’s representation leak; assign_court is Step 6’s Single Choice violation. The fix needs both routines: a frozen Match dataclass with domain methods and a CourtProvider(Protocol) injected at construction. Real codebases mix leak types. Classification matters as much as the fix because each leak type has a different routine attached to it.
Predict the Blast Radius
Why this matters
Information hiding is verified by simulating change — Parnas’s original test, and the one industry calls change impact analysis. A real engineer’s job isn’t to recite that classes should depend on abstractions. It’s to read a system and predict: if this changes, what else changes? This step is your final exam for the tutorial: a fresh, never-seen MusicShare app with five modules, four plausible change requests (one of which has the correct answer “don’t refactor”), one honest-tradeoff question, and one cold-transfer case from a different domain. Plus one short open-text artifact — a module guide for ui.py — to consolidate everything you’ve learned into the lightest-weight design doc Parnas, Clements & Weiss invented.
🎯 You will learn to
- Predict the change-impact radius of a plausible future change in a small system before attempting the change
- Evaluate when a layer of information hiding pays for itself — and when it adds cognitive overhead without proportional benefit
- Apply the five-step routine on a system you’ve never seen before
- Produce a Parnas/Clements/Weiss module guide for an unfamiliar module under time pressure
The MusicShare app
MusicShare ships a web UI for discovering and sharing music. Its five real modules:
| Module | Public surface (the contract) | Hidden secret |
|---|---|---|
recommender.py |
Recommender(Protocol).recommend(query, *, limit) -> list[SongHit] |
scoring / ranking algorithm |
streaming.py |
StreamingProvider(Protocol) + play_track / share_track / like_track |
which streaming service is used today |
playlist.py |
Playlist class with add, top_tracks(n), total_duration_minutes(), average_popularity(), __len__ |
internal storage representation |
events.py |
EventDirectory(Protocol).find_in(city) -> list[Event] |
which persistence backend stores concert listings |
ui.py |
HTTP handlers for /search, /share, /like, /concerts/<city> |
how requests are routed / rendered to HTML |
Plus the wiring layer (composition_root.py) that picks today’s concrete Recommender, StreamingProvider, and EventDirectory instances.
Your tasks
-
Write a module guide for
ui.pyin the fileMODULE_GUIDE.md. Use the same four labels you learned in Steps 4–5:Primary secret,Likely changes,Stable contract,Not promised. One stylistic note: Steps 4–5 wrote the guide inside a"""..."""Python docstring at the top of a.pyfile (because there was a module file to attach it to). Here the artifact stands alone, so it’s a.mdfile with the four labels as Markdown##headings instead. Same content, same Parnas/Clements/Weiss-1985 format — just rendered for Markdown instead of Python. The labels still match exactly so a future maintainer cangrepfor them across both formats.One or two lines per label is enough — the artifact’s value is in the content, not the length. The test enforces substantive content under each label and that the
Not promisedsection names at least one specific concrete decision (HTML templating, route paths, response formats, authentication, etc.). -
Answer all six quiz questions below. Four are change-impact predictions on MusicShare (one of which has “don’t refactor” as the correct answer); the fifth is the honest-tradeoff question; the sixth is an unscaffolded transfer case on a system you have not seen.
The module guide is the consolidation artifact: producing a four-label document for a module you’ve never edited proves you can apply the discipline on cold material. That is the meaningful capstone for this tutorial.
# MusicShare system map
Five modules + wiring:
- recommender.py — Recommender Protocol; today's concrete is PopularityRecommender.
Secret: scoring / ranking algorithm.
- streaming.py — StreamingProvider Protocol; today's concretes are
SpotifyProvider, AppleMusicProvider, TidalProvider.
Secret: which streaming service is used.
- playlist.py — Playlist class; secret: internal storage representation.
- events.py — EventDirectory Protocol; today's concrete is SQLiteEventDirectory.
Secret: which persistence backend stores concert listings.
- ui.py — HTTP handlers for /search, /share, /like, /concerts/<city>.
Calls only the four Protocols above (never the concrete classes).
- composition_root.py — picks today's concrete implementations and hands them to ui.py.
The quiz on the right asks you to predict, for several plausible future
changes, WHICH modules need to be edited. No code to refactor — just
your judgment, plus one short module guide.
# Module guide — ui.py
Write the Parnas/Clements/Weiss module guide for `ui.py`. Use the four
labels exactly as below; replace the `<...>` placeholders with one or
two lines of your own reasoning.
## Primary secret
<One sentence: what design decision does ui.py own and hide?>
## Likely changes
<Bullet two or three plausible future changes this module absorbs locally.>
## Stable contract
<One or two sentences: what do callers of ui.py rely on?>
## Not promised
<Bullet at least two concrete decisions that are NOT part of ui.py's
contract — things a future maintainer must NOT depend on. Be specific:
name HTML/JSON, templating engine, exact response shapes, URL paths,
authentication scheme, etc. A generic "implementation details" line
does not count and the test will reject it.>
Solution
# MusicShare system map — answer key
## Change 1: Add YouTube Music
Edits: streaming.py (new YouTubeMusicProvider), composition_root.py (registry entry).
That's it. ui.py and playlist.py untouched. Single Choice payoff (Step 6).
## Change 2: Migrate events to a remote HTTP service
Edits: events.py (new HttpEventDirectory), composition_root.py (swap wiring).
ui.py untouched. The 200 tests that use InMemoryEventDirectory still pass.
Canonical Parnas storage case (Step 5).
## Change 3: "Humanize" track durations in the UI
Edits: playlist.py (humanize helper on Track), ui.py (call it at render).
Honest: this IS a multi-file change. A new FEATURE is not a hidden DECISION changing.
## Change 4: nightly_health_check.py one-off cron script
Edits: NONE — let it use sqlite3 directly. A 25-line script with no
second caller doesn't pay for an abstraction layer. Step 7's Snippet 5
and the chapter's "When NOT to apply" section both warn against
over-application.
## The tradeoff
Information hiding helps modification, not first-read clarity. Bet on it
where change is plausible; skip it where it isn't. The right number of
abstractions is the smallest number that lets the system change gracefully.
## Cold transfer: CampusRide REST -> GraphQL
Edits: scooters.py (new or replaced concrete gateway), composition_root.py
(wiring). trip_planner.py, pricing.py, and ui.py stay on the
ScooterGateway domain contract. If any of them need GraphQL details,
the vendor protocol leaked.
# Module guide — ui.py (sample answer)
## Primary secret
How HTTP requests map to domain operations on the four Protocols
(Recommender, StreamingProvider, Playlist, EventDirectory), and
how their return values get rendered to clients.
## Likely changes
- Switch server-rendered HTML to JSON for an SPA frontend.
- Swap the templating engine (Jinja → htpy → none-at-all-just-strings).
- Add new routes (/playlists, /recommendations/<id>, etc.).
- Change the auth scheme (session cookies → JWT, etc.).
## Stable contract
Each HTTP route accepts validated input and returns a response that
the browser can render. Domain operations are reached only through
the four injected Protocols.
## Not promised
- Exact HTML structure or CSS class names.
- The templating engine (Jinja, Mako, htpy, plain strings — all valid).
- URL path format (`/concerts/<city>` could become `/cities/<city>/concerts`).
- Response status codes beyond 200/4xx categories.
- Cookie-based auth specifically — could become Bearer tokens, OAuth, etc.
- The order or shape of the underlying Protocol calls inside a handler.
Your training is complete. Eight steps ago you proved that private
is not a secret. You then ran the five-step routine on representation
(Playlist, Step 2), declared the contract via Protocol (Step 3),
attacked over-specification (Recommender, Step 4), hid persistence
(EventDirectory, Step 5), and applied Single Choice (StreamingProvider,
Step 6). Step 7 trained the which kind of leak is this? discrimination
across all four — including the “don’t refactor” calibration. This step
applied everything to an unfamiliar whole system.
The same routine, repeated four times across very different domains,
is the operational form of David Parnas’s 1972 criterion. And the
module guide you just wrote for ui.py is the artifact Parnas, Clements,
& Weiss (1985) called the lightest-weight design doc that records
why. Four labels, a few lines each, and you have something a future
maintainer can read in 30 seconds to decide whether their change
belongs in this module.
What to take with you. When you next find yourself reading or writing Python code, run this five-line audit on any module:
1. What is this module's secret? (A volatile decision, one sentence.)
2. What does its public API let clients see beyond that secret?
3. Could two different implementations both satisfy this contract?
4. If the secret changed, how many files would I edit?
5. Is the cost of the abstraction less than the cost of the future change?
If the answer to (1) is nothing, the module is shallow — merge it upward. If (2) reveals a leak, narrow the contract. If (3) is no, the secret has not been hidden. If (4) is many, redesign. If (5) is no, do not abstract — your reader pays for the layer every time, future change or not. The last item is what Step 7’s Snippet 5 and this step’s Change 4 trained: knowing when not to apply the principle is part of applying it well.
Now go fix some real code.
Step 8 — Knowledge Check
Min. score: 80%1. Change 1: Add YouTube Music as a fourth streaming service. Users should be able to play, share, and like tracks on it just like the other three. Which files need to be edited? (Select all that apply.) (select all that apply)
This is the Single Choice payoff. One new class in streaming.py and one new entry in the wiring registry. Zero edits to ui.py, zero edits to playlist.py. Compare with the pre-refactor design from Step 6, which would have required edits in three functions — and miss any single one, and your “Add YouTube Music” feature ships half-broken.
2. Change 2: The SRE team migrates the production concert catalog from SQLite to a remote HTTP service (an internal REST API). The data shape is the same; only the storage moves. Which files need to be edited? (Select all that apply.) (select all that apply)
The canonical Parnas case, made concrete. One new class in events.py. One line in the wiring. The 200 tests that build InMemoryEventDirectory still pass without changes — that’s the testability benefit. The ui.py handlers compile against the Protocol and don’t know HTTP from SQL — that’s the comprehensibility benefit. The SRE migration ships in a week, not a quarter — that’s the change-locality benefit.
3. Change 3: Product wants the UI to show “about 3 minutes” instead
of “194 seconds” everywhere a track duration appears. A new
humanized-duration string is needed on each Track.
Which files need to be edited? (Select all that apply.)
(select all that apply)
This is the honest change in the set — a real, multi-file edit. Adding a new piece of presentation logic does mean changes in playlist.py (where Track is defined) and ui.py (where it renders). That’s normal and fine. Information hiding does not promise every change is local — it promises that change-prone decisions stay local. Adding a new field is a new feature, not a change-prone decision leaking. The takeaway: don’t over-claim what hiding buys you. Sometimes a two-file edit is the right answer.
4. Change 4 — the “don’t refactor” calibration.
A new teammate proposes: “I want to write a one-off script,
nightly_health_check.py, that connects directly to SQLite and
prints a count of events per city to a log file. It runs once a
night via cron, doesn’t share code with anything else, and the
whole thing is ~25 lines.”
They ask: “Should I make it use the EventDirectory Protocol
instead of sqlite3 directly?” What is the right call?
The honest answer: not every direct dependency should be hidden. Information hiding is a maintenance investment that pays back when (a) there are multiple callers, and (b) the hidden decision is plausibly volatile. A 25-line cron script has neither property. Forcing it through EventDirectory would mean:
- The future reader of the cron script has to chase a Protocol they don’t need.
- If
EventDirectoryever grows new methods, this script breaks even though its needs are unchanged. - The Protocol stops being “things that look like a directory” and starts being “things that satisfy the union of every consumer’s needs.”
The Step 7 calibration (Snippet 5) and the chapter’s When NOT to apply section are exactly about this. The right number of abstractions is the smallest number that lets the system change gracefully. Below that number, you’re under-engineered; above it, you tax every reader. Both extremes are bugs.
5. The honest tradeoff. Tempero, Blincoe, and Lottridge (2023) found that more modular code helped students complete modification tasks but did not consistently make code easier to understand on first encounter. What is the right takeaway?
Information hiding is a bet on future change. It’s a great bet where the design decision is plausibly volatile — vendors, storage, algorithms, regulatory rules. It’s a bad bet on decisions that will never change. A 50-line cron job does not need a PaymentGateway Protocol; a payments codebase does. The SE maxim from the chapter: the right number of abstractions is the smallest number that lets the system change gracefully. Beyond that number, every extra layer is a tax on every reader.
What you’ve actually learned in this tutorial. You can now (a) name a module’s secret, (b) spot the contract leaks, (c) refactor a leaky module behind a Protocol or domain methods, (d) verify the secret is hidden with an implementation-swap test, (e) apply Single Choice when alternatives are exhaustive, (f) classify an unfamiliar leak before fixing it (Step 7), (g) abstain when no change is plausible, and (h) predict the change-impact radius before you start editing. That’s the operational form of Parnas’s principle — and it survives the move from a tutorial to a real codebase.
6. Cold transfer — no MusicShare scaffolding this time. CampusRide has these modules:
scooters.pyowns aScooterGateway(Protocol)withnearby(location)andreserve(scooter_id).trip_planner.pychooses a route from available scooters and campus buildings.pricing.pycomputes student discounts and surge pricing.ui.pyrenders the map and reservation button.composition_root.pywires today’s concrete gateway into the app.
ScooterGateway stay the same.
Which files need to be edited? (Select all that apply.)
(select all that apply)
This is the same shape as Step 5, but without the MusicShare table. The volatile decision is the vendor protocol. It belongs in scooters.py behind ScooterGateway; composition_root.py chooses the concrete implementation. Route planning, pricing, and UI rendering should stay on the stable domain contract.