Code Smells & Refactoring Tutorial

1

The Cost of Duplication You Didn't Notice

Welcome — what this tutorial trains

You already know Python and pytest. You haven’t yet learned the discipline of changing existing code without breaking it. That discipline has a name — refactoring — and a lot of structure to it. Over the next ten steps you’ll learn:

How to recognize a handful of high-impact code smells in real Python.
How to apply named refactorings that fix each smell safely.
How tests give you the safety net to change structure without changing behavior.
How to judge when a refactoring is worth doing, and when it isn’t.

The codebase grows over the tutorial: you start with three small functions and end with a small music streaming app. Most of the typing is done by Monaco’s refactoring tools — you’ll select code, pick a refactoring, and judge the diff. The thinking is yours.

Prerequisite: Testing Foundations — pytest discovery, assert, and @pytest.mark.parametrize. If those feel new, do that one first.

Why this matters

Duplication isn’t just a style problem — it multiplies the cost of every future bug. When the same logic lives in three places, one fix becomes three fixes, and missing one means the bug ships in production. Before you can refactor duplication away, you need to feel what it costs. This step plants that schema: a single bug, fixed in N places, where N is the number of duplicates.

🎯 You will learn to

Apply Fowler’s definition of refactoring as behavior-preserving structural change.
Analyze a duplicated method to see how a single bug propagates through every caller.
Evaluate when visual similarity is real duplication vs. an accidental coincidence.

Open royalty.py. The RoyaltyCalculator class has three methods that calculate the creator’s share of streaming royalties for three different track types — songs, podcasts, audiobooks. They all use the same formula: plays × rate × 0.7 (the platform takes 30%, creators get 70%).

class RoyaltyCalculator:
    def royalty_song(self, plays: int, rate: float) -> float:
        return plays * rate * 0.7

    def royalty_podcast(self, plays: int, rate: float) -> float:
        return plays + rate * 0.7   # ← bug

    def royalty_audiobook(self, plays: int, rate: float) -> float:
        return plays + rate * 0.7   # ← bug

Two of the three have the same bug — + where there should be *. The bug ships unnoticed because the test suite only covers royalty_song. (You can already see the class in the UML diagram in the bottom-left — three methods, one box.)

Your task

Run the existing tests in test_royalty.py. test_royalty_song passes — the song formula is correct. But test_monthly_payouts_sums_across_track_types already fails: that’s an integration test summing royalties across all three track types via the MonthlyPayouts caller, and the buggy podcast / audiobook formulas produce a wildly wrong total. The bug propagates upward through every caller of the broken methods.
Extend the parametrize table in test_royalty.py to cover royalty_podcast and royalty_audiobook with at least two (plays, rate) cases each. Use the formula plays * rate * 0.7 to compute the expected values. Two new tests will turn red — the bug is now visible at the unit level too, not just at the integration level.
Fix the bug in royalty.py. You have to fix it in two separate places because the logic is duplicated. After the fix, all three test layers pass: the per-method unit tests, the integration test, and (in production) any future caller.

You will not yet refactor the duplication away. That waits until Step 4. The point of this step is to feel the cost: a single bug, fixed in N places, where N is the number of duplicates.

Refactoring, defined

Throughout this tutorial we use Martin Fowler’s definition (Refactoring, 2018):

Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior.

The two halves are equally important. Structural change without behavior preservation is a rewrite; behavior preservation without structural change is unnecessary churn. A refactoring is both at once.

Not every triplet is a smell

Visual similarity isn’t the diagnostic — bug-coupling is. Look at this counter-example:

def round_currency_usd(amount: float) -> float:
    return round(amount, 2)            # cents: 2 decimal places

def round_currency_jpy(amount: float) -> float:
    return round(amount, 0)            # yen: no fractional unit

def round_currency_kwd(amount: float) -> float:
    return round(amount, 3)            # Kuwaiti dinar: 3 decimal places

Three near-identical functions, but each captures a real domain rule (ISO 4217 minor-unit precision per currency). Consolidating these into one function with a precision parameter would not remove a bug coupling — there are no shared bugs to fix in three places, because each precision is independent. The visual similarity is a coincidence of the API shape, not duplicated knowledge.

The rule of thumb: if changing one of the functions would also force you to change the others (in lockstep), that’s duplication. If they evolve independently, leave them alone. Step 4 returns to this trade-off — for now, hold the distinction in mind.

Starter files

royalty.py

"""Streaming royalty calculations.

The platform takes 30% commission; creators get 70% of plays * rate.
"""


class RoyaltyCalculator:
    """Computes creator royalty payouts per track type."""

    def royalty_song(self, plays: int, rate: float) -> float:
        return plays * rate * 0.7

    def royalty_podcast(self, plays: int, rate: float) -> float:
        return plays + rate * 0.7

    def royalty_audiobook(self, plays: int, rate: float) -> float:
        return plays + rate * 0.7

payouts.py

"""Monthly creator-payout aggregator — uses RoyaltyCalculator across all track types.

This is the *production caller* that exercises the three royalty methods.
The bug in royalty_podcast / royalty_audiobook propagates through here:
MonthlyPayouts.total_creator_earnings will return wildly wrong totals
until the underlying calculator is fixed.
"""
from typing import List, Tuple
from royalty import RoyaltyCalculator


class MonthlyPayouts:
    """Aggregates monthly creator earnings across songs, podcasts, audiobooks."""

    def __init__(self, calculator: RoyaltyCalculator) -> None:
        self.calculator: RoyaltyCalculator = calculator

    def total_creator_earnings(
        self,
        song_plays: List[Tuple[int, float]],
        podcast_plays: List[Tuple[int, float]],
        audiobook_plays: List[Tuple[int, float]],
    ) -> float:
        """Sum royalties across all three track types for one month."""
        total: float = 0.0
        for plays, rate in song_plays:
            total += self.calculator.royalty_song(plays, rate)
        for plays, rate in podcast_plays:
            total += self.calculator.royalty_podcast(plays, rate)
        for plays, rate in audiobook_plays:
            total += self.calculator.royalty_audiobook(plays, rate)
        return total

test_royalty.py

"""Tests for streaming royalty calculations and the MonthlyPayouts caller."""
import pytest
from royalty import RoyaltyCalculator
from payouts import MonthlyPayouts


@pytest.fixture
def calc() -> RoyaltyCalculator:
    return RoyaltyCalculator()


@pytest.mark.parametrize("plays,rate,expected", [
    (100, 0.01, 0.7),
    (1000, 0.005, 3.5),
])
def test_royalty_song(calc: RoyaltyCalculator, plays: int, rate: float, expected: float) -> None:
    assert calc.royalty_song(plays, rate) == pytest.approx(expected)


# TODO: extend the parametrize table to cover royalty_podcast and
# royalty_audiobook with at least two cases each. Use the formula
# plays * rate * 0.7 to compute the expected values.


def test_monthly_payouts_sums_across_track_types(calc: RoyaltyCalculator) -> None:
    """Integration test: the bug in any royalty_* method also breaks this aggregate."""
    payouts: MonthlyPayouts = MonthlyPayouts(calc)
    total: float = payouts.total_creator_earnings(
        song_plays=[(100, 0.01), (1000, 0.005)],   # 0.7 + 3.5  = 4.2
        podcast_plays=[(100, 0.02)],               # 1.4        = 1.4
        audiobook_plays=[(200, 0.02)],             # 2.8        = 2.8
    )
    assert total == pytest.approx(8.4)

Solution

royalty.py

"""Streaming royalty calculations.

The platform takes 30% commission; creators get 70% of plays * rate.
"""


class RoyaltyCalculator:
    """Computes creator royalty payouts per track type."""

    def royalty_song(self, plays: int, rate: float) -> float:
        return plays * rate * 0.7

    def royalty_podcast(self, plays: int, rate: float) -> float:
        return plays * rate * 0.7

    def royalty_audiobook(self, plays: int, rate: float) -> float:
        return plays * rate * 0.7

test_royalty.py

"""Tests for streaming royalty calculations and the MonthlyPayouts caller."""
import pytest
from royalty import RoyaltyCalculator
from payouts import MonthlyPayouts


@pytest.fixture
def calc() -> RoyaltyCalculator:
    return RoyaltyCalculator()


@pytest.mark.parametrize("plays,rate,expected", [
    (100, 0.01, 0.7),
    (1000, 0.005, 3.5),
])
def test_royalty_song(calc: RoyaltyCalculator, plays: int, rate: float, expected: float) -> None:
    assert calc.royalty_song(plays, rate) == pytest.approx(expected)


@pytest.mark.parametrize("plays,rate,expected", [
    (100, 0.02, 1.4),
    (500, 0.01, 3.5),
])
def test_royalty_podcast(calc: RoyaltyCalculator, plays: int, rate: float, expected: float) -> None:
    assert calc.royalty_podcast(plays, rate) == pytest.approx(expected)


@pytest.mark.parametrize("plays,rate,expected", [
    (100, 0.05, 3.5),
    (200, 0.02, 2.8),
])
def test_royalty_audiobook(calc: RoyaltyCalculator, plays: int, rate: float, expected: float) -> None:
    assert calc.royalty_audiobook(plays, rate) == pytest.approx(expected)


def test_monthly_payouts_sums_across_track_types(calc: RoyaltyCalculator) -> None:
    """Integration test: the bug in any royalty_* method also breaks this aggregate."""
    payouts: MonthlyPayouts = MonthlyPayouts(calc)
    total: float = payouts.total_creator_earnings(
        song_plays=[(100, 0.01), (1000, 0.005)],
        podcast_plays=[(100, 0.02)],
        audiobook_plays=[(200, 0.02)],
    )
    assert total == pytest.approx(8.4)

Two halves of the lesson.

The bug fix is the easy part — change + to * in two places. Notice the number of places: with three duplicated functions, one bug becomes three fixes. With ten duplicates, one bug becomes ten fixes — and any one of those ten places might be missed.
The new test cases are the discipline part. The original suite passed because it only exercised one of the three functions. Coverage gaps don’t announce themselves; you find them by widening your test cases until every public function is exercised.

What you did NOT do. You didn’t extract the common formula into a helper. That’s the obvious next move, but it isn’t this step’s lesson — and doing it now, before fixing the bug, would have propagated the buggy formula into the helper. You’d then have one buggy place instead of three, but the bug would still be there. Fix first, refactor second is the rule. Step 4 will do the extraction.

Step 1 — Knowledge Check

Min. score: 80%

1. Retrieval. Which operator was wrong in two of the three royalty functions, and what should it have been?

+ was wrong; should have been *
* was wrong; should have been +
The formula in the docstring is plays × rate × 0.7 — multiplication is correct. The bug was the reverse: two functions had + where the formula calls for *.
/ was wrong; should have been *
- was wrong; should have been +

The docstring stated the formula as plays × rate × 0.7. Two of the three functions had plays + rate * 0.7, which silently returns garbage values for any non-trivial input. Naming the bug operator from memory anchors the smell — when you next see “three near-identical functions,” the very next thought should be “what bug might be hiding in just two of them?”

2. Which of the following best matches Fowler’s definition of refactoring?

Rewriting code from scratch in a cleaner style
A rewrite typically does change behavior in subtle ways and starts from a blank slate. Refactoring is incremental and behavior-preserving — the external contract is unchanged.
Restructuring existing code without changing its external behavior
Adding new features while improving the existing code
Adding features mixes refactoring with feature work. Discipline says do one or the other — refactor under the safety net of tests, then add the feature on the cleaner foundation.
Removing unused code paths and dead branches
Removing dead code can be part of a refactor, but the term refers more broadly to any structural improvement that preserves behavior, not specifically to deletion.

Refactoring has two halves: structural change AND behavior preservation. Without the second half, it’s a rewrite. Without the first, it’s churn. Both halves are non-negotiable.

3. You discover the same bug in three near-identical copies of a function. Which order do you apply?

Refactor first to remove the duplication, then fix the bug in one place
Tempting, but dangerous: extracting the common code with the bug intact would propagate the buggy formula into the helper. The duplication would be gone, but the bug would remain. Fix-then-refactor preserves the safety property: every step has a passing test suite.
Fix the bug in all three places, then refactor to remove the duplication
Fix in one place; let the duplication remain — the others are unused
Either order is equivalent; the result is identical code
Order matters because of the test safety net. Fix-then-refactor verifies the fix in each location with a green test before changing structure. Refactor-then-fix can hide the original bug inside the new abstraction.

Fix-then-refactor. Fix the bug in each duplicate first, with tests passing for each location. Only then refactor to consolidate — the consolidated helper now contains the corrected formula. This is one of the most useful sequencing rules in the refactoring toolkit.

4. Transfer — operational cost of duplication. Next month the team adds two more royalty methods, both copying the same buggy formula from the existing duplicates. No refactor happens in between. When a developer finally notices the bug after that, how many code lines must they touch to fix every broken method?

1 — the bug lives in shared code, so one fix propagates everywhere
There is no shared helper yet — that is exactly what Step 4’s Extract Function introduces. Until then, the formula lives independently in each method and N copies means N edits.
2 — only the methods that were already broken today need fixing; the new ones stay wrong
All four methods contain the same broken formula and all four affect production payouts. Leaving the new ones uncorrected ships a known regression.
4 — every method that contains the bug must be edited separately
0 — the test suite reverts the duplicates automatically
Tests run code; they don’t rewrite it. The N-edit cost falls on the developer every time.

Cost-of-fix scales linearly with the number of buggy duplicates (N). Today N is 2; two more copies makes N = 4. Step 4’s Extract Function collapses N to 1 — one helper, one fix, no matter how many callers depend on the formula. That gap — N edits before, 1 edit after, summed across every future bug in the shared logic — is the operational reason duplication multiplies the cost of every future bug.

2

Long Method → Extract Function (by hand)

Why this matters

A method that does five things needs five names — but it’s been given one. Long Methods are the most common code smell in real Python, and the cure (Extract Function) is the most important refactoring you’ll learn. You’ll do this one by hand, slowly and deliberately, so when the tool drives later extractions you recognize what it’s doing under the hood. The safety dance — green tests → change → green tests — is the discipline that turns refactoring from a coin flip into a reliable craft.

🎯 You will learn to

Analyze a Long Method by identifying its sub-goal structure.
Apply Extract Function by hand while keeping tests green.
Evaluate the safety dance — the test rhythm that makes structural change reliable.

Open player.py. The original Streaming.process_play_event(user, track) was about 35 lines doing five different things. Sub-goal 1 (the subscription check) has already been extracted as a worked example — the file you opened starts from there. The method body still inlines four more sub-goals:

Verify the user’s subscription
Check that the track is available in the user’s region
Compute royalty for the play
Update the play count
Append to the user’s history

The comments label these sub-goals — that’s a hint that the method is doing too much. A coherent method does one thing and is named for it. A method that does five things needs five names, which means five separate methods.

The safety dance

Refactoring without tests is a coin flip. Refactoring with tests is reliable. The discipline:

Step	Action
1	Run the tests. They should pass. Now you have a baseline.
2	Make one structural change. Do not change behavior.
3	Run the tests again. They must still pass. If they fail, the change broke something — undo and try again.
4	Repeat from (2) until the structure is clean.

The rhythm is green → change → green → change → green. You are never more than one undo away from a known-good state.

Predict before reading

The first sub-goal of process_play_event is the subscription check (lines tagged # 1.). Before you read the worked example below, write down the function name you would choose for that block. One word or short phrase. Hold your choice in mind. After you see the worked example, compare to the name we chose — and ask yourself whether yours is clearer, equivalent, or worse.

Naming is the load-bearing micro-decision in every Extract Function. Generating a name yourself first — even if you change it after — anchors the schema better than reading our name first.

Worked example — extracting the first sub-goal

Look at the first comment block in process_play_event:

# 1. Verify the user's subscription
if user.subscription_tier == "free":
    if user.daily_plays >= 5:
        raise PermissionError("Free tier daily limit reached")
elif user.subscription_tier == "premium":
    pass  # unlimited
else:
    raise ValueError(f"Unknown tier: {user.subscription_tier}")

Five lines doing one thing. That’s a perfect Extract Method candidate.

The mechanics, narrated.

Cut the five lines.

Write a new helper method on the Streaming class:

def _check_subscription(self, user: User) -> None:
    """Raise an exception if the user is not allowed to play."""
    if user.subscription_tier == "free":
        if user.daily_plays >= 5:
            raise PermissionError("Free tier daily limit reached")
    elif user.subscription_tier == "premium":
        pass  # unlimited
    else:
        raise ValueError(f"Unknown tier: {user.subscription_tier}")

Replace the cut block in process_play_event with a call:
```
self._check_subscription(user)
```
Run the tests. Green.

Three things to notice:

The helper has a type-annotated signature. user: User and the -> None return tell future readers what it consumes and what it produces. The annotation is part of the refactoring, not optional polish.
The helper has a leading underscore in its name. Convention: _name means “internal — not part of the public API.” process_play_event is public; helpers it depends on are internal.
The helper’s name describes what it does, not how. _check_subscription is a coherent sub-goal; _block_one would not be.

Your task

Two more sub-goals are still inline in process_play_event. Extract them as methods on Streaming. The file already has the worked-example extraction applied; you continue from there.

Sub-goal 2: geo-restriction. Extract _check_geo_restriction(self, user: User, track: Track) -> None (raises on failure).
Sub-goal 3: royalty calculation. Extract _compute_royalty(self, track: Track) -> float returning the royalty amount.

Run the tests after each extraction. They should stay green throughout. Watch the UML diagram in the bottom-left grow with each extraction — Streaming gains methods, process_play_event’s body shrinks.

Why we do this manually one time. Steps 4–8 will use Monaco’s tool to do extractions for you with two clicks. Doing one slow extraction by hand right now anchors the mechanics in your fingers. When the tool acts later, you’ll recognize what it did because you’ll have done it yourself.

Starter files

player.py

"""The streaming player — orchestrates a single play event."""
from dataclasses import dataclass, field
from typing import List


@dataclass
class User:
    user_id: str
    subscription_tier: str   # "free" or "premium"
    region: str
    daily_plays: int = 0
    history: List[str] = field(default_factory=list)


@dataclass
class Track:
    track_id: str
    title: str
    duration_sec: int
    available_regions: List[str]
    rate: float
    play_count: int = 0


class Streaming:
    """Orchestrates streaming play events for users and tracks."""

    def _check_subscription(self, user: User) -> None:
        """Raise if the user is not allowed to play right now."""
        if user.subscription_tier == "free":
            if user.daily_plays >= 5:
                raise PermissionError("Free tier daily limit reached")
        elif user.subscription_tier == "premium":
            pass  # unlimited
        else:
            raise ValueError(f"Unknown tier: {user.subscription_tier}")

    def process_play_event(self, user: User, track: Track) -> float:
        """Run one play: returns the royalty paid to the creator."""
        # 1. Verify the user's subscription
        self._check_subscription(user)

        # 2. Geo-restriction check
        if track.available_regions and user.region not in track.available_regions:
            raise PermissionError(f"Track not available in {user.region}")

        # 3. Compute royalty (creator gets 70%)
        royalty: float = 1 * track.rate * 0.7

        # 4. Update play count
        track.play_count += 1
        user.daily_plays += 1

        # 5. Append to user's history
        user.history.append(track.track_id)

        return royalty

playback_session.py

"""A playback session runs multiple plays for one user.

Production caller for Streaming.process_play_event. Behavior must be
preserved across the Long-Method extraction in Step 2 — the body of
process_play_event changes shape internally, but its contract (royalty
per play, raises on subscription/geo violation) stays identical.
"""
from typing import List
from player import Streaming, User, Track


class PlaybackSession:
    """Plays a sequence of tracks and accumulates the total royalty paid."""

    def __init__(self, streaming: Streaming, user: User) -> None:
        self.streaming: Streaming = streaming
        self.user: User = user
        self.total_royalty: float = 0.0

    def run_session(self, tracks: List[Track]) -> float:
        """Play each track in order; return cumulative royalty."""
        for track in tracks:
            self.total_royalty += self.streaming.process_play_event(self.user, track)
        return self.total_royalty

test_player.py

"""Behavior tests — lock the contract before refactoring."""
import pytest
from typing import List
from player import User, Track, Streaming
from playback_session import PlaybackSession


@pytest.fixture
def streaming() -> Streaming:
    return Streaming()


@pytest.fixture
def free_user() -> User:
    return User(user_id="u1", subscription_tier="free", region="US", daily_plays=2)


@pytest.fixture
def premium_user() -> User:
    return User(user_id="u2", subscription_tier="premium", region="EU")


@pytest.fixture
def track_us_only() -> Track:
    return Track(track_id="t1", title="Song A", duration_sec=180,
                 available_regions=["US"], rate=0.01)


@pytest.fixture
def track_global() -> Track:
    return Track(track_id="t2", title="Song B", duration_sec=240,
                 available_regions=[], rate=0.02)


def test_premium_user_global_track_pays_royalty(streaming: Streaming,
                                                premium_user: User,
                                                track_global: Track) -> None:
    # Premium has no daily limit; global track has no geo restriction.
    royalty: float = streaming.process_play_event(premium_user, track_global)
    assert royalty == pytest.approx(0.014)
    assert track_global.play_count == 1
    assert premium_user.daily_plays == 1
    assert premium_user.history == ["t2"]


def test_free_user_under_limit_pays_royalty(streaming: Streaming,
                                            free_user: User,
                                            track_us_only: Track) -> None:
    royalty: float = streaming.process_play_event(free_user, track_us_only)
    assert royalty == pytest.approx(0.007)
    assert track_us_only.play_count == 1
    assert free_user.daily_plays == 3


def test_free_user_at_daily_limit_blocked(streaming: Streaming,
                                          free_user: User,
                                          track_us_only: Track) -> None:
    free_user.daily_plays = 5
    with pytest.raises(PermissionError, match="daily limit"):
        streaming.process_play_event(free_user, track_us_only)


def test_geo_restriction_blocks_play(streaming: Streaming,
                                     premium_user: User,
                                     track_us_only: Track) -> None:
    with pytest.raises(PermissionError, match="not available"):
        streaming.process_play_event(premium_user, track_us_only)


# ---- Caller test: PlaybackSession exercises process_play_event ----

def test_playback_session_accumulates_royalty(
    streaming: Streaming, premium_user: User, track_global: Track
) -> None:
    session: PlaybackSession = PlaybackSession(streaming, premium_user)
    tracks: List[Track] = [track_global, track_global, track_global]
    total: float = session.run_session(tracks)
    # 3 plays × 0.014 royalty per play
    assert total == pytest.approx(0.014 * 3)
    assert track_global.play_count == 3
    assert premium_user.daily_plays == 3

Solution

player.py

"""The streaming player — orchestrates a single play event."""
from dataclasses import dataclass, field
from typing import List


@dataclass
class User:
    user_id: str
    subscription_tier: str
    region: str
    daily_plays: int = 0
    history: List[str] = field(default_factory=list)


@dataclass
class Track:
    track_id: str
    title: str
    duration_sec: int
    available_regions: List[str]
    rate: float
    play_count: int = 0


class Streaming:
    """Orchestrates streaming play events for users and tracks."""

    def _check_subscription(self, user: User) -> None:
        """Raise if the user is not allowed to play right now."""
        if user.subscription_tier == "free":
            if user.daily_plays >= 5:
                raise PermissionError("Free tier daily limit reached")
        elif user.subscription_tier == "premium":
            pass
        else:
            raise ValueError(f"Unknown tier: {user.subscription_tier}")

    def _check_geo_restriction(self, user: User, track: Track) -> None:
        """Raise if the track isn't licensed for the user's region."""
        if track.available_regions and user.region not in track.available_regions:
            raise PermissionError(f"Track not available in {user.region}")

    def _compute_royalty(self, track: Track) -> float:
        """Per-play royalty for a single play (creator gets 70%)."""
        return 1 * track.rate * 0.7

    def process_play_event(self, user: User, track: Track) -> float:
        """Run one play: returns the royalty paid to the creator."""
        self._check_subscription(user)
        self._check_geo_restriction(user, track)
        royalty: float = self._compute_royalty(track)
        track.play_count += 1
        user.daily_plays += 1
        user.history.append(track.track_id)
        return royalty

Three new helpers, one shorter process_play_event.

Each helper has:

A type-annotated signature that documents what it consumes and produces.
A docstring naming the sub-goal in human terms.
A leading underscore signalling “internal.”

process_play_event is now seven lines of orchestration. A reader can scan it top-to-bottom and understand the play sequence without diving into any of the helpers. The helpers are there if a reader needs the details.

The safety dance held throughout. Tests passed before the first extraction, after each individual extraction, and at the end. If any one extraction had broken behavior, you’d have caught it within seconds and reverted only the bad change.

What you didn’t do. You didn’t extract _update_play_count(user, track) or _record_history(user, track) — those are one-liners. Extract Function pays back when the extracted block has a coherent name; extracting individual lines with weak names just trades one smell for another (call-site ping-pong). The “Rule of Three” (Step 4) and the “comments-as-deodorant” trap (this step’s quiz) cover when extraction stops paying.

Step 2 — Knowledge Check

Min. score: 80%

1. Retrieval. What’s the minimum test discipline a refactoring requires?

Run the tests once at the end to confirm nothing broke
Running tests only at the end means a single failure indicates some change broke behavior — but not which one. Refactoring relies on small steps, each individually verified.
Run the tests between every two structural changes — green, change, green
Write new tests after each refactoring to cover the new structure
Refactoring is by definition behavior-preserving — the existing tests should still cover the unchanged behavior. New tests follow new behavior, not refactorings.
No tests are needed if the IDE supports the refactoring
Even tool-driven refactorings can fail in edge cases (missed call sites, dynamic references). The test suite is the safety net regardless of how the structural change happens.

Green → change → green is the rhythm. The “green” after one change is the “green” before the next, so the cost is one test run per change, not two. If the second green ever turns red, you know exactly which change to revert.

2. Which of the following is the best sign that a function is too long?

It exceeds 50 lines
Line count is a heuristic, not a rule. A 60-line function with one coherent purpose is fine; a 20-line function with five sub-goals is too long.
It contains nested loops or conditionals
Nested control flow is its own smell (Complex Class), but a function can be long and flat — many sequential blocks rather than nesting.
Its body has multiple sub-goals labeled with comments
It uses no helper functions of its own
Many short functions delegate to helpers; many short functions don’t. The presence of helpers is independent of length.

Comments-as-section-headers are the diagnostic. When a reader needs # 1. Validate, # 2. Compute, # 3. Format to navigate a function, the function is doing those three things and should be three functions. The comment names become the new function names.

3. You add a comment to a long method explaining what each section does. Have you fixed the smell?

Yes — the function is now self-documenting
Comments help readers; they don’t change the structural problem. The function still has five sub-goals jammed together; the next change still risks breaking unrelated logic.
Yes, as long as each comment is one line
Length doesn’t change the lesson. A one-line comment can label a sub-goal, but the sub-goal is still inline. Extract the function — let the function’s name be the comment.
No — comments document a smell that should be fixed structurally
Only if the comments use a standard format like reStructuredText

Comments-as-deodorant. Section-header comments are a sign the code wants to be split. The fix isn’t tighter comments; it’s named functions whose call sites read like an outline. After Extract Function, the comment in the call site is the helper’s name.

4. After extracting one sub-goal into a helper, you run the tests. They pass. What do you do next?

Commit the change and continue with the next sub-goal
Run the tests again to be sure
Re-running the same test command rarely surfaces new information. The first green run is the signal.
Manually inspect the helper for any subtle differences from the original
If the test passed, the extraction preserved behavior. Suspicion that the test is too weak points to a separate problem (oracle strength) — manually inspecting every change papers over the gap rather than addressing it.
Add a new test for the helper before extracting more sub-goals

Trust the tests, then move on. The whole point of behavior preservation tests is that they make manual inspection unnecessary. Each green run is a small commit point. Adding new tests for the helper is optional — the existing tests cover the public behavior, which is what matters.

3

Boolean Anti-Patterns — and the trap that costs you everything

Why this matters

Boolean code looks innocent and breaks loudly. About 30% of conditional anti-patterns in one large empirical study come from a single shape — wrapping a boolean expression in if/else: return True/False. Worse, some “obvious” boolean simplifications silently drop a branch, and a happy-path test will let the bug ship. Truth tables are the safety net that catches what your eyes miss.

🎯 You will learn to

Apply three boolean simplifications to remove pointless conditional wrappers.
Analyze a nested-if collapse to recognize the IfsMerged trap.
Create a @pytest.mark.parametrize truth table that covers all input combinations.

Open gates.py. The PlaybackGates class has three small methods that guard playback. All three have boolean smells. Two are safe to simplify; one looks safe but isn’t.

Sub-task A — boolean return

def is_trial_expired(self, free_trial_days_left: int) -> bool:
    if free_trial_days_left <= 0:
        return True
    else:
        return False

The body is if condition: return True else: return False. The condition is already a boolean — wrapping it in an if/else does nothing. Simplify to:

def is_trial_expired(self, free_trial_days_left: int) -> bool:
    return free_trial_days_left <= 0

This is the most common conditional anti-pattern in novice Python — about 30% of conditional anti-patterns in one large empirical study (Naude et al., 2024).

Sub-task B — confusing else

def stream_quality(self, is_premium: bool) -> str:
    if is_premium:
        return "high"
    if not is_premium:
        return "low"
    return "low"  # unreachable, but the type checker insists

Two ifs with mutually-exclusive conditions and a “just in case” return at the end. The intent is if/else. Simplify to:

def stream_quality(self, is_premium: bool) -> str:
    if is_premium:
        return "high"
    else:
        return "low"

Sub-task C — the IfsMerged trap

Here’s the method:

def play_decision(self, is_logged_in: bool, is_offline: bool) -> str:
    if is_logged_in:
        if not is_offline:
            return "stream"
        else:
            return "play_cached"
    return "error_login_required"

A colleague reviewing the code proposes this simplification:

def play_decision(self, is_logged_in: bool, is_offline: bool) -> str:
    if is_logged_in and not is_offline:
        return "stream"
    return "error_login_required"

It compiles. The existing happy-path test (assert gates.play_decision(True, False) == "stream") passes. Before reading on, decide for yourself — is this simplification behavior-preserving? Take 30 seconds. Sketch a 4-row truth table over (is_logged_in, is_offline) covering all four combinations. Walk both versions through each row. Find a row where they disagree — or convince yourself they never do.

(Don’t peek. The reveal is in the next paragraph.)

The colleague’s simplification is wrong. The original returns "play_cached" for (is_logged_in=True, is_offline=True). The simplified version returns "error_login_required" for the same input. The branch where you’re logged in and offline got silently dropped. That’s the IfsMerged trap: collapsing nested ifs into a single and is safe only when the inner else returns the same value as the outer fall-through. Here it doesn’t — the inner else returns "play_cached", but the outer fall-through returns "error_login_required".

If you missed it during the prediction: that’s the point. Empirical studies of code review show this is one of the most common classes of regression introduced during simplification — exactly because the change looks obvious. The truth table is the safety net.

Why a happy-path test misses this

A single test like assert gates.play_decision(True, False) == "stream" exercises one of four possible truth-value combinations of (is_logged_in, is_offline). Three remain unchecked, and the bug lives in one of those three.

The fix is a truth table — a parametrize over all four combinations:

@pytest.mark.parametrize("is_logged_in,is_offline,expected", [
    (False, False, "error_login_required"),
    (False, True,  "error_login_required"),
    (True,  False, "stream"),
    (True,  True,  "play_cached"),
])
def test_play_decision(gates: PlaybackGates, is_logged_in: bool, is_offline: bool, expected: str) -> None:
    assert gates.play_decision(is_logged_in, is_offline) == expected

Four rows. Two columns of input. One expected output per row. Any behavior change in the function shows up somewhere in the table.

Mini warmup — parametrize in 5 lines. If @pytest.mark.parametrize is unfamiliar, the syntax is ("name1,name2", [(value1a, value2a), (value1b, value2b), ...]). Each tuple becomes one test case; the test function receives the tuple’s values as named parameters. That’s it.

Your task

Sub-task A: simplify is_trial_expired.
Sub-task B: rewrite stream_quality as a clean if/else.
Sub-task C — fill the truth table FIRST, then refactor. Open truth_table.md. Two of the four rows are pre-filled. Fill the other two by reading the original nested play_decision (in gates.py). Then circle (with a ← BREAKS comment) the row where the naive if is_logged_in and not is_offline: return "stream" simplification would diverge from the original. Only after the table is complete, write the correct simplified version of play_decision and extend the parametrize table in test_play_decision to cover all four rows.

The lesson is not “the original was unsimplifiable” — it can be simplified, but only if the simplification preserves the four-row truth table. The elegant version that does:

def play_decision(self, is_logged_in: bool, is_offline: bool) -> str:
    if not is_logged_in:
        return "error_login_required"
    return "play_cached" if is_offline else "stream"

Externalizing the truth table on paper before touching the function is the load-bearing move: it converts a 6-element working-memory task (4 truth combinations × 2 versions to compare) into a written artifact you can stare at.

Starter files

gates.py

"""Boolean playback gates — one class, three method-level smells."""


class PlaybackGates:
    """Encapsulates the boolean decisions a player makes per playback event."""

    def is_trial_expired(self, free_trial_days_left: int) -> bool:
        # Sub-task A: simplify
        if free_trial_days_left <= 0:
            return True
        else:
            return False

    def stream_quality(self, is_premium: bool) -> str:
        # Sub-task B: rewrite as if/else
        if is_premium:
            return "high"
        if not is_premium:
            return "low"
        return "low"

    def play_decision(self, is_logged_in: bool, is_offline: bool) -> str:
        # Sub-task C: don't simplify naively — fill the truth table first.
        if is_logged_in:
            if not is_offline:
                return "stream"
            else:
                return "play_cached"
        return "error_login_required"

play_controller.py

"""Routes play decisions through all three boolean gates.

Production caller for PlaybackGates. The gate methods get simplified
internally during this step (boolean-return collapse, if/else cleanup,
IfsMerged-safe nested-if rewrite) but their *behavior* must stay
identical — this caller's tests prove it.
"""
from dataclasses import dataclass
from gates import PlaybackGates


@dataclass
class UserState:
    free_trial_days_left: int
    is_premium: bool
    is_logged_in: bool
    is_offline: bool


class PlayController:
    """Decides what action to take for a user about to play."""

    def __init__(self, gates: PlaybackGates) -> None:
        self.gates: PlaybackGates = gates

    def decide(self, state: UserState) -> str:
        """Return one of: trial_expired, error_login_required, play_cached, stream:high, stream:low."""
        if self.gates.is_trial_expired(state.free_trial_days_left):
            return "trial_expired"
        decision: str = self.gates.play_decision(state.is_logged_in, state.is_offline)
        if decision == "stream":
            quality: str = self.gates.stream_quality(state.is_premium)
            return f"stream:{quality}"
        return decision

test_gates.py

"""Truth-table-driven tests for boolean gates and the PlayController caller."""
import pytest
from gates import PlaybackGates
from play_controller import PlayController, UserState


@pytest.fixture
def gates() -> PlaybackGates:
    return PlaybackGates()


@pytest.mark.parametrize("days,expected", [
    (5, False),
    (1, False),
    (0, True),
    (-1, True),
])
def test_is_trial_expired(gates: PlaybackGates, days: int, expected: bool) -> None:
    assert gates.is_trial_expired(days) == expected


@pytest.mark.parametrize("is_premium,expected", [
    (True, "high"),
    (False, "low"),
])
def test_stream_quality(gates: PlaybackGates, is_premium: bool, expected: str) -> None:
    assert gates.stream_quality(is_premium) == expected


# TODO: extend this table to cover ALL FOUR truth combinations of
# (is_logged_in, is_offline). The current table is happy-path only and
# will not catch the IfsMerged trap.
@pytest.mark.parametrize("is_logged_in,is_offline,expected", [
    (True, False, "stream"),
])
def test_play_decision(gates: PlaybackGates, is_logged_in: bool, is_offline: bool, expected: str) -> None:
    assert gates.play_decision(is_logged_in, is_offline) == expected


# ---- Caller tests: PlayController routes through all three gates ----

@pytest.mark.parametrize("state,expected", [
    (UserState(free_trial_days_left=0, is_premium=True,  is_logged_in=True,  is_offline=False), "trial_expired"),
    (UserState(free_trial_days_left=5, is_premium=False, is_logged_in=False, is_offline=False), "error_login_required"),
    (UserState(free_trial_days_left=5, is_premium=True,  is_logged_in=True,  is_offline=True),  "play_cached"),
    (UserState(free_trial_days_left=5, is_premium=True,  is_logged_in=True,  is_offline=False), "stream:high"),
    (UserState(free_trial_days_left=5, is_premium=False, is_logged_in=True,  is_offline=False), "stream:low"),
])
def test_play_controller_decide(gates: PlaybackGates, state: UserState, expected: str) -> None:
    controller: PlayController = PlayController(gates)
    assert controller.decide(state) == expected

truth_table.md

# Truth table for `play_decision`

Read the **original nested** `play_decision` in `gates.py`. Fill the two
remaining rows by simulating what the original would return for each
combination. Then put a `← BREAKS` comment on the row where the naive
`if is_logged_in and not is_offline: return "stream"` collapse diverges.

| is_logged_in | is_offline | original returns      | naive `A and not B` returns |
|--------------|------------|-----------------------|------------------------------|
| False        | False      | error_login_required  | error_login_required         |
| False        | True       | error_login_required  | error_login_required         |
| True         | False      | TODO                  | TODO                         |
| True         | True       | TODO                  | TODO                         |

Solution

gates.py

"""Boolean playback gates — one class, three method-level smells."""


class PlaybackGates:
    """Encapsulates the boolean decisions a player makes per playback event."""

    def is_trial_expired(self, free_trial_days_left: int) -> bool:
        return free_trial_days_left <= 0

    def stream_quality(self, is_premium: bool) -> str:
        if is_premium:
            return "high"
        else:
            return "low"

    def play_decision(self, is_logged_in: bool, is_offline: bool) -> str:
        if not is_logged_in:
            return "error_login_required"
        return "play_cached" if is_offline else "stream"

test_gates.py

"""Truth-table-driven tests for boolean gates and the PlayController caller."""
import pytest
from gates import PlaybackGates
from play_controller import PlayController, UserState


@pytest.fixture
def gates() -> PlaybackGates:
    return PlaybackGates()


@pytest.mark.parametrize("days,expected", [
    (5, False),
    (1, False),
    (0, True),
    (-1, True),
])
def test_is_trial_expired(gates: PlaybackGates, days: int, expected: bool) -> None:
    assert gates.is_trial_expired(days) == expected


@pytest.mark.parametrize("is_premium,expected", [
    (True, "high"),
    (False, "low"),
])
def test_stream_quality(gates: PlaybackGates, is_premium: bool, expected: str) -> None:
    assert gates.stream_quality(is_premium) == expected


@pytest.mark.parametrize("is_logged_in,is_offline,expected", [
    (False, False, "error_login_required"),
    (False, True,  "error_login_required"),
    (True,  False, "stream"),
    (True,  True,  "play_cached"),
])
def test_play_decision(gates: PlaybackGates, is_logged_in: bool, is_offline: bool, expected: str) -> None:
    assert gates.play_decision(is_logged_in, is_offline) == expected


# ---- Caller tests: PlayController must keep working post-simplification ----

@pytest.mark.parametrize("state,expected", [
    (UserState(free_trial_days_left=0, is_premium=True,  is_logged_in=True,  is_offline=False), "trial_expired"),
    (UserState(free_trial_days_left=5, is_premium=False, is_logged_in=False, is_offline=False), "error_login_required"),
    (UserState(free_trial_days_left=5, is_premium=True,  is_logged_in=True,  is_offline=True),  "play_cached"),
    (UserState(free_trial_days_left=5, is_premium=True,  is_logged_in=True,  is_offline=False), "stream:high"),
    (UserState(free_trial_days_left=5, is_premium=False, is_logged_in=True,  is_offline=False), "stream:low"),
])
def test_play_controller_decide(gates: PlaybackGates, state: UserState, expected: str) -> None:
    controller: PlayController = PlayController(gates)
    assert controller.decide(state) == expected

truth_table.md

# Truth table for `play_decision`

| is_logged_in | is_offline | original returns      | naive `A and not B` returns |
|--------------|------------|-----------------------|------------------------------|
| False        | False      | error_login_required  | error_login_required         |
| False        | True       | error_login_required  | error_login_required         |
| True         | False      | stream                | stream                       |
| True         | True       | play_cached           | error_login_required ← BREAKS |

Three smells, three rewrites.

is_trial_expired returns the comparison directly. The if/else around a boolean comparison was pure ceremony.
stream_quality uses a single if/else. The duplicate-condition pattern was an artifact of “let me handle the negation just to be safe” — a cousin of the redundancy-for-safety smell.
play_decision was correctly simplified using a guard clause + ternary. The naive if A and not B collapse would have dropped the is_offline=True branch entirely. The four-row parametrize table catches it.

The general principle. Any “simplification” of nested boolean control flow has to be checked against the full truth table of its inputs. Two boolean inputs → 4 rows. Three booleans → 8 rows. If your test table has fewer rows than the input space, the simplification is unverified.

Step 3 — Knowledge Check

Min. score: 80%

1. Retrieval / prediction. In a truth table over (is_logged_in, is_offline), which row does the naive if is_logged_in and not is_offline collapse get wrong?

(False, False)
(False, False) returns ‘error_login_required’ in both the original and the naive version — the outer if not is_logged_in guard handles it identically.
(False, True)
(False, True) is also handled identically — when is_logged_in is False, both versions return ‘error_login_required’ regardless of is_offline.
(True, False)
(True, False) returns ‘stream’ in both versions. This is the happy-path row — and the only one a single-test suite would check.
(True, True)

(True, True) is the IfsMerged row. Original code returns "play_cached"; the naive if A and not B collapse returns "error_login_required". The bug is invisible to any test that doesn’t include this exact input. Truth-table tests are not optional for boolean logic — they’re load-bearing.

2. Which of the following is the simplest correct simplification of if cond: return True else: return False?

return cond is True
is True checks identity, not truthiness. For a comparison like x <= 0, identity-comparison can fail surprisingly when the result is a numpy bool or similar non-True-singleton.
return bool(cond)
bool(cond) is correct but redundant when cond is already a comparison — comparisons return bool. It’s useful only when the input might be non-boolean (a list, an int) and explicit coercion is desired.
return cond
if cond: return True; return False
Removing the else keeps the same anti-pattern. The redundancy is the if/else plus the True/False literals, not just the else.

The condition IS the answer. Comparisons (<=, ==, in, etc.) already return booleans. Wrapping them in if/return True/return False adds no information.

3. You see this code:

if is_active:
    process()
if not is_active:
    skip()

What’s the smell, and what’s the fix?

Speculative generality — fix is to inline the helpers
Speculative generality is about unused abstractions, not about conditional structure.
Confusing else — fix is to use a single if/else
Long Method — fix is Extract Function
These four lines are too short to be a Long Method. The smell is in the shape of the conditional, not the length.
There’s no smell; this is fine because the conditions are exclusive
Logically the conditions are exclusive, but the structure invites two separate evaluations of is_active — and silently breaks if a third branch is later added (e.g., is_active is None). The if/else makes the exclusivity explicit and atomic.

Confusing else is the official name (cf. anti-patterns catalogs). Two ifs with mutually-exclusive conditions are an if/else in disguise. The fix removes the duplicate condition evaluation and makes the structure more change-resistant.

4. Which of these is the safest way to verify a boolean refactoring?

Run the existing happy-path test once
Happy-path tests pass on naive simplifications that drop branches — the IfsMerged trap demonstrates this exactly.
Add one new test for the simplified version
Adding one test catches one bug (if it happens to test the right input). Adding a full table catches every behavior change.
Run a parametrize table covering all input combinations
Manually inspect the diff for syntactic similarity
Syntactic similarity is not behavioral equivalence. The IfsMerged collapse looks similar to the original but evaluates differently on one of four rows.

For boolean code, a truth-table parametrize is the only sound oracle. Two-input booleans need 4 rows; three-input need 8; n-input need 2^n. If the table has fewer rows than the input space, the verification is incomplete.

4

Duplicated Code → Extract Function (with the tool)

Why this matters

Step 1 made you feel duplication’s cost. Now you remove it — and from this step forward, the tool does the typing. Your job is the three decisions a tool can’t make: where the boundary is, what to name the result, and whether the post-state is better than the pre-state. Get those three right and parameterised Extract Function is the highest-leverage refactoring in the toolkit.

🎯 You will learn to

Analyze near-duplicate methods to identify the one thing that varies between them.
Apply Monaco’s Refactor: Extract Function/Method to consolidate duplicates with a callable parameter.
Evaluate when to extract using the Rule of Three.

The tool will do the typing. You are doing three things: choosing the boundary, choosing the name, and judging whether the result is better. Those decisions are yours.

Open filters.py. The MusicLibrary class has two filter methods:

class MusicLibrary:
    def __init__(self, tracks: List[Dict]) -> None:
        self.tracks: List[Dict] = tracks

    def apply_genre_filter(self, genre: str) -> List[Dict]:
        result: List[Dict] = [t for t in self.tracks if t["genre"] == genre]
        result.sort(key=lambda t: t["title"])
        return result[:50]

    def apply_artist_filter(self, artist: str) -> List[Dict]:
        result: List[Dict] = [t for t in self.tracks if t["artist"] == artist]
        result.sort(key=lambda t: t["title"])
        return result[:50]

The structure is identical — filter, sort, paginate. The variation is exactly one thing: the predicate that decides which tracks to keep. That variation is a candidate parameter.

Initial state

Detailed description

UML class diagram with 1 class (MusicLibrary).

Classes

MusicLibrary — Attributes: tracks — Operations: apply_genre_filter(genre); apply_artist_filter(artist)

(One class with two near-duplicate methods. After the refactor a private _apply_filter helper joins them, and each public method becomes a one-line delegation.)

Draft the Issue line FIRST

Before you click anything, open memo.md and write a one-sentence Issue line naming the smell in your own words. Doing this before tool invocation calibrates whether the refactoring you’re about to apply matches the smell you diagnosed. If you can’t articulate the smell in a sentence, the tool will gladly produce a “clean” diff that doesn’t actually fix anything.

💡 Expert’s note. When you see two near-duplicates, the question is what varies between them. If the variation is a constant, it’s a parameter. If the variation is a piece of behavior — like a predicate — it’s a callable parameter. Naming the variation is the load-bearing decision.

How the tool works

Select the lines you want to extract. (Highlight them in the editor.)
Right-click to open the context menu. You’ll see a group of Refactor: actions:
- Refactor: Rename Symbol…
- Refactor: Extract Function/Method…
- Refactor: Introduce Parameter Object…
- Refactor: Move Method…
- Refactor: Move Field…
Click the action you want. A dialog appears asking for the new function’s name.
Type the name. The tool shows you a diff preview of the change.
Accept the diff if it looks right. Reject if it doesn’t.

The tool handles the mechanics — cutting the lines, generating the function definition, writing the call site, updating any references it can find. It does not handle judgment. You decide which lines to extract, what to name them, and whether the post-state is better. Those are the load-bearing decisions; the typing is not.

Your task

In apply_genre_filter, select the body (list comprehension, sort, slice).
Invoke Refactor: Extract Function/Method. Name the new method _apply_filter. Accept the diff.

The tool may have made the predicate a literal string. Edit the extracted method so that the predicate is a callable parameter with this signature:

from typing import Callable, Dict, List
def _apply_filter(self, predicate: Callable[[Dict], bool]) -> List[Dict]:
    result: List[Dict] = [t for t in self.tracks if predicate(t)]
    result.sort(key=lambda t: t["title"])
    return result[:50]

And rewrite apply_genre_filter to call it:

def apply_genre_filter(self, genre: str) -> List[Dict]:
    return self._apply_filter(lambda t: t["genre"] == genre)

Replace the body of apply_artist_filter with a similar call:

def apply_artist_filter(self, artist: str) -> List[Dict]:
    return self._apply_filter(lambda t: t["artist"] == artist)

Run the tests. They should still pass. The UML diagram now shows a third method _apply_filter on MusicLibrary — visible proof of the extraction.

The Memo template

From this step on, every refactoring gets a memo — a structured note that captures the design decision. The four fields:

Field	What it captures
Issue	What smell is present in the original code?
Rationale	Why is this refactoring the right fix?
Invariant	What property of behavior is preserved?
Tests	Which tests confirm the invariant?

Open memo.md. Two fields are pre-filled (Rationale, Invariant). You write the Issue and Tests fields. Don’t skip this — the memo is part of the deliverable for this step.

When NOT to extract — the Rule of Three

Two duplicates is the minimum for extraction. Some teams use the Rule of Three: don’t extract until you see the same pattern three times. The reason: the third occurrence reveals what’s truly common vs. what’s accidental similarity. Extracting on the second occurrence sometimes produces an abstraction that breaks when you find the third.

For this step, two is enough — the variation (predicate) is clearly the only difference. If you’re unsure, wait for the third instance.

Starter files

filters.py

"""Filters for the music library."""
from typing import Dict, List


class MusicLibrary:
    """An in-memory music library with genre and artist filters."""

    def __init__(self, tracks: List[Dict]) -> None:
        self.tracks: List[Dict] = tracks

    def apply_genre_filter(self, genre: str) -> List[Dict]:
        result: List[Dict] = [t for t in self.tracks if t["genre"] == genre]
        result.sort(key=lambda t: t["title"])
        return result[:50]

    def apply_artist_filter(self, artist: str) -> List[Dict]:
        result: List[Dict] = [t for t in self.tracks if t["artist"] == artist]
        result.sort(key=lambda t: t["title"])
        return result[:50]

search_handler.py

"""Dispatches user search queries to the right MusicLibrary filter.

Production caller for MusicLibrary's filter methods. The student
extracts a private `_apply_filter` helper inside MusicLibrary; the
PUBLIC methods (`apply_genre_filter`, `apply_artist_filter`) keep
their signatures, so this caller doesn't need to change. That's the
point of refactoring — internal structure changes, external contract
stays identical.
"""
from typing import Dict, List
from filters import MusicLibrary


class SearchHandler:
    """Routes a (attribute, value) query to the right filter method."""

    def __init__(self, library: MusicLibrary) -> None:
        self.library: MusicLibrary = library

    def search(self, attribute: str, value: str) -> List[Dict]:
        """Dispatch to the genre or artist filter based on attribute name."""
        if attribute == "genre":
            return self.library.apply_genre_filter(value)
        elif attribute == "artist":
            return self.library.apply_artist_filter(value)
        return []

test_filters.py

"""Behavior tests for the filters and the SearchHandler caller."""
import pytest
from typing import Dict, List
from filters import MusicLibrary
from search_handler import SearchHandler


@pytest.fixture
def library() -> MusicLibrary:
    tracks: List[Dict] = [
        {"title": "B", "artist": "Alice", "genre": "rock"},
        {"title": "A", "artist": "Alice", "genre": "jazz"},
        {"title": "C", "artist": "Bob",   "genre": "rock"},
        {"title": "D", "artist": "Carol", "genre": "rock"},
    ]
    return MusicLibrary(tracks)


def test_genre_filter_returns_sorted_matches(library: MusicLibrary) -> None:
    result: List[Dict] = library.apply_genre_filter("rock")
    assert [t["title"] for t in result] == ["B", "C", "D"]


def test_artist_filter_returns_sorted_matches(library: MusicLibrary) -> None:
    result: List[Dict] = library.apply_artist_filter("Alice")
    assert [t["title"] for t in result] == ["A", "B"]


def test_genre_filter_paginates_at_50() -> None:
    tracks: List[Dict] = [
        {"title": f"T{i:03d}", "artist": "X", "genre": "rock"} for i in range(100)
    ]
    big_library = MusicLibrary(tracks)
    result: List[Dict] = big_library.apply_genre_filter("rock")
    assert len(result) == 50


# ---- Caller tests: SearchHandler dispatches both filter kinds ----

def test_search_handler_dispatches_to_genre(library: MusicLibrary) -> None:
    handler: SearchHandler = SearchHandler(library)
    result: List[Dict] = handler.search("genre", "rock")
    assert [t["title"] for t in result] == ["B", "C", "D"]


def test_search_handler_dispatches_to_artist(library: MusicLibrary) -> None:
    handler: SearchHandler = SearchHandler(library)
    result: List[Dict] = handler.search("artist", "Alice")
    assert [t["title"] for t in result] == ["A", "B"]


def test_search_handler_unknown_attribute_returns_empty(library: MusicLibrary) -> None:
    handler: SearchHandler = SearchHandler(library)
    assert handler.search("year", "2020") == []

memo.md

# Refactoring memo — Step 4

## Issue
<!-- TODO: name the smell in one sentence. What's wrong with the original code? -->


## Rationale
The variation between `apply_genre_filter` and `apply_artist_filter` is the predicate
(what makes a track a "match"). Everything else — the list comprehension, the sort,
the pagination — is identical. Extracting the common structure into a helper that
accepts a callable predicate captures the duplication exactly, with no over-generalization.

## Invariant
For any library and any field/value pair, calling the original duplicated function
and the new factored-through-`_apply_filter` version produces identical results
(same list, same order, same length). External behavior is unchanged.

## Tests
<!-- TODO: which tests in test_filters.py confirm the invariant? List them by name. -->

Solution

filters.py

"""Filters for the music library."""
from typing import Callable, Dict, List


class MusicLibrary:
    """An in-memory music library with genre and artist filters."""

    def __init__(self, tracks: List[Dict]) -> None:
        self.tracks: List[Dict] = tracks

    def _apply_filter(self, predicate: Callable[[Dict], bool]) -> List[Dict]:
        result: List[Dict] = [t for t in self.tracks if predicate(t)]
        result.sort(key=lambda t: t["title"])
        return result[:50]

    def apply_genre_filter(self, genre: str) -> List[Dict]:
        return self._apply_filter(lambda t: t["genre"] == genre)

    def apply_artist_filter(self, artist: str) -> List[Dict]:
        return self._apply_filter(lambda t: t["artist"] == artist)

memo.md

# Refactoring memo — Step 4

## Issue
Two functions (`apply_genre_filter`, `apply_artist_filter`) share 70% of their
code — the only variation is the predicate that selects matches. This is
Duplicated Code: any future change (e.g., raising the page size from 50 to 100)
requires editing both places, and any bug in the shared logic exists twice.

## Rationale
The variation between the two functions is the predicate (what makes a track a
"match"). Everything else — the list comprehension, the sort, the pagination —
is identical. Extracting the common structure into a helper that accepts a
callable predicate captures the duplication exactly, with no over-generalization.

## Invariant
For any library and any field/value pair, calling the original duplicated function
and the new factored-through-`_apply_filter` version produces identical results
(same list, same order, same length). External behavior is unchanged.

## Tests
`test_genre_filter_returns_sorted_matches`, `test_artist_filter_returns_sorted_matches`,
and `test_genre_filter_paginates_at_50` confirm the invariant — same inputs produce
the same outputs after the extraction.

What the tool did, what you did.

The tool extracted the body and inserted the call site. Mechanical work.
You named it _apply_filter. The leading underscore signals “module-internal.” The name describes what it does (filters), not how (uses list comprehension).
You decided that the variation was a callable predicate, not a string field name. That’s the design decision. A string-based extraction would have worked for these two cases but broken if a future call needed lambda t: t["year"] > 2000.

The Rule of Three at work. With only two duplicates, the predicate-as-callable might be over-generalizing. With three duplicates, the pattern is clearly stable. We extracted on two here because the variation was obvious — but a more defensible default is to wait for the third instance.

Step 4 — Knowledge Check

Min. score: 80%

1. Retrieval / prediction. After Extract Function, which four regions of filters.py will have changed compared to the starter?

Function definitions, return statements, imports, the test file
‘Function definitions and return statements’ is not a useful taxonomy of regions — every Python file has those. The test file shouldn’t change at all (that’s the behavior-preservation property). The four concrete regions are the new import, the helper, and the two delegating bodies.
The Callable import, the _apply_filter helper, and both filter bodies
Imports, the helper, the test file, and the __init__
The test file shouldn’t change — that’s the behavior-preservation property. __init__ is unchanged because the helper is a method, not a new field.
Only one filter body — the tool propagates the rest automatically
The tool extracts the first region into a helper, but the second duplicate still has the old body until it gets replaced with a delegating call. That second replacement is a human task.

One new import, one new helper, two delegators. Each duplicate body shrinks to a single-line return self._apply_filter(lambda t: ...). The Callable import is needed because the helper’s signature gains a callable parameter. The test file should not change — if it did, you changed behavior, not just structure.

2. Spacing back to Step 1. Suppose Step 1’s royalty_song, royalty_podcast, royalty_audiobook (with the bug + in two of them) had been refactored before the bug was discovered — extracted into a single _royalty(plays, rate) helper. What would have happened?

The bug would have been caught immediately
Extraction doesn’t run the tests automatically. It moves code; it doesn’t validate it.
The bug would have moved into the helper — one buggy place instead of three
The refactoring would have failed because the duplicates differ
The duplicates were near-identical — same structure, same operator (or at least, same intended operator). The tool would have happily extracted them; whether the result was correct depends on which of the three the helper inherits its body from.
The tests would have caught it because of the extraction
Existing tests covered only royalty_song. Without coverage of the other two, no extraction or non-extraction would surface the bug.

Fix-then-refactor. This is exactly why Step 1’s lesson is “fix first.” Extracting buggy duplicates into a helper consolidates the bug, which feels like progress but isn’t. The bug is still there — it’s just in a single, harder-to-spot location.

3. You see a piece of code that’s been duplicated once (two copies). Should you extract it?

Yes, immediately — duplication is always a smell
Eager extraction at two duplicates is a real pattern, but it backfires when the third occurrence reveals a different variation than the first two implied (the wrong-abstraction trap).
It depends — if the variation is obvious, extract; if not, wait
No — the Rule of Three says wait until you see three copies
The Rule of Three is a default, not an absolute. When the variation is obvious and stable (like a predicate), extracting at two is fine.
No — extraction always over-generalizes
Extraction can over-generalize, but it doesn’t have to. The risk is real but manageable with judgment.

Judgment beats rules. Sandi Metz: “Duplication is far cheaper than the wrong abstraction.” When unsure, wait for the third occurrence. When the variation is one obvious dimension, extract early.

4. Apply. Imagine MusicLibrary grows a third filter, apply_album_filter(album), also written as a sort-and-truncate over a single-field equality predicate. Using the helper you extracted in this step, what’s the smallest body the new filter needs?

Re-extract _apply_filter again — the existing helper only handles two predicates
The helper accepts any Callable[[Dict], bool] predicate — it already handles the third case. Re-extracting would duplicate the helper, which is the smell you just removed.
return self._apply_filter(lambda t: t['album'] == album) — one delegating line
Copy the body of apply_genre_filter and edit the predicate — the helper isn’t reusable
Copy-paste reintroduces the duplication you just killed. Each new filter is a one-liner because the variation (the predicate) is captured by a parameter.
Override _apply_filter in a subclass that knows about albums
Subclassing would couple the album filter to a hierarchy that doesn’t exist. The lambda parameter is the right seam — it’s why the helper accepts a Callable, not a string.

The Rule of Three’s payoff. The third filter is now a one-liner because the helper captured the variation (the predicate) as a parameter. That’s the lifetime savings of Extract Function: every future filter that fits the shape costs one line, not seven. If the new filter didn’t fit the shape — say it needed to truncate at 100 instead of 50 — that would be a signal the abstraction is too narrow, and the new filter would push the helper toward a second parameter.

5

Long Parameter List → Introduce Parameter Object (with the tool)

Why this matters

A method that takes eight parameters is not just long — it’s hiding a relationship the code refuses to name. When four of the eight always travel together, that’s a data clump, and every call site that touches one is forced to touch all four. Introduce Parameter Object names the clump as a real type, so the relationship becomes a compile-time fact rather than a convention every reader has to discover.

🎯 You will learn to

Analyze a long parameter list to spot the data clump hiding inside it.
Apply Monaco’s Refactor: Introduce Parameter Object to consolidate a clump into a @dataclass.
Evaluate why type-annotated parameter objects beat anonymous dict payloads.

Open track.py. The class TrackCatalog has an add_track method that takes eight parameters:

class TrackCatalog:
    def add_track(
        self,
        title: str,
        artist: str,
        album: str,
        duration_sec: int,
        genre: str,
        release_year: int,
        bpm: int,
        isrc: str,
    ) -> Dict:
        ...

Eight is a lot. But the smell isn’t just the count — it’s that four of the eight always travel together: artist, album, genre, release_year describe the album a track belongs to. Every call site that passes one of them passes all four; every call site that mutates one mutates them as a unit.

That’s a data clump. The fix is Introduce Parameter Object — a small class (here a @dataclass) that names the clump and makes the relationship explicit.

Initial state

Detailed description

UML class diagram with 1 class (TrackCatalog).

Classes

TrackCatalog — Attributes: library — Operations: add_track(title, artist, album, duration_sec, genre, release_year, bpm, isrc)

(One class, one method with eight flat parameters. After the refactor, four of these will be replaced by an AlbumInfo parameter object — and you’ll see a new AlbumInfo box appear in the live UML next to TrackCatalog.)

A 60-second `@dataclass` primer (predict, then peek)

Python’s @dataclass decorator generates __init__, __repr__, and __eq__ for you from a list of typed field declarations:

from dataclasses import dataclass

@dataclass
class Point:
    x: int
    y: int

That’s the entire class. Before reading further, take 30 seconds and predict: what auto-generated __init__ does Python write for you? What does print(Point(3, 4)) print? Does Point(3, 4) == Point(3, 4) return True or False? Write your answers down — the desugared form is in the next paragraph.

Behind the scenes, @dataclass rewrites the class to roughly:

class Point:
    def __init__(self, x: int, y: int) -> None:
        self.x = x
        self.y = y
    def __repr__(self) -> str:
        return f"Point(x={self.x}, y={self.y})"
    def __eq__(self, other) -> bool:
        return isinstance(other, Point) and (self.x, self.y) == (other.x, other.y)

So Point(x=3, y=4) works, print(Point(3, 4)) reads Point(x=3, y=4), and Point(3, 4) == Point(3, 4) returns True — all without you typing any boilerplate. The tool will produce exactly this shape for AlbumInfo in a moment. Recognize it, don’t try to derive it.

Sketch `AlbumInfo` BEFORE invoking the tool

Take 60 seconds. In a comment at the top of track.py, write what you’d put in an AlbumInfo dataclass if you were writing it by hand: field names + Python types. Don’t peek at the tool’s output yet. Then invoke the tool and compare. The compare is the lesson — your sketch usually matches the tool’s output, which builds your trust that the tool is doing what you’d do, just faster.

Your tasks (in order)

Refactor the signature. Place your cursor in TrackCatalog.add_track’s parameter list. Right-click → Refactor: Introduce Parameter Object… The tool asks for the class name — type AlbumInfo. Select artist, album, genre, release_year. Inspect the diff. The new add_track should accept (self, title, album_info, duration_sec, bpm, isrc) — five parameters (plus self), one of which is structured. Watch the live UML in the bottom-left: a new AlbumInfo box appears with the four clumped fields, and TrackCatalog.add_track’s signature shrinks.
Update helpers.py. Open it. seed_library calls add_track with positional arguments — tools rewrite named call sites more reliably than positional ones, so the tool likely missed this. Replace each call with the new AlbumInfo form.
Update test_track.py. Open it. test_add_track_inserts_record calls add_track with the old eight-keyword form — that signature no longer exists. Replace the call with the new AlbumInfo form. (The other test, test_seed_library_populates_two_tracks, doesn’t need to change because it goes through seed_library.)
Run the tests. All three should now be green.

Pause and reckon: what did this refactor cost?

Count the files you had to edit by hand: helpers.py and test_track.py — two files, even though the smell was in one line of track.py. That count is the concrete cost of changing a public signature: every call site has to follow, and tests are call sites too.

This is what makes parameter objects valuable from the start. If you bundle related fields into a @dataclass when you first design a function, the signature stays stable as you add new fields — AlbumInfo can grow a label or producer field without changing any call site. Eight flat parameters can’t.

The lesson generalizes: stable interfaces are a design choice that pays off across every future signature change. When tests have to be updated by a refactor, that’s a signal — the test was reaching past the public interface, OR the public interface itself was the wrong shape. In Step 5’s case, it’s the second: the original add_track shape didn’t reflect that four of its parameters always travel together.

Look out for this in later steps. Steps 6 and 7 will demonstrate refactorings where the public interface stays stable and tests do not change — that’s the contrast.

Why a `@dataclass`, not a `dict`?

You could also pass {"artist": ..., "album": ..., "genre": ..., "release_year": ...} as a dict. That’s worse for three reasons:

Property	`@dataclass AlbumInfo`	`dict`
Type-checked at definition	yes (mypy / IDE)	no
Auto-completion in editor	yes	partial
Typos caught early	yes (`album_info.titel` is an error)	no (`d["titel"]` returns `KeyError` at runtime)

The dict version “works” but defers every type error to runtime. The dataclass moves the same errors to definition time, which is where bugs are cheap to fix.

Starter files

track.py

"""The track catalog: add a new track to the library."""
from typing import Dict, List


class TrackCatalog:
    """Stores and inserts tracks. Pretend the library list is a database."""

    def __init__(self) -> None:
        self.library: List[Dict] = []

    def add_track(
        self,
        title: str,
        artist: str,
        album: str,
        duration_sec: int,
        genre: str,
        release_year: int,
        bpm: int,
        isrc: str,
    ) -> Dict:
        """Insert a new track into the library."""
        record: Dict = {
            "title": title,
            "artist": artist,
            "album": album,
            "duration_sec": duration_sec,
            "genre": genre,
            "release_year": release_year,
            "bpm": bpm,
            "isrc": isrc,
        }
        self.library.append(record)
        return record

helpers.py

"""Helpers for seeding default tracks."""
from track import TrackCatalog


def seed_library(catalog: TrackCatalog) -> None:
    """Populate a catalog with some default tracks."""
    catalog.add_track("Lullaby", "Alice", "Bedtime", 180, "ambient", 2020, 60, "ISRC001")
    catalog.add_track("Sprint",  "Bob",   "Workout", 120, "electro", 2021, 140, "ISRC002")

memo.md

# Refactoring memo — Step 5

## Issue
`add_track` takes eight parameters, four of which (`artist`, `album`, `genre`,
`release_year`) always travel together as album-level metadata. This is a
**Long Parameter List with a data clump** — the four fields describe the album
a track belongs to, and they always co-vary in call sites.

## Rationale
<!-- TODO: explain WHY introducing AlbumInfo is the right fix.
     Why a @dataclass, not a dict? Why these four parameters and not others? -->


## Invariant
<!-- TODO: which property of behavior must be preserved across the refactor?
     (Hint: same input values → same record dict in LIBRARY.) -->


## Tests
The three `test_seed_library_*` tests in `test_track.py` lock in the observable
behavior: after `seed_library(catalog)`, the catalog contains two records with the
right field values. They go through the *stable* `seed_library(catalog) -> None`
interface, not through `add_track` directly — so the refactor changes `add_track`'s
signature without forcing the tests to change.

test_track.py

"""Tests for TrackCatalog. Exercise `add_track` directly AND through `seed_library`.

IMPORTANT — this step deliberately tests `add_track` *directly*. Why? Because the
smell is in `add_track`'s signature, and Introduce Parameter Object **changes that
signature**. When a public signature changes, every call site has to change too —
*including tests*. That cost is part of what you're learning here. Bundle parameters
eagerly (cheap) and you pay later when call sites multiply; bundle them as a
`@dataclass` and the signature is stable from the start.

After the refactor you will rewrite the kwargs in this file to pass an `AlbumInfo`.
The number of files that need to change (this file + `helpers.py`) is the *concrete
measurement* of the cost.
"""
import pytest
from typing import Dict
from track import TrackCatalog
from helpers import seed_library


@pytest.fixture
def catalog() -> TrackCatalog:
    return TrackCatalog()


def test_add_track_inserts_record(catalog: TrackCatalog) -> None:
    # After the refactor, this kwarg call will need to pass an `AlbumInfo`
    # for the album-metadata clump (artist / album / genre / release_year).
    record: Dict = catalog.add_track(
        title="Echo",
        artist="Carol",
        album="Reflections",
        duration_sec=200,
        genre="indie",
        release_year=2022,
        bpm=110,
        isrc="ISRC003",
    )
    assert record["title"] == "Echo"
    assert record["artist"] == "Carol"
    assert record["album"] == "Reflections"
    assert record["genre"] == "indie"
    assert record["release_year"] == 2022
    assert len(catalog.library) == 1


def test_seed_library_populates_two_tracks(catalog: TrackCatalog) -> None:
    # `seed_library`'s `(catalog) -> None` signature is stable across the refactor.
    # This test does NOT need to change — only the internal call inside helpers.py does.
    seed_library(catalog)
    assert len(catalog.library) == 2
    assert catalog.library[0]["title"] == "Lullaby"
    assert catalog.library[1]["bpm"] == 140

Solution

track.py

"""The track catalog: add a new track to the library."""
from dataclasses import dataclass
from typing import Dict, List


@dataclass
class AlbumInfo:
    artist: str
    album: str
    genre: str
    release_year: int


class TrackCatalog:
    """Stores and inserts tracks."""

    def __init__(self) -> None:
        self.library: List[Dict] = []

    def add_track(self, title: str, album_info: AlbumInfo, duration_sec: int,
                  bpm: int, isrc: str) -> Dict:
        """Insert a new track into the library."""
        record: Dict = {
            "title": title,
            "artist": album_info.artist,
            "album": album_info.album,
            "duration_sec": duration_sec,
            "genre": album_info.genre,
            "release_year": album_info.release_year,
            "bpm": bpm,
            "isrc": isrc,
        }
        self.library.append(record)
        return record

helpers.py

"""Helpers — also call add_track."""
from track import TrackCatalog, AlbumInfo


def seed_library(catalog: TrackCatalog) -> None:
    """Populate a catalog with some default tracks."""
    catalog.add_track(
        "Lullaby",
        AlbumInfo(artist="Alice", album="Bedtime", genre="ambient", release_year=2020),
        180, 60, "ISRC001",
    )
    catalog.add_track(
        "Sprint",
        AlbumInfo(artist="Bob", album="Workout", genre="electro", release_year=2021),
        120, 140, "ISRC002",
    )

memo.md

# Refactoring memo — Step 5

## Issue
`add_track` takes eight parameters, four of which (`artist`, `album`, `genre`,
`release_year`) always travel together as album-level metadata. This is a
**Long Parameter List with a data clump** — the four fields describe the album
a track belongs to, and they always co-vary in call sites.

## Rationale
Introducing `AlbumInfo` as a `@dataclass` captures the co-variation: the four
fields are now a single conceptual unit, named, type-annotated, and IDE-checkable.
A dict would also bundle them but would lose static type checking and tab-completion;
a hand-written class would force boilerplate (`__init__`, `__eq__`, `__repr__`)
that `@dataclass` generates for free. The other four `add_track` parameters
(`title`, `duration_sec`, `bpm`, `isrc`) vary independently of each other and of
the album fields, so they stay flat.

## Invariant
For any combination of input values, the dict record stored in `catalog.library`
and returned from `catalog.add_track` is unchanged: same keys, same values, same
order of keys. External behavior — what callers observe — is preserved.

## Tests
The three `test_seed_library_*` tests in `test_track.py` lock in the observable
behavior: after `seed_library(catalog)`, the catalog contains two records with the
right field values. They go through the *stable* `seed_library(catalog) -> None`
interface, not through `add_track` directly — so the refactor changes `add_track`'s
signature without forcing the tests to change.

test_track.py

"""Tests for TrackCatalog after Introduce Parameter Object.

Note: `test_add_track_inserts_record` was rewritten as part of this step's task —
its `add_track` call now passes an `AlbumInfo` instead of four flat album fields.
That update is the *concrete cost* of changing a public signature; this is the
file you'd be unable to avoid editing in any real codebase. `test_seed_library_*`
did not need to change because it goes through a stable wrapper.
"""
import pytest
from typing import Dict
from track import TrackCatalog, AlbumInfo
from helpers import seed_library


@pytest.fixture
def catalog() -> TrackCatalog:
    return TrackCatalog()


def test_add_track_inserts_record(catalog: TrackCatalog) -> None:
    record: Dict = catalog.add_track(
        title="Echo",
        album_info=AlbumInfo(
            artist="Carol", album="Reflections",
            genre="indie", release_year=2022,
        ),
        duration_sec=200,
        bpm=110,
        isrc="ISRC003",
    )
    assert record["title"] == "Echo"
    assert record["artist"] == "Carol"
    assert record["album"] == "Reflections"
    assert record["genre"] == "indie"
    assert record["release_year"] == 2022
    assert len(catalog.library) == 1


def test_seed_library_populates_two_tracks(catalog: TrackCatalog) -> None:
    seed_library(catalog)
    assert len(catalog.library) == 2
    assert catalog.library[0]["title"] == "Lullaby"
    assert catalog.library[1]["bpm"] == 140

The clump becomes a class. AlbumInfo is a small @dataclass that names the relationship between the four album-related fields. add_track’s signature shrinks from 8 parameters to 5 — and the 5 are now all different (no clump remains).

The tool rewrote add_track. You probably had to fix seed_library by hand — it called add_track with positional arguments, and the tool’s call-site rewriter sometimes misses positional calls in helper modules. Tests caught the miss; you fixed it; behavior is now preserved.

Compare before/after. The inline UML at the top of the instructions showed one box with 8 fields. The live UML in the bottom-left now shows two boxes — add_track (with 5 parameters) and AlbumInfo (with 4 fields). Same behavior, different structure. The tests still pass, which is the proof.

Step 5 — Knowledge Check

Min. score: 80%

1. Retrieval. Which of add_track’s eight parameters travel together as a clump? (select all that apply)

title
title is a track-level property — every track has its own. It varies independently of the album fields.
artist
album
duration_sec
duration_sec is track-level. Two tracks on the same album can have different durations.
genre
release_year
bpm
bpm is track-level. Even within a single album, tempo varies song to song.
isrc
isrc (International Standard Recording Code) is unique per recording, not per album. It’s the most track-specific identifier.

The clump is artist, album, genre, release_year — these describe the album, not the individual track. Two tracks on the same album always pass these four together with the same values. That co-variation is the diagnostic for a parameter object.

2. Why prefer @dataclass over a plain dict for the parameter object?

Runtime speed — dataclass attribute access is faster than dict lookup
Performance is roughly equivalent. Dataclass attribute access compiles to a normal __dict__ lookup; dict access is a hash lookup. Both are fast.
Type-checked field access — d.titel is a static error, d['titel'] is a runtime KeyError
Heterogeneous types — dicts can’t mix str, int, and list values
Dicts hold heterogeneous types just fine — a dict can have str, int, and list values.
Type-checker requirement — mypy and pyright refuse plain dict parameter objects
Type checkers don’t require dataclasses; they just produce better diagnostics with them than with dicts.

Type-checked field access is the load-bearing benefit. album_info.titel (typo) is a static error; d["titel"] is a runtime KeyError. Moving errors earlier — from runtime to edit time — is the same theme as test-driven development.

3. The tool rewrites call sites for you. After applying Introduce Parameter Object, your tests fail. What’s the most likely cause?

The tool corrupted the AlbumInfo definition
Tool-generated dataclass definitions are deterministic — they don’t ‘corrupt’. If the dataclass body looks wrong, that’s an undo-and-retry, not a tool bug.
The tool missed a positional call site in another file
The tests need updating because the refactor changed behavior
If tests need updating to pass, the refactor changed behavior — that’s not a refactor, that’s a rewrite. The whole point of the safety dance is that tests don’t change.
The dataclass decorator is malformed
If the dataclass were malformed, the import would fail at collection time, not in a specific test.

The tool sees what it can statically analyze. Positional calls in helper modules sometimes get missed — particularly when the helper imports add_track indirectly. The tests catch the miss; you fix it manually. Behavior preservation remains your responsibility, even with a tool.

4. What’s the difference between using @dataclass and writing the class by hand?

@dataclass is slower at runtime than a hand-written class
Dataclasses are essentially the same as hand-written classes at runtime. The decorator runs once at definition; instances behave normally.
@dataclass auto-generates init, repr, and eq
@dataclass classes can’t have methods
Dataclasses can have methods. The decorator handles __init__ etc.; anything else can still be added normally.
@dataclass is only for inheritance hierarchies
Dataclasses can be used in inheritance, but they’re more commonly used as flat parameter objects — exactly what we did with AlbumInfo.

Boilerplate generation. @dataclass writes the constructor, equality check, and repr from your field annotations. You get the same behavior as a hand-written class with one-third the code.

6

Feature Envy → Move Method (with the tool)

Why this matters

A method is a name plus a body, and a body that touches only another class’s data is misnamed. Feature Envy is the diagnostic for this: a method that lives on one class but uses zero state of it. The cure is Move Method — relocate the code to the class whose data it actually uses. Get this right and the dependency arrows in your UML start pointing the way the code actually flows.

🎯 You will learn to

Analyze a method body to recognize Feature Envy by the “zero self-state” rule.
Apply Monaco’s Refactor: Move Method to relocate the method to its rightful host.
Evaluate why type-annotated signatures make automated moves safer than dynamic ones.

Open media.py. There are two classes, Player and Track, and one suspicious method:

class Player:
    def __init__(self, volume: int) -> None:
        self.volume: int = volume

    def compute_remaining_seconds(self, track: "Track") -> int:
        return track.duration_sec - track.current_position

compute_remaining_seconds lives on Player but uses zero state of Player. It only touches fields of Track. That’s Feature Envy — the method “envies” the data of another class.

Initial state

Detailed description

UML class diagram with 2 classes (Player, Track). Player references Track labeled "uses".

Classes

Player — Attributes: volume — Operations: compute_remaining_seconds(track)
Track — Attributes: track_id; duration_sec; current_position — Operations: is_finished()

Relationships

Player references Track labeled "uses"

(compute_remaining_seconds lives on Player, but its body only reads Track fields. The arrow points the wrong way.)

💡 Diagnostic. Read the body of any method. Does it touch self at all? If not — or if it touches self only as a delegator — the method probably belongs on the other object whose data it does touch.

Not every cross-class access is Feature Envy

A common confusion: cross-class field access alone is not the diagnostic. It’s the one-sidedness that matters. Compare:

# Feature Envy — single method, zero self-state
class Player:
    def compute_remaining_seconds(self, track: Track) -> int:
        return track.duration_sec - track.current_position    # only Track fields

# Inappropriate Intimacy — two classes reaching deep into each other
class Player:
    def adjust_for_track(self, track: Track) -> None:
        if track.duration_sec > 600 and track.current_position == 0:
            self.volume = max(self.volume - 5, 0)
            track.last_play_volume = self.volume               # writes Track field

class Track:
    def adjust_for_player(self, player: Player) -> None:
        if player.volume < 10:
            self.playback_speed = 0.9                          # mutates self
            player.eq_preset = "soft"                          # writes Player field

Feature Envy has one asymmetric arrow — the fix is Move Method, and that’s what this step trains. Inappropriate Intimacy has arrows in both directions — moving one method just relocates the problem; the structural fix is to introduce a mediating object (or, often, to merge the two classes if they’re really one concept). Recognize the difference before you reach for Move Method, because the wrong fix on Inappropriate Intimacy makes things worse.

Run the move

Place your cursor inside compute_remaining_seconds, then Refactor: Move Method… with target class Track. Same flow as Step 4’s Extract Function — preview, accept. Watch the live UML in the bottom-left: the method migrates from Player to Track, and the Player → Track “uses” arrow disappears because Player no longer reaches into Track for this calculation.

Spacing callback — apply Step 3’s lesson here

Track has a sibling method is_finished that suffers from the boolean-return anti-pattern from Step 3:

def is_finished(self) -> bool:
    if self.current_position >= self.duration_sec:
        return True
    else:
        return False

After moving compute_remaining_seconds onto Track, simplify is_finished while you’re there. Same pattern, new context — the smell doesn’t care which class it lives on.

The Memo

Open memo.md. Three of four fields are blank — the Tests field is pre-filled. You write Issue, Rationale, and Invariant based on what you observe in the move.

Starter files

media.py

"""The streaming media classes."""
from dataclasses import dataclass


@dataclass
class Track:
    track_id: str
    duration_sec: int
    current_position: int = 0

    def is_finished(self) -> bool:
        # Sub-task: simplify this boolean (Step 3 callback)
        if self.current_position >= self.duration_sec:
            return True
        else:
            return False


@dataclass
class Player:
    volume: int = 50

    def compute_remaining_seconds(self, track: Track) -> int:
        # Feature Envy: uses no Player state, only Track state.
        return track.duration_sec - track.current_position

playback_ui.py

"""UI helper that formats the player's current state for display.

This caller currently goes through `Player.compute_remaining_seconds(track)`,
which is Feature Envy on Track. After the Move Method refactor, the call
site here MUST be updated from `self.player.compute_remaining_seconds(track)`
to `track.compute_remaining_seconds()`. The tool may rewrite this for you;
if it doesn't, the tests will catch the missed call site.

The tests in `test_media.py` go *only* through `PlaybackUI.format_remaining`,
whose `(track) -> str` signature is stable across the refactor — so the tests
themselves don't change. The cost of the refactor lives entirely inside this
file.
"""
from media import Player, Track


class PlaybackUI:
    """Renders the player display."""

    def __init__(self, player: Player) -> None:
        self.player: Player = player

    def format_remaining(self, track: Track) -> str:
        """Return a `M:SS remaining` string, or `Finished` when done."""
        if track.is_finished():
            return "Finished"
        seconds: int = self.player.compute_remaining_seconds(track)
        minutes: int = seconds // 60
        secs: int = seconds % 60
        return f"{minutes}:{secs:02d} remaining"

test_media.py

"""Behavior tests for the playback UI.

DESIGN — *contrast with Step 5*. Step 5 changed a public signature, and the tests
had to follow. This step does NOT change a public signature: `compute_remaining_seconds`
is an *internal* helper the UI calls; moving it from `Player` to `Track` is a private
rearrangement that callers don't observe. The `PlaybackUI.format_remaining(track) -> str`
surface is stable, so every test below is stable too. After the refactor, the tests
DO NOT CHANGE — only the internal call inside `playback_ui.py` does.

That stability is the *payoff* of refactoring through stable interfaces. Compare it
to Step 5's two-file edit: here the cost is exactly one file (`playback_ui.py`).
"""
import pytest
from media import Player, Track
from playback_ui import PlaybackUI


@pytest.fixture
def ui() -> PlaybackUI:
    return PlaybackUI(Player())


def test_format_remaining_at_start(ui: PlaybackUI) -> None:
    # 300s remaining → "5:00 remaining"
    track: Track = Track(track_id="t1", duration_sec=300, current_position=0)
    assert ui.format_remaining(track) == "5:00 remaining"


def test_format_remaining_partway(ui: PlaybackUI) -> None:
    # 300 - 120 = 180s → "3:00 remaining"
    track: Track = Track(track_id="t1", duration_sec=300, current_position=120)
    assert ui.format_remaining(track) == "3:00 remaining"


def test_format_remaining_pads_seconds(ui: PlaybackUI) -> None:
    # 125 - 60 = 65s → "1:05 remaining" (zero-padded seconds)
    track: Track = Track(track_id="t1", duration_sec=125, current_position=60)
    assert ui.format_remaining(track) == "1:05 remaining"


def test_format_remaining_at_end(ui: PlaybackUI) -> None:
    # When position == duration, is_finished() is True → "Finished"
    track: Track = Track(track_id="t1", duration_sec=100, current_position=100)
    assert ui.format_remaining(track) == "Finished"


def test_format_remaining_one_second_left(ui: PlaybackUI) -> None:
    # 1s remaining (not finished) → "0:01 remaining"
    track: Track = Track(track_id="t1", duration_sec=100, current_position=99)
    assert ui.format_remaining(track) == "0:01 remaining"

memo.md

# Refactoring memo — Step 6

## Issue
<!-- TODO: name the smell. Why does compute_remaining_seconds belong on Track instead of Player? -->


## Rationale
<!-- TODO: explain WHY moving the method is the right fix. What's the heuristic? -->


## Invariant
<!-- TODO: which property of behavior is preserved? Be specific about the contract. -->


The five `test_format_remaining_*` tests in `test_media.py` confirm the
observable behavior of `PlaybackUI.format_remaining(track) -> str`: the right
formatted string for each remaining-time scenario, including the `"Finished"`
terminal case. Because the tests go through `format_remaining`'s stable
signature, **the test file does not change across the refactor** — only the
internal call inside `playback_ui.py` does (from `self.player.compute_remaining_seconds(track)`
to `track.compute_remaining_seconds()`). Contrast this with Step 5, where
changing a public signature forced the tests to follow.

Solution

media.py

"""The streaming media classes."""
from dataclasses import dataclass


@dataclass
class Track:
    track_id: str
    duration_sec: int
    current_position: int = 0

    def is_finished(self) -> bool:
        return self.current_position >= self.duration_sec

    def compute_remaining_seconds(self) -> int:
        return self.duration_sec - self.current_position


@dataclass
class Player:
    volume: int = 50

playback_ui.py

"""UI helper — post-refactor: reaches into Track directly.

The only change from the starter is the internal call inside `format_remaining`:
`self.player.compute_remaining_seconds(track)` becomes `track.compute_remaining_seconds()`.
The public method signature (`format_remaining(track) -> str`) is identical.
"""
from media import Player, Track


class PlaybackUI:
    """Renders the player display."""

    def __init__(self, player: Player) -> None:
        self.player: Player = player

    def format_remaining(self, track: Track) -> str:
        """Return a `M:SS remaining` string, or `Finished` when done."""
        if track.is_finished():
            return "Finished"
        seconds: int = track.compute_remaining_seconds()
        minutes: int = seconds // 60
        secs: int = seconds % 60
        return f"{minutes}:{secs:02d} remaining"

test_media.py

"""Behavior tests for the playback UI — IDENTICAL to the starter file.

Notice: this file is byte-for-byte unchanged from the starter. The Move Method
refactor moved `compute_remaining_seconds` from Player to Track, but the
*public* interface — `PlaybackUI.format_remaining(track) -> str` — stayed the
same. Tests that go through stable interfaces don't need to change across a
refactor. Compare to Step 5, where the test file did need to change.
"""
import pytest
from media import Player, Track
from playback_ui import PlaybackUI


@pytest.fixture
def ui() -> PlaybackUI:
    return PlaybackUI(Player())


def test_format_remaining_at_start(ui: PlaybackUI) -> None:
    track: Track = Track(track_id="t1", duration_sec=300, current_position=0)
    assert ui.format_remaining(track) == "5:00 remaining"


def test_format_remaining_partway(ui: PlaybackUI) -> None:
    track: Track = Track(track_id="t1", duration_sec=300, current_position=120)
    assert ui.format_remaining(track) == "3:00 remaining"


def test_format_remaining_pads_seconds(ui: PlaybackUI) -> None:
    track: Track = Track(track_id="t1", duration_sec=125, current_position=60)
    assert ui.format_remaining(track) == "1:05 remaining"


def test_format_remaining_at_end(ui: PlaybackUI) -> None:
    track: Track = Track(track_id="t1", duration_sec=100, current_position=100)
    assert ui.format_remaining(track) == "Finished"


def test_format_remaining_one_second_left(ui: PlaybackUI) -> None:
    track: Track = Track(track_id="t1", duration_sec=100, current_position=99)
    assert ui.format_remaining(track) == "0:01 remaining"

memo.md

# Refactoring memo — Step 6

## Issue
`compute_remaining_seconds` lives on `Player` but uses zero state of `Player` —
its body reads only `track.duration_sec` and `track.current_position`. This is
Feature Envy: the method envies the data of another class.

## Rationale
Methods should live with the data they use. Moving `compute_remaining_seconds`
onto `Track` makes the method a normal accessor of its host's state, removes
the unnecessary `Track` parameter, and eliminates the misleading `Player → Track`
dependency arrow. Calls become `track.compute_remaining_seconds()` — the noun
(`track`) is the subject of the verb (`compute remaining seconds`).

## Invariant
For any `Track` with given `duration_sec` and `current_position`, the value
returned by the method is unchanged before and after the move. The four
existing tests confirm this — they need only the call-site rewrite, not any
change in expected values.

The five `test_format_remaining_*` tests confirm `PlaybackUI.format_remaining(track) -> str`
is unchanged. Because these tests go through `format_remaining`'s stable
signature, the test file is byte-for-byte identical before and after the
refactor. Only `playback_ui.py`'s internal call site changes. Compare to
Step 5, where the test file *had* to be edited because the public signature
of `add_track` changed — that's the cost a public-API change pays; this
refactor avoids it entirely.

Player got smaller. Track got one method richer. The Player → Track “uses” arrow disappeared from the diagram because Player no longer reaches into Track for that calculation. The method is now where the data lives.

Two refactorings in one step. Move Method on compute_remaining_seconds, and the Step-3 boolean simplification on is_finished. The point of the second is spacing — applying a previous step’s lesson in a new context. Step 3 simplified booleans in standalone functions; here you simplified one inside a class method. Same anti-pattern, same fix.

Compare before/after. The inline UML at the top showed Player containing compute_remaining_seconds with an arrow pointing to Track. The live UML now shows the method inside Track’s box and no arrow. The tests still pass — that’s the proof of behavior preservation.

Pause and reckon: how many files did you edit?

Count again. One file: playback_ui.py (plus media.py itself — the source of the move). The test_media.py file is byte-for-byte identical before and after the refactor.

Compare to Step 5, where you had to edit helpers.py AND test_track.py. The difference isn’t that this refactoring was easier — it’s that this refactoring went through a stable interface. PlaybackUI.format_remaining(track) -> str is the public contract; moving an internal helper from one class to another doesn’t break it. Moving a public signature does.

The takeaway, in one sentence: when a refactoring needs you to edit tests, that’s a signal — either the public surface changed (Step 5) or the tests were reaching past the public surface (a test-design problem). When neither is true, tests stay green for free.

Step 6 — Knowledge Check

Min. score: 80%

1. Prediction (before reading the body). Player.compute_remaining_seconds(track) returns a number of seconds. Without reading the body, which class is most likely to be the right home for this method?

Player — the method already lives there
Existing location is the least trustworthy heuristic — it shows where the method has been, not where it should be. Feature Envy is the smell of methods being in the wrong place.
Track — methods that compute a track’s properties belong on Track
User — playback decisions ultimately serve a user
User-level decisions belong on User-level methods. Computing remaining seconds of a Track is at the Track level.
It depends on the body; you can’t predict from the name
Method names are usually a strong signal. compute_remaining_seconds of a track is a property of the track. Read the body to confirm, but the name pre-commits.

The noun is the receiver. A method named “compute X of Y” is usually a method on Y. Reading the body confirms — uses zero Player state, all Track state. Move Method.

2. What’s the rule that diagnoses Feature Envy?

The method is more than 10 lines long
Length is a different smell (Long Method). Feature Envy can affect very short methods.
The method uses zero state of its host class (only foreign state)
The method has too many parameters
Parameter count is the Long Parameter List smell. Feature Envy is independent of parameter count.
The method’s name doesn’t match its containing class
Name mismatch is a heuristic but not the diagnostic. The diagnostic is body inspection — what state does the body actually touch?

“Uses zero state of host class” is the precise diagnostic. A method that doesn’t touch self (other than to read it) is suspicious; a method that touches self only to delegate to another object is suspicious; a method that uses only foreign state is unambiguously Feature Envy.

3. Step 3 callback. Why simplify is_finished from if cond: return True else: return False to return cond?

Because it makes the file shorter
Shorter for its own sake isn’t a goal. The simplification removes redundant cognitive structure, not lines.
Because the if/else around the comparison adds no information.
Because Python’s bytecode runs faster on return cond
Bytecode performance is roughly identical. The reason to simplify is readability, not speed.
Because the original form is a syntax error in newer Python versions
The original form is valid Python at every version. It’s just bad style.

The condition (self.current_position >= self.duration_sec) is already a bool. Wrapping it in if/else with literal True and False returns is pure ceremony — same anti-pattern from Step 3, now in a method context.

4. Could you instead move the data (Track’s fields) into Player, leaving the method where it is?

Yes — moving data to the method is symmetric to moving the method to the data
Logically symmetric, practically not. Data tends to have many users; methods tend to have few. Moving data breaks more code.
Yes — but only if Track has no other methods that use those fields
Even if Track had no other users right now, designing for the move would create future-coupling that’s hard to undo. Move the method, not the data.
No — other Track methods rely on those fields; moving the data scatters them.
No — Python’s class system doesn’t allow moving fields between classes
Fields can be moved technically. The reason not to is structural, not syntactic.

Methods are easier to move than data. A method has one home and a few callers; a field tends to have many readers and writers. Moving fields shotgun-surgeries the codebase. Moving methods tightens it.

5. Variation Theory — non-example. A Catalog.build_default_track() method on a Catalog class uses no self state and only returns a new Track from a few constants. Is this Feature Envy?

Yes — it uses no Catalog state, so it should be moved to Track
The diagnostic is necessary but not sufficient. A method that uses zero host state is a candidate for Feature Envy, but factories, validators, and pure-rule synthesizers can legitimately live on the class that exposes them as API.
No — factories that synthesize new objects belong with the exposing API.
Yes — and the fix is to delete the method since it does nothing
The method does plenty — it produces a Track. It’s a factory, not dead code.
Only if the method takes any parameters
Parameter count is unrelated. Factories with parameters, factories with none — both can legitimately live on the calling class.

The diagnostic catches Feature Envy candidates, but judgment decides. “Uses zero host state” makes a method a candidate for Move Method, not a guaranteed instance of the smell. Factories, default constructors, and pure-policy synthesizers live where the API surface is, not where the data is.

6. Variation Theory — Feature Envy vs Inappropriate Intimacy. Player.adjust_for_track(track) reads several track fields and writes track.last_play_volume. Track.adjust_for_player(player) reads player.volume and writes player.eq_preset. What’s the smell, and what’s the right fix?

Feature Envy on both sides — apply Move Method twice
Move Method is the fix for one-sided envy. With two-sided coupling, moving each method to the other class produces two envious methods on swapped homes. The smell isn’t where the methods live — it’s that two classes know each other’s internals.
Inappropriate Intimacy — Move Method just relocates it; the fix is structural
Neither — mutual writes between domain objects are normal in OO design
Tightly coupled mutual writes are a maintenance hazard: every change to one class can ripple into the other unpredictably. OO designs that survive long-term keep cross-class writes thin and explicit.
Long Parameter List — introduce a PlaybackContext parameter object
Parameter clumps are a different smell. Here the issue isn’t the call shape; it’s that the bodies of both methods reach across the class boundary in opposite directions.

One arrow vs two. Feature Envy is one-sided — a method on the wrong class. Move Method fixes it. Inappropriate Intimacy has arrows in both directions — moving one method just makes the other class envious. The structural fix is different: introduce a mediator that owns the cross-cutting logic, or, when the two classes are really one concept, merge them. Diagnose before you reach for the tool.

7. Mid-tutorial mix — Step 1 callback. You’ve just discovered that two functions have the same arithmetic bug. According to what you practiced in Step 1, what’s the correct order of operations?

Refactor first to remove the duplication, then fix the bug in the helper
Fix-then-refactor. Extracting with the bug intact would consolidate the bug into the helper — same bug, harder-to-spot location.
Fix the bug in both places, then refactor to remove the duplication
Fix in one place; the second one isn’t actually called
If a duplicate is genuinely uncalled, that’s dead code, a different smell. The Step 1 lesson assumes both copies are reachable.
Run the tests; if they pass, the duplication isn’t a problem
Running tests is the right discipline, but passing tests don’t mean the duplication isn’t a maintenance cost. The bug count will equal the duplicate count when the next bug arrives.

Fix-then-refactor. The Step 1 lesson generalizes: never refactor over a known bug. Fix in each location with green tests, then consolidate. This is the highest-leverage sequencing rule in the toolkit.

8. Mid-tutorial mix — Step 4 callback. Two near-duplicate functions were just written this morning. The third occurrence isn’t on the roadmap. The variation between the two copies is subtle — you’d have to study them to be sure what’s common. What does the Rule of Three suggest as the default?

Extract immediately — duplication is always a smell
Eager extraction at two duplicates is a real pattern, but it backfires when the third occurrence reveals a different variation than the first two implied (the wrong-abstraction trap). Step 4 noted that the default is to wait.
Wait for the third occurrence — it reveals what’s truly common.
Refuse to extract until tests are added for both copies
Tests-first is correct discipline before any refactor, but it’s not what Rule of Three says. Rule of Three is about the third occurrence, not the third test.
Extract only if the two functions live in the same file
Co-location is unrelated to Rule of Three.

The Rule of Three is the default, not an absolute. Step 4 made the nuance explicit: when the variation is obvious and stable (like a predicate), extracting at two duplicates is fine. When the variation is subtle (this question’s setup), the rule’s default — wait for three — saves you from a wrong-abstraction trap.

7

God Class → Extract Class (with the tool)

Why this matters

A God Class is a class that grew responsibilities until none of them are clearly its responsibility. Renaming won’t help — fields and methods are still coupled to the same self. Decomposition does help, because it shrinks change locality: the number of places that have to change when a feature lands. Extract Class is the named refactoring that turns “one class doing two jobs” into “two classes each doing one job,” and the change-locality count is how you tell whether you actually decomposed or just renamed.

🎯 You will learn to

Analyze a class to identify multiple responsibility clusters by their field-set overlap.
Apply Monaco’s Refactor: Extract Class to migrate one cluster into a new class.
Evaluate the refactor by comparing change locality before and after.

Open streaming_app.py. The StreamingApp class has two distinct responsibilities mixed into one body:

Catalog — track index, search history, recommendation cache, search & recommend methods.
Billing — subscription tier, payment method, invoice list, charging methods.

The comments in the source file label the two clusters explicitly. That’s a hint: when responsibility clusters need labels to stay legible, the class is doing too many things at once.

Initial state

Detailed description

UML class diagram with 1 class (StreamingApp).

Classes

StreamingApp — Attributes: track_index; search_history; recommendation_cache; subscription_tier; payment_method; invoice_list — Operations: search(); record_recommendation(); charge_monthly(); charge_annual(); send_invoice()

Why split? — predict the change-locality, then verify

Before you do any moves, predict: if a new payment method like PayPal needed to be added today, how many places in streaming_app.py would have to change? Skim the file, count, and write your number in a comment at the top of streaming_app.py like # BEFORE: N places change for a new payment method.

After the refactor, count again — how many files / classes change for the same feature. Write # AFTER: M places change.

The point of writing both numbers down is that change locality is the only metric that distinguishes “decomposed” from “renamed.” A renamed class still has the same change footprint. A decomposed class shrinks it.

💡 Expert’s note — recognizing seams. The seam between two responsibility clusters isn’t visible from how methods are named — it’s visible from which self.X fields each method touches. Methods that touch only the catalog field-set and methods that touch only the billing field-set form two clusters. A method that touches both is the boundary case and goes last. Read the methods one at a time, mark the field-sets they touch, and the seam will be obvious.

Use Extract Class

Place your cursor anywhere inside StreamingApp, then choose Refactor: Extract Class…. In the dialog:

Name the new class BillingManager.
Name the delegate field billing.
Select the billing fields: subscription_tier, payment_method, invoice_list.
Select the billing methods: charge_monthly, charge_annual, send_invoice, and the unlabeled notify_payment_due.
Preview the diff, then apply it.

The tool creates BillingManager, moves the selected field/method cluster, replaces the old fields with self.billing = BillingManager(...), and rewrites straightforward typed call sites from streaming_app.charge_monthly() to streaming_app.billing.charge_monthly().

💡 Expert’s note. Extract Class still follows the field-then-method safety rule internally. Fields become the new class’s state first; then methods that use only that field-set can move without leaving dangling self.X references behind.

Spacing callback — Rule of Three from Step 4

Look at charge_monthly and charge_annual carefully. They share most of their logic. Should you Extract Function on the shared body before moving them?

The Step 4 lesson said: extract when the variation is one obvious dimension. Here the variation is the multiplier (12 vs. 1). The bodies are otherwise identical.

Recommended order: Move first, then Extract second. Keeps the extracted helper inside BillingManager, exactly where billing logic should live. (If you Extract first, the helper is on StreamingApp and then has to be moved too — twice the work for the same result.)

The unlabeled stray method

One method, notify_payment_due, isn’t labeled with a comment. Read its body. Which cluster does it belong in? The answer is in the body — this is a small recovery exercise in seam recognition.

Starter files

streaming_app.py

"""The streaming app — currently a God Class.

Two responsibility clusters share one class body. Your task is to
extract the billing cluster into a new BillingManager class.
"""
from dataclasses import dataclass, field
from typing import Dict, List


class StreamingApp:
    def __init__(self, user_id: str) -> None:
        self.user_id: str = user_id

        # ----- Catalog cluster -----
        self.track_index: Dict[str, dict] = {}
        self.search_history: List[str] = []
        self.recommendation_cache: Dict[str, List[str]] = {}

        # ----- Billing cluster -----
        self.subscription_tier: str = "free"
        self.payment_method: str = ""
        self.invoice_list: List[Dict] = []

    # ----- Catalog methods -----
    def search(self, query: str) -> List[str]:
        self.search_history.append(query)
        return [tid for tid, info in self.track_index.items()
                if query.lower() in info.get("title", "").lower()]

    def record_recommendation(self, seed: str, results: List[str]) -> None:
        self.recommendation_cache[seed] = results

    # ----- Billing methods -----
    def charge_monthly(self) -> Dict:
        invoice: Dict = {"period": "monthly", "amount": 9.99 * 1, "method": self.payment_method}
        self.invoice_list.append(invoice)
        return invoice

    def charge_annual(self) -> Dict:
        invoice: Dict = {"period": "annual", "amount": 9.99 * 12, "method": self.payment_method}
        self.invoice_list.append(invoice)
        return invoice

    def send_invoice(self, invoice_index: int) -> bool:
        if invoice_index >= len(self.invoice_list):
            return False
        # Pretend to email the invoice.
        return True

    # ----- Which cluster does this belong in? -----
    def notify_payment_due(self) -> str:
        return f"Payment of $9.99 due on {self.payment_method}"

memo.md

# Refactoring memo — Step 7

## Issue
<!-- TODO: name the smell. What makes StreamingApp a God Class? -->


## Rationale
<!-- TODO: why Extract Class? Why this seam? Why does the field-set define the boundary? -->


## Invariant
<!-- TODO: what behavior must be preserved? What does the public API
     of StreamingApp guarantee that the refactor must not break? -->


## Tests
The five `test_*_runner_*` tests in `test_streaming_app.py` go entirely through
`AppRunner` — `configure_payment`, `run_searches`, `search_history`, `charge_monthly`,
`charge_annual`, and `invoices()`. Those are the *stable* signatures the refactor
preserves. Internally, `app_runner.py` is updated when billing migrates (its
`self.app.charge_monthly()` becomes `self.app.billing.charge_monthly()`, etc.),
but the test file itself does not change.

app_runner.py

"""The high-level driver tests interact with.

Every method below has a stable signature. Internally, several call sites
will need to be updated after the Extract Class refactor — from
`self.app.charge_monthly()` to `self.app.billing.charge_monthly()`, and
from `self.app.invoice_list` to `self.app.billing.invoice_list`. Those are
the *internal* edits the refactor forces. The methods on `AppRunner` keep
their signatures, so `test_streaming_app.py` doesn't change.
"""
from typing import Dict, List
from streaming_app import StreamingApp


class AppRunner:
    """Drives daily flows over StreamingApp. Tests interact only with this surface."""

    def __init__(self, app: StreamingApp) -> None:
        self.app: StreamingApp = app

    def configure_payment(self, method: str) -> None:
        """Set the payment method used for subsequent charges."""
        # Pre-refactor: payment_method is a field on StreamingApp.
        # Post-refactor: it lives on app.billing.
        self.app.payment_method = method

    def run_searches(self, queries: List[str]) -> List[List[str]]:
        """Run a batch of catalog searches; return one hit list per query."""
        return [self.app.search(q) for q in queries]

    def search_history(self) -> List[str]:
        """The list of queries the user has run, in order."""
        return list(self.app.search_history)

    def charge_monthly(self) -> Dict:
        """Process this month's charge; return the resulting invoice."""
        # Pre-refactor: charge_monthly is on StreamingApp.
        # Post-refactor: it lives on app.billing.
        return self.app.charge_monthly()

    def charge_annual(self) -> Dict:
        """Process the annual charge; return the resulting invoice."""
        return self.app.charge_annual()

    def invoices(self) -> List[Dict]:
        """All invoices accumulated so far."""
        # Pre-refactor: invoice_list is on StreamingApp.
        # Post-refactor: it lives on app.billing.
        return list(self.app.invoice_list)

test_streaming_app.py

"""Behavior tests for the streaming app, exercised entirely through `AppRunner`.

DESIGN — *every* test below goes through `AppRunner`. None of them touch
`app.charge_monthly()` or `app.billing.charge_monthly()` directly. Why?

The Extract Class refactor is going to relocate billing fields and methods
from `StreamingApp` to a new `BillingManager`. If a test poked at
`app.charge_monthly()` it would break the moment that method moves; if it
switched on `hasattr(app, 'billing')` to handle both shapes, the test would
be coupled to *which step of the refactor we're in*. Both options are bad
test design.

The right answer is to test through a stable surface. `AppRunner` exposes
the operations callers actually care about (`charge_monthly`, `invoices()`,
etc.) and keeps those signatures fixed. Internally, `AppRunner` is updated
when billing moves. Tests aren't.
"""
import pytest
from typing import Dict, List
from streaming_app import StreamingApp
from app_runner import AppRunner


@pytest.fixture
def app() -> StreamingApp:
    a: StreamingApp = StreamingApp(user_id="u42")
    a.track_index = {
        "t1": {"title": "Echoes"},
        "t2": {"title": "Echo Chamber"},
        "t3": {"title": "Bright Lights"},
    }
    return a


@pytest.fixture
def runner(app: StreamingApp) -> AppRunner:
    return AppRunner(app)


# ---- Catalog: search behavior ----

def test_run_searches_returns_matching_titles(runner: AppRunner) -> None:
    results: List[List[str]] = runner.run_searches(["echo"])
    assert sorted(results[0]) == ["t1", "t2"]


def test_search_history_records_each_query(runner: AppRunner) -> None:
    runner.run_searches(["echo", "bright"])
    assert runner.search_history() == ["echo", "bright"]


# ---- Billing: charge behavior ----

def test_charge_monthly_creates_invoice_with_method(runner: AppRunner) -> None:
    runner.configure_payment("visa-1234")
    invoice: Dict = runner.charge_monthly()
    assert invoice["amount"] == pytest.approx(9.99)
    assert invoice["method"] == "visa-1234"
    assert invoice["period"] == "monthly"


def test_charge_annual_creates_invoice_with_twelve_months(runner: AppRunner) -> None:
    runner.configure_payment("visa-1234")
    invoice: Dict = runner.charge_annual()
    assert invoice["amount"] == pytest.approx(9.99 * 12)
    assert invoice["period"] == "annual"


def test_invoices_grow_per_charge(runner: AppRunner) -> None:
    runner.configure_payment("visa-1234")
    runner.charge_monthly()
    runner.charge_monthly()
    assert len(runner.invoices()) == 2

Solution

streaming_app.py

"""The streaming app, after extracting BillingManager."""
from dataclasses import dataclass, field
from typing import Dict, List


class BillingManager:
    def __init__(self, subscription_tier: str = "free",
                 payment_method: str = "") -> None:
        self.subscription_tier = subscription_tier
        self.payment_method = payment_method
        self.invoice_list: List[Dict] = []

    def _create_invoice(self, period: str, multiplier: int) -> Dict:
        invoice = {"period": period, "amount": 9.99 * multiplier,
                   "method": self.payment_method}
        self.invoice_list.append(invoice)
        return invoice

    def charge_monthly(self) -> Dict:
        return self._create_invoice("monthly", 1)

    def charge_annual(self) -> Dict:
        return self._create_invoice("annual", 12)

    def send_invoice(self, invoice_index: int) -> bool:
        if invoice_index >= len(self.invoice_list):
            return False
        return True

    def notify_payment_due(self) -> str:
        return f"Payment of $9.99 due on {self.payment_method}"


class StreamingApp:
    def __init__(self, user_id: str) -> None:
        self.user_id = user_id
        self.track_index: Dict[str, dict] = {}
        self.search_history: List[str] = []
        self.recommendation_cache: Dict[str, List[str]] = {}
        self.billing = BillingManager()

    def search(self, query: str) -> List[str]:
        self.search_history.append(query)
        return [tid for tid, info in self.track_index.items()
                if query.lower() in info.get("title", "").lower()]

    def record_recommendation(self, seed: str, results: List[str]) -> None:
        self.recommendation_cache[seed] = results

app_runner.py

"""Drives daily flows over StreamingApp — post-refactor: billing reached via app.billing.X.

Notice: every public method below has the same signature as the starter file.
Only the *internal* call sites changed (to `self.app.billing.X`). That's why
`test_streaming_app.py` is byte-for-byte identical across the refactor.
"""
from typing import Dict, List
from streaming_app import StreamingApp


class AppRunner:
    """Drives daily flows over StreamingApp. Tests interact only with this surface."""

    def __init__(self, app: StreamingApp) -> None:
        self.app: StreamingApp = app

    def configure_payment(self, method: str) -> None:
        """Set the payment method used for subsequent charges."""
        self.app.billing.payment_method = method

    def run_searches(self, queries: List[str]) -> List[List[str]]:
        """Run a batch of catalog searches; return one hit list per query."""
        return [self.app.search(q) for q in queries]

    def search_history(self) -> List[str]:
        """The list of queries the user has run, in order."""
        return list(self.app.search_history)

    def charge_monthly(self) -> Dict:
        """Process this month's charge; return the resulting invoice."""
        return self.app.billing.charge_monthly()

    def charge_annual(self) -> Dict:
        """Process the annual charge; return the resulting invoice."""
        return self.app.billing.charge_annual()

    def invoices(self) -> List[Dict]:
        """All invoices accumulated so far."""
        return list(self.app.billing.invoice_list)

test_streaming_app.py

"""Behavior tests — IDENTICAL to the starter file. Tests do not change across
the Extract Class refactor because they go through `AppRunner`'s stable surface.
"""
import pytest
from typing import Dict, List
from streaming_app import StreamingApp
from app_runner import AppRunner


@pytest.fixture
def app() -> StreamingApp:
    a: StreamingApp = StreamingApp(user_id="u42")
    a.track_index = {
        "t1": {"title": "Echoes"},
        "t2": {"title": "Echo Chamber"},
        "t3": {"title": "Bright Lights"},
    }
    return a


@pytest.fixture
def runner(app: StreamingApp) -> AppRunner:
    return AppRunner(app)


def test_run_searches_returns_matching_titles(runner: AppRunner) -> None:
    results: List[List[str]] = runner.run_searches(["echo"])
    assert sorted(results[0]) == ["t1", "t2"]


def test_search_history_records_each_query(runner: AppRunner) -> None:
    runner.run_searches(["echo", "bright"])
    assert runner.search_history() == ["echo", "bright"]


def test_charge_monthly_creates_invoice_with_method(runner: AppRunner) -> None:
    runner.configure_payment("visa-1234")
    invoice: Dict = runner.charge_monthly()
    assert invoice["amount"] == pytest.approx(9.99)
    assert invoice["method"] == "visa-1234"
    assert invoice["period"] == "monthly"


def test_charge_annual_creates_invoice_with_twelve_months(runner: AppRunner) -> None:
    runner.configure_payment("visa-1234")
    invoice: Dict = runner.charge_annual()
    assert invoice["amount"] == pytest.approx(9.99 * 12)
    assert invoice["period"] == "annual"


def test_invoices_grow_per_charge(runner: AppRunner) -> None:
    runner.configure_payment("visa-1234")
    runner.charge_monthly()
    runner.charge_monthly()
    assert len(runner.invoices()) == 2

memo.md

# Refactoring memo — Step 7

## Issue
`StreamingApp` is a God Class — it mixes two distinct responsibility clusters
(catalog: track index, search history, recommendation cache; billing:
subscription tier, payment method, invoice list). The clusters share no fields
and almost no methods. Each cluster's methods touch only its own field-set.
A reader looking for "where is payment handled?" has to wade through unrelated
catalog state.

## Rationale
Extract Class along the responsibility seam: a new `BillingManager` owns the
billing fields and methods; `StreamingApp` keeps the catalog cluster and gains
a `self.billing` reference. The seam is recognizable from the field-sets each
method touches, not from method names. Field-then-method ordering keeps each
intermediate state runnable (methods that depend on fields are moved only after
their fields have moved).

## Invariant
The public API of `StreamingApp` (its `search`, `record_recommendation` methods
and the catalog state) is unchanged. The billing API has migrated under
`app.billing.X`. The `AppRunner` surface in `app_runner.py` mediates this:
its public methods keep the same signatures, so test code continues to call
`runner.charge_monthly()` rather than reaching into the migrating internals.

## Tests
The five `test_*_runner_*` tests in `test_streaming_app.py` go entirely through
`AppRunner`. Those signatures are stable across the refactor, so the test file
is byte-for-byte identical before and after. The cost of the Extract Class
refactor lives in `streaming_app.py` (the source of the move) and `app_runner.py`
(its internal call sites), not in the tests.

One fat box, two collaborating boxes. StreamingApp now has only catalog responsibilities. BillingManager owns subscription, payment, invoices, and the notify_payment_due method. Anything billing-related is one place.

Spacing callback paid off. Inside BillingManager, charge_monthly and charge_annual were extracted to share _create_invoice(period, multiplier). This used the Step-4 Rule of Three, but applied after the move so the helper lives where billing logic lives.

The unlabeled notify_payment_due belonged with billing — it reads payment_method, which is now a billing field. It moved to BillingManager.

Change locality scorecard.

Before: add a new payment method (e.g., “PayPal”) → edit StreamingApp (mixed with catalog code).
After: add a new payment method → edit BillingManager only. Catalog code is untouched.

Compare before/after. The inline UML showed one box with five methods and six fields. The live UML now shows two boxes connected by a --has--> relationship.

Pause and reckon: how many files did you edit?

Count: streaming_app.py (the source — fields and methods migrated to BillingManager) and app_runner.py (its internal call sites updated from self.app.X to self.app.billing.X). test_streaming_app.py is byte-for-byte identical before and after.

Why? Tests go through AppRunner’s stable methods — configure_payment, charge_monthly, invoices(). Those signatures haven’t changed; only their internal implementations have. This is exactly the same dynamic as Step 6: a stable test surface absorbs the refactor’s churn so tests don’t need to.

Compare the cost: Step 5 (changed public signature) → 2 file edits plus test edits. Step 6 + Step 7 (preserved public signatures) → 1 caller edit each, no test edits. The pattern is consistent: the cost of a refactor is the cost of fixing every place that depended on what changed. Stable interfaces minimize “every place.”

Step 7 — Knowledge Check

Min. score: 80%

1. Up-front classification. Which of the following are billing-only fields of the original StreamingApp? (select all that apply)

track_index
track_index is the catalog’s data — what tracks the app knows about. It stays on StreamingApp.
search_history
search_history is queried by the search method, also catalog. Stays.
subscription_tier
recommendation_cache
recommendation_cache is the recommendation engine’s data. Catalog cluster.
payment_method
invoice_list
user_id
user_id identifies the app instance overall (catalog and billing both reference it). Stays at the StreamingApp level.

Three billing fields: subscription_tier, payment_method, invoice_list. Identifying these before using the tool is the responsibility-recognition gate. Without it, you risk moving fields that don’t actually belong to the billing cluster.

2. If you were doing Extract Class manually (one Move Method/Field at a time, instead of with the all-at-once tool), what’s the safe order?

Move methods first, then fields
Methods that reference fields would break when their fields are still elsewhere. Method-first leaves a window where references are wrong.
Move fields first, then methods
Move them simultaneously to keep the class consistent
Simultaneous moves aren’t an option in stepwise refactoring — each move is a discrete edit.
The order doesn’t matter — the tool handles dependencies
An all-at-once Extract Class tool does handle ordering automatically. The question is about the discipline that applies when sequencing moves manually — which is what ‘safe order’ refers to.

Fields first, methods second. Methods depend on fields; reverse order breaks references during the intermediate state. Whether you use a single Extract Class action (which sequences internally) or chain Move Field + Move Method by hand, the underlying ordering rule is the same.

3. Spacing callback to Step 4. charge_monthly and charge_annual share most of their body. Should you Extract Function on them before moving them to BillingManager?

Yes — fewer methods to move = simpler refactoring
Extracting before moving creates the helper on StreamingApp. Then that helper has to be moved too. Same end state, twice the moves.
No — move first; extract after, so the helper lives in BillingManager
Yes — extracting first is required by the Rule of Three
Rule of Three doesn’t dictate ordering — it dictates whether to extract. Ordering is about minimizing rework.
No — the helper would have to be on StreamingApp anyway
Extracting first puts the helper on StreamingApp; extracting after moving puts it on BillingManager. Where the helper lives matters.

Move first, extract second. Same code shape, but the extracted helper ends up where billing logic actually belongs. This is a sequencing optimization — both orderings produce correct code, but one of them avoids redundant work.

4. What does NOT count as fixing the God Class smell, even if it makes the file look cleaner?

Extracting a BillingManager class with the billing fields and methods
This is the fix — that’s what this step did.
Renaming the methods so their names sort by cluster (bill_charge_monthly, etc.)
Moving the billing logic to a separate file
Moving to a separate file is a real structural change (it forces an explicit import and decouples module compilation). Often paired with extraction.
Splitting the class into a Catalog and a Billing class via Move Method/Field
Splitting into two classes is structural — the same fix, just with a different starting class kept.

Renaming is local cleanup, not decomposition. A class with 14 methods named with prefixes is still one class with 14 methods. The fields and methods remain coupled to the same self. Decomposition requires a new class to receive the migrated members. Watch for this confusion — “I cleaned it up by renaming” is a frequent self-deception about responsibility decomposition.

8

Replace Conditional with Polymorphism (with the tool)

Why this matters

An if track_kind == ... chain is a class hierarchy in disguise: each branch corresponds to a type, and every new type adds an elif. Polymorphism inverts this — each subclass owns its own behavior, and the dispatch becomes the language’s job, not yours. This step is also the first refactoring that runs through red: the safety dance changes shape when you’re declaring an interface before its bodies exist. Knowing when polymorphism is the right hammer (and when a small match is fine) is the senior judgment we’re after.

🎯 You will learn to

Analyze an if/elif chain over a type tag to recognize a polymorphism opportunity.
Apply Replace Conditional with Polymorphism by migrating each branch to a subclass.
Evaluate when not to use polymorphism (small, stable match statements).

🟥 Heads-up — the safety dance changes shape for this one step. Steps 1–7 ran green → change → green: tests stayed passing throughout. Step 8 deliberately starts from red. The starter declares Track as @abstractmethod and three empty subclasses; every subclass-specific test fails at construction because none of the subclasses override play yet. You’ll fill in each subclass’s play body, watching the failures clear one at a time. This is interface-first refactoring: declare the target shape, let tests fail loudly until the shape is real, finish green. The external behavior of the procedural play(track_kind, ...) wrapper is identical at the start and at the end — that’s the behavior-preserving promise — but the internal migration runs through red. The discipline is the same: tiny steps, tests as your map. The starting color is the only thing that’s different.

Open tracks.py. The play function has an if/elif chain over track_kind:

def play(track_kind: str, track: Track, user: User) -> str:
    if track_kind == "song":
        # Songs respect repeat mode
        ...
    elif track_kind == "podcast":
        # Podcasts auto-resume from last position
        ...
    elif track_kind == "audiobook":
        # Audiobooks respect playback speed
        ...
    else:
        raise ValueError(f"Unknown track kind: {track_kind}")

Three branches, three different rules, one flat Track data class. This is type-conditional dispatch — the function asks “what kind of thing am I?” and chooses behavior. Every time you add a new track kind, this function grows by one elif.

Initial state

Detailed description

UML class diagram with 1 class (Track). play_fn depends on Track labeled "reads".

Classes

Track — Attributes: track_id; duration_sec; repeat_mode; last_position; playback_speed — Operations: none declared

Relationships

play_fn depends on Track labeled "reads"

(One flat Track class with all the fields any kind of track might need; one play function with three branches dispatching on a string. After the refactor, Track becomes abstract and three subclasses each own their play behavior.)

A 30-second `abstractmethod` primer

Python’s abc module lets you declare a class as abstract — it can’t be instantiated directly. Subclasses must override the abstract methods to be instantiable.

from abc import ABC, abstractmethod

class Animal(ABC):
    @abstractmethod
    def speak(self) -> str:
        ...

class Dog(Animal):
    def speak(self) -> str:
        return "woof"

Animal()  # TypeError: Can't instantiate abstract class
Dog()     # OK

ABC (Abstract Base Class) is the marker. @abstractmethod says “every subclass MUST override this.” Together they enforce a contract at instantiation time, before any logic runs.

💡 We’re giving you the hierarchy shape on purpose. Designing class hierarchies from scratch is a separate skill (single-responsibility, Liskov substitution, type variance). For this step, we provide the abstract Track base + three subclass skeletons. Your task is the dispatch inversion — moving each play branch into the right subclass — not the hierarchy design.

How to apply this

The starter tracks.py already declares Track, Song, Podcast, Audiobook as a hierarchy with Track.play(user) as @abstractmethod. Run the test suite first — most tests will fail. The shape of those failures is your map for what to do next:

test_track_cannot_be_instantiated passes — the abstract base contract is wired correctly. Track(...) raises TypeError. ✓
test_subclass_dispatch_calls_subclass_play and every per-subclass behavior test fail at construction — none of Song, Podcast, Audiobook override play, so each SubclassName(...) raises TypeError: Can't instantiate abstract class … with abstract method 'play'. The error message names exactly what’s missing.

That uniform failure is @abstractmethod enforcement working as designed: the hierarchy refuses to instantiate any subclass that hasn’t fulfilled the contract. Each test you fix is one subclass acquiring its play body.

Now you have a punch list. For each branch of the original play(track_kind, ...) function, add a def play(self, user: User) -> str: to the corresponding subclass and move the body in (replacing track.X with self.X). Finally, replace the original play function body with return track.play(user).

Tool-wise, this is just Refactor: Move Method done three times — same flow you used in Steps 6 and 7. Some branches may be easier to copy/paste because the conditional pattern doesn’t always select cleanly for tool-driven moves; that’s fine.

Spacing callback — Step 6’s Feature Envy in disguise

Each if track_kind == "X" branch in the original play function is a bit of code that uses only track-kind-specific data. By Step 6’s diagnostic (“uses zero state of host class”), each branch is Feature Envy on the Track-of-that-kind. The polymorphism refactoring is a special case of Move Method: instead of moving one method to one class, we move N methods to N subclasses dispatched by type.

Comparison — change locality (round two)

How many files change to add a new Live track type?

Before: edit tracks.py (add an elif), edit any caller, possibly edit tests for completeness.
After: add a new Live(Track) subclass with its own play. The play function and its callers don’t change.

That’s the Open/Closed Principle — open for extension (new subclasses), closed for modification (existing dispatch unchanged).

When polymorphism is not the right answer

Three branches dispatching on a string-valued type can be a small match statement instead. Decide based on:

Does the variation extend? If the type set is closed (e.g., HTTP status code categories — there are exactly four), polymorphism is over-engineered. A match works.
Does each branch carry its own state? If yes, polymorphism wins (state goes into the subclass). If branches are stateless calculations, match is simpler.
Will subclasses share more behavior over time? If the kinds will accumulate shared methods, the hierarchy pays back. If they’ll stay independent, polymorphism is bookkeeping.

Starter files

tracks.py

"""The track hierarchy — to be built up via Replace Conditional with Polymorphism."""
from abc import ABC, abstractmethod
from dataclasses import dataclass


@dataclass
class User:
    user_id: str


@dataclass
class Track(ABC):
    track_id: str
    duration_sec: int
    # All possible per-kind state (used by various subclasses)
    repeat_mode: bool = False
    last_position: int = 0
    playback_speed: float = 1.0

    @abstractmethod
    def play(self, user: User) -> str:
        """Each subclass owns its play behavior."""
        ...


@dataclass
class Song(Track):
    # TODO: override play(self, user: User) -> str — songs respect repeat_mode.
    # If self.repeat_mode is True, return f"playing song {self.track_id} on repeat"
    # otherwise return f"playing song {self.track_id}"
    pass


@dataclass
class Podcast(Track):
    # TODO: override play(self, user: User) -> str — podcasts auto-resume from last_position.
    # Return f"resuming podcast {self.track_id} at {self.last_position}s"
    pass


@dataclass
class Audiobook(Track):
    # TODO: override play(self, user: User) -> str — audiobooks adjust playback speed.
    # Return f"playing audiobook {self.track_id} at {self.playback_speed}x speed"
    pass


# The original procedural dispatcher — to be collapsed.
def play(track_kind: str, track: Track, user: User) -> str:
    """Procedural dispatch. After polymorphism, this collapses to one line."""
    if track_kind == "song":
        if track.repeat_mode:
            return f"playing song {track.track_id} on repeat"
        else:
            return f"playing song {track.track_id}"
    elif track_kind == "podcast":
        return f"resuming podcast {track.track_id} at {track.last_position}s"
    elif track_kind == "audiobook":
        return f"playing audiobook {track.track_id} at {track.playback_speed}x speed"
    else:
        raise ValueError(f"Unknown track kind: {track_kind}")

memo.md

# Refactoring memo — Step 8

## Issue
<!-- TODO: name the smell. Why is the if/elif chain over track_kind a smell? -->


## Rationale
<!-- TODO: why does Replace Conditional with Polymorphism fix it?
     What does the post-state achieve that the pre-state can't? -->


## Invariant
<!-- TODO: behavior preservation — what does play(...) still guarantee
     for each track kind after the refactor? -->


## Tests
<!-- TODO: which tests confirm the invariant?
     Hint: dispatch micro-tests + per-subclass behavior tests. -->

playback_loop.py

"""Plays a queue of tracks one after another.

Currently each call site has to pass the kind string explicitly:
`play("song", song_instance, user)`. After Replace Conditional with
Polymorphism, every call here can collapse to `track.play(user)` —
one polymorphic call regardless of subclass. Watch the lines shrink.
"""
from typing import List
from tracks import Track, Song, Podcast, Audiobook, User, play


class PlaybackLoop:
    """Drives a sequence of plays for a given user."""

    def __init__(self, user: User) -> None:
        self.user: User = user

    def play_one_of_each(
        self, song: Song, podcast: Podcast, audiobook: Audiobook
    ) -> List[str]:
        """Play one of each kind, returning the result strings.

        After polymorphism, this method can shrink to a single list
        comprehension over `[song, podcast, audiobook]` calling
        `t.play(self.user)` directly.
        """
        return [
            play("song", song, self.user),
            play("podcast", podcast, self.user),
            play("audiobook", audiobook, self.user),
        ]

test_tracks.py

"""Behavior tests for the polymorphic Track hierarchy and PlaybackLoop caller."""
import pytest
from typing import List
from tracks import Track, Song, Podcast, Audiobook, User, play
from playback_loop import PlaybackLoop


# ----- Dispatch micro-tests (run first) -----

def test_track_cannot_be_instantiated() -> None:
    """Track is abstract; instantiation should fail."""
    with pytest.raises(TypeError):
        Track(track_id="t0", duration_sec=10)


def test_subclass_dispatch_calls_subclass_play() -> None:
    """Calling play on a subclass instance dispatches to that subclass's method."""
    s: Song = Song(track_id="s1", duration_sec=180)
    p: Podcast = Podcast(track_id="p1", duration_sec=2400, last_position=300)
    # If the subclass overrides correctly, these don't error and don't return None.
    assert s.play(User("u1")) is not None
    assert p.play(User("u1")) is not None


# ----- Per-subclass behavior tests -----

def test_song_default_returns_plain_play_message() -> None:
    s: Song = Song(track_id="s1", duration_sec=180)
    assert s.play(User("u1")) == "playing song s1"


def test_song_with_repeat_mode_returns_repeat_message() -> None:
    s: Song = Song(track_id="s2", duration_sec=200, repeat_mode=True)
    assert s.play(User("u1")) == "playing song s2 on repeat"


def test_podcast_resumes_from_last_position() -> None:
    p: Podcast = Podcast(track_id="p1", duration_sec=2400, last_position=300)
    assert p.play(User("u1")) == "resuming podcast p1 at 300s"


def test_audiobook_uses_playback_speed() -> None:
    a: Audiobook = Audiobook(track_id="a1", duration_sec=3600, playback_speed=1.5)
    assert a.play(User("u1")) == "playing audiobook a1 at 1.5x speed"


# ----- The dispatcher collapses to a one-liner -----

def test_play_function_delegates_to_subclass() -> None:
    s: Song = Song(track_id="s1", duration_sec=180)
    # The procedural play() should now just call track.play(user).
    # Both call styles must agree.
    assert play("song", s, User("u1")) == s.play(User("u1"))


# ---- Caller test: PlaybackLoop must keep working across the polymorphism refactor ----

def test_playback_loop_runs_each_track_kind() -> None:
    loop: PlaybackLoop = PlaybackLoop(User("u1"))
    s: Song = Song(track_id="s1", duration_sec=180)
    p: Podcast = Podcast(track_id="p1", duration_sec=2400, last_position=300)
    a: Audiobook = Audiobook(track_id="a1", duration_sec=3600, playback_speed=1.5)
    results: List[str] = loop.play_one_of_each(s, p, a)
    assert results == [
        "playing song s1",
        "resuming podcast p1 at 300s",
        "playing audiobook a1 at 1.5x speed",
    ]

Solution

tracks.py

"""The track hierarchy — polymorphic play."""
from abc import ABC, abstractmethod
from dataclasses import dataclass


@dataclass
class User:
    user_id: str


@dataclass
class Track(ABC):
    track_id: str
    duration_sec: int
    repeat_mode: bool = False
    last_position: int = 0
    playback_speed: float = 1.0

    @abstractmethod
    def play(self, user: User) -> str:
        ...


@dataclass
class Song(Track):
    def play(self, user: User) -> str:
        if self.repeat_mode:
            return f"playing song {self.track_id} on repeat"
        return f"playing song {self.track_id}"


@dataclass
class Podcast(Track):
    def play(self, user: User) -> str:
        return f"resuming podcast {self.track_id} at {self.last_position}s"


@dataclass
class Audiobook(Track):
    def play(self, user: User) -> str:
        return f"playing audiobook {self.track_id} at {self.playback_speed}x speed"


def play(track_kind: str, track: Track, user: User) -> str:
    """The conditional collapses to one polymorphic call."""
    return track.play(user)

playback_loop.py

"""Plays a queue of tracks — post-polymorphism: callers shrink to one line."""
from typing import List
from tracks import Track, Song, Podcast, Audiobook, User


class PlaybackLoop:
    """Drives a sequence of plays for a given user."""

    def __init__(self, user: User) -> None:
        self.user: User = user

    def play_one_of_each(
        self, song: Song, podcast: Podcast, audiobook: Audiobook
    ) -> List[str]:
        """One polymorphic call per track — no kind dispatch needed."""
        return [t.play(self.user) for t in (song, podcast, audiobook)]

memo.md

# Refactoring memo — Step 8

## Issue
The procedural `play(track_kind, track, user)` function dispatches on a string
type field via an `if/elif` chain. Every new track kind requires editing the
same function (Open/Closed violation). Each branch reads/writes only fields
of its corresponding kind — Feature Envy generalized across three subtypes.

## Rationale
Replace Conditional with Polymorphism: define a `Track` abstract base with
an abstract `play(self, user)` method, then move each branch's logic into the
corresponding subclass (`Song`, `Podcast`, `Audiobook`). The conditional
collapses to `track.play(user)` — one polymorphic call. Adding a new track
kind now means adding a new subclass, not editing the dispatcher.

## Invariant
For every existing combination of `(track_kind, track, user)`, the return
value of `play(track_kind, track, user)` is unchanged after the refactor.
Songs respect repeat_mode, podcasts auto-resume from last_position, audiobooks
adjust playback_speed — all the same observable behavior, dispatched through
polymorphism instead of conditional.

## Tests
`test_song_default_returns_plain_play_message`, `test_song_with_repeat_mode_returns_repeat_message`,
`test_podcast_resumes_from_last_position`, `test_audiobook_uses_playback_speed`
confirm per-subclass behavior. `test_play_function_delegates_to_subclass` confirms
the procedural-to-polymorphic delegation. `test_track_cannot_be_instantiated`
and `test_subclass_dispatch_calls_subclass_play` confirm the contract is enforced.

Three branches, three subclasses, one polymorphic call. The play function dropped from 13 lines of conditional dispatch to a single line. Each branch’s logic now lives on the subclass that owns the relevant state.

The productive failure paid off. Audiobook had no play method initially. Python raised TypeError: Can't instantiate abstract class Audiobook with abstract method play. The error message told you exactly what to add. That’s the value of @abstractmethod — it makes the missing implementation a startup error, not a runtime error somewhere later.

The dispatch contract was enforced from the start by @abstractmethod. test_track_cannot_be_instantiated passed immediately — the abstract base refused construction. The other dispatch tests failed in informative ways: stub bodies returned None, and the missing Audiobook.play raised TypeError at instantiation. Each failure pointed at the exact next step. By the time the migrations completed, all dispatch tests went green together.

Compare before/after. The inline UML showed one flat Track and a procedural play function with three branches. The live UML now shows Track (abstract) at the top with three concrete subclasses below, each with its own play. Adding a Live track type later requires a new subclass — the play function and existing tests don’t change. That’s Open/Closed.

Pause and reckon: the test file again.

Look at test_tracks.py. The per-subclass tests like test_song_default_returns_plain_play_message call s.play(User("u1")) directly. The dispatcher test calls the procedural play("song", s, User("u1")). Both shapes existed before the refactor; both still exist after. The refactor moved logic into the subclass methods, but the methods themselves were already declared.

That’s why the tests didn’t need to change: their interfaces (Song.play, Podcast.play, Audiobook.play, the procedural shim play) are the target of the refactor, not casualties of it. We declared the surface up front (the abstract base + concrete stubs), wrote tests against it, then filled in the bodies. Tests went red while bodies were stubs, then green as bodies arrived.

This is the interface-first approach to refactoring: design the public surface you want, write tests against it, then refactor the implementation to match. Tests stay green throughout because they were written against the destination, not the origin. Step 5’s lesson said “stable interfaces save you from editing tests across signature changes.” Step 8’s lesson is the same idea spelled differently: design the destination interface first, and the tests automatically become a refactor safety net rather than a migration burden.

Step 8 — Knowledge Check

Min. score: 80%

1. Retrieval / dispatch. When play(user) is called on a Podcast instance, which method runs?

Track.play (the abstract base method)
The abstract method’s body never runs — it’s a contract, not an implementation. Subclasses must override; the override is what runs.
Podcast.play (the override)
Both — Track.play runs first, then Podcast.play
MRO doesn’t ‘chain’ calls. Without an explicit super().play(user) in the override, only the override runs.
Neither — abstract methods raise NotImplementedError
Abstract methods raise TypeError at instantiation, not NotImplementedError at call. And only if no override exists.

The override runs. Podcast.play overrides Track.play, so the dynamic dispatch from track.play(user) resolves to Podcast.play. This is the same mechanism C++ calls “virtual”; Python’s methods are virtual by default.

2. Is polymorphism always the right fix for if/elif chains over a type field?

Yes — type-conditionals are always a smell
Always-rules in design are usually wrong. Type-conditionals over closed, stable, stateless type sets (e.g., HTTP status code categories: informational/success/redirect/error) are fine as match.
No — for a small closed type set with no shared state, match is simpler.
Yes — Python’s match statement is just syntactic sugar for polymorphism
match and polymorphism are different mechanisms. match is structural pattern matching at the call site; polymorphism is dynamic dispatch via inheritance. They have different change-locality properties.
Only if the conditional is more than 5 branches long
Length is a heuristic, not a rule. Three branches that each carry state can be polymorphism; ten branches that each return one value can be match.

It depends on extension and state. Polymorphism wins when the type set will grow and each type carries its own state. match wins when the type set is closed and the branches are pure dispatch. Both can be valid; choose based on the change vector.

3. What does @abstractmethod actually enforce?

Subclasses must override it; otherwise instantiation raises TypeError.
The method body must be empty
The body can have any content (often a docstring or ...). The body is sometimes used as a default implementation called from a subclass via super().
Subclasses cannot call super() on an abstract method
Subclasses can call super().method() on an abstract method to access a default implementation, if the abstract base provided one.
The method can only be called via the abstract base class
Once a subclass overrides the abstract method, calling on the subclass instance dispatches normally.

Enforcement is at instantiation. @abstractmethod makes the abstract base un-instantiable; subclasses without all required overrides are also un-instantiable. The error message names the missing methods, which is far better than discovering at call time that the method does nothing.

4. Spacing back to Step 6. Why is moving each conditional branch into its corresponding subclass a special case of Move Method?

It’s not — polymorphism is a different refactoring entirely
It is the same shape: a method body that uses only foreign data is moved closer to that data. Polymorphism just generalizes ‘foreign class’ to ‘subtype.’
Each branch uses only data of one kind of track — Feature Envy on the subclass
Move Method requires concrete classes; abstract subclasses don’t qualify
Move Method works fine across an inheritance hierarchy. The receiving classes are concrete subclasses (Song, Podcast, Audiobook).
Polymorphism is a structural change; Move Method is a behavioral change
Both refactorings are structural — they change where code lives without changing what it does. Behavior preservation is the rule for both.

Same diagnostic, different host. Step 6’s Feature Envy (“uses zero state of host class”) generalized: each branch of the conditional uses only state of its respective subclass. Polymorphism is Move Method dispatched by type.

9

Hotspots & The Boy Scout Rule

Why this matters

With six refactorings under your belt, every line of code starts to look like a nail. It isn’t. “Always refactor” produces speculative generality; “never refactor” lets hotspots compound debt until the code becomes write-only. This step is the calibration: where investment of refactoring effort earns its keep, and where it’s a tax on every future reader. The empirical data make the trade-off concrete — smelly hotspots have 4–5× the change-proneness of clean code, but a smelly cold file is just noise.

🎯 You will learn to

Evaluate a piece of code to decide whether refactoring is worth the cost now.
Analyze a candidate refactoring for speculative generality or premature abstraction.
Apply the hotspot rule and the Boy Scout Rule as complementary heuristics.

Eight steps in, you’ve learned six refactorings (Extract Function, Boolean simplification, parameterised Extract, Introduce Parameter Object, Move Method, Extract Class) and one big idea (Replace Conditional with Polymorphism). With this much hammering, every line of code starts to look like a nail.

It isn’t.

Two anti-rules

Speculative generality is the smell of refactoring for an extension that never comes. A class StripePaymentPlugin(PaymentPlugin) with no second implementation, no plan for one, and no tests demonstrating multi-plugin behavior is just complexity. The abstraction might be useful if you ever need a second plugin. Most of the time, you don’t — and the code is harder to read while you wait for that future use.

Premature abstraction (a sibling smell) is over-eagerly extracting on the second occurrence of a pattern instead of the third. The Step-4 quiz hit this: with two duplicates the variation might be obvious or might be misleading. The Rule of Three exists because the third occurrence reveals what’s truly common vs. accidentally similar.

Both anti-rules are reactions to the same human pattern: developers like to feel “future-proof,” and refactoring tools make abstraction cheap. The cost of unused abstraction is paid daily by every reader who has to mentally instantiate the abstract types in their head before understanding the concrete code.

The hotspot rule

A hotspot is code that is already changing — high churn, frequent bug fixes, recent feature work. Refactoring buys you the most when applied to hotspots:

You’re going to read the code anyway (because you’re changing it).
The cleanup is amortized over many future changes (because the code keeps changing).
The behavior preservation tests are most likely to exist on hotspot code (because someone wrote them recently).

Refactoring cold code — files nobody has touched for two years — is rarely worth the time. The code “looks ugly,” but nobody’s reading it; nobody’s changing it; nobody benefits from the cleanup. The cost is paid; the benefit isn’t realized.

The opposite extreme is also wrong. “Never refactor” means every hotspot accrues debt that compounds with every change — bug fixes take longer, new features ship slower, the team’s tolerance for the code degrades. Eventually the codebase becomes the kind of thing you can only rewrite, not edit. That’s not a virtue; it’s a failure mode.

The Boy Scout Rule

Robert C. Martin’s formulation: “Always leave the campground cleaner than you found it.” Applied to code: small, incremental, concurrent improvements bundled with whatever change is already happening — one Extract Method, one renamed variable, one collapsed boolean. The Boy Scout rule (clean while changing) and the hotspot rule (clean where changing) are the same principle from two angles, and together they reject “always” and “never” in favor of opportunistic improvement.

A misconception to avoid: “any cleaner loop is interchangeable”

One refactoring that looks innocuous but routinely breaks behavior: replacing an indexed for loop with a for x in iterable loop without checking what the index was doing. The classic example:

# Original: sums every other element starting at index 0
total = 0
for i in range(0, len(values), 2):
    total += values[i]

# "Cleaner" but broken: now sums every element
total = 0
for v in values:
    total += v

The two loops look similar, the linter might even prefer the second, and the tests on small inputs may pass. But the index-stride semantics is gone. Documented in Oliveira/Keuning/Jeuring (2023) as one of the most common refactoring misconceptions among CS students: students replace for i in range(...) with for v in iterable because the latter is “more Pythonic,” without checking whether the original index pattern was load-bearing.

The rule: before you simplify a loop, ask what the original was doing with the index. If the answer is “nothing” — the index was unused or only used to access iterable[i] sequentially — the simplification is safe. If the index was strided, reversed, paired with another sequence, or used for enumerate-like positional logic, the simplification is a behavior change.

Putting numbers on it

One large empirical study (Palomba et al., 2018) tracked 395 releases of 30 projects. Smelly classes had a median change-proneness of 32 vs. 12 for non-smelly ones; fault-proneness of 9 vs. 3. When three smells co-occurred in a class, change-proneness rose to a median of 54 and faults to 12. The data say: smells are real, smells in hotspots are expensive, and the strongest ROI is fixing co-occurring smells in churning code.

No code task this step — read, internalize, then quiz. Step 10 (the next one) puts the judgment to work on a snippet with multiple smells coexisting.

Starter files

Solution

The summary.

“Always refactor” produces speculative generality and premature abstraction. The cost (reader load, complexity) is paid daily; the benefit (extension capacity) often never arrives.
“Never refactor” lets hotspots accrue debt that compounds with every future change. Eventually the codebase becomes write-only — fixable only by rewrite.
The middle path — refactor in hotspots, while you’re already changing the file — captures most of the benefit and almost none of the cost.
When in doubt: extract on the third occurrence, not the second. Wait for evidence before generalizing.

Step 9 — Knowledge Check

Min. score: 80%

1. Retrieval. Which of these scenarios would you defer refactoring for? (select all that apply)

Code in a file untouched for two years that isn’t on the roadmap
Code you’re about to add a feature to next sprint
Code about to be changed is the textbook hotspot — refactor while editing it. Boy Scout rule.
Code with frequent bug fixes in the last six months
Frequent bug fixes are a strong hotspot signal. The smelly structure is causing the bugs; refactoring reduces future bug rates.
Code with no tests
Lack of tests is a concern before refactoring (write characterization tests first), but not a reason to defer indefinitely. Address the test gap and then refactor.
A class with three different smells that co-occur
Co-occurring smells in one class are exactly what the empirical research flags as the highest-cost configuration. Don’t defer; address.

Cold code is the only one to defer. Hotspots, soon-to-be-touched code, and co-occurring-smell hotspots all justify refactoring effort. Lack of tests is a side-quest (write tests first), not a deferral.

2. What’s wrong with the rule “always refactor when you see a smell”?

It produces too much code churn
Churn is a consequence of the deeper problem. The deeper problem is that not every smell warrants a fix in every context.
It causes speculative generality — abstractions for absent extensions
It’s too lenient — the rule should be stricter
Stricter rules don’t help — strict + wrong produces speculative generality faster.
Smells are subjective; you can’t always agree what’s a smell
Smell subjectivity is real but it’s not the failure mode of ‘always refactor.’ The failure mode is that some smells are not actually problems in their context.

Speculative generality and premature abstraction. “Always refactor” produces abstractions that anticipate extension that never arrives. The reader pays the cost of the abstraction every day; the imagined benefit may never come.

3. What’s wrong with the rule “never refactor working code”?

It produces ugly code that’s hard to read
True, but partial — ugliness alone isn’t the structural problem.
It lets hotspots accrue debt that compounds with every change
True — and the most actionable version. Hotspot code keeps changing; without cleanup, each change becomes harder than the last.
It violates the Boy Scout Rule
True — Robert Martin’s rule says leave each campground (any file being touched) cleaner. ‘Never refactor’ is the opposite of that discipline.
All of the above

All three are facets of the same failure. Pure “never refactor” produces compounding debt in exactly the places where reducing it pays back the most.

4. The Boy Scout Rule, applied to code, says:

Refactor every file you touch, completely, before committing
Complete refactors are big risky changes. The Boy Scout rule is small improvements bundled with whatever work was already happening.
Leave each file you touch a little cleaner than you found it
Never commit code that lints with warnings
Linter compliance is a different discipline (style, not structure). The Boy Scout rule is about structural quality, not surface style.
Always rewrite code that you don’t understand
Rewriting unfamiliar code is dangerous — there’s no safety net of a working baseline. Read, write tests, then refactor incrementally.

Small, incremental, concurrent. The Boy Scout rule trades coordinated big-bang refactors for many small per-task improvements. Combined with the hotspot rule, it covers most of the value of refactoring at almost none of the coordination cost.

5. Loop-replacement misconception. Consider:

# Original
total = 0
for i in range(0, len(values), 2):
    total += values[i]

A reviewer suggests replacing it with:

# Suggested
total = 0
for v in values:
    total += v

Is the suggested replacement a safe refactoring?

Yes — it’s the same loop with cleaner syntax
The two loops look similar but the original uses range(0, len(values), 2) — the third argument is the stride. The replacement loses that, summing every element instead of every other one.
Yes — the type checker would catch any difference
Type checkers detect type errors, not semantic ones. Both loops have the same types in and out; the difference is in which elements get summed.
No — the original strides by 2; the rewrite sums all elements
Only if values is a list, not a tuple
The container type is unrelated to the stride semantics. The bug is the same whether values is a list, tuple, or generator.

Strided loops aren’t interchangeable with foreach. Before simplifying a loop, ask what the index was doing. Strided, reversed, paired-with-another-sequence, or used-for-position — all four patterns break under naive for x in iterable replacement. This is a documented misconception (Oliveira/Keuning/Jeuring 2023) that survives all the way into professional code.

10

The Refactoring Memo (Synthesis)

Why this matters

Real code never gives you one smell at a time. The synthesis test is whether you can spot multiple smells coexisting, choose which refactoring to apply first based on the dependencies between them, and defend the choice in writing before any code moves. That last part — defending the choice in a memo — is the discipline that separates a senior engineer from a tool-driven button-pusher. The tool will let you refactor anything; the memo is what makes you ask whether you should.

🎯 You will learn to

Analyze unfamiliar code to diagnose multiple coexisting smells.
Evaluate which refactoring to apply first based on smell dependencies.
Create a four-field refactoring memo (Issue, Cure, Risk, Confidence) before touching any code.

Open playlist.py. The class PlaylistManager has multiple smells stacked together. Your job is to (a) diagnose them, (b) choose one refactoring to apply, (c) write the memo, and only then (d) execute the refactoring with the tool.

Warm-up — interleaved diagnosis

Before you look at playlist.py, name the smell and refactoring that fits each of the four snippets below. No code changes — just diagnose. Each snippet has exactly one dominant smell from earlier steps.

Snippet A

def shuffle_score(
    plays: int,
    age: int,
    last_played: int,
    mood: str,
    weather: str,
    time_of_day: str,
) -> float:
    base: float = 0.0
    if mood == "energetic" and weather == "sunny":
        base = plays / age
    elif mood == "calm":
        base = plays / (age + last_played)
    # ... 8 more elifs ...
    return base

Snippet B

def update_metadata(
    track: Track,
    title: str,
    artist: str,
    album: str,
    year: int,
    genre: str,
    label: str,
) -> None:
    track.title = title
    track.artist = artist
    ...

Snippet C

class Statistics:
    def avg_track_length(self, tracks: List[Track]) -> float:
        total: int = sum(t.duration_sec for t in tracks)
        return total / len(tracks)

Snippet D

def is_hi_res(track: Track) -> bool:
    if track.bitrate >= 1411:
        return True
    else:
        return False

Snippet E (the foil)

class StripeProcessor:
    def charge(self, amount_cents: int, currency: str) -> bool:
        return self._gateway.charge(amount_cents, currency)

class PayPalProcessor:
    def charge(self, amount_cents: int, currency: str) -> bool:
        return self._gateway.charge(amount_cents, currency)

Two near-identical classes — same method shape, same body. Tempting to extract a base class or polymorphism, but this might be fine. Each processor has its own SDK, error handling, and configuration that isn’t shown here. The visual similarity is shallow.

Hold all five diagnoses in your head as you read on. The quiz will check whether you can discriminate real smells from looks-similar-but-fine.

The synthesis snippet — `playlist.py`

PlaylistManager.add_to_playlist(...) is the function we’ll refactor. Smells stacked in it (in plain English):

Long Parameter List — eight parameters, several of which travel as a clump.
Long Method — three sub-goals (validation, deduplication, persistence) jammed together.
Duplicated Code — the validation block is duplicated in add_to_queue.

Two plausible refactorings — pick ONE

Option	What it tackles first	Trade-off
A: Introduce Parameter Object on the eight parameters first	Long Parameter List	Cleaner signatures, but the duplicated validation block is still duplicated
B: Extract Function on the validation block first	Long Method + Duplication	Smaller methods, both `add_to_playlist` and `add_to_queue` deduped, but the eight-parameter signature is still long

Either choice is defensible. The tutorial doesn’t grade which choice you make — it grades the memo you write to defend it.

The memo — fully blank this time

Open memo.md. All four sections are empty. You write all four:

Issue — name the smell(s) in plain English.
Rationale — explain why your chosen refactoring is the right first step. (You can mention follow-up refactorings, but commit to one first move.)
Invariant — what behavior must be preserved across the refactor?
Tests — which tests in test_playlist.py confirm the invariant?

A reference card at the bottom of memo.md defines each field in one sentence in case you need to refresh.

Then execute

After writing the memo, invoke the appropriate tool action:

For Option A: place cursor in the parameter list, Refactor: Introduce Parameter Object, name it TrackInfo, bundle the four album-related parameters.
For Option B: select the validation block in add_to_playlist, Refactor: Extract Function/Method, name it _validate_track_data. Then replace the duplicated block in add_to_queue with a call to the new helper.

Run the tests after the refactor. They must still pass.

Pause and reckon: which option costs more, and why?

Before you finish this step, look at the test file. Notice that none of the tests call pm.add_to_playlist(...) or pm.add_to_queue(...) directly. They all go through TrackImporter. The tests don’t depend on which option you picked.

Now think about what did have to change for each option:

Option B (Extract Function) preserves the public signatures of add_to_playlist and add_to_queue. The internal validation logic moves into a private _validate_track_data helper. No callers change. importer.py is untouched. test_playlist.py is untouched. This is the same dynamic you saw in Step 6 (Move Method) and Step 7 (Extract Class) — the refactor stayed inside the class’s existing public surface.
Option A (Introduce Parameter Object) changes the public signatures of add_to_playlist and add_to_queue. importer.py’s _add_to_*_one helpers MUST be updated to construct a TrackInfo for the new shape. This is the same dynamic as Step 5 — bundling parameters means every caller has to follow.

In Step 5 the cost landed on the test file. Here, because the tests already go through a stable wrapper, the cost lands on the wrapper instead. That’s the lesson restated one more time: a refactor’s cost is paid wherever the changed signature is referenced. Tests are exempt only when they reference some other stable interface that absorbs the change.

Take 30 seconds before moving on: which option did you pick, and which file(s) did that choice force you to edit beyond playlist.py? Add one line to your memo’s Rationale naming those files.

What’s next after this tutorial

You’ve covered method-level smells (Long Method, Duplication, Boolean anti-patterns), data-level smells (Long Parameter List, Feature Envy), and class-level smells (God Class, type-conditional dispatch). The next layer is design principles — the rules of thumb that make smells less likely to appear in the first place. Two suggested follow-ups:

SOLID design principles tutorial — single-responsibility, open/closed, Liskov substitution, interface segregation, dependency inversion. The vocabulary the smells in this tutorial implicitly invoked.
Observer pattern tutorial — one of many design patterns; an example of how design principles solidify into reusable structural recipes.

Starter files

playlist.py

"""The playlist manager — multiple smells stacked together."""
from typing import Dict, List


class PlaylistManager:
    def __init__(self) -> None:
        self.playlist: List[Dict] = []
        self.queue: List[Dict] = []

    def add_to_playlist(
        self,
        title: str,
        artist: str,
        album: str,
        year: int,
        genre: str,
        duration_sec: int,
        bpm: int,
        isrc: str,
    ) -> Dict:
        # Validation
        if not title or not isinstance(title, str):
            raise ValueError("title is required")
        if duration_sec <= 0:
            raise ValueError("duration_sec must be positive")
        if bpm <= 0:
            raise ValueError("bpm must be positive")

        # Deduplication
        for existing in self.playlist:
            if existing["isrc"] == isrc:
                return existing

        # Persistence
        record: Dict = {
            "title": title, "artist": artist, "album": album,
            "year": year, "genre": genre, "duration_sec": duration_sec,
            "bpm": bpm, "isrc": isrc,
        }
        self.playlist.append(record)
        return record

    def add_to_queue(
        self,
        title: str,
        artist: str,
        album: str,
        year: int,
        genre: str,
        duration_sec: int,
        bpm: int,
        isrc: str,
    ) -> Dict:
        # Validation (DUPLICATED from add_to_playlist)
        if not title or not isinstance(title, str):
            raise ValueError("title is required")
        if duration_sec <= 0:
            raise ValueError("duration_sec must be positive")
        if bpm <= 0:
            raise ValueError("bpm must be positive")

        record: Dict = {
            "title": title, "artist": artist, "album": album,
            "year": year, "genre": genre, "duration_sec": duration_sec,
            "bpm": bpm, "isrc": isrc,
        }
        self.queue.append(record)
        return record

importer.py

"""Bulk-imports tracks into a PlaylistManager.

If the chosen refactoring is Option A (Introduce Parameter Object), every
call site here will need to pass a `TrackInfo` instead of four flat album
fields — exactly the same kind of cost you paid in Step 5 with `AlbumInfo`.
If the chosen refactoring is Option B (Extract Function on validation),
the call shape here stays the same and only `playlist.py` changes.

Tests in `test_playlist.py` go entirely through `TrackImporter`'s
`import_to_playlist` and `import_to_queue` methods. Their signatures are
stable across either option.
"""
from typing import Any, Dict, List
from playlist import PlaylistManager


class TrackImporter:
    """Bulk-imports a list of track records into a playlist or queue."""

    def __init__(self, manager: PlaylistManager) -> None:
        self.manager: PlaylistManager = manager

    def import_to_playlist(self, tracks: List[Dict[str, Any]]) -> List[Dict]:
        """Add each record to the underlying playlist; return the inserted records."""
        return [self._add_to_playlist_one(t) for t in tracks]

    def import_to_queue(self, tracks: List[Dict[str, Any]]) -> List[Dict]:
        """Add each record to the underlying queue; return the inserted records."""
        return [self._add_to_queue_one(t) for t in tracks]

    # ---- Internal call sites: these may need to change after the refactor ----

    def _add_to_playlist_one(self, t: Dict[str, Any]) -> Dict:
        return self.manager.add_to_playlist(
            title=t["title"],
            artist=t["artist"],
            album=t["album"],
            year=t["year"],
            genre=t["genre"],
            duration_sec=t["duration_sec"],
            bpm=t["bpm"],
            isrc=t["isrc"],
        )

    def _add_to_queue_one(self, t: Dict[str, Any]) -> Dict:
        return self.manager.add_to_queue(
            title=t["title"],
            artist=t["artist"],
            album=t["album"],
            year=t["year"],
            genre=t["genre"],
            duration_sec=t["duration_sec"],
            bpm=t["bpm"],
            isrc=t["isrc"],
        )

test_playlist.py

"""Synthesis tests — exercised entirely through `TrackImporter`'s stable surface.

DESIGN NOTE — *every* test below goes through `importer.import_to_playlist`
or `importer.import_to_queue`. None of them call `pm.add_to_playlist(...)`
or `pm.add_to_queue(...)` directly, and none of them branch on which
option you picked. The whole point of the synthesis is to apply what
Steps 5–7 taught:

  - Step 5: when a public signature changes, ALL callers (including tests)
    have to change with it. That cost is real.
  - Steps 6, 7: when a refactor stays *behind* a stable interface, tests
    don't need to change.

For Step 10:
  - Option A (Introduce Parameter Object) IS a public-signature change
    on `add_to_playlist` / `add_to_queue`. The test surface — `TrackImporter`'s
    public methods — stays stable, so the *tests* don't change. But
    `importer.py`'s internal `_add_to_*_one` helpers DO need updating
    (they construct the new TrackInfo). Same cost as Step 5, paid inside
    the importer instead of inside the tests.
  - Option B (Extract Function on validation) keeps `add_to_playlist` /
    `add_to_queue` signatures stable. Both `importer.py` and the tests
    are byte-for-byte unchanged.
"""
import pytest
from typing import Any, Dict, List
from playlist import PlaylistManager
from importer import TrackImporter


@pytest.fixture
def pm() -> PlaylistManager:
    return PlaylistManager()


@pytest.fixture
def importer(pm: PlaylistManager) -> TrackImporter:
    return TrackImporter(pm)


def _track_record(**overrides: Any) -> Dict[str, Any]:
    base: Dict[str, Any] = dict(
        title="Echoes", artist="Alice", album="Reflections", year=2022,
        genre="indie", duration_sec=200, bpm=110, isrc="ISRC001",
    )
    base.update(overrides)
    return base


def test_import_to_playlist_returns_inserted_records(importer: TrackImporter) -> None:
    records: List[Dict[str, Any]] = [_track_record(isrc="A"), _track_record(isrc="B")]
    inserted: List[Dict] = importer.import_to_playlist(records)
    assert len(inserted) == 2
    assert inserted[0]["title"] == "Echoes"
    assert inserted[0]["isrc"] == "A"
    assert inserted[1]["isrc"] == "B"


def test_import_to_playlist_appends_to_underlying_list(
    importer: TrackImporter, pm: PlaylistManager,
) -> None:
    importer.import_to_playlist([_track_record(isrc="A"), _track_record(isrc="B")])
    assert len(pm.playlist) == 2


def test_import_to_playlist_validates_title(importer: TrackImporter) -> None:
    with pytest.raises(ValueError, match="title"):
        importer.import_to_playlist([_track_record(title="")])


def test_import_to_playlist_validates_duration(importer: TrackImporter) -> None:
    with pytest.raises(ValueError, match="duration"):
        importer.import_to_playlist([_track_record(duration_sec=0)])


def test_import_to_playlist_dedupes_by_isrc(
    importer: TrackImporter, pm: PlaylistManager,
) -> None:
    importer.import_to_playlist([_track_record(isrc="X"), _track_record(isrc="X")])
    assert len(pm.playlist) == 1


def test_import_to_queue_validates_title(importer: TrackImporter) -> None:
    with pytest.raises(ValueError, match="title"):
        importer.import_to_queue([_track_record(title="")])

memo.md

# Refactoring memo — Step 10

## Issue
<!-- Name the smell(s) you see in playlist.py. There are at least three. -->


## Rationale
<!-- Which refactoring will you apply FIRST? Why is that the right first step
     given the smells you named? You may mention follow-up refactorings. -->


## Invariant
<!-- What property of behavior must be preserved across the refactor?
     (Hint: external API contract — same inputs produce same outputs.) -->


## Tests
<!-- List which tests in test_playlist.py confirm the invariant. -->


---

### Reference card

| Field | One-sentence definition |
|---|---|
| **Issue** | The smell present in the original code. |
| **Rationale** | Why this refactoring is the right fix, given the smell. |
| **Invariant** | The behavior property preserved across the refactor. |
| **Tests** | The tests that confirm the invariant. |

Solution

playlist.py

"""Playlist manager — Option B: Extract Function on the validation block."""
from typing import List, Dict


class PlaylistManager:
    def __init__(self) -> None:
        self.playlist: List[Dict] = []
        self.queue: List[Dict] = []

    def _validate_track_data(self, title: str, duration_sec: int, bpm: int) -> None:
        if not title or not isinstance(title, str):
            raise ValueError("title is required")
        if duration_sec <= 0:
            raise ValueError("duration_sec must be positive")
        if bpm <= 0:
            raise ValueError("bpm must be positive")

    def _make_record(
        self,
        title: str,
        artist: str,
        album: str,
        year: int,
        genre: str,
        duration_sec: int,
        bpm: int,
        isrc: str,
    ) -> Dict:
        return {
            "title": title, "artist": artist, "album": album,
            "year": year, "genre": genre, "duration_sec": duration_sec,
            "bpm": bpm, "isrc": isrc,
        }

    def add_to_playlist(
        self,
        title: str,
        artist: str,
        album: str,
        year: int,
        genre: str,
        duration_sec: int,
        bpm: int,
        isrc: str,
    ) -> Dict:
        self._validate_track_data(title, duration_sec, bpm)
        for existing in self.playlist:
            if existing["isrc"] == isrc:
                return existing
        record: Dict = self._make_record(
            title, artist, album, year, genre, duration_sec, bpm, isrc
        )
        self.playlist.append(record)
        return record

    def add_to_queue(
        self,
        title: str,
        artist: str,
        album: str,
        year: int,
        genre: str,
        duration_sec: int,
        bpm: int,
        isrc: str,
    ) -> Dict:
        self._validate_track_data(title, duration_sec, bpm)
        record: Dict = self._make_record(
            title, artist, album, year, genre, duration_sec, bpm, isrc
        )
        self.queue.append(record)
        return record

memo.md

# Refactoring memo — Step 10

## Issue
`PlaylistManager` has three coexisting smells:

1. **Duplicated Code** — the validation block (title / duration_sec / bpm checks)
   is identical in `add_to_playlist` and `add_to_queue`. Any future validation
   rule must be added in two places.
2. **Long Method** — `add_to_playlist` does validation, deduplication, and
   record construction in one body, with no internal structure.
3. **Long Parameter List** — eight parameters, four of which (artist, album,
   year, genre) travel together as album metadata.

## Rationale
**First refactoring: Extract Function on the validation block.** The duplication
is the most actionable smell — it's already identical text in two methods, and
the variation is zero (the predicates are pure value-checks of three named fields).
Extracting `_validate_track_data(title, duration_sec, bpm)` collapses two copies
into one and makes any future validation rule a single-place change.

Follow-ups (not done in this step):
- Introduce Parameter Object on the album-metadata clump (artist, album, year, genre).
- Possibly Extract `_make_record` so both add methods share record construction too.

## Invariant
For any valid input, `add_to_playlist` and `add_to_queue` produce the same dict
records as before, append them to the same lists, raise the same ValueErrors on
invalid input, and dedupe in `add_to_playlist` by ISRC. External API and behavior
are unchanged.

## Tests
The six `test_import_to_*` tests in `test_playlist.py` go through
`TrackImporter.import_to_playlist` and `import_to_queue` — stable surfaces
across either Option A or Option B. With Option B (this solution),
`playlist.py` adds the helper methods and the public signatures stay the
same, so `importer.py` and `test_playlist.py` are both byte-for-byte
unchanged. Compare to Step 5: there, choosing the parameter-object route
forced edits to two callers (one of which was the test file). Here Option
B's stability avoids that cost entirely; Option A would have shifted it
into `importer.py`'s internal helpers.

One sample memo, but yours may differ. This solution chose Option B (Extract Function on validation) because the duplication was the highest-impact smell — present in two methods, identical text, zero variation. Option A (Introduce Parameter Object) is also defensible: it directly addresses Long Parameter List, and the duplication can be tackled second.

Either order produces clean code eventually. The exam isn’t which refactoring is “right” — it’s whether your memo articulates a clear reason for your choice. The memo is the assessment.

What you’ve completed. Across ten steps you’ve practiced six refactorings on increasingly complex code, with tests preserving behavior at every step, and UML diagrams making structural changes visible. You’ve used Monaco’s tool support to skip the typing and focus on judgment. You’ve internalized that refactoring is a discipline, not a bag of tricks — every move comes with a justification, an invariant, and a test that confirms the invariant.

Next steps. SOLID design principles → why these smells appear in the first place. Observer pattern → how design principles crystallize into reusable patterns.

Step 10 — Knowledge Check

Min. score: 80%

1. Cumulative classification. Which smells appear in the original PlaylistManager.add_to_playlist? (select all that apply)

Long Parameter List
Long Method
Duplicated Code
Feature Envy
Feature Envy is when a method uses the data of another class. add_to_playlist uses self.playlist — its own state.
God Class
PlaylistManager has two responsibilities (playlist + queue), but they’re closely related and the class is small. Not a God Class — yet.
Speculative Generality
Speculative generality is over-eager abstraction. add_to_playlist has the opposite problem (under-abstracted).
Boolean anti-pattern (return True / return False)
No if cond: return True else: return False pattern in this snippet.

Three smells stacked: Long Parameter List + Long Method + Duplicated Code. Recognizing them simultaneously is the synthesis skill. Each previous step taught one in isolation; here they coexist.

2. Interleaved warmup retrieval. Snippet A (the shuffle_score function with if mood == "energetic" and weather == "sunny" branches) most resembles which smell from this tutorial?

Long Method (Step 2)
Long Method is the secondary smell, but the dominant pattern is the conditional dispatch — multiple branches choosing behavior.
Boolean anti-pattern (Step 3)
The if/and/elif structure isn’t simplifiable to a boolean return — the branches do different things, not just return different booleans.
Long Parameter List (Step 5)
Six parameters is borderline-long but not the dominant smell. The dominant pattern is the dispatch.
Type-conditional dispatch (Step 8)

Type-conditional dispatch. When if mood == "X" chooses behavior, you have a polymorphism candidate. Each mood/weather pair is dispatching to different scoring logic — same shape as the track_kind example in Step 8.

3. Interleaved warmup retrieval. Snippet B (update_metadata with seven parameters that travel together) most resembles which smell?

Long Parameter List + clump (Step 5)
Feature Envy (Step 6)
Feature Envy would require the method to use only foreign data. update_metadata uses both track (foreign) and the parameters; it’s the parameters’ shape that’s the smell, not the method’s location.
Long Method (Step 2)
The method is short — it’s just assignments. Length isn’t the issue.
Speculative Generality (Step 9)
Speculative generality is about unused abstractions. Here the abstraction is missing entirely.

Long Parameter List with a data clump. Six of the seven parameters describe album metadata — they always travel together. The fix is Introduce Parameter Object.

4. Interleaved warmup retrieval. Snippet D (is_hi_res with if track.bitrate >= 1411: return True else: return False) most resembles which smell?

Long Method (Step 2)
Three lines isn’t long.
Boolean-return anti-pattern (Step 3)
Feature Envy (Step 6)
Feature Envy might be relevant — is_hi_res uses only track data — but the dominant smell here is the boolean wrapping.
Long Parameter List (Step 5)
One parameter isn’t long.

Classic boolean-return anti-pattern. The condition is already a boolean; wrapping it in if/else with return True / return False adds nothing. Simplify to return track.bitrate >= 1411. (And maybe also Move Method onto Track, but that’s secondary — Snippet D’s primary smell is boolean.)

5. Snippet C (Statistics class) retrieval. Statistics.avg_track_length(tracks) uses zero Statistics state — it iterates over the parameter tracks and computes a sum. Which earlier-step smell does this most resemble?

Long Method (Step 2)
The body is short — three lines of computation. Length isn’t the issue.
Long Parameter List (Step 5)
One parameter — the opposite of long parameter list.
Feature Envy (Step 6)
God Class (Step 7)
A class with one method isn’t a God Class; it’s the opposite (small, focused, possibly Lazy).

Feature Envy. The method uses zero state of its host (Statistics) and only data from the parameter tracks. Step 6’s diagnostic — “uses zero host state” — fires cleanly here. The likely fix is Move Method onto whichever class owns the track collection.

6. Snippet E (the foil) — discrimination check. Two near-identical processor classes (StripeProcessor, PayPalProcessor) with the same charge signature and similar body. Should you extract a common base class or apply Replace Conditional with Polymorphism?

Yes — duplicated code is always a smell
Visual similarity isn’t bug-coupling. Step 1’s currency-rounding non-example showed the same pattern: three look-alike functions with independent domain rules. Extract only when the variation is one stable dimension and a real change forces lockstep edits in both copies.
Yes — the Open/Closed Principle requires polymorphism here
Open/Closed says be open to extension; it doesn’t require every type-similar pair to share a base class. If the processors evolve independently, a shared base is over-coupling, not abstraction.
Probably not — each processor has different SDK/error/config concerns
Yes — this is exactly the case Step 8 trained for
Step 8 trained for cases where each branch reads/writes different fields and the dispatch on a string-typed kind is the smell. Two classes that already exist, with their own state and SDK, aren’t the same shape.

Variation Theory in action. “Looks duplicated” can mean three different things: (1) actual duplication that should be extracted; (2) visual similarity over independent domain rules (currency rounding from Step 1); or (3) parallel infrastructure with hidden differences (this snippet). Discriminating between them is the judgment Step 9’s hotspot/Boy-Scout framing was for. When in doubt, wait for the third occurrence and a real bug-coupling signal before extracting.

7. Step 1 callback — fix-then-refactor. During the synthesis you spot a bug in one of the duplicated validation blocks of playlist.py. What’s the right discipline?

Refactor first to extract the validation; the helper now needs only one fix
Extracting with the bug intact consolidates the bug into the helper — same bug, harder-to-spot location. The helper now contains corrupt logic that future readers will trust.
Fix the bug in both duplicated locations first; then refactor to extract
Fix one location and leave the other; it’s about to be deleted by the refactor anyway
Tempting reasoning, but the moment between ‘one fix done, refactor in progress’ and ‘refactor complete’ is exactly when a teammate might pull the branch and run the buggy duplicate.
Run the tests; if they pass, the bug isn’t real
Passing tests don’t mean the bug isn’t real — they mean the bug isn’t in the test cases yet. Step 1’s lesson was that test coverage and bug presence are independent.

Fix-then-refactor. Step 1’s load-bearing rule generalizes across the entire toolkit: never refactor over a known bug. Fix in each location with green tests, then consolidate. This sequencing is one of the strongest rules in the discipline.

8. Memo synthesis. Why is the memo important — couldn’t you just refactor and let the diff speak for itself?

The memo is required for the assessment in this tutorial
True for this step, but it’s not the deeper reason. The memo is also valuable in industry settings — code reviewers, future maintainers, and a future self all benefit.
The memo captures the reasoning behind the change — diffs cannot.
Without the memo, the tool refuses to apply the refactoring
The tool doesn’t read memos. The memo is for humans.
Memos are a CS convention; every PR must include one
Memo conventions vary by team. The discipline of naming the invariant is what generalizes.

The memo answers questions a diff can’t. What smell? Why this refactoring? What invariant? Which tests confirm? A reviewer reading the diff sees what changed; the memo tells them why the change is correct. That’s the difference between a reviewable refactor and a trust-me one.

Code Smells & Refactoring — A Music Streaming App

The Cost of Duplication You Didn't Notice

Welcome — what this tutorial trains

Why this matters

🎯 You will learn to

Your task

Refactoring, defined

Not every triplet is a smell

Solution

Step 1 — Knowledge Check

Long Method → Extract Function (by hand)

Why this matters

🎯 You will learn to

The safety dance

Predict before reading

Worked example — extracting the first sub-goal

Your task

Solution

Step 2 — Knowledge Check

Boolean Anti-Patterns — and the trap that costs you everything

Why this matters

🎯 You will learn to

Sub-task A — boolean return

Sub-task B — confusing else

Sub-task C — the IfsMerged trap

Why a happy-path test misses this

Your task

Solution

Step 3 — Knowledge Check

Duplicated Code → Extract Function (with the tool)

Why this matters

🎯 You will learn to

Initial state

Draft the Issue line FIRST

How the tool works

Your task

The Memo template

When NOT to extract — the Rule of Three

Solution

Step 4 — Knowledge Check

Long Parameter List → Introduce Parameter Object (with the tool)

Why this matters

🎯 You will learn to

Initial state

A 60-second @dataclass primer (predict, then peek)

Sketch AlbumInfo BEFORE invoking the tool

Your tasks (in order)

Pause and reckon: what did this refactor cost?

Why a @dataclass, not a dict?

Solution

Step 5 — Knowledge Check

Feature Envy → Move Method (with the tool)

Why this matters

🎯 You will learn to

Initial state

Not every cross-class access is Feature Envy

Run the move

Spacing callback — apply Step 3’s lesson here

The Memo

Solution

Pause and reckon: how many files did you edit?

Step 6 — Knowledge Check

God Class → Extract Class (with the tool)

Why this matters

🎯 You will learn to

Initial state

Why split? — predict the change-locality, then verify

Use Extract Class

Spacing callback — Rule of Three from Step 4

The unlabeled stray method

Solution

Pause and reckon: how many files did you edit?

Step 7 — Knowledge Check

Replace Conditional with Polymorphism (with the tool)

Why this matters

🎯 You will learn to

Initial state

A 30-second abstractmethod primer

How to apply this

Spacing callback — Step 6’s Feature Envy in disguise

A 60-second `@dataclass` primer (predict, then peek)

Sketch `AlbumInfo` BEFORE invoking the tool

Why a `@dataclass`, not a `dict`?

A 30-second `abstractmethod` primer

The synthesis snippet — `playlist.py`