Learn Test-Driven Development hands-on: write failing tests first, make them pass with minimal code, then refactor. Build the TDD mindset — testing as a design tool, not just verification.
Learning objective: After this step you will be able to explain why automated tests are valuable and use
assertto verify function behavior.
You write Python code that works… until someone changes it and breaks something you did not expect. Testing is the safety net that catches those breaks before they reach users.
Many students think testing is about finding bugs after you write code. That is only half the story. Tests also:
Think of tests as a ratchet: once a test passes, your progress is locked in. You can never accidentally slip backward.
TDD asks you to write a test before the code exists. This feels backward at first — and that is completely normal. Every developer who learns TDD goes through an uncomfortable adjustment period. If TDD feels strange or slow, you are doing it right. Stick with the process; it gets natural with practice.
The function below is supposed to convert a numeric score (0–100) to a letter grade. Before running it, read the code carefully and predict: what will calculate_grade(85) return? What about calculate_grade(90)?
Write your predictions down, then click Run to see if you were right.
grades.py and observe the output. Was your prediction correct?calculate_grade().assert statements at the bottom of the file to verify:
calculate_grade(95) returns "A"calculate_grade(85) returns "B"calculate_grade(72) returns "C"An assert works like this:
assert calculate_grade(95) == "A", "95 should be an A"
If the condition is False, Python raises an AssertionError with the message.
def calculate_grade(score):
"""Convert a numeric score (0-100) to a letter grade."""
if score > 90:
return "A"
if score >= 80:
return "B"
if score >= 70:
return "C"
if score >= 60:
return "D"
return "F"
# Try it out — predict the output before running!
print("grade(95) =", calculate_grade(95))
print("grade(85) =", calculate_grade(85))
print("grade(90) =", calculate_grade(90))
print("grade(45) =", calculate_grade(45))
# TODO: Add three assert statements below to verify:
# calculate_grade(95) == "A"
# calculate_grade(85) == "B"
# calculate_grade(72) == "C"
1. A teammate says: “I only write tests after I finish all my code, to check for bugs.” What is the main limitation of this approach?
Post-hoc tests verify your implementation rather than intended behavior. They also cannot serve as a safety net during development because they don’t exist yet. TDD flips this: tests are written first to specify behavior, then code is written to satisfy them.
2. What does this code do?
assert len(result) == 3, "Expected 3 items"
assert condition, message evaluates the condition. If True, nothing happens and
execution continues. If False, Python raises an AssertionError with the message.
This is the simplest form of automated testing in Python.
3. Which of the following is not a benefit of automated tests?
Tests reduce bugs and catch regressions, but they cannot guarantee zero bugs. A test suite is only as good as the scenarios it covers. Dijkstra famously said: “Testing can show the presence of bugs, but never their absence.”
4. The “ratchet metaphor” for testing means:
The ratchet metaphor (from Harry Percival’s “Obey the Testing Goat”) captures the key psychological benefit of TDD: each passing test is a permanent checkpoint. Even if you are stuck on a new feature, you know everything that passed before still works.
Learning objective: After this step you will be able to write pytest test functions and interpret pytest output.
assert to pytestRaw assert statements work, but they give minimal feedback when things fail. pytest is Python’s most popular testing framework. It discovers and runs your tests automatically, gives clear failure output, and scales from tiny scripts to enormous projects.
pytest uses simple naming conventions — no classes, no boilerplate:
| Convention | Rule | Example |
|---|---|---|
| File name | Must start with test_ |
test_temperature.py |
| Function name | Must start with test_ |
def test_freezing_point(): |
| Assertions | Plain assert — same as before |
assert celsius_to_fahrenheit(0) == 32 |
That is it. No import unittest, no class TestSomething(unittest.TestCase), no self.assertEqual(). Just functions and assert.
In this tutorial, clicking Run on a test file will execute pytest automatically. You will see output like:
test_temperature.py::test_freezing_point PASSED
test_temperature.py::test_boiling_point PASSED
A PASSED test means the assertion held. A FAILED test shows you exactly what went wrong.
Before writing any code, predict: what will pytest print if a test function only has pass and no assert? Will it PASS or FAIL? Click Run on test_temperature.py now — before changing anything — and see if your prediction is correct.
The temperature.py module is provided and working. Your job is to complete test_temperature.py:
test_freezing_point — assert that celsius_to_fahrenheit(0) returns 32.0test_boiling_point — assert that celsius_to_fahrenheit(100) returns 212.0test_negative_forty from scratch — assert that celsius_to_fahrenheit(-40) returns -40.0 (the unique crossover point where Celsius equals Fahrenheit!)Click Run on test_temperature.py after each change to see your tests execute.
"""Temperature conversion utilities."""
def celsius_to_fahrenheit(celsius):
"""Convert Celsius to Fahrenheit."""
return celsius * 9 / 5 + 32
def fahrenheit_to_celsius(fahrenheit):
"""Convert Fahrenheit to Celsius."""
return (fahrenheit - 32) * 5 / 9
"""Tests for the temperature module."""
import pytest
from temperature import celsius_to_fahrenheit, fahrenheit_to_celsius
def test_freezing_point():
# TODO: assert that celsius_to_fahrenheit(0) returns 32.0
pass
def test_boiling_point():
# TODO: assert that celsius_to_fahrenheit(100) returns 212.0
pass
# TODO: Write a test_negative_forty function from scratch.
# Assert that celsius_to_fahrenheit(-40) returns -40.0
if __name__ == "__main__":
pytest.main(["-v"])
1. Which file name will pytest automatically discover as a test file?
pytest discovers files whose names start with test_ or end with _test.py.
test_temperature.py matches the test_ prefix. The others do not match either pattern.
Functions inside must also start with test_.
2. What does this pytest output mean?
test_math.py::test_addition PASSED
test_math.py::test_division FAILED
PASSED means the assertions in that test function all held (no AssertionError).
FAILED means at least one assertion in test_division was False. pytest will
show the exact line and values that caused the failure.
3. Which is a valid pytest test function?
pytest uses plain functions (no class needed) that start with test_ and use assert.
Option A is the older unittest style — it works but is verbose. Option C won’t be
discovered (wrong prefix). Option D returns a boolean but never asserts — it would
always “pass” even if the result is wrong.
4. (Spaced review — Step 1) A developer finishes a feature and deletes the tests because “the feature is done and working.” What principle does this violate?
The ratchet metaphor means tests lock in progress permanently. Deleting passing tests removes the safety net: if someone later modifies the code and introduces a regression, there is no test to catch it. Tests are living documentation, not temporary scaffolding.
Learning objective: After this step you will be able to execute a complete Red-Green-Refactor TDD cycle and explain why TDD is a design technique, not just testing.
Here is the single most important idea in this tutorial:
TDD is a design technique, not a testing technique.
When you write the test first, you are forced to decide:
You answer all these design questions before writing a single line of implementation. The tests are a byproduct — the real value is the design thinking.
TDD follows a strict three-phase cycle, repeated in short iterations:
| Phase | Color | What You Do | Key Rule |
|---|---|---|---|
| 1. RED | :red_circle: | Write ONE failing test | The test describes desired behavior that does not exist yet |
| 2. GREEN | :green_circle: | Write the MINIMUM code to pass | Do not optimize. Do not add features. Just make the test pass |
| 3. REFACTOR | :large_blue_circle: | Clean up the code | Remove duplication, improve names — but don’t change behavior. All tests must still pass |
Critical misconception: You do NOT write all your tests first. You write one test, make it pass, refactor, then write the next test. One cycle at a time.
Let’s build a WordCounter class using TDD. We will walk through one full cycle together, then you will continue.
RED — Write a test that describes the behavior we want:
# Sub-goal: Define WHAT we want (the interface + expected behavior)
def test_count_words_simple():
counter = WordCounter()
result = counter.count("hello world")
assert result == {"hello": 1, "world": 1}
Run this test — it fails because WordCounter does not exist yet. This is the RED phase. The failure is expected and correct.
GREEN — Write the minimum code to make it pass:
# Sub-goal: Write just enough code to pass the test
class WordCounter:
def count(self, text):
words = text.split()
result = {}
for word in words:
result[word] = result.get(word, 0) + 1
return result
Run the test again — it passes. This is the GREEN phase.
REFACTOR — The code is clean enough for now, so no refactoring needed. All tests still pass.
Open test_word_counter.py. Cycle 1’s test is already provided.
test_word_counter.py before writing any code. You should see the test FAIL with an ImportError — this IS the RED phase. The test describes behavior that does not exist yet. This failure is expected and correct.WordCounter class in word_counter.py to make the test pass. Run again — you should see GREEN.test_count_repeated_words (partially scaffolded). Run the tests — the new test should FAIL (RED). Then update your implementation if needed to pass it (GREEN).count() method. If you used a manual dictionary loop, try refactoring to use Python’s collections.Counter — which does the same thing in one line. Run the tests after refactoring to confirm they still pass. This is the REFACTOR phase: improve the code without changing behavior.test_count_empty_string entirely on your own. What should count("") return? Decide, write the test, see it fail, then make it pass.Important: Always run the tests and see RED before writing implementation. If you skip this step, you cannot be sure your test is actually testing something.
# TODO: Implement the WordCounter class here.
# It should have a count(text) method that returns
# a dictionary of {word: count} for each word in the text.
"""TDD for WordCounter — practice the Red-Green-Refactor rhythm."""
import pytest
from word_counter import WordCounter
# --- Cycle 1 (test provided — see RED first, then write implementation to go GREEN) ---
def test_count_words_simple():
"""A simple sentence with no repeated words."""
counter = WordCounter()
result = counter.count("hello world")
assert result == {"hello": 1, "world": 1}
# --- Cycle 2 (partially scaffolded — complete the assertion) ---
def test_count_repeated_words():
"""A sentence with repeated words should count each occurrence."""
counter = WordCounter()
result = counter.count("the cat sat on the mat")
# TODO: assert that 'the' appears 2 times and 'cat' appears 1 time
# Hint: assert result["the"] == ??? and result["cat"] == ???
pass
# --- Cycle 3 (write this test entirely on your own) ---
# TODO: Write test_count_empty_string
# What should counter.count("") return? Decide, then test it.
if __name__ == "__main__":
pytest.main(["-v"])
1. A student writes five test functions, then starts implementing the code to make them all pass at once. What is wrong with this approach?
TDD’s power comes from the rapid feedback loop. Each cycle (one test at a time) tells you something about your design. If you write all tests upfront, you lose this feedback and often end up with tests that don’t match the final design.
2. In the GREEN phase, a developer writes optimized, production-ready code. Is this correct?
The GREEN phase is about making the test pass as quickly and simply as possible. Resist the urge to optimize or add features. The REFACTOR phase is where you clean up, remove duplication, and improve the design — with all tests still passing.
3. Analyze: A teammate says “TDD is just a way to write more tests.” How would you correct this?
This is the threshold concept in TDD: it is a design technique, not a testing technique. The test forces you to think about the public interface (how will I call this?) and expected behavior (what should it return?) before you write any implementation. This produces better-designed, more modular code as a natural consequence.
4. What is the correct order of the TDD cycle?
RED (write a failing test) → GREEN (write minimum code to pass) → REFACTOR (clean up). The order matters: you must see the test fail first to confirm it is testing something real. If a test passes immediately, it is not testing new behavior — this is the “Success Against All Odds” anti-pattern.
Learning objective: After this step you will be able to apply the Red-Green-Refactor cycle independently across multiple iterations with fading scaffolding.
A “kata” is a small, focused exercise designed to practice the TDD rhythm. We will build the classic FizzBuzz function using four TDD cycles, with the scaffolding fading at each step:
FizzBuzz rules:
"Fizz""Buzz""FizzBuzz""1", "2")The first test and implementation are provided for you. Study how the cycle works:
RED: test_returns_number_as_string checks that fizzbuzz(1) returns "1" and fizzbuzz(2) returns "2".
GREEN: The simplest implementation is return str(number).
REFACTOR: Nothing to refactor yet.
Run the tests to confirm Cycle 1 passes (GREEN).
Self-check: Why does the Cycle 1 implementation use return str(number) instead of a more complete solution? What TDD principle does this follow? (Answer: the GREEN phase requires writing the minimum code to pass the current test. We do not add logic for requirements that have no test yet.)
The test test_fizz_for_multiples_of_3 is written for you. Your job:
fizzbuzz() in fizzbuzz.py to make it pass (GREEN)Write test_buzz_for_multiples_of_5 yourself. The function signature is provided — fill in the assertions. Run the tests and see RED first, then write the implementation to go GREEN.
Write test_fizzbuzz_for_multiples_of_15 and the corresponding implementation entirely on your own. Full TDD cycle, no scaffolding. Start with the test. See RED. Then go GREEN.
Remember the rhythm: RED (see it fail) → GREEN (make it pass) → REFACTOR (clean up).
def fizzbuzz(number):
"""Return FizzBuzz result for the given number.
Rules:
- Divisible by 3 -> "Fizz"
- Divisible by 5 -> "Buzz"
- Divisible by both 3 and 5 -> "FizzBuzz"
- Otherwise -> the number as a string
"""
# TODO: Add Fizz/Buzz/FizzBuzz logic ABOVE the return below.
# Each cycle should add a new condition.
return str(number)
"""TDD Kata: FizzBuzz — practice the rhythm with fading scaffolding."""
import pytest
from fizzbuzz import fizzbuzz
# --- Cycle 1 (FULL WORKED EXAMPLE — provided) ---
def test_returns_number_as_string():
assert fizzbuzz(1) == "1"
assert fizzbuzz(2) == "2"
assert fizzbuzz(4) == "4"
# --- Cycle 2 (PARTIAL SCAFFOLD — test provided, write the implementation) ---
def test_fizz_for_multiples_of_3():
assert fizzbuzz(3) == "Fizz"
assert fizzbuzz(6) == "Fizz"
assert fizzbuzz(9) == "Fizz"
# --- Cycle 3 (HINT ONLY — write the assertions and implementation) ---
def test_buzz_for_multiples_of_5():
# TODO: assert that fizzbuzz(5), fizzbuzz(10), fizzbuzz(20)
# all return "Buzz"
pass
# --- Cycle 4 (INDEPENDENT — write this test from scratch) ---
# TODO: Write test_fizzbuzz_for_multiples_of_15
# Test that fizzbuzz(15), fizzbuzz(30), fizzbuzz(45) return "FizzBuzz"
# Then update fizzbuzz.py to make it pass.
# Hint: If using if/elif, check divisibility by 15 BEFORE 3 or 5. Why?
if __name__ == "__main__":
pytest.main(["-v"])
1. Why do we add one test at a time instead of writing all four FizzBuzz tests upfront?
Incremental test-writing is the heart of TDD. Each test cycle gives you feedback: “Does this design still work? Do I need to refactor?” Writing all tests first is essentially waterfall planning disguised as TDD.
2. You write a new test and run it. It passes immediately without you changing any code. What should you do?
If a test passes immediately, it means one of two things: (1) the behavior was already implemented by a previous cycle (which is fine — but verify), or (2) the test is not actually testing what you think. Always see RED first to confirm the test is meaningful.
3. If you use if/elif to implement FizzBuzz, why should you check divisibility by 15 before checking 3 or 5?
With if/elif, the first matching branch executes and the rest are skipped. Since
15 is divisible by both 3 and 5, checking if number % 3 == 0 first would return
“Fizz” and never reach the 15 check. The most specific condition must come first.
(Note: the string-concatenation approach from Step 5 avoids this issue entirely
because it uses two separate if statements instead of elif.)
4. (Spaced review — Step 2) A test function is named def check_fizzbuzz():. Will pytest discover and run it?
pytest auto-discovers functions starting with test_. A function named
check_fizzbuzz would be silently ignored — a common mistake that leads
to tests never running without the developer realizing it.
Learning objective: After this step you will be able to distinguish between brittle (implementation-coupled) and robust (behavior-focused) tests, and explain why this matters for refactoring.
Test what a function DOES, not how it does it.
A brittle test breaks when you change the internal implementation, even if the behavior is unchanged. A robust test only breaks when the actual behavior changes.
Consider testing a function that sorts a list of students by GPA:
# BRITTLE — tests implementation details
def test_sort_brittle():
sorter = StudentSorter()
sorter.sort(students)
# Checks internal state — breaks if we rename the field
assert sorter._sorted_list[0].name == "Alice"
assert sorter._comparison_count == 3 # Why do we care?
# ROBUST — tests behavior
def test_sort_robust():
sorter = StudentSorter()
result = sorter.sort(students)
# Checks what the function RETURNS — the observable behavior
assert result[0].gpa >= result[1].gpa # Sorted descending
assert len(result) == len(students) # No items lost
The brittle test will break if you rename _sorted_list or change the sort algorithm (even to a faster one). The robust test only cares about the result.
Here is a simple rule: if you refactor the internals of a function and all tests still pass, your tests are robust. If tests break after a pure refactoring (no behavior change), they are testing implementation.
Open test_brittle_audit.py. It contains four tests for a simple ShoppingCart class. Two tests are brittle (they test implementation details) and two are robust (they test behavior). Your job:
Now refactor your FizzBuzz from Step 4 using a different approach — for example, string concatenation instead of if/elif:
def fizzbuzz(number):
result = ""
if number % 3 == 0:
result += "Fizz"
if number % 5 == 0:
result += "Buzz"
return result or str(number)
Run test_fizzbuzz.py — all tests should still pass, proving they test behavior, not implementation.
class ShoppingCart:
def __init__(self):
self._items = []
def add(self, name, price):
self._items.append({"name": name, "price": price})
def total(self):
return sum(item["price"] for item in self._items)
def item_count(self):
return len(self._items)
def items(self):
return [item["name"] for item in self._items]
"""AUDIT: Two of these tests are brittle. Find and fix them."""
import pytest
from shopping_cart import ShoppingCart
# Test 1
def test_add_item_updates_count():
cart = ShoppingCart()
cart.add("Apple", 1.50)
assert cart.item_count() == 1
# Test 2 — BRITTLE? Check if this tests behavior or implementation.
def test_add_item_internal_list():
cart = ShoppingCart()
cart.add("Apple", 1.50)
# Accesses private attribute directly
assert cart._items[0]["name"] == "Apple"
assert cart._items[0]["price"] == 1.50
# Test 3
def test_total_sums_prices():
cart = ShoppingCart()
cart.add("Apple", 1.50)
cart.add("Bread", 2.00)
assert cart.total() == 3.50
# Test 4 — BRITTLE? Check if this tests behavior or implementation.
def test_internal_list_length():
cart = ShoppingCart()
cart.add("Apple", 1.50)
cart.add("Bread", 2.00)
# Accesses private attribute directly
assert len(cart._items) == 2
if __name__ == "__main__":
pytest.main(["-v"])
1. Which test is more robust? Test A:
def test_sort_a():
result = sort_names(["Charlie", "Alice", "Bob"])
assert result == ["Alice", "Bob", "Charlie"]
def test_sort_b():
result = sort_names(["Charlie", "Alice", "Bob"])
assert result[0] < result[1] < result[2]
assert len(result) == 3
Test A is fine for this specific input, but Test B expresses the property that matters: the output is sorted and the same length. If you add a fourth name to the input, Test A breaks but Test B still works. Robust tests verify behavioral properties.
2. You refactor a function’s internal algorithm (from bubble sort to quicksort) without changing its return value. Two of your tests break. What does this tell you?
If the function’s behavior (inputs → outputs) is unchanged but tests break, those tests are coupled to the implementation. This is the “Nitpicker” anti-pattern. The fix is to rewrite the tests to assert on observable behavior, not internal details.
3. (Spaced review — Step 3) A developer writes a test, sees it fail, then writes code to make it pass. Now she renames a variable for clarity and re-runs the tests. Which TDD phase is she in?
Renaming a variable for clarity without changing behavior is a textbook REFACTOR. The key: all tests still pass before and after. RED (failing test) → GREEN (pass) → REFACTOR (clean up, tests still pass). She completed GREEN and is now in REFACTOR.
Learning objective: After this step you will be able to independently apply the full TDD cycle — writing tests first, then implementation — for a new problem.
This is the real test of your TDD skills. You will receive requirements and a minimal scaffold — a stub function and one example test. The rest is up to you.
validate_password(password)Build a function that validates passwords. It should return a tuple: (is_valid, message).
| Rule | Valid Example | Invalid Example |
|---|---|---|
| At least 8 characters | "SecureP1" |
"Short1" |
| Contains at least one uppercase letter | "Password1" |
"password1" |
| Contains at least one digit | "Password1" |
"Password" |
Return values:
(True, "Valid")(False, "<reason>") — e.g., (False, "Password must be at least 8 characters")The first test (test_valid_password) is provided as scaffolding. Your job:
test_password.py — one for each rule (too short, missing uppercase, missing digit). The function signatures are suggested in the comments.password.py only has a stubvalidate_password() in password.py to make tests pass (GREEN)That is completely normal. Writing tests for something that does not exist yet feels backward at first. Every TDD practitioner felt this way initially. Focus on what the function should return for each input — that is all a test needs to express.
Remember: Write the tests before the implementation. This forces you to design the interface (function name, parameters, return type) before writing any logic.
def validate_password(password):
"""Validate a password against security rules.
Returns:
(True, "Valid") if all rules pass.
(False, "reason") if a rule is violated.
"""
# TODO: Replace this stub with real validation logic.
# The stub returns a failing tuple so tests show clean RED.
return (False, "Not implemented")
"""TDD Kata: Password Validator — write your tests FIRST."""
import pytest
from password import validate_password
# Here is the first test as an example (the "happy path"):
def test_valid_password():
is_valid, message = validate_password("SecureP1")
assert is_valid is True
assert message == "Valid"
# TODO: Write at least 3 more test functions below.
# Each test should check ONE validation rule:
#
# def test_rejects_short_password():
# ...
#
# def test_rejects_missing_uppercase():
# ...
#
# def test_rejects_missing_digit():
# ...
if __name__ == "__main__":
pytest.main(["-v"])
1. A student’s test for the password validator looks like this:
def test_password():
assert validate_password("SecureP1") == (True, "Valid")
assert validate_password("short") == (False, "Password must be at least 8 characters")
assert validate_password("nouppercase1") == (False, "Password must contain an uppercase letter")
assert validate_password("NoDigits") == (False, "Password must contain a digit")
Each assert should be in its own test function so you can see exactly which
rule failed. If test_password() fails on line 2, you don’t know if lines 3 and 4
would also fail. Separate test functions give you precise, independent feedback.
2. You are writing a test for a new feature, but you find it very difficult to set up the test — you need to create five objects and configure them just to call one function. What does this difficulty tell you?
This is the “listen to the test” principle. When a test is hard to write, the function’s design is often the problem. A function that requires complex setup usually has too many responsibilities or hidden dependencies. TDD uses this pain as architectural feedback — simplify the design, and the test becomes easy.
3. (Spaced review — Step 1) A PM says: “We have 95% test coverage, so we can skip manual QA for this release.” What is wrong with this reasoning?
High coverage does not mean high quality. A test that executes a function but
only asserts True covers the code but verifies nothing. Coverage is a
necessary but not sufficient condition for quality. This connects to Step 1:
tests show the presence of bugs, never their absence.
Learning objective: After this step you will be able to evaluate when TDD is appropriate, identify common TDD anti-patterns, and articulate the key principles you learned.
TDD is most valuable when:
TDD is less useful for:
Even in these cases, some tests are almost always valuable. The question is whether to write them first.
Research shows that students who learn TDD properly continue to write significantly more tests in future projects, even when not explicitly asked to. TDD doesn’t just teach testing — it permanently shifts how you think about code quality.
This tutorial covered the fundamentals. In more advanced TDD work, you will encounter:
@pytest.mark.parametrize — test multiple inputs without duplicate test functions@pytest.fixtureOpen test_diagnose.py. It contains a test suite written by a junior developer. Read each test and identify which TDD principle it violates (if any). Add a # VERDICT: comment to each test function explaining your analysis. This is not graded — it is a reflection exercise.
"""Diagnose: Which principles does each test violate (if any)?"""
import pytest
# --- Test 1 ---
def test_login_system():
"""Tests login, logout, session, and password reset all in one."""
user = {"username": "alice", "password": "Secret1"}
assert authenticate(user["username"], user["password"]) == True
assert create_session(user["username"]) is not None
assert logout(user["username"]) == True
assert reset_password(user["username"], "NewSecret1") == True
# VERDICT: (your analysis here)
# --- Test 2 ---
def test_add_returns_sum():
assert add(2, 3) == 5
# VERDICT: (your analysis here)
# --- Test 3 ---
def test_user_internals():
user = User("Bob")
assert user._internal_id is not None
assert user._created_at is not None
# VERDICT: (your analysis here)
# --- Test 4 ---
def test_divide_by_zero():
result = safe_divide(10, 0)
assert result is None
# VERDICT: (your analysis here)
1. Evaluate: For which scenario is TDD most beneficial?
Payment processing has complex logic, strict correctness requirements, and is modified by multiple developers over time. TDD excels here because the tests specify exact behavior and prevent regressions. The other scenarios are either too exploratory or too visual for TDD to add significant value.
2. A student’s test file contains a single function with 20 assertions testing five different behaviors. Another student has five functions with four assertions each. Which approach is better, and why?
Test isolation is a core TDD principle. One test function = one behavior = one
potential failure point. When test_valid_password fails but test_missing_digit
passes, you know exactly where the problem is. A monolithic test function hides
this information behind the first failure.
3. (Anti-pattern identification) A developer writes this test:
def test_user_creation():
user = User("Alice", "alice@example.com")
assert user._name == "Alice"
assert user._email == "alice@example.com"
assert user._id is not None
This is the “Nitpicker” anti-pattern. The test is coupled to the internal
representation (private attributes) rather than the public interface. A robust
test would check user.name (public property) or str(user) or whatever the
class exposes publicly. Internal details should be free to change without breaking tests.
4. (Spaced review — Step 3) Two developers are arguing. Dev A says: “I wrote the function first, then added tests — same result.” Dev B says: “No, writing the test first changes the design.” Who is right, and why?
Dev B is correct. Writing the test first means you must decide the function’s name, parameters, and return value BEFORE writing any implementation. This forces you to think as a caller (API consumer) rather than an implementer. Research shows TDD code tends to be more modular with smaller, more cohesive functions.