1

Anatomy of a Playwright Test: Navigate, Interact, Assert

Why this matters

Every Playwright test you ever write — at work, on capstones, debugging at 11pm — is a variation on three lines: navigate to the page, interact with the UI, assert what the user sees. Lock that rhythm in now and the rest of the tutorial becomes pattern-matching against it. Skip it, and every later step feels like memorization.

🎯 You will learn to

  • Analyze a basic Playwright test and identify how each line maps onto the Arrange / Act / Assert pattern from Testing Foundations
  • Apply the navigate-interact-assert rhythm to read unfamiliar Playwright tests at a glance

In Testing Foundations you wrote tests like this:

def test_valid_name_accepted():
    assert squad_name_valid("epic") is True

That test verifies one function in isolation. A Playwright test verifies a whole React app through a real browser, the way a user experiences it. Same AAA bones, different organism.

🔄 Concept bridge

Testing Foundations (pytest) Playwright (e2e)
Arrange / Act / Assert Navigate / Interact / Assert
Function inputs User actions through the UI
Direct return value Observable outcome on the page
Synchronous Async (await everywhere)
Strong oracle = == exact match Strong oracle = toHaveText, toHaveCount, …

The discipline is the same. The mechanics differ.

🌳 Primer: what getByRole actually queries

Before you read the test, lock in this concept — every locator in the test below depends on it.

Every HTML element has an implicit role that the browser exposes to assistive technology (screen readers, voice control, etc.). The browser maintains a parallel tree — the accessibility tree — that mirrors the DOM but only contains semantically meaningful elements with their roles, names, and states.

HTML Implicit role Accessible name source
<button>Save</button> button the visible text “Save”
<input type="text"> textbox a <label for=...> or aria-label
<a href="...">Home</a> link the visible link text
<ul><li>X</li></ul> list containing listitem (none — structural)
<h2>Settings</h2> heading the visible heading text
<div onclick=...>Click me</div> (no role) (no name) — invisible to screen readers

page.getByRole('button', { name: /add todo/i }) queries this tree, not the DOM. It says: “find the element with accessible role button whose accessible name matches the regex /add todo/i.” The query doesn’t care whether the button is <button class="primary">, <button data-print-id="add">, or wrapped in five <div>s — only the role and name.

Why this matters:

  • Locators stay stable across CSS refactors — change the class, change the layout, the locator still works.
  • Locators break when accessibility breaks — if a teammate replaces <button> with <div onclick="...">, the locator stops finding it. That’s a feature, not a bug: the change made the page worse for screen-reader users, and the test failure surfaces that regression.
  • You’re testing the same thing the user (and their assistive tech) sees — not the same thing the React renderer happens to emit on a given day.

With that primer in mind, every getByRole(...) call below is a query against the accessibility tree.

Read this test (don’t run yet)

import { test, expect } from '@playwright/test';

test('user can add a todo', async ({ page }) => {
  await page.goto('/');                                                  // Navigate
  await page.getByRole('textbox', { name: /todo item/i }).fill('Milk');  // Interact
  await page.getByRole('button', { name: /add todo/i }).click();         // Interact
  await expect(page.getByRole('listitem')).toHaveText('Milk');           // Assert
});

Annotations that matter:

  • async ({ page }) => { … } — every Playwright test is async. page is your handle to the browser tab.
  • await on every line — the browser is asynchronous. Without await, JavaScript races past the click before React’s state has updated.
  • getByRole('button', { name: /add todo/i }) — queries the accessibility tree (per the primer above) for a button with the accessible name “Add todo”.
  • await expect(...).toBeVisible() — Playwright’s web-first assertions auto-wait and retry until the condition holds (or the timeout expires). They’re the right tool for asynchronous UI.
⚠️ Negative-transfer trap: this is *not* React Testing Library or Jest

If you’ve used React Testing Library (RTL) with Jest, the API looks deceptively similar — getByRole, getByText, expect(...).toBeVisible(). The methods have the same names but different machinery underneath:

Comparison point React Testing Library + Jest Playwright
What runs the test jsdom (a fake DOM in Node) a real Chromium browser
Render React’s renderer alone the full app + bundler + browser
getByRole(...) synchronous, returns immediately returns a locator — async, retries
expect(x).toBeVisible() synchronous Jest matcher await expect(locator).toBeVisible() — async, auto-retries
A failing assertion shows the rendered DOM shows the failing accessibility tree + screenshot
Snapshot tests common (toMatchSnapshot) strongly discouraged for e2e — they brittle on every render
Deep render assertions “the component received prop X” not even possible — Playwright sees only what the user sees

Three habits to retire before continuing:

  1. Never write expect(await locator.isVisible()).toBe(true). That looks like Jest, but it runs once and races. Always await expect(locator).toBeVisible() — Playwright’s web-first form retries.
  2. Don’t reach for snapshot matchers. toMatchSnapshot works in Playwright but is the wrong tool for e2e — every refactor breaks the snapshot, even when the user-visible behavior is unchanged. Use toHaveText, toHaveCount, toHaveURL — assertions that mirror what the user would notice.
  3. Don’t probe component internals. “Was prop X passed?” “Is useState set to Y?” — those are unit-test concerns. Playwright sees what the browser renders. If a behavior isn’t observable through the UI, it’s not Playwright’s job to verify.

🎬 Predict — commit to a letter, then click reveal

Read the test above and pick one answer for each question. Commit (out loud, on paper, or in your head) before opening the reveal — predicting something is what primes the encoding; skim-and-reveal is no learning.

Q1. If we changed name: /add todo/i to name: /save/i, what happens?

  • (a) The test still passes — getByRole matches buttons by role, not name.
  • (b) The test fails fast — Playwright throws “no such button” on the next line.
  • (c) The test fails on a 30-second timeout — the locator silently retries waiting for a “Save” button that never appears.
  • (d) Compile error — name: requires a string literal, not a regex.
Reveal — pick first, then click

(c). The role+name query is async and retrying (that’s the whole point of web-first locators). With no matching button, Playwright keeps retrying until the action timeout — which surfaces as a slow-failing test, not a fast crash. (a) is the wrong direction — name is the required filter, not a hint. (b) is the React Testing Library mental model leaking in: RTL’s getByRole throws synchronously; Playwright’s doesn’t. (d) is wrong because regex is allowed (and idiomatic).

Q2. Which line is the Assert step?

  • (a) await page.goto('/')
  • (b) await page.getByRole('textbox', ...).fill('Milk')
  • (c) await page.getByRole('button', ...).click()
  • (d) await expect(page.getByRole('listitem')).toHaveText('Milk')
Reveal

(d). Only expect(...) calls are assertions — they check an outcome. goto, fill, click are commands that do things to the page. If you can’t point to which line is the assertion, the test isn’t proving what you think.

▶ Run

Click Test in the Live Preview toolbar. The test passes against the demo Todo app.

🔍 Investigate

Why is await on every line? The browser is asynchronous: clicking a button doesn’t instantly produce the result. await says “wait for this to finish before moving on.” Without await, the assertion would race past the click before React re-rendered, and the test would either fail or — worse — pass for the wrong reason.

✏️ Modify — predict the failure shape, then run

Change the assertion to look for 'Bread' instead of 'Milk'. Before you click Test, commit to one of these:

  • (a) Locator-not-found timeout (no element matched).
  • (b) Text mismatch — the failure message names both the expected (Bread) and actual (Milk) text.
  • (c) Both — Playwright reports two failures.
  • (d) The test passes — toHaveText does a substring match.

Run, then check your prediction.

Reveal

(b). The locator finds the listitem (it exists); the assertion fails on the text comparison and the failure message includes both expected and actual. Building the habit of predicting the failure message shape is the difference between debugging by reading and debugging by guessing.

📝 House rule (carry it forward)

A Playwright test reads navigate → interact → assert. The test title is the spec — what user-visible promise we’re proving — not a description of clicks.

Starter files
src/App.jsx
function App() {
  const [items, setItems] = React.useState([]);
  const [text, setText] = React.useState('');

  function addTodo() {
    const trimmed = text.trim();
    if (!trimmed) return;

    setItems([...items, trimmed]);
    setText('');
  }

  return (
    <main className="todo-shell">
      <section className="todo-panel">
        <p className="eyebrow">Playwright tutorial</p>
        <h1>Todo Lab</h1>

        <div className="todo-form">
          <label htmlFor="todo-input">Todo item</label>
          <div className="todo-row">
            <input
              id="todo-input"
              value={text}
              onChange={(event) => setText(event.target.value)}
              placeholder="Buy milk"
            />
            <button onClick={addTodo}>Add todo</button>
          </div>
        </div>

        <ul aria-label="Todo list" className="todo-list">
          {items.map((item, index) => (
            <li key={index}>{item}</li>
          ))}
        </ul>
      </section>
    </main>
  );
}
src/main.jsx
const root = ReactDOM.createRoot(document.getElementById('root'));
root.render(<App />);
src/styles.css
body {
  margin: 0;
  font-family: system-ui, -apple-system, BlinkMacSystemFont, "Segoe UI", sans-serif;
  background: #f6f7fb;
  color: #1f2937;
}

.todo-shell {
  min-height: 100vh;
  display: grid;
  place-items: center;
  padding: 32px;
}

.todo-panel {
  width: min(100%, 560px);
  background: white;
  border: 1px solid #d9dee8;
  border-radius: 8px;
  padding: 28px;
  box-shadow: 0 18px 40px rgba(31, 41, 55, 0.08);
}

.eyebrow {
  margin: 0 0 8px;
  color: #4b5563;
  font-size: 0.85rem;
  font-weight: 700;
  text-transform: uppercase;
  letter-spacing: 0.04em;
}

h1 { margin: 0 0 24px; font-size: 2rem; }
label { display: block; margin-bottom: 8px; font-weight: 700; }
.todo-row { display: flex; gap: 10px; }

input {
  flex: 1;
  min-width: 0;
  background: white;
  color: #1f2937;
  border: 1px solid #b8c0cc;
  border-radius: 6px;
  padding: 10px 12px;
  font: inherit;
}

button {
  border: 0;
  border-radius: 6px;
  padding: 10px 14px;
  background: #2563eb;
  color: white;
  font: inherit;
  font-weight: 700;
  cursor: pointer;
}

.todo-list { margin: 24px 0 0; padding-left: 24px; }
.todo-list:empty { display: none; }
.todo-list li { margin: 8px 0; }

/* Dark mode — the iframe inherits the host page's theme via
   [data-bs-theme="dark"] on <html>. Mirror the site's dark palette
   so the Todo app preview stays legible when students switch themes. */
[data-bs-theme="dark"] body { background: #1c2533; color: #e6edf3; }
[data-bs-theme="dark"] .todo-panel {
  background: #232a36;
  border-color: #2a323e;
  box-shadow: 0 18px 40px rgba(0, 0, 0, 0.4);
}
[data-bs-theme="dark"] .eyebrow { color: #9ca3af; }
[data-bs-theme="dark"] input {
  background: #2a323e;
  color: #e6edf3;
  border-color: #3a4351;
}
[data-bs-theme="dark"] input::placeholder { color: #6b7280; }
[data-bs-theme="dark"] button { background: #2563eb; }
tests/todo.spec.js
import { test, expect } from '@playwright/test';

test('user can add a todo', async ({ page }) => {
  await page.goto('/');
  await page.getByRole('textbox', { name: /todo item/i }).fill('Milk');
  await page.getByRole('button', { name: /add todo/i }).click();
  await expect(page.getByRole('listitem')).toHaveText('Milk');
});
2

The Spec Card: Choosing What User Paths Deserve a Test

Why this matters

The hardest part of e2e testing isn’t writing the test — it’s deciding which tests to write. Without a deliberate selection method, you end up testing whatever came to mind first, missing the partitions that actually catch bugs. The Spec Card is the artifact that forces the question what about this feature is the stable contract? before you commit code that pins the wrong thing.

🎯 You will learn to

  • Apply input-space partitioning from Testing Foundations to user-path partitioning in e2e
  • Create a Spec Card that names a feature’s stable contract before writing the test
  • Evaluate which user paths deserve an e2e test versus a lower test layer

🧠 Quick recall — commit before reading on

Q. Why does Playwright need await in front of expect(locator).toBeVisible()?

  • (a) JavaScript requires await on every line in async functions.
  • (b) Web-first assertions auto-wait and retry; without await, the assertion fires once and races past React’s render.
  • (c) await makes the test go faster.
  • (d) Without await, the test won’t compile.
Reveal

(b). The matcher returns a Promise that retries until the condition holds or the timeout expires. Drop the await and it fires once, then JavaScript moves on — silent flakiness, the worst kind of failure.

From foundations partitions to user-path partitions

In Testing Foundations, you partitioned the input space of a function and picked one representative input per partition. In e2e, you partition the user-path space — the different user behaviors a feature has to support — and pick one representative test per partition.

Same discipline. Different domain.

📋 Introducing the Spec Card

Before you write an e2e test, write down the spec it’s verifying. Five fields, fits on screen:

Spec Card: User can add a todo

✓ Behavior:        User types a name, clicks Add, sees it in the list.
✓ Should pass when: CSS classes change. The Add button is restyled.
                    The input becomes a `<textarea>`. The list becomes
                    a table.
✗ Should fail when: Adding silently drops items. Empty inputs are
                    accepted. The input doesn't clear after add.
🎯 Locator contract: A textbox labeled "Todo item"; a button named
                    "Add todo"; a list of items.
✅ Oracle:          The new item is visible in the list.

The Spec Card is the artifact you carry through the rest of the tutorial. It forces the question what about this UI is the stable contract? before you write code that can pin the wrong thing.

Notice the “Should pass when” line: it lists implementation changes that should not break the test. That’s your defense against brittleness later.

✏️ Fill in your own Spec Card — pick one of two ways

Two equally good options. Pick whichever fits how you think:

  1. In-editor template — Open notes/spec-card.md in the file tree on the left. It’s a fillable Markdown template (auto-saved alongside your code). Fill it in for the whitespace-only input test you’re about to write below.
  2. Standalone tool — Open the Spec Card tool in a new tab. Same five fields, but as a structured form with auto-save, Export-as-Markdown, and Copy-to-clipboard. The tool persists across tutorials so you can build a portfolio of Spec Cards as you write tests at school and at work.

Either way, fill the card in before you touch the test code below. The whole point of the Spec Card is that the decisions get made upstream of typing.

🎬 Predict — which user-path partitions are missing?

Three tests are pre-written in tests/add-todo.spec.js. They cover:

  1. Happy path"Milk" is accepted.
  2. Empty input"" is rejected.
  3. Very long input — a 200-character string is accepted.

Read the spec under App.jsx: the app trims input before deciding. Which partition is missing from the tests?

(In your head, before reading on…)

Reveal The missing partition is **whitespace-only input** (`" "`). After trimming, it equals `""`, so the spec says it should be rejected — exactly like the empty-string case from the partition perspective, but with a different surface input.

▶ Run

Click Test. Three tests pass; the fourth is a // TODO you’ll fill in next.

✏️ Modify — write the missing partition test

In tests/add-todo.spec.js, find the whitespace-only input is rejected test. The Arrange / Act / Assert comments are placeholders — fill them in, following the pattern of the three tests above.

Hints will appear on test failure — work through them in layers if you get stuck.

🔍 Investigate

You now have four tests for one feature, each covering a different partition. Why not write a test for every possible input?

The foundations answer applies: representative coverage with low cost. We don’t need a separate test for " ", " ", " ", " ", … — they’re all in the same partition (whitespace-only) and the trimming logic processes them identically. One representative test per partition is enough.

📝 House rules added

  • Use partitions to choose user paths. You don’t need a test for every string. You need one test per behaviorally-distinct partition.
  • Not every test belongs in e2e. Many edge cases live more cheaply in unit tests. Reserve e2e tests for behaviors that need full-stack browser confidence.
Starter files
src/App.jsx
function App() {
  const [items, setItems] = React.useState([]);
  const [text, setText] = React.useState('');

  function addTodo() {
    const trimmed = text.trim();
    if (!trimmed) return;

    setItems([...items, trimmed]);
    setText('');
  }

  return (
    <main className="todo-shell">
      <section className="todo-panel">
        <p className="eyebrow">Playwright tutorial</p>
        <h1>Todo Lab</h1>

        <div className="todo-form">
          <label htmlFor="todo-input">Todo item</label>
          <div className="todo-row">
            <input
              id="todo-input"
              value={text}
              onChange={(event) => setText(event.target.value)}
              placeholder="Buy milk"
            />
            <button onClick={addTodo}>Add todo</button>
          </div>
        </div>

        <ul aria-label="Todo list" className="todo-list">
          {items.map((item, index) => (
            <li key={index}>{item}</li>
          ))}
        </ul>
      </section>
    </main>
  );
}
src/main.jsx
const root = ReactDOM.createRoot(document.getElementById('root'));
root.render(<App />);
src/styles.css
body { margin: 0; font-family: system-ui, -apple-system, sans-serif; background: #f6f7fb; color: #1f2937; }
.todo-shell { min-height: 100vh; display: grid; place-items: center; padding: 32px; }
.todo-panel { width: min(100%, 560px); background: white; border: 1px solid #d9dee8; border-radius: 8px; padding: 28px; box-shadow: 0 18px 40px rgba(31, 41, 55, 0.08); }
.eyebrow { margin: 0 0 8px; color: #4b5563; font-size: 0.85rem; font-weight: 700; text-transform: uppercase; letter-spacing: 0.04em; }
h1 { margin: 0 0 24px; font-size: 2rem; }
label { display: block; margin-bottom: 8px; font-weight: 700; }
.todo-row { display: flex; gap: 10px; }
input { flex: 1; min-width: 0; background: white; color: #1f2937; border: 1px solid #b8c0cc; border-radius: 6px; padding: 10px 12px; font: inherit; }
button { border: 0; border-radius: 6px; padding: 10px 14px; background: #2563eb; color: white; font: inherit; font-weight: 700; cursor: pointer; }
.todo-list { margin: 24px 0 0; padding-left: 24px; }
.todo-list:empty { display: none; }
.todo-list li { margin: 8px 0; }
/* Dark mode (iframe sets [data-bs-theme="dark"] on <html>) */
[data-bs-theme="dark"] body { background: #1c2533; color: #e6edf3; }
[data-bs-theme="dark"] .todo-panel { background: #232a36; border-color: #2a323e; box-shadow: 0 18px 40px rgba(0, 0, 0, 0.4); }
[data-bs-theme="dark"] .eyebrow { color: #9ca3af; }
[data-bs-theme="dark"] input { background: #2a323e; color: #e6edf3; border-color: #3a4351; }
[data-bs-theme="dark"] input::placeholder { color: #6b7280; }
[data-bs-theme="dark"] button { background: #2563eb; }
tests/add-todo.spec.js
import { test, expect } from '@playwright/test';

test('user can add a todo (happy path)', async ({ page }) => {
  await page.goto('/');
  await page.getByRole('textbox', { name: /todo item/i }).fill('Milk');
  await page.getByRole('button', { name: /add todo/i }).click();
  await expect(page.getByRole('listitem')).toHaveText('Milk');
});

test('empty input is rejected', async ({ page }) => {
  await page.goto('/');
  await page.getByRole('button', { name: /add todo/i }).click();
  await expect(page.getByRole('listitem')).toHaveCount(0);
});

test('very long todo is accepted', async ({ page }) => {
  await page.goto('/');
  const long = 'x'.repeat(200);
  await page.getByRole('textbox', { name: /todo item/i }).fill(long);
  await page.getByRole('button', { name: /add todo/i }).click();
  await expect(page.getByRole('listitem')).toHaveText(long);
});

// TODO: write the missing partition test here.
// The spec trims input before deciding whether to accept it,
// so whitespace-only input is in the same partition as empty input.
test('whitespace-only input is rejected', async ({ page }) => {
  // Arrange: navigate to the page.
  // Act: fill the input with whitespace, click Add todo.
  // Assert: no list item was added.
});
notes/spec-card.md
# Spec Card: User can add a todo (whitespace-only rejected)

Fill this in BEFORE writing the test. The decisions made here
determine which assertions and locators you'll commit to below.

## ✓ Behavior
<!-- One sentence: what user-visible behavior are you proving? -->


## ✓ Should pass when
<!-- Implementation changes the test must SURVIVE.
     Examples: CSS class renames, button restyles, layout shifts. -->


## ✗ Should fail when
<!-- Regressions the test must CATCH.
     Examples: whitespace input is accepted, the input doesn't
     clear after submit, the list silently drops items. -->


## 🎯 Locator contract
<!-- Which semantic queries identify each element?
     Prefer role + accessible name, label, or semantic test ID.
     Avoid CSS classes and DOM positions. -->


## ✅ Oracle
<!-- Observable outcome that confirms success.
     What would the user see? -->


---
Prefer a structured form? Open the standalone Spec Card tool at
/SEBook/tools/spec-card (auto-saves, exports as Markdown).
3

The Locator Ladder: Stable Contracts vs Incidental UI

Why this matters

The locator you choose is the contract between your test and the UI — it decides which UI changes will (correctly) break the test and which will (incorrectly) break it. Pick the wrong rung of the ladder and your test either fails on every CSS rename (false alarms that erode trust) or stays green when accessibility regresses (silent failures). The locator ladder is how you make that choice deliberately, not by accident.

🎯 You will learn to

  • Analyze five locator strategies and identify what each one depends on (semantics vs implementation)
  • Apply the locator ladder to choose the highest rung the UI actually supports
  • Evaluate locator durability against three classes of refactor (CSS rename, text change, DOM restructure)

🧠 Quick recall — commit before reading on

Q. From your Spec Card in Step 2, what does the “Locator contract” field name?

  • (a) The exact CSS selectors the test should use.
  • (b) The semantic queries (role + accessible name, label, test ID) that identify each element the test interacts with — the stable part of the UI surface.
  • (c) The list of test cases the test should cover.
  • (d) The CI pipeline that runs the test.
Reveal

(b). “Locator contract” names what about each element is stable — the role and accessible name, the label association, or the semantic test ID. CSS selectors (a) are the brittle rung. Test cases (c) belong in the test code, not the Spec Card.

🎯 The locator ladder

There are five common ways to find the same UI element in Playwright. Each rung depends on something different about the UI.

// Five ways to find the same "Add todo" button:

// Rung 1 — Role + accessible name. Mirrors how assistive tech finds it.
page.getByRole('button', { name: /add todo/i });

// Rung 2 — Label association (best for form controls).
page.getByLabel(/todo item/i);   // (this would find the input, not the button)

// Rung 3 — Visible text content.
page.getByText('Add todo');

// Rung 4 — Author-supplied stable test ID.
page.getByTestId('add-todo');

// Rung 5 — Raw CSS/DOM selector (last resort).
page.locator('.add-todo-btn');

What each rung depends on:

Rung Locator Depends on
1 getByRole + name: The button has an accessible name (HTML semantics)
2 getByLabel A <label for="…"> connection (forms)
3 getByText Exact visible text
4 getByTestId An author-added data-testid attribute
5 .locator('.css-class') The DOM/CSS structure (implementation detail)

Higher rungs depend on accessible / user-visible facts. Lower rungs depend on implementation decisions (CSS classes, DOM positions). The official Playwright docs put it bluntly: “Your DOM can easily change … Prefer user-facing attributes to XPath or CSS selectors.”

🎬 Predict — commit to a letter, then click reveal

The team is about to ship three independent changes to the Add button: a CSS-class rename (.add-todo-btn.primary-btn), a button-text change ("Add todo""Add"), and a DOM restructure (the button moves into a different parent element). The user-visible behavior — clicking it adds a todo — doesn’t change.

Q. Of the five locators above, which two would survive all three changes without a single edit?

  • (a) Rungs 1 and 4 — getByRole('button', { name: /add/i }) and getByTestId('add-todo').
  • (b) Rungs 1 and 3 — both query user-visible text in some form.
  • (c) Rungs 2 and 5 — both target form-control specifics.
  • (d) None — every locator breaks on at least one change.
Reveal — pick first, then click

(a). getByRole('button', { name: /add/i }) survives all three: regex tolerance covers the text change (“Add” still matches /add/i); the role-based query is independent of CSS classes and DOM ancestry. getByTestId('add-todo') survives because the data-testid is author-controlled and travels with the element wherever it moves. The other rungs each break on one of the three. The investigate-table below shows the per-cell answer if you want the full breakdown — but the lesson lands in those two rows.

▶ Run

Click Test. All five locators currently work against the Todo app — the file tests/locator-ladder.spec.js has one test per rung, all passing.

🔍 Investigate — reveal the answer table

                            CSS rename    Text change    DOM restructure
----------------------------------------------------------------------
1. getByRole({name:/add/i})    ✓              ✗ (a)         ✓
2. getByLabel                  ✓              ✓ (b)         ✓
3. getByText('Add todo')       ✓              ✗              ✓
4. getByTestId('add-todo')     ✓              ✓              ✓
5. .locator('.add-todo-btn')   ✗              ✓              ✗ (c)

Notes:

  • (a) With a regex /add/i, the role locator survives “Add todo” → “Add” (regex still matches). With an exact name: 'Add todo' it would break. Regex tolerance is a deliberate design choice.
  • (b) getByLabel finds inputs via their <label> — button labels don’t apply, so this rung doesn’t really apply to buttons. Listed for completeness.
  • (c) A DOM restructure (changing the button’s surrounding markup) often changes CSS-selector ancestry. Brittle.

The pattern: getByTestId is the only rung that survives a button-text change without exact matching. But getByTestId requires the author to have added the test ID — a code-level decision. And test IDs done badly (<button data-testid="blue-btn-right-col">) are just CSS coupling under another name.

✏️ Modify

Open tests/locator-ladder.spec.js. The fifth test uses the brittle .locator('.add-todo-btn') form. Rewrite it as a role-based locator (Rung 1). Run again — your refactored test should still pass.

📝 House rule

Pick the locator that matches the stable contract of this UI element. If the button label is part of the user-visible promise, use getByRole with a sensible regex. If the wording will change but the action is permanent, use getByTestId with a semantically named test ID. Use raw CSS only when nothing else will do — and write a comment explaining why.

Starter files
src/App.jsx
function App() {
  const [items, setItems] = React.useState([]);
  const [text, setText] = React.useState('');

  function addTodo() {
    const trimmed = text.trim();
    if (!trimmed) return;

    setItems([...items, trimmed]);
    setText('');
  }

  return (
    <main className="todo-shell">
      <section className="todo-panel">
        <p className="eyebrow">Playwright tutorial</p>
        <h1>Todo Lab</h1>

        <div className="todo-form">
          <label htmlFor="todo-input">Todo item</label>
          <div className="todo-row">
            <input
              id="todo-input"
              value={text}
              onChange={(event) => setText(event.target.value)}
              placeholder="Buy milk"
            />
            <button
              className="add-todo-btn"
              data-testid="add-todo"
              onClick={addTodo}
            >
              Add todo
            </button>
          </div>
        </div>

        <ul aria-label="Todo list" className="todo-list">
          {items.map((item, index) => (
            <li key={index}>{item}</li>
          ))}
        </ul>
      </section>
    </main>
  );
}
src/main.jsx
const root = ReactDOM.createRoot(document.getElementById('root'));
root.render(<App />);
src/styles.css
body { margin: 0; font-family: system-ui, -apple-system, sans-serif; background: #f6f7fb; color: #1f2937; }
.todo-shell { min-height: 100vh; display: grid; place-items: center; padding: 32px; }
.todo-panel { width: min(100%, 560px); background: white; border: 1px solid #d9dee8; border-radius: 8px; padding: 28px; box-shadow: 0 18px 40px rgba(31, 41, 55, 0.08); }
.eyebrow { margin: 0 0 8px; color: #4b5563; font-size: 0.85rem; font-weight: 700; text-transform: uppercase; letter-spacing: 0.04em; }
h1 { margin: 0 0 24px; font-size: 2rem; }
label { display: block; margin-bottom: 8px; font-weight: 700; }
.todo-row { display: flex; gap: 10px; }
input { flex: 1; min-width: 0; background: white; color: #1f2937; border: 1px solid #b8c0cc; border-radius: 6px; padding: 10px 12px; font: inherit; }
.add-todo-btn,
button { border: 0; border-radius: 6px; padding: 10px 14px; background: #2563eb; color: white; font: inherit; font-weight: 700; cursor: pointer; }
.todo-list { margin: 24px 0 0; padding-left: 24px; }
.todo-list:empty { display: none; }
.todo-list li { margin: 8px 0; }
/* Dark mode */
[data-bs-theme="dark"] body { background: #1c2533; color: #e6edf3; }
[data-bs-theme="dark"] .todo-panel { background: #232a36; border-color: #2a323e; box-shadow: 0 18px 40px rgba(0, 0, 0, 0.4); }
[data-bs-theme="dark"] .eyebrow { color: #9ca3af; }
[data-bs-theme="dark"] input { background: #2a323e; color: #e6edf3; border-color: #3a4351; }
[data-bs-theme="dark"] input::placeholder { color: #6b7280; }
[data-bs-theme="dark"] .add-todo-btn,
[data-bs-theme="dark"] button { background: #2563eb; }
tests/locator-ladder.spec.js
import { test, expect } from '@playwright/test';

// Rung 1 — Role + accessible name (regex-tolerant).
test('rung 1: getByRole finds the Add todo button', async ({ page }) => {
  await page.goto('/');
  await page.getByRole('textbox', { name: /todo item/i }).fill('Milk');
  await page.getByRole('button', { name: /add todo/i }).click();
  await expect(page.getByRole('listitem')).toHaveText('Milk');
});

// Rung 2 — getByLabel (best for inputs, but works through the form).
test('rung 2: getByLabel finds the input via its label', async ({ page }) => {
  await page.goto('/');
  await page.getByLabel(/todo item/i).fill('Bread');
  await page.getByRole('button', { name: /add todo/i }).click();
  await expect(page.getByRole('listitem')).toHaveText('Bread');
});

// Rung 3 — getByText (couples to exact wording).
test('rung 3: getByText finds the button by visible text', async ({ page }) => {
  await page.goto('/');
  await page.getByRole('textbox', { name: /todo item/i }).fill('Eggs');
  await page.getByText('Add todo').click();
  await expect(page.getByRole('listitem')).toHaveText('Eggs');
});

// Rung 4 — getByTestId (semantic test ID).
test('rung 4: getByTestId finds the button via data-testid', async ({ page }) => {
  await page.goto('/');
  await page.getByRole('textbox', { name: /todo item/i }).fill('Cheese');
  await page.getByTestId('add-todo').click();
  await expect(page.getByRole('listitem')).toHaveText('Cheese');
});

// Rung 5 — Raw CSS class (the brittle rung — REWRITE this one!).
// TODO: rewrite this test to use page.getByRole instead of CSS.
test('rung 5: brittle CSS locator (rewrite me)', async ({ page }) => {
  await page.goto('/');
  await page.getByRole('textbox', { name: /todo item/i }).fill('Butter');
  await page.locator('.add-todo-btn').click();
  await expect(page.getByRole('listitem')).toHaveText('Butter');
});
4

Strong Assertions: The Liar Test in the Browser

Why this matters

A green test you can’t trust is worse than no test at all — it gives false confidence while the bug ships. Liar tests are the most dangerous failure mode in an e2e suite because the test visibly clicks buttons, which makes it feel like real verification. This step makes that lie tactile: you’ll watch a buggy app pass a weak assertion, then strengthen it until it tells the truth.

🎯 You will learn to

  • Analyze a passing Playwright test and recognize when its oracle is too weak to catch the spec violation
  • Apply web-first assertions (await expect(...)) instead of the synchronous expect(await locator.isVisible()).toBe(true) antipattern
  • Evaluate three weak assertion patterns and rewrite them to verify the user-visible promise

🧠 Quick recall — commit before reading on

Q. From Testing Foundations: a liar test has a PASS result that doesn’t prove the spec. What’s the defining feature?

  • (a) The test runs slowly and times out before completing.
  • (b) The test’s oracle is too weak — the assertion is true for both a correct implementation and a buggy one.
  • (c) The test only runs on some platforms.
  • (d) The test asserts on the wrong element entirely.
Reveal

(b). A liar test passes against a correct implementation and against a broken one — the assertion can’t distinguish them. The same pattern exists in e2e, and it’s sneakier here because the test visibly clicks buttons, which makes it feel “more real” than it is.

🎬 Predict — commit to a letter, then click reveal

Read this test. The Todo app you’ll run it against has a bug somewhere in addTodo — predict-and-investigate, don’t peek at the source first.

test('adding a todo shows it in the list', async ({ page }) => {
  await page.goto('/');
  await page.getByRole('textbox', { name: /todo item/i }).fill('Milk');
  await page.getByRole('button', { name: /add todo/i }).click();
  await expect(page.getByRole('listitem')).toHaveCount(1);
});

Q. Against a buggy app where addTodo somehow drops the user’s text, what does this test do?

  • (a) Fail — Playwright detects the empty list item and raises.
  • (b) PasstoHaveCount(1) only counts list items; it never reads their text.
  • (c) ErrortoHaveCount requires non-empty content.
  • (d) Flaky — sometimes passes, sometimes fails depending on render order.
Reveal — pick first, then click

(b). The assertion only counts. It says nothing about what’s inside the items. The test will be a liar: green check, broken feature.

▶ Run

Click Test.

The test passes. Surprise.

🔍 Investigate — open src/App.jsx and find the bug

Now (and only now) open src/App.jsx. The bug: addTodo stores '' instead of trimmed — the user’s text is dropped between state-update and render, so every <li> renders empty.

What did toHaveCount(1) actually verify? Just that one list item exists. It said nothing about what’s inside the item. The bug — empty text — is invisible to this assertion.

The assertion is a liar: PASS result, broken feature.

Three weak assertion patterns to recognize

Weak assertion Why it lies
await expect(page.getByRole('list')).toBeVisible() An empty <ul> is still “visible”
await expect(page.getByText('')).toBeVisible() Always true
await expect(page.getByRole('listitem')).toHaveCount(1) Doesn’t verify item content

And one Playwright-specific anti-pattern from the official docs:

// ❌ Anti-pattern — non-retrying, no auto-wait:
expect(await page.getByText('Milk').isVisible()).toBe(true);

// ✓ Web-first form — auto-waits and retries:
await expect(page.getByText('Milk')).toBeVisible();

✏️ Modify

In tests/todo.spec.js, strengthen the assertion to verify the item’s text, not just the count. Predict the new failure message before re-running.

Hints will appear on test failure — work through them in layers if you get stuck.

📝 House rule

Assert the promise, not the plumbing.

The promise is what the spec said the user would see. The plumbing is which DOM nodes exist, what CSS class they have, what their internal state is. A strong assertion verifies the promise; a weak assertion verifies the plumbing without verifying what the user actually gets.

Starter files
src/App.jsx
// 🐛 BUGGY APP — there's a bug somewhere in addTodo that makes the
// weak assertion lie. Predict + run the test BEFORE you hunt for it
// in the source. The Investigate phase reveals where the bug lives
// (and why the count assertion missed it).
function App() {
  const [items, setItems] = React.useState([]);
  const [text, setText] = React.useState('');

  function addTodo() {
    const trimmed = text.trim();
    if (!trimmed) return;
    setItems([...items, '']);
    setText('');
  }

  return (
    <main className="todo-shell">
      <section className="todo-panel">
        <p className="eyebrow">Buggy Todo Lab</p>
        <h1>Todo Lab</h1>

        <div className="todo-form">
          <label htmlFor="todo-input">Todo item</label>
          <div className="todo-row">
            <input
              id="todo-input"
              value={text}
              onChange={(event) => setText(event.target.value)}
              placeholder="Buy milk"
            />
            <button onClick={addTodo}>Add todo</button>
          </div>
        </div>

        <ul aria-label="Todo list" className="todo-list">
          {items.map((item, index) => (
            <li key={index}>{item}</li>
          ))}
        </ul>
      </section>
    </main>
  );
}
src/main.jsx
const root = ReactDOM.createRoot(document.getElementById('root'));
root.render(<App />);
src/styles.css
body { margin: 0; font-family: system-ui, -apple-system, sans-serif; background: #f6f7fb; color: #1f2937; }
.todo-shell { min-height: 100vh; display: grid; place-items: center; padding: 32px; }
.todo-panel { width: min(100%, 560px); background: white; border: 1px solid #d9dee8; border-radius: 8px; padding: 28px; box-shadow: 0 18px 40px rgba(31, 41, 55, 0.08); }
.eyebrow { margin: 0 0 8px; color: #4b5563; font-size: 0.85rem; font-weight: 700; text-transform: uppercase; letter-spacing: 0.04em; }
h1 { margin: 0 0 24px; font-size: 2rem; }
label { display: block; margin-bottom: 8px; font-weight: 700; }
.todo-row { display: flex; gap: 10px; }
input { flex: 1; min-width: 0; background: white; color: #1f2937; border: 1px solid #b8c0cc; border-radius: 6px; padding: 10px 12px; font: inherit; }
button { border: 0; border-radius: 6px; padding: 10px 14px; background: #2563eb; color: white; font: inherit; font-weight: 700; cursor: pointer; }
.todo-list { margin: 24px 0 0; padding-left: 24px; min-height: 24px; }
.todo-list li { margin: 8px 0; min-height: 1.2em; }
/* Dark mode */
[data-bs-theme="dark"] body { background: #1c2533; color: #e6edf3; }
[data-bs-theme="dark"] .todo-panel { background: #232a36; border-color: #2a323e; box-shadow: 0 18px 40px rgba(0, 0, 0, 0.4); }
[data-bs-theme="dark"] .eyebrow { color: #9ca3af; }
[data-bs-theme="dark"] input { background: #2a323e; color: #e6edf3; border-color: #3a4351; }
[data-bs-theme="dark"] input::placeholder { color: #6b7280; }
[data-bs-theme="dark"] button { background: #2563eb; }
tests/todo.spec.js
import { test, expect } from '@playwright/test';

// The weak assertion below passes against the buggy app.
// Strengthen it so the test fails — that's the bug-catching version.
test('adding a todo shows it in the list', async ({ page }) => {
  await page.goto('/');
  await page.getByRole('textbox', { name: /todo item/i }).fill('Milk');
  await page.getByRole('button', { name: /add todo/i }).click();

  // ❌ Weak assertion: only checks the count.
  await expect(page.getByRole('listitem')).toHaveCount(1);

  // TODO: replace or extend the assertion above so the test
  // catches the empty-text bug. Hint: assert the item's text.
});
5

Behavior, Not Implementation: The Brittleness Gauntlet

Why this matters

Every brittle test on a real codebase trains the team to ignore the suite — and once trust is gone, the suite’s value collapses. The fix is not to write more tests; it’s to make sure each test breaks for the right reason. This step makes that distinction tactile by having you edit the app yourself and watch one locator survive a refactor while another shatters.

🎯 You will learn to

  • Analyze a failing test and classify the break as a real regression or a false alarm
  • Apply the locator ladder under pressure: predict which tests survive each refactor before running them
  • Evaluate a brittle locator and rewrite it into one coupled to behavior, not styling

🧠 Quick recall — commit before reading on

Q. From Step 3 — which two locator strategies survive a CSS class rename without modification?

  • (a) getByText and getByLabel
  • (b) getByRole and getByTestId
  • (c) getByPlaceholder and .locator('.css-class')
  • (d) Only getByRole survives — every other rung breaks.
Reveal

(b). Both getByRole and getByTestId query non-CSS properties — the accessibility tree and an author-supplied data attribute, respectively. They survive any change to className. CSS-class locators (.locator('.css-class')) explicitly couple to the class.

Now we’re going to make the brittleness tactile. You’ll edit the app yourself and watch tests break.

Two tests, same behavior, two locator strategies

You have two test files in tests/:

  • tests/css-locator.spec.js — uses page.locator('.add-todo-btn') (Rung 5)
  • tests/role-locator.spec.js — uses page.getByRole('button', { name: /add/i }) (Rung 1)

Both verify the same behavior: clicking Add adds a todo. Both pass against the current App.jsx.

🎬 Predict — Round 1: CSS class rename. Commit to a letter, then click reveal.

Imagine the design team does a styling pass and renames the button’s CSS class:

- <button className="add-todo-btn" onClick={addTodo}>Add todo</button>
+ <button className="primary-btn"  onClick={addTodo}>Add todo</button>

The user-visible behavior is identical — the button still says “Add todo” and still adds a todo.

Q. After the rename, what happens when you re-run both test files?

  • (a) Both pass — the behavior didn’t change, so neither test should break.
  • (b) Both fail — Playwright reloads the file and gets confused by the rename.
  • (c) css-locator fails (false alarm — broke for a styling change), role-locator passes (correctly indifferent to CSS).
  • (d) role-locator fails (real regression — the role changed), css-locator passes.
Reveal — pick first, then make the edit yourself

(c). This is the entire lesson of the gauntlet. The role-based locator queries the accessibility tree (role + accessible name “Add todo”) — both unchanged. The CSS locator queries the class — which IS what changed. The behavior is identical, so the role test correctly stays green; the CSS test fails for a false alarm. You’re about to watch this happen in real time.

✏️ Edit App.jsx (one line)

Open src/App.jsx. Find the line:

<button className="add-todo-btn" onClick={addTodo}>Add todo</button>

Change add-todo-btn to primary-btn. Just that one identifier. Save the file.

▶ Run

Click Test. You will see one ❌ red and one ✓ green — that’s the design of this step. Do not “fix” the red one by reverting the rename; the red is the lesson. If you see two greens, the rename didn’t take effect (recheck App.jsx); if you see two reds, you broke something else (revert other changes and try again).

The gate below specifically asserts that tests/css-locator.spec.js is failing — passing the gate requires the css-locator test to be in its broken state.

🔍 Investigate

Test Result What it tells us
tests/css-locator.spec.js ❌ Fails The test was coupled to a styling decision. The user-facing behavior didn’t change, but the test broke. This is a false alarm — wasted CI time and eroded trust in the suite.
tests/role-locator.spec.js ✓ Passes The test was coupled to the user-visible role + name. Styling changed; behavior didn’t; the test correctly didn’t notice.

The role-based test honors what’s stable about the UI: the button has an accessible name “Add todo.” Styling is incidental. The CSS-based test pinned the incidental thing.

🔄 Mini-gauntlet, Round 2 (preview)

What if Marketing renames "Add todo""Add"? The role-locator’s regex /add/i matches both, so it survives. A name: 'Add todo' (exact) wouldn’t have. Whether that survival is right depends on whether the exact wording is part of the spec — and that ambiguity is exactly the trade-off Step 6 makes explicit.

📝 House rule

A test that breaks under a refactor it shouldn’t have broken under is brittle. Brittleness is the cost of coupling tests to implementation details. The Spec Card’s “Should pass when” field is your defense — write down the changes the test should survive before you write the test, then make sure your locators honor it.

Starter files
src/App.jsx
// 🛠 Edit this file as instructed: rename the CSS class
// on the Add todo button from "add-todo-btn" to "primary-btn".
function App() {
  const [items, setItems] = React.useState([]);
  const [text, setText] = React.useState('');

  function addTodo() {
    const trimmed = text.trim();
    if (!trimmed) return;

    setItems([...items, trimmed]);
    setText('');
  }

  return (
    <main className="todo-shell">
      <section className="todo-panel">
        <p className="eyebrow">Brittleness gauntlet</p>
        <h1>Todo Lab</h1>

        <div className="todo-form">
          <label htmlFor="todo-input">Todo item</label>
          <div className="todo-row">
            <input
              id="todo-input"
              value={text}
              onChange={(event) => setText(event.target.value)}
              placeholder="Buy milk"
            />
            <button className="add-todo-btn" onClick={addTodo}>
              Add todo
            </button>
          </div>
        </div>

        <ul aria-label="Todo list" className="todo-list">
          {items.map((item, index) => (
            <li key={index}>{item}</li>
          ))}
        </ul>
      </section>
    </main>
  );
}
src/main.jsx
const root = ReactDOM.createRoot(document.getElementById('root'));
root.render(<App />);
src/styles.css
body { margin: 0; font-family: system-ui, -apple-system, sans-serif; background: #f6f7fb; color: #1f2937; }
.todo-shell { min-height: 100vh; display: grid; place-items: center; padding: 32px; }
.todo-panel { width: min(100%, 560px); background: white; border: 1px solid #d9dee8; border-radius: 8px; padding: 28px; box-shadow: 0 18px 40px rgba(31, 41, 55, 0.08); }
.eyebrow { margin: 0 0 8px; color: #4b5563; font-size: 0.85rem; font-weight: 700; text-transform: uppercase; letter-spacing: 0.04em; }
h1 { margin: 0 0 24px; font-size: 2rem; }
label { display: block; margin-bottom: 8px; font-weight: 700; }
.todo-row { display: flex; gap: 10px; }
input { flex: 1; min-width: 0; background: white; color: #1f2937; border: 1px solid #b8c0cc; border-radius: 6px; padding: 10px 12px; font: inherit; }
.add-todo-btn,
.primary-btn,
button { border: 0; border-radius: 6px; padding: 10px 14px; background: #2563eb; color: white; font: inherit; font-weight: 700; cursor: pointer; }
.todo-list { margin: 24px 0 0; padding-left: 24px; }
.todo-list:empty { display: none; }
.todo-list li { margin: 8px 0; }
/* Dark mode */
[data-bs-theme="dark"] body { background: #1c2533; color: #e6edf3; }
[data-bs-theme="dark"] .todo-panel { background: #232a36; border-color: #2a323e; box-shadow: 0 18px 40px rgba(0, 0, 0, 0.4); }
[data-bs-theme="dark"] .eyebrow { color: #9ca3af; }
[data-bs-theme="dark"] input { background: #2a323e; color: #e6edf3; border-color: #3a4351; }
[data-bs-theme="dark"] input::placeholder { color: #6b7280; }
[data-bs-theme="dark"] .add-todo-btn,
[data-bs-theme="dark"] .primary-btn,
[data-bs-theme="dark"] button { background: #2563eb; }
tests/css-locator.spec.js
import { test, expect } from '@playwright/test';

// CSS-class locator — pins .add-todo-btn (an implementation detail).
test('css-locator: user can add a todo', async ({ page }) => {
  await page.goto('/');
  await page.getByRole('textbox', { name: /todo item/i }).fill('Milk');
  await page.locator('.add-todo-btn').click();
  await expect(page.getByRole('listitem')).toHaveText('Milk');
});
tests/role-locator.spec.js
import { test, expect } from '@playwright/test';

// Role-based locator — pins the button's accessible name.
test('role-locator: user can add a todo', async ({ page }) => {
  await page.goto('/');
  await page.getByRole('textbox', { name: /todo item/i }).fill('Milk');
  await page.getByRole('button', { name: /add/i }).click();
  await expect(page.getByRole('listitem')).toHaveText('Milk');
});
6

The Maintenance Trade-off: Pin the Spec, No More, No Less

Why this matters

Step 4 said stronger assertions catch more bugs. Step 5 said brittle locators waste team time. Both are true — and they pull in opposite directions. The skill that separates a maintainable suite from a brittle one is knowing how to reconcile them: pin exactly what the spec promises, no more, no less. Get this calibration wrong and you either over-specify (false alarms on every refactor) or under-specify (the count is broken and the test is green).

🎯 You will learn to

  • Apply the principle match assertion specificity to spec specificity to a single-promise feature
  • Analyze a 3 × 2 grid of assertion strength × scenario and predict which results are correct vs misleading
  • Evaluate a goldilocks assertion against brittle and loose alternatives

🧠 Quick recall — commit before reading on

Q. A test fails. Which of these is the false alarm?

  • (a) The behavior under test changed — the user can no longer place an order.
  • (b) The test asserts on a CSS class that the design team renamed; the user-visible behavior is unchanged.
  • (c) The test discovered a regression in the checkout flow.
  • (d) The test caught an off-by-one in the cart count.
Reveal

(b). A false alarm is a test failure that doesn’t correspond to a behavior change — the test was coupled to implementation (CSS class) instead of to the user-visible promise. (a), (c), and (d) are real regressions worth catching. Both Step 4 (liar tests = false passes) and Step 5 (brittle tests = false fails) point at the same underlying issue: a test’s value depends on what it actually verifies. Step 6 puts the principle into one sentence.

🎯 The principle

Match assertion specificity to spec specificity. Pin exactly what the spec promises — no more, no less.

A stronger assertion is not always a better assertion. We’ll see this on a deliberately simple feature first. (Step 7 generalizes it to features with multiple promises.)

The feature

The Todo app has a new remaining-count display: a <p role="status"> showing “3 items remaining”. The spec is one sentence:

“Show the user how many items are still pending.”

That’s it. One promise: surface the count. Notice what’s not in the spec:

  • the exact wording (“items remaining” vs “todos pending”)
  • plurality grammar (“1 item” vs “1 items”)
  • the surrounding sentence (“You have 3…” vs just “3…”)
  • color, position, animation

Three candidate assertions

// Brittle (over-specified): pins exact wording, plurality, surrounding copy.
await expect(page.getByRole('status'))
  .toHaveText('You have 3 items remaining across all todos');

// Goldilocks (spec-aligned): pins exactly what the spec promises.
await expect(page.getByRole('status')).toContainText('3');
await expect(page.getByRole('status')).toContainText(/item/i);

// Loose (under-specified): the status region exists; nothing more.
await expect(page.getByRole('status')).toBeVisible();

🎬 Predict — Scenario A: marketing changes wording. Commit, then click reveal.

Imagine the team rewrites the status text from "3 items remaining" to "3 todos pending". The spec is still satisfied — the count is still shown.

Q. Which assertion correctly survives the wording change (i.e., passes — and the pass is the right answer)?

  • (a) Brittle only — exact text is the contract.
  • (b) Goldilocks only — pins the count and the noun, both still present.
  • (c) Loose only — toBeVisible() doesn’t care about content.
  • (d) Goldilocks and Loose — both still pass; only Goldilocks’s pass is informative.
Reveal

(d). Brittle fails (false alarm — wording changed, spec didn’t). Goldilocks and Loose both pass — but Goldilocks’s pass is meaningful (it verified the count and the noun) while Loose’s pass is trivially true (it never checked the count anyway). A “passing” test that proves nothing isn’t doing its job.

🎬 Predict — Scenario B: an off-by-one regression. Commit, then click reveal.

Now imagine a different change: the count logic has a bug. Where the page should say “3 items remaining,” it says “4 items remaining” instead.

Q. Which assertion catches this regression (i.e., fails — and the fail is the right answer)?

  • (a) Brittle and Goldilocks both fail; Loose passes (misses the bug).
  • (b) Only Brittle fails; Goldilocks misses it because it doesn’t pin the exact number.
  • (c) Only Loose fails — it’s the only one that runs against the count region.
  • (d) All three pass — toContainText and toHaveText both ignore numeric content.
Reveal

(a). Brittle fails because '3 items remaining''4 items remaining'. Goldilocks fails because toContainText('3') doesn’t match '4 items remaining' (no '3' in that string). Loose passes because the status region is still visible — it never checked the count, so it can’t catch a count regression. That last “pass” is the under-specification trap.

▶ Run

Click Test. All three tests pass against the base app. (The base app shows "3 items remaining" correctly.)

✏️ Edit App.jsx — introduce the off-by-one bug

In src/App.jsx, find the line:

const remainingCount = items.length;

Change it to:

const remainingCount = items.length + 1;

That’s the bug — the count is now wrong by one. Predict which tests catch it before re-running.

▶ Run again

🔍 Investigate — Scenario B results

Assertion Result Was the result useful?
Brittle ❌ Fails ✓ Yes — it caught the regression
Goldilocks ❌ Fails ✓ Yes — it caught the regression
Loose ✓ Passes ✗ No — it missed the bug entirely

Now think back to Scenario A (the wording change). Reset the bug — change items.length + 1 back to items.length. Then imagine the wording change happening:

Assertion Result under wording change Was the result useful?
Brittle ❌ Fails ✗ No — false alarm; spec still satisfied
Goldilocks ✓ Passes ✓ Yes — wording isn’t part of the spec
Loose ✓ Passes (Trivially — but it never checked the count anyway)

The 2×2 grid that crystallizes the lesson

Assertion ↓ / Spec → Spec is loose
(“show the count”)
Spec is tight
(“show ‘3 items remaining’”)
Loose assertion ✓ aligned ✗ misses regressions
Tight assertion ✗ false alarms ✓ aligned

Strength (LO3) and spec-fidelity (LO4) are different axes. The best assertion lives on the diagonal — its specificity matches the spec’s specificity.

  • Loose spec + loose assertion = good. (You’re pinning what’s promised.)
  • Loose spec + tight assertion = false alarms. (You’re pinning more than promised.)
  • Tight spec + loose assertion = misses regressions. (You’re pinning less than promised.)
  • Tight spec + tight assertion = good. (You’re pinning the exact contract.)

The Goldilocks assertion above is on the diagonal: a loose spec, met with a loose-but-targeted assertion that still verifies the count. Brittle is off the diagonal in one direction; loose is off in the other.

📝 House rule

Pin exactly what the spec promises. No more, no less.

Don’t default to maximum strictness “just in case.” Strictness is not free — every pin is a future false alarm waiting to happen. Don’t default to minimum strictness either — every un-pinned promise is a regression waiting to slip through.

Read the spec. Decide what’s promised. Pin that.

Starter files
src/App.jsx
// 🛠 You'll edit one line in this file to introduce the off-by-one bug.
function App() {
  const [items, setItems] = React.useState([]);
  const [text, setText] = React.useState('');

  function addTodo() {
    const trimmed = text.trim();
    if (!trimmed) return;
    setItems([...items, trimmed]);
    setText('');
  }

  const remainingCount = items.length;

  return (
    <main className="todo-shell">
      <section className="todo-panel">
        <p className="eyebrow">Todo Lab</p>
        <h1>Todo Lab</h1>

        <div className="todo-form">
          <label htmlFor="todo-input">Todo item</label>
          <div className="todo-row">
            <input
              id="todo-input"
              value={text}
              onChange={(event) => setText(event.target.value)}
              placeholder="Buy milk"
            />
            <button onClick={addTodo}>Add todo</button>
          </div>
        </div>

        <p role="status" className="status-line">
          {remainingCount} items remaining
        </p>

        <ul aria-label="Todo list" className="todo-list">
          {items.map((item, index) => (
            <li key={index}>{item}</li>
          ))}
        </ul>
      </section>
    </main>
  );
}
src/main.jsx
const root = ReactDOM.createRoot(document.getElementById('root'));
root.render(<App />);
src/styles.css
body { margin: 0; font-family: system-ui, -apple-system, sans-serif; background: #f6f7fb; color: #1f2937; }
.todo-shell { min-height: 100vh; display: grid; place-items: center; padding: 32px; }
.todo-panel { width: min(100%, 560px); background: white; border: 1px solid #d9dee8; border-radius: 8px; padding: 28px; box-shadow: 0 18px 40px rgba(31, 41, 55, 0.08); }
.eyebrow { margin: 0 0 8px; color: #4b5563; font-size: 0.85rem; font-weight: 700; text-transform: uppercase; letter-spacing: 0.04em; }
h1 { margin: 0 0 24px; font-size: 2rem; }
label { display: block; margin-bottom: 8px; font-weight: 700; }
.todo-row { display: flex; gap: 10px; }
input { flex: 1; min-width: 0; background: white; color: #1f2937; border: 1px solid #b8c0cc; border-radius: 6px; padding: 10px 12px; font: inherit; }
button { border: 0; border-radius: 6px; padding: 10px 14px; background: #2563eb; color: white; font: inherit; font-weight: 700; cursor: pointer; }
.status-line { margin: 18px 0 0; color: #4b5563; font-weight: 600; }
.todo-list { margin: 12px 0 0; padding-left: 24px; }
.todo-list li { margin: 8px 0; }
/* Dark mode */
[data-bs-theme="dark"] body { background: #1c2533; color: #e6edf3; }
[data-bs-theme="dark"] .todo-panel { background: #232a36; border-color: #2a323e; box-shadow: 0 18px 40px rgba(0, 0, 0, 0.4); }
[data-bs-theme="dark"] .eyebrow { color: #9ca3af; }
[data-bs-theme="dark"] input { background: #2a323e; color: #e6edf3; border-color: #3a4351; }
[data-bs-theme="dark"] input::placeholder { color: #6b7280; }
[data-bs-theme="dark"] button { background: #2563eb; }
[data-bs-theme="dark"] .status-line { color: #9ca3af; }
tests/brittle.spec.js
import { test, expect } from '@playwright/test';

// BRITTLE: pins exact wording, plurality, surrounding copy.
test('brittle: counter shows pinned exact text', async ({ page }) => {
  await page.goto('/');
  await page.getByRole('textbox', { name: /todo item/i }).fill('A');
  await page.getByRole('button', { name: /add/i }).click();
  await page.getByRole('textbox', { name: /todo item/i }).fill('B');
  await page.getByRole('button', { name: /add/i }).click();
  await page.getByRole('textbox', { name: /todo item/i }).fill('C');
  await page.getByRole('button', { name: /add/i }).click();
  await expect(page.getByRole('status')).toHaveText('3 items remaining');
});
tests/goldilocks.spec.js
import { test, expect } from '@playwright/test';

// GOLDILOCKS: pins exactly what the spec promises (the count + the noun).
test('goldilocks: counter shows the right count of items', async ({ page }) => {
  await page.goto('/');
  await page.getByRole('textbox', { name: /todo item/i }).fill('A');
  await page.getByRole('button', { name: /add/i }).click();
  await page.getByRole('textbox', { name: /todo item/i }).fill('B');
  await page.getByRole('button', { name: /add/i }).click();
  await page.getByRole('textbox', { name: /todo item/i }).fill('C');
  await page.getByRole('button', { name: /add/i }).click();
  await expect(page.getByRole('status')).toContainText('3');
  await expect(page.getByRole('status')).toContainText(/item/i);
});
tests/loose.spec.js
import { test, expect } from '@playwright/test';

// LOOSE: the status region exists; nothing more.
// This misses the actual count!
test('loose: status region is visible', async ({ page }) => {
  await page.goto('/');
  await page.getByRole('textbox', { name: /todo item/i }).fill('A');
  await page.getByRole('button', { name: /add/i }).click();
  await expect(page.getByRole('status')).toBeVisible();
});
7

Multi-Promise Features and the Capstone

Why this matters

Real features rarely have a single promise. The “Mark as done” toggle has three: state changes, count decrements, item stays visible. Each promise has its own specificity sweet spot — and treating them as one big assertion either over-pins (brittle on harmless changes) or under-pins (misses bugs in two-thirds of the contract). This step is the real-world skill: per-promise specificity decisions, made independently.

🎯 You will learn to

  • Apply the specificity-matching principle to features with multiple independent promises
  • Analyze each promise separately and choose its locator + assertion shape
  • Create a complete multi-promise Playwright test from a Spec Card and a partial test stub

🧠 Quick recall — commit before reading on

Q. From Step 6: a stronger assertion is sometimes worse. When?

  • (a) When the SUT is slow — strong assertions time out before the page renders.
  • (b) When the spec is loose — pinning more than the spec promises creates false alarms on every harmless wording / styling change.
  • (c) Never — stricter is always safer.
  • (d) When the test runs on Firefox — strong assertions don’t work cross-browser.
Reveal

(b). This is Step 6’s principle: the best assertion lives on the diagonal of the (spec specificity × assertion specificity) grid. If the spec is loose (“show the count”) but the assertion is tight (toHaveText('3 items remaining')), every wording change becomes a false alarm — a test failure that doesn’t correspond to a behavior break.

Step 6 had a single promise (the count). Real features usually have multiple promises — and you have to make a separate specificity decision for each one. That’s the skill that distinguishes a maintainable test suite from a brittle one.

🎯 The feature: “Mark as done” toggle

The Todo app now supports marking items as done. Click on a todo’s button to toggle its done state. Done items show a checkmark; the remaining-count display only counts items that are not done.

The spec is three promises:

  1. Toggle state. Clicking a todo toggles its done state.
  2. Count decrements. The remaining-count display reflects only un-done items.
  3. Item stays visible. Marked-done items remain in the list (not deleted).

For each promise, we make a specificity decision independently. Read this table — you’ll fill in a similar one for the capstone:

Promise                       Brittle option              Goldilocks option              Loose option
──────────────────────────    ──────────────────────────  ──────────────────────────     ─────────────────────────
1. Toggle state               toHaveClass(/todo-done/)    toHaveAttribute('aria-         (skip — but then how
                              (pins CSS class —           pressed', 'true') (pins        do you know the toggle
                              implementation detail)      semantic ARIA contract)        worked?)
2. Count decrements           toHaveText('2 items         getByRole('status')            toBeVisible() on the
                              remaining') (over-pins      .toContainText('2')            status (misses the
                              wording)                    (pins the number itself)       count regression)
3. Item stays visible         (Goldilocks IS the          getByRole('listitem')          (you can't loose-spec
                              target — count + visible)   .filter({hasText:'Milk'})      a deletion check —
                                                          .toBeVisible()                  this promise is binary)

Notice the asymmetry.

  • Promise 2 is the same shape as Step 6: pin the count, not the wording.
  • Promise 1 introduces a new dimension: there’s a right tool (aria-pressed, the semantic contract) and a wrong tool (.todo-done CSS class). Using the wrong tool isn’t more strict — it’s coupled to implementation in a different way.
  • Promise 3 is binary — the item either stays visible or it doesn’t. Loose-spec doesn’t apply when the contract is yes/no.

Worked example: one fully written test

Read this carefully — it applies the table above:

test('marking a todo as done decrements the count and keeps it visible', async ({ page }) => {
  // Arrange: three todos.
  await page.goto('/');
  for (const t of ['Milk', 'Bread', 'Eggs']) {
    await page.getByRole('textbox', { name: /todo item/i }).fill(t);
    await page.getByRole('button', { name: /add todo/i }).click();
  }

  // Act: mark "Milk" as done.
  const milkToggle = page.getByRole('button', { name: 'Milk' });
  await milkToggle.click();

  // Assert all three promises:
  // Promise 1 — toggle state is "done" (semantic ARIA contract).
  await expect(milkToggle).toHaveAttribute('aria-pressed', 'true');

  // Promise 2 — count decrements (pin the number, not wording).
  await expect(page.getByRole('status')).toContainText('2');

  // Promise 3 — Milk is still in the list (not deleted).
  await expect(
    page.getByRole('listitem').filter({ hasText: 'Milk' })
  ).toBeVisible();
});

Each assertion is on the diagonal of its own 2×2 grid. Promise 1 uses the semantic ARIA attribute (not the CSS class). Promise 2 pins the count number (not the wording). Promise 3 verifies presence (the binary contract).

🎓 Capstone — write the next two tests

You’re given a complete Spec Card and two test stubs. Your job: fill in Act + Assert.

Spec Card: Mark a todo as done

✓ Behavior:        Clicking a todo toggles its "done" state. Done todos
                    are visually distinct. The remaining count decrements.
                    Marked-done todos remain in the list.
✓ Should pass when: Visual styling of done items changes (color, icon,
                    font-weight). The toggle becomes a checkbox instead
                    of a button. The confirmation animation changes.
✗ Should fail when: Marking doesn't persist between renders. Count doesn't
                    decrement. Done items disappear from the list.
🎯 Locator contract: Each todo is a listitem. The toggle button has the
                    item's text as its accessible name. The status region
                    exposes a count.
✅ Oracle:          The status count reflects the number of un-done items.

Your two tests:

test('marking and unmarking a todo restores the count', async ({ page }) => {
  // Arrange: one todo "Milk".
  // Act: mark it done, then unmark it.
  // Assert: aria-pressed is back to false; count is back to 1.
});

test('marking one of two todos shows count of 1', async ({ page }) => {
  // Arrange: two todos "Milk" and "Bread".
  // Act: mark "Milk" as done.
  // Assert: count shows "1"; "Bread" is still un-done; "Milk" is done.
});

Use the worked example as your template. Apply per-promise specificity decisions (semantic locators, pin the count, verify the toggle state).

🤔 Metacognitive close

Before you submit:

  • Rate your confidence on each LO from Step 1 to now. Anything still fuzzy?
  • For your two capstone tests, ask: what’s the smallest change to App.jsx that should make my test fail? What’s the smallest change that should NOT make my test fail?

That second question is the real test of whether you’ve internalized the principle. If your test would fail for anything you can think of, it’s brittle. If it would not fail for a real regression you can think of, it’s loose. Aim for the diagonal.

📝 Final house rule

A durable e2e test isn’t a script of clicks. It’s an executable behavioral spec with a thin adapter that maps user intent onto the current UI.

Next steps beyond this tutorial

The in-browser sandbox here doesn’t host every Playwright feature. In a real Playwright project you’d also use:

  • Network mocking (page.route) — mock API responses for deterministic tests.
  • Storage state auth — sign in once, reuse the session across tests.
  • Fixtures — share setup logic without hiding business intent.
  • Trace viewer — inspect failed CI runs frame-by-frame.

The official Playwright docs are the next learning artifact. Everything you’ve built here transfers — only the plumbing differs.

Starter files
src/App.jsx
function App() {
  const [items, setItems] = React.useState([]);
  const [text, setText] = React.useState('');

  function addTodo() {
    const trimmed = text.trim();
    if (!trimmed) return;
    setItems([...items, { text: trimmed, done: false }]);
    setText('');
  }

  function toggleDone(idx) {
    setItems(items.map((item, i) =>
      i === idx ? { ...item, done: !item.done } : item
    ));
  }

  const remainingCount = items.filter((item) => !item.done).length;

  return (
    <main className="todo-shell">
      <section className="todo-panel">
        <p className="eyebrow">Todo Lab  Capstone</p>
        <h1>Todo Lab</h1>

        <div className="todo-form">
          <label htmlFor="todo-input">Todo item</label>
          <div className="todo-row">
            <input
              id="todo-input"
              value={text}
              onChange={(event) => setText(event.target.value)}
              placeholder="Buy milk"
            />
            <button onClick={addTodo}>Add todo</button>
          </div>
        </div>

        <p role="status" className="status-line">
          {remainingCount} items remaining
        </p>

        <ul aria-label="Todo list" className="todo-list">
          {items.map((item, idx) => (
            <li key={idx} className={item.done ? 'todo-done' : ''}>
              <button
                className="todo-toggle"
                onClick={() => toggleDone(idx)}
                aria-pressed={item.done}
              >
                {item.text}
              </button>
            </li>
          ))}
        </ul>
      </section>
    </main>
  );
}
src/main.jsx
const root = ReactDOM.createRoot(document.getElementById('root'));
root.render(<App />);
src/styles.css
body { margin: 0; font-family: system-ui, -apple-system, sans-serif; background: #f6f7fb; color: #1f2937; }
.todo-shell { min-height: 100vh; display: grid; place-items: center; padding: 32px; }
.todo-panel { width: min(100%, 560px); background: white; border: 1px solid #d9dee8; border-radius: 8px; padding: 28px; box-shadow: 0 18px 40px rgba(31, 41, 55, 0.08); }
.eyebrow { margin: 0 0 8px; color: #4b5563; font-size: 0.85rem; font-weight: 700; text-transform: uppercase; letter-spacing: 0.04em; }
h1 { margin: 0 0 24px; font-size: 2rem; }
label { display: block; margin-bottom: 8px; font-weight: 700; }
.todo-row { display: flex; gap: 10px; }
input { flex: 1; min-width: 0; background: white; color: #1f2937; border: 1px solid #b8c0cc; border-radius: 6px; padding: 10px 12px; font: inherit; }
.todo-row > button { border: 0; border-radius: 6px; padding: 10px 14px; background: #2563eb; color: white; font: inherit; font-weight: 700; cursor: pointer; }
.status-line { margin: 18px 0 0; color: #4b5563; font-weight: 600; }
.todo-list { margin: 12px 0 0; padding-left: 0; list-style: none; }
.todo-list li { margin: 8px 0; }
.todo-toggle { display: block; width: 100%; text-align: left; color: #1f2937; border: 1px solid #d9dee8; border-radius: 6px; padding: 10px 12px; background: white; font: inherit; cursor: pointer; }
.todo-done .todo-toggle { color: #9ca3af; text-decoration: line-through; }
/* Dark mode */
[data-bs-theme="dark"] body { background: #1c2533; color: #e6edf3; }
[data-bs-theme="dark"] .todo-panel { background: #232a36; border-color: #2a323e; box-shadow: 0 18px 40px rgba(0, 0, 0, 0.4); }
[data-bs-theme="dark"] .eyebrow { color: #9ca3af; }
[data-bs-theme="dark"] input { background: #2a323e; color: #e6edf3; border-color: #3a4351; }
[data-bs-theme="dark"] input::placeholder { color: #6b7280; }
[data-bs-theme="dark"] .todo-row > button { background: #2563eb; }
[data-bs-theme="dark"] .status-line { color: #9ca3af; }
[data-bs-theme="dark"] .todo-toggle { background: #2a323e; color: #e6edf3; border-color: #3a4351; }
[data-bs-theme="dark"] .todo-done .todo-toggle { color: #6b7280; }
tests/mark-done.spec.js
import { test, expect } from '@playwright/test';

// Worked example — read this carefully before writing the next two.
test('marking a todo as done decrements the count and keeps it visible', async ({ page }) => {
  await page.goto('/');
  for (const t of ['Milk', 'Bread', 'Eggs']) {
    await page.getByRole('textbox', { name: /todo item/i }).fill(t);
    await page.getByRole('button', { name: /add todo/i }).click();
  }

  const milkToggle = page.getByRole('button', { name: 'Milk' });
  await milkToggle.click();

  // Promise 1 — toggle state (semantic ARIA contract).
  await expect(milkToggle).toHaveAttribute('aria-pressed', 'true');
  // Promise 2 — count decrements (pin the number).
  await expect(page.getByRole('status')).toContainText('2');
  // Promise 3 — item stays visible (binary contract).
  await expect(
    page.getByRole('listitem').filter({ hasText: 'Milk' })
  ).toBeVisible();
});

// Your turn: fill in Act + Assert.
test('marking and unmarking a todo restores the count', async ({ page }) => {
  // Arrange: navigate and add one todo "Milk".
  await page.goto('/');
  await page.getByRole('textbox', { name: /todo item/i }).fill('Milk');
  await page.getByRole('button', { name: /add todo/i }).click();

  // TODO: Act — mark Milk as done, then unmark it.
  // TODO: Assert — Milk's aria-pressed is "false"; the status shows "1".
});

test('marking one of two todos shows count of 1', async ({ page }) => {
  // Arrange: navigate and add two todos "Milk" and "Bread".
  await page.goto('/');
  for (const t of ['Milk', 'Bread']) {
    await page.getByRole('textbox', { name: /todo item/i }).fill(t);
    await page.getByRole('button', { name: /add todo/i }).click();
  }

  // TODO: Act — mark "Milk" as done.
  // TODO: Assert — status shows "1"; "Milk" is done; "Bread" is not done.
});
8

From-Scratch Capstone: Write a Test From a Spec Card Alone

Why this matters

Filling in a TODO inside a tutorial scaffold is not the skill you’ll need at work. At work you get a behavior, an empty file, and a deadline. The gap between “I can finish the test someone started” and “I can write the test from a blank buffer” is enormous — and most Playwright tutorials never close it. This step does. It’s the moment the training wheels come off.

🎯 You will learn to

  • Create a complete Playwright test — from import to closing }); — given only a behavior spec
  • Apply every prior step’s discipline (Spec Card, locator ladder, web-first assertions, per-promise specificity) without a stub to lean on
  • Evaluate your own test against the gates: does it survive harmless refactors and catch real regressions?

🪜 The training wheels come off

Every previous step gave you something to start with: a stub, a TODO, a worked example sitting just above the box where you typed. This step gives you nothing. An empty file. A spec. Your judgment.

That’s how it works at work — and that’s the gap most Playwright tutorials never close. We’re closing it here.

📋 The spec — read carefully, don’t skim

The Todo app from Step 7 supports marking items as done. The team has just added a small new spec promise:

Promise. When every todo in the list is marked done, the remaining-count display reads "0 items remaining", and all the original todos remain visible (done items are not deleted from the list).

Two specific user paths the team wants covered:

  1. Mark-all-then-check. Add three todos. Mark all three as done. The count should read 0; all three items should still be in the list.
  2. Toggle-back-restores. Add two todos. Mark both done. Then unmark one. The count should be 1; both items still in the list.

🃏 Your Spec Card (write this BEFORE you write code — on paper or as a comment)

Fill in the five fields:

Field Example shape
Behavior One sentence: what user-visible behavior are you proving?
Should pass when List the implementation changes the test must survive (CSS class renames, button text tweaks, etc.)
Required failures List the regressions the test must catch (count not decrementing, items deleted on done, etc.)
Locator contract Which semantic queries (getByRole, getByLabel, etc.) — and why each one
Oracle Per-promise: what assertion shape pins each promise at the right specificity?

Once your Spec Card has all five fields, then open tests/all-done.spec.js and start typing. You will see only the import line; everything else is yours.

✏️ Write the test

Open tests/all-done.spec.js (currently has only the import line). Write two tests covering the two user paths above. Both must:

  • Use getByRole / getByLabel for every locator (no CSS classes, no XPath).
  • Use await expect(...) for every assertion (no synchronous expect(await locator.isVisible()).toBe(true)).
  • Match assertion specificity to spec specificity: the count number IS the contract, but the wording around it (“0 items remaining” vs “Nothing left to do”) is not.

📋 What the gates check

The gates below verify you wrote the test from scratch — the file will have:

  • An import line for test, expect.
  • Two test('...', async ({ page }) => { … }); blocks.
  • At least one await page.goto(...) per test.
  • At least one await expect(...) per test.
  • At least one getByRole(...) locator (proving you used the accessibility tree).
  • And of course: both tests must actually pass against the running app.

Don’t peek at Step 7’s solution mid-task. The point of this step is not the answer; it’s the typing-from-blank habit.

Starter files
src/App.jsx
function App() {
  const [items, setItems] = React.useState([]);
  const [text, setText] = React.useState('');

  function addTodo() {
    const trimmed = text.trim();
    if (!trimmed) return;
    setItems([...items, { text: trimmed, done: false }]);
    setText('');
  }

  function toggleDone(idx) {
    setItems(items.map((item, i) =>
      i === idx ? { ...item, done: !item.done } : item
    ));
  }

  const remainingCount = items.filter((item) => !item.done).length;

  return (
    <main className="todo-shell">
      <section className="todo-panel">
        <p className="eyebrow">Todo Lab  From-Scratch Capstone</p>
        <h1>Todo Lab</h1>

        <div className="todo-form">
          <label htmlFor="todo-input">Todo item</label>
          <div className="todo-row">
            <input
              id="todo-input"
              value={text}
              onChange={(event) => setText(event.target.value)}
              placeholder="Buy milk"
            />
            <button onClick={addTodo}>Add todo</button>
          </div>
        </div>

        <p role="status" className="status-line">
          {remainingCount} items remaining
        </p>

        <ul aria-label="Todo list" className="todo-list">
          {items.map((item, idx) => (
            <li key={idx} className={item.done ? 'todo-done' : ''}>
              <button
                className="todo-toggle"
                onClick={() => toggleDone(idx)}
                aria-pressed={item.done}
              >
                {item.text}
              </button>
            </li>
          ))}
        </ul>
      </section>
    </main>
  );
}
src/main.jsx
const root = ReactDOM.createRoot(document.getElementById('root'));
root.render(<App />);
src/styles.css
body { margin: 0; font-family: system-ui, -apple-system, sans-serif; background: #f6f7fb; color: #1f2937; }
.todo-shell { min-height: 100vh; display: grid; place-items: center; padding: 32px; }
.todo-panel { width: min(100%, 560px); background: white; border: 1px solid #d9dee8; border-radius: 8px; padding: 28px; box-shadow: 0 18px 40px rgba(31, 41, 55, 0.08); }
.eyebrow { margin: 0 0 8px; color: #4b5563; font-size: 0.85rem; font-weight: 700; text-transform: uppercase; letter-spacing: 0.04em; }
h1 { margin: 0 0 24px; font-size: 2rem; }
label { display: block; margin-bottom: 8px; font-weight: 700; }
.todo-row { display: flex; gap: 10px; }
input { flex: 1; min-width: 0; background: white; color: #1f2937; border: 1px solid #b8c0cc; border-radius: 6px; padding: 10px 12px; font: inherit; }
.todo-row > button { border: 0; border-radius: 6px; padding: 10px 14px; background: #2563eb; color: white; font: inherit; font-weight: 700; cursor: pointer; }
.status-line { margin: 18px 0 0; color: #4b5563; font-weight: 600; }
.todo-list { margin: 12px 0 0; padding-left: 0; list-style: none; }
.todo-list li { margin: 8px 0; }
.todo-toggle { width: 100%; text-align: left; background: transparent; border: 1px solid #d9dee8; border-radius: 6px; padding: 10px 12px; font: inherit; cursor: pointer; }
.todo-toggle[aria-pressed="true"] { background: #ecfdf5; border-color: #10b981; }
.todo-done .todo-toggle { text-decoration: line-through; color: #6b7280; }
[data-bs-theme="dark"] body { background: #1c2533; color: #e6edf3; }
[data-bs-theme="dark"] .todo-panel { background: #232a36; border-color: #2a323e; box-shadow: 0 18px 40px rgba(0, 0, 0, 0.4); }
[data-bs-theme="dark"] .eyebrow { color: #9ca3af; }
[data-bs-theme="dark"] input { background: #2a323e; color: #e6edf3; border-color: #3a4351; }
[data-bs-theme="dark"] input::placeholder { color: #6b7280; }
[data-bs-theme="dark"] .todo-row > button { background: #2563eb; }
[data-bs-theme="dark"] .todo-toggle { background: transparent; color: #e6edf3; border-color: #3a4351; }
[data-bs-theme="dark"] .todo-toggle[aria-pressed="true"] { background: #064e3b; border-color: #10b981; }
tests/all-done.spec.js
import { test, expect } from '@playwright/test';

// ─────────────────────────────────────────────────────────────
// From-scratch capstone. Two tests, both written by you, both
// following the spec at the top of the step. No TODOs, no stubs.
//
// Spec recap (write this as a comment block before each test):
//   Promise: marking all todos done makes the count read 0,
//            and all items remain visible.
//   Path 1:  add 3 todos, mark all 3 done, expect count = 0
//            and 3 listitems still visible.
//   Path 2:  add 2 todos, mark both done, unmark one,
//            expect count = 1, both listitems visible.
// ─────────────────────────────────────────────────────────────