1

Branches, HEAD, and Detached HEAD

🎯 You will learn to

  • Explain why branch creation is O(1) — no files get copied.
  • Tell attached from detached HEAD by reading .git/HEAD.
  • Anticipate where orphaned commits come from, setting up the reflog rescue.
📚 The 15-step arc (open once, then close)
Phase Steps What you build
Foundations 1–3 Mental model: branches are pointers; commits are immutable hashed snapshots
Daily tools 4–7 Stash, cherry-pick, blame, bisect — used weekly on real teams
History rewriting 8–11 Rebase, interactive rebase, squash-merge, revert — when to use each
Submodules 12–14 Nested repos, the gitlink, six-step publish ceremony
Capstone 15 Compose 5+ tools under pressure with no hand-holding

Steps 1–3 are foundational — every later step refers back. After Step 7, take a break before Step 8 (spacing helps consolidation).

Why this tutorial exists

You already know init, add, commit, branch, merge, remotes. This tutorial lifts the hood — object database, refs, HEAD — so every “scary” command becomes a safe, predictable pointer move.

Two antipatterns to retire on sight:

Antipattern What it looks like
Blind-testing Typing random add/commit/push/pull permutations until errors stop
Burning down the repo Deleting the folder, copying files out, re-cloning, force-pushing

Both come from an inaccurate mental model. Each step fixes one piece.

Prerequisite self-check

Answer from memory. Any shaky? Revisit the basic tutorial.

  1. New file is red in git status. State name? Command to green?
  2. After a commit + one more edit, what does bare git diff compare?
  3. main and feature have diverged. Can merge feature fast-forward?
  4. Teammate pushed a buggy commit to shared main. reset --hard + force-push, or revert?
  5. Staged a .env with secrets. Does adding to .gitignore now help?
Expected answers
  1. Untrackedgit add stages it.
  2. Working tree vs. index. Index matches HEAD (nothing staged), so you see unstaged edits.
  3. No — diverged branches need a merge commit with two parents.
  4. git revert. Additive; doesn’t break teammates’ clones.
  5. No. .gitignore only blocks future tracking. Use git rm --cached + rotate the secret.

Task 1: Prove a branch is a 41-byte pointer

Predict first: what’s in .git/refs/heads/main? A commit list? A snapshot?

cd /tutorial/myproject
cat .git/refs/heads/main
cat .git/refs/heads/feature-divide
cat .git/HEAD

Each branch file is one line — a commit SHA. HEAD is ref: refs/heads/main — a pointer to a pointer.

@startuml
branch main:
  A "Initial commit"
  B "Add add function"
head main
@enduml

That indirection lets commit advance the branch pointer while HEAD auto-follows — no HEAD rewrite needed.

Task 2: Detach HEAD and feel the difference

git switch --detach HEAD~1
cat .git/HEAD        # now a raw 40-char SHA, not a ref

Detached HEAD = HEAD pinned to a commit, not a branch. Watch the graph: HEAD floats on the commit node itself.

Museum-archive analogy. You can read any document, but notes left without a label have nowhere to go when you leave. git switch -c <name> is that label.

Any commit you make here is anchored to nothing. git switching away orphans it. The next step shows how to rescue orphans.

Cleanup

git switch main

✍️ Before moving on (30-second self-test)

Without scrolling up, answer:

  1. How many bytes is a branch?
  2. What’s the physical difference between attached and detached HEAD?

Got both? You’ve internalized the schema this whole tutorial rests on.

2

Rescuing Lost Work with git reflog

🎯 You will learn to

  • Recover commits lost to bad rebases, hard resets, and detached-HEAD orphans.
  • Tell what git log --all can see from what git reflog can see.
  • Know reflog’s limits — it’s local, and disappears with the clone.

🤔 Predict first

You make an experimental commit in detached HEAD, then git switch main away without creating a branch. Can git log --all find that commit? Can anything?

log --all vs reflog — the load-bearing distinction

  git log --all git reflog
Walks Commits reachable from refs Every position HEAD occupied
Sees orphans? No (unreachable = invisible) Yes (reachability irrelevant)
Shared across clones? Yes No — local only

Task 1: Deliberately lose work

cd /tutorial/myproject
git switch --detach HEAD
echo "# experimental note" >> calculator.py
git add calculator.py && git commit -m "Experimental: add note in detached HEAD"
git switch main
git log --all --oneline      # the Experimental commit is GONE from this view

It’s orphaned — no ref reaches it, so log --all walks right past.

Task 2: Find the orphan

git reflog

Each line: <sha> HEAD@{n}: <action>: <description>.

Expression Meaning
HEAD@{0} where HEAD is now
HEAD@{1} where HEAD was one move ago
HEAD@{n} n moves ago

The detached-HEAD commit is at HEAD@{1}.

Task 3: Anchor it with a branch

git branch rescued-work HEAD@{1}
git log rescued-work --oneline

The universal recipe: git reflog → note the SHA or HEAD@{n}git branch <name> <sha> anchors it as reachable. Works for dropped commits after interactive rebase, botched resets, failed rebases — any “lost” commit that’s still in .git/objects.

3

Relative Commit Addresses & Git's Object Database

🎯 You will learn to

  • Name any commit without a SHA using HEAD~n, BRANCH^, and rev-parse.
  • Prove Git’s history model is snapshot-based — commits point to trees that point to blobs holding full file bytes — by hashing content directly.
  • Predict that a single trailing space changes the entire SHA chain — and say why that matters for blame later.

🚪 This is the threshold step

Step 3 is the conceptual hinge of the whole tutorial. Every later step (rebase, cherry-pick, bisect, submodules) becomes obvious or mysterious depending on whether the object model clicks here.

If it doesn’t click on the first read, that’s expected — threshold concepts (Meyer & Land) are transformative (they reframe the whole domain) and troublesome (they resist quick mastery). Re-read, re-run the hashing experiment, sleep on it. Most learners need two passes. The recall prompt at the bottom is your self-check.

Relative references

Expression Meaning
HEAD~n n commits back along first-parent chain
BRANCH^ shorthand for BRANCH~1
BRANCH^2 second parent of a merge commit
@startuml
branch main:
  A "Oldest commit"
  B "main~2"
  C "HEAD~1"
  D "HEAD / main"
head main
@enduml

Task 1: Practice

cd /tutorial/myproject
git rev-parse HEAD        # current SHA
git rev-parse HEAD~1      # parent
git rev-parse main        # same as HEAD

Task 2: Prove content-addressability

Every object in .git/objects/ is addressed by the SHA-1 of its content. Three object kinds:

Object Stores
blob Raw file bytes (no filename)
tree Directory: filename → blob/tree SHA
commit Tree SHA + parent SHAs + author + message

Hash the same bytes in two unrelated repos:

echo "hello world" | git hash-object --stdin
cd /tmp && git init -q bob-repo && cd bob-repo
echo "hello world" | git hash-object --stdin
cd /tutorial/myproject

Identical 40-char SHA. Same bytes → same hash, always, everywhere. That’s why Git deduplicates across branches and history for free.

Task 3: Byte-exact means byte-exact

Predict: hashing "hello world " with one trailing space — same SHA?

printf 'hello world \n' | git hash-object --stdin

Different. One whitespace byte → new blob SHA → new tree SHA → new commit SHA. That’s why reformatter commits (Step 6) mask real authorship: every whitespace tweak rewrites the entire hash chain.

✍️ Before moving on (the unifying invariant)

Close this and answer from memory:

“What’s the one property of existing commit objects that lets every later step in this tutorial work?”

The invariant (peek only after attempting)

Existing commit objects are immutable. Git changes history by creating new objects and/or moving references — never by editing old commits in place.

Every Git command falls into one of these operation categories:

Operation type Examples What changes
Create immutable objects hash-object, commit, stash, cherry-pick, revert New blob / tree / commit objects
Move refs branch, reset, fast-forward merge, finalizing a successful rebase Branch / ref points to a different commit
Update index add, conflicted-resolution staging, merge --squash Staging area changes
Update working tree switch, restore, checkout, stash pop, submodule update Files on disk change
Transfer objects/refs fetch, push, pull Local/remote object/ref sets change

Most everyday commands combine categories (e.g., commit creates a commit object and moves a branch ref and clears the index). The point isn’t that operations are pure — it’s that no operation rewrites existing commit objects. Whenever a later step feels confusing, ask: what objects is this creating? what refs is it moving? what’s still in .git/objects that I could recover?

4

Saving Work Temporarily with git stash

🎯 You will learn to

  • Context-switch cleanly mid-feature without polluting history with WIP commits.
  • Pick pop vs. apply correctly.
  • Diagnose the classic “stash missed my new file” footgun.

Scenario

You’re mid-feature when your lead yells “hotfix on main, now!” Your options without stash are all bad: WIP commit (pollutes history), git restore (destroys work), or stay put (can’t isolate the fix).

git stash is the escape hatch.

🤔 Predict first

After git stash, where does your in-progress work end up — in the index, in the working tree, in a private commit, or deleted? And what will git status say about your working tree?

Task 1: See the dirty tree

A half-finished power function is already sitting in calculator.py:

def power(a, b):
    # TODO: add input validation
    return a ** b
cd /tutorial/myproject
git status
git diff

Task 2: Stash it

git stash
git status           # clean!
git stash list       # your WIP is here
💡 How stash works internally (Step 3 callback)

A stash is a merge commit at refs/stash — first parent is HEAD at stash time, second parent records the index (and a third parent records untracked files when you use -u). Same object model as every other commit, which is why git stash apply <sha> works on any historical stash.

Task 3: Do the hotfix on a dedicated branch

git switch -c hotfix-divide-zero

In the editor, append to calculator.py:

def safe_divide(a, b):
    """Divide a by b, raising ValueError on zero denominator."""
    if b == 0:
        raise ValueError("Cannot divide by zero")
    return a / b
git add calculator.py
git commit -m "Hotfix: add safe_divide to prevent zero-division errors"
git switch main
git merge hotfix-divide-zero --no-edit
git branch -d hotfix-divide-zero

Task 4: Restore your WIP

git stash pop
git stash list       # empty — pop removed it

pop = apply + drop. Use apply instead if you want to keep the stash (e.g. to apply it on multiple branches).

📋 Full stash cheat sheet (other flags)
Command Effect
git stash Save tracked mods + staged; clean tree
git stash pop Restore and drop the top stash
git stash apply Restore but keep the stash
git stash drop Delete without applying
git stash push -m "msg" Save with a message
git stash -u Also include untracked files

Gotcha: plain git stash skips untracked (never-add-ed) files. Use -u to include them — the most common stash footgun.

Task 5: Finish the feature

Edit calculator.py so power has real validation, then commit (message must include “power”):

def power(a, b):
    """Return a raised to the power of b."""
    if not isinstance(a, (int, float)) or not isinstance(b, (int, float)):
        raise TypeError("Arguments must be numbers")
    return a ** b
5

Cherry-Pick: Copy One Specific Commit

🎯 You will learn to

  • Pick cherry-pick for one-commit backports; reject it for many-commit integration.
  • Resolve a cherry-pick conflict end-to-end (same marker dance as merge — different final verb).
  • Explain why the copied commit has a new SHA (apply Step 3’s object model).

Scenario

Lead: “The absolute helper on experimental is useful on main too. Bring that one commit over — leave the half-baked multiply behind.”

🤔 Predict first

Cherry-pick produces a new commit with a new SHA. What happens to the original commit on experimental — does it move, get rewritten, vanish, or stay put unchanged?

cherry-pick <sha> replays one commit’s patch on top of HEAD as a new commit (new parent → new SHA, same message + diff).

Task 1: Inspect

The pre-built experimental has two commits: a half-baked experimental_multiply, and a reusable absolute.

cd /tutorial/myproject
git log experimental --oneline

You only want the second commit.

Task 2: Cherry-pick the tip

A branch name resolves to its tip commit — no SHA copy needed:

git switch main
git cherry-pick experimental
git log --oneline

A new commit Add absolute value function sits on main with a different SHA from the original. Same patch, new parent → new SHA.

💡 Schema check (Step 3 callback). Cherry-pick creates a new immutable object and moves the branch pointer to it. The original commit on experimental is untouched — Git never edits commits in place. This pattern repeats in every step from here on.

🔍 Contrast — what’s not like cherry-pick. git branch foo at the same commit creates zero new objects (just a 41-byte ref file). Both move pointers; only cherry-pick also creates a new commit. That’s why branch creation is instant and cherry-pick can fail with a conflict.

Task 3: Produce and resolve a conflict

Make the same line differ on both branches:

On main, edit calculator.py so def add(a, b): return a + b becomes:

def add(a, b):
    """Return the sum of two numbers."""
    return a + b
git add calculator.py && git commit -m "Document add function"

On experimental, change the same line differently:

git switch experimental

Edit to:

def add(a, b): return a + b  # simple addition
git add calculator.py && git commit -m "Inline comment on add"
git switch main
git cherry-pick experimental      # CONFLICT
git status

You’ll see <<<<<<< / ======= / >>>>>>> in the file. Conflicts are not failures — Git is asking a human to combine two valid changes.

Edit the block to keep both sides:

def add(a, b):
    """Return the sum of two numbers."""
    return a + b  # simple addition
git add calculator.py
git cherry-pick --continue     # NOT `git commit` — use the cherry-pick verb
🆘 Stuck on the conflict?
  1. Open calculator.py and find the <<<<<<< / ======= / >>>>>>> block.
  2. The block has two halves: above ======= is what you have (HEAD), below is what’s coming in (the cherry-picked commit).
  3. Edit so the result keeps the docstring and the inline comment, then delete all three marker lines.
  4. git add calculator.pygit cherry-pick --continue.
  5. To bail at any point: git cherry-pick --abort resets cleanly.
6

git blame: Who Last Changed This Line (and Why)?

🎯 You will learn to

  • Answer “why does this line exist?” by chaining blame -Lshow <sha>.
  • Predict when plain blame lies — reformatter commits mask real authors.
  • Defuse the lie with -w or blame.ignoreRevsFile.
  • Recognize blame’s blind spot: it can only see existing lines.

The two-command forensic workflow

  1. git blame -L <start>,<end> <file> → find the SHA that last touched the line.
  2. git show <sha> → read the commit message and diff — the why lives here.

Blame is for context, not accusation.

Task 1: Why does this line exist?

git blame -L 7,7 calculator.py
# Copy the SHA from the first column, then:
git show <that-sha>

Who, when, why — covered. That chain is 90% of real blame use.

Task 2: The reformatter-masked authorship case

Setup planted: Bob wrote clip. CI-Bot later ran whitespace normalization (no logic change).

Predict: who will plain blame name as the last author of def clip?

git blame -L 1,$(wc -l < calculator.py) calculator.py | grep -i 'clip'

Last-toucher wins — blame names CI-Bot, masking Bob. Inspect:

git show <ci-bot-sha>     # pure whitespace diff

Add -w to skip whitespace-only changes:

git blame -w -L 1,$(wc -l < calculator.py) calculator.py | grep -i 'clip'

Now the author is Bob — the real logic author. For recurring formatters, persist this:

echo "<ci-bot-sha>" >> .git-blame-ignore-revs
git config blame.ignoreRevsFile .git-blame-ignore-revs

GitHub’s web blame UI honors this file too.

Task 3: Default blame vs. HEAD -- blame

Predict first: if your working tree has uncommitted edits to a file, will plain git blame <file> show those uncommitted lines or hide them?

echo "# uncommitted note" >> calculator.py
git blame calculator.py | tail        # the uncommitted line is shown — with a zero SHA "Not Committed Yet"
git blame HEAD -- calculator.py | tail # only what's committed at HEAD
git restore calculator.py             # discard the experimental edit

The distinction. Default git blame <file> annotates the file as it currently is on disk — uncommitted lines included, marked with the zero SHA 00000000 and the author “Not Committed Yet”. git blame HEAD -- <file> instead asks “who last touched this line in the version recorded at HEAD?” Different question, different answer when the working tree is dirty.

Still a real blind spot, though. Blame can only attribute existing lines (in either mode). A bug caused by a deleted line is invisible. For deletions, reach for git log -p, git log -S (pickaxe search), or git bisect (next step) — the official Git docs are explicit that deleted/replaced lines require diff- or pickaxe-style history search.

📋 Full flag cheat sheet (`-C`, `-M`, `ignoreRevsFile`)
Flag Use when
-L start,end You know which lines matter (avoid scanning 1000 lines)
-w A reformatter was the last toucher
-C -M A line moved or was copied across files
blame.ignoreRevsFile Permanently skip known reformat commits
💡 Sanity check: when `-w` is a no-op (try it)
git blame -L 1,$(wc -l < calculator.py) calculator.py | grep -i 'def add'

Plain blame already shows the real author — -w is identical here. Rule: -w matters only when a reformatter was the last toucher.

7

git bisect: Binary Search for the Commit That Broke Things

🎯 You will learn to

  • Decide when bisect is worth reaching for (rule: ≥ ~5 commits or slow tests).
  • Run an automated bisect end-to-end and always reset afterward.
  • Spot regressions blame cannot find — deletions, behavioral changes, and anything involving missing lines.

🤔 Predict first

A regression appeared somewhere in the last 1000 commits. Roughly how many tests would git bisect need to find the exact breaking commit? Pick one before reading on: 1000, 500, 100, or ~10.

Why bisect beats every alternative

Reading 30 diffs by hand is slow. blame can’t see missing lines. log --grep="fix" is wishful thinking.

Bisect runs binary search on history: log₂(30) ≈ 5 tests to pin the exact culprit. 1000 commits → ~10 tests. Scales forever.

Task 1: See the regression

Setup planted 5 commits; one of them broke absolute(-4) == 4.

cd /tutorial/myproject
git log --oneline -7
python3 test_calculator.py      # AssertionError

Task 2: Manual bisect (feel the motion)

git bisect start
git bisect bad HEAD
git bisect good HEAD~5
# Git checks out a midpoint. Test it:
python3 test_calculator.py
# exit 0 → git bisect good ;  exit ≠ 0 → git bisect bad
# Repeat until Git prints "<sha> is the first bad commit"
git bisect reset

Task 3: Automated bisect (the real-world default)

git bisect start HEAD HEAD~5
git bisect run python3 test_calculator.py
git bisect reset

bisect run uses the script’s exit code (0 = good, non-zero = bad) to drive the search. Always finish with reset — otherwise HEAD stays on the last midpoint.

Task 4: Fix the bug

Bisect points at Simplify absolute (BUG: removes negation!). In the editor, restore the body of absolute:

def absolute(x):
    """Return |x| — handles negatives, zero, and positives."""
    return x if x >= 0 else -x
git commit -am "Fix: restore negation in absolute"
python3 test_calculator.py      # all tests pass
⚠️ Test-portability caveat (real-world bisects)

Bisect runs the test at every historical commit in range. If the test itself was added mid-range, older commits won’t have it and bisect breaks. Restore the modern test each iteration:

git bisect run -- bash -c 'cp /tmp/test.py . && python3 test.py'

🌙 Halftime: take a break before Step 8

You’ve finished the daily tools phase (stash, cherry-pick, blame, bisect). Steps 8–11 are history rewriting — denser and structurally riskier.

Walk away for at least 30 minutes (overnight is better) before continuing. Spaced practice is one of the most replicated findings in cognitive science: a 30-minute break before harder material produces measurably better retention than pushing straight through. Your hippocampus consolidates while you’re not studying.

When you come back, predict from memory: what does git stash actually save? Why does cherry-pick create a new SHA? If those don’t come fast, re-do the step. If they do, Step 8 awaits.

8

Rebase: Integrate Changes Without a Merge Commit

🎯 You will learn to

  • Pick rebase for short local branches, merge for shared/long-lived ones — and say why.
  • Produce linear history with rebase + fast-forward merge (no diamond).
  • Resolve a rebase conflict — same marker dance as merge, but finish with rebase --continue.
  • Recover from a bad rebase using reflog (Step 2’s safety net applied).

Mental model: the video-editor timeline cut

Select the clips (commits) unique to your feature, cut, move playhead to main’s tip, paste. Each paste is a new commit object — same patch, new parent, new SHA. Originals stay in .git/objects (reflog recovers).

💡 Schema check (Step 3 callback). Rebase = “cherry-pick a series” under the hood. New objects, branch pointer moved. Same mechanic Step 5 used on one commit; Step 8 just iterates.

🔍 Contrast — what’s not like rebase. A fast-forward merge on a strict-extension branch creates zero new commits — main’s pointer just slides forward to the feature tip. Rebase + ff-merge together produce linear history because rebase did all the new-commit-creation up front; the merge has nothing left to do.

Task 1: Inspect the divergence

Pre-built: feature-sqrt has square_root; main later got Bump version notes + Add identity helper.

cd /tutorial/myproject
git log --all --oneline --graph --decorate

Task 2: Rebase and fast-forward

Predict before running: how many parents will the feature tip have after rebase?

git switch feature-sqrt
git rebase main
git switch main
git merge feature-sqrt        # fast-forward, no merge commit
git branch -d feature-sqrt

Result: one linear line on the graph. No diamond.

Task 3: Rebase through a conflict (desirable difficulty)

Real rebases conflict when upstream touched the same lines. Produce one deliberately:

git switch -c feature-trailer main~1
echo '# end-of-module trailer' >> calculator.py
git commit -am 'Add trailer comment at end of file'
git rebase main       # CONFLICT — both sides appended at EOF
git status

Conflicts aren’t failures — they’re “two valid changes touched the same lines; a human must combine them.” Edit calculator.py so the bottom keeps both the identity helper and your trailer comment, removing the <<< / === / >>> markers.

git add calculator.py
git rebase --continue         # NOT `git commit` — use the rebase verb
git switch main
git branch -D feature-trailer

Remember: rebase conflict = merge conflict mechanics, but finalize with git rebase --continue. Bail with git rebase --abort.

When to rebase vs merge

Situation Prefer
Short feature branch (hours–days), only you Rebase
Long-lived or already-pushed branch used by teammates Merge
Cardinal rule Never rebase shared history
9

Interactive Rebase: Edit, Squash, Reorder, Drop

🎯 You will learn to

  • Squash messy WIP commits into one clean commit before opening a PR.
  • Drop an accidentally-committed secret (and recover it from reflog if needed).
  • Reword a commit message retroactively without changing its diff.
  • Pick the right verb (pick/reword/squash/fixup/drop/edit) for the rewriting goal.

🚪 This is the second threshold step

Step 9 is the densest step in the tutorial — eight verbs, several edge cases, and the most “wait, what?” moments in real Git. That’s not a bug; it’s where most engineers’ command of Git plateaus. Crossing this threshold is what separates “I use Git” from “I shape Git history.” Plan two passes. Don’t worry if Task 4 needs a re-read.

⚠️ Safe zone only

Interactive rebase rewrites history (Step 3: new parents → new SHAs). Run it only on commits that (a) are unpushed, or (b) live on a feature branch only you use. For public history, use git revert (next).

🤔 Predict first

After rebase -i collapses four messy commits into one clean commit, do the original four still exist anywhere — and could you recover one of them with git reflog?

💡 Schema check. Same pattern as Steps 5 & 8: every rewriting verb here (squash, drop, reword, edit) creates new commit objects and moves the branch pointer. The “old” commits don’t disappear — they’re just unreferenced. Reflog finds them.

The four verbs you’ll use here

Verb Effect
pick Use commit as-is (default)
squash Meld into previous; combine messages
drop Remove commit
reword Edit message only
📋 All six core verbs (`fixup`, `edit`)
Verb Effect
pick Use commit as-is (default)
reword Edit message only
edit Pause so you can commit --amend or add fixes / split
squash Meld into previous; combine messages
fixup Like squash, drop this commit’s message
drop Remove commit

Two more verbs exist for advanced workflows: break (pause mid-rebase so you can poke around, then git rebase --continue) and exec <cmd> (run a shell command after each replayed commit, e.g. exec pytest). See git help rebase if you need them.

🛠 Why this VM uses scripted `sed` instead of `$EDITOR`

Real workflow: git rebase -i HEAD~N opens your $EDITOR, you hand-edit action words, save-and-close. This browser VM can’t host an interactive editor, so we script it via GIT_SEQUENCE_EDITOR="sed -i …".

The skill is knowing what to change, not typing the sed. For each task: (1) predict the edit on paper, (2) run the scripted version, (3) verify the log matches your prediction.

Task 1: Inspect the messy branch

cd /tutorial/myproject
git log --oneline -5           # 4 ugly commits on refactor-power

Task 2: Squash four commits into one

Predict: which lines get squash, and why must line 1 stay pick?

GIT_SEQUENCE_EDITOR="sed -i '2,4s/^pick/squash/'" git rebase -i HEAD~4
git commit --amend -m "Refactor: cleanup notes in calculator.py"
git log --oneline -3

Task 3: Drop a secret-leaking commit

Append to calculator.py: SECRET_API_KEY=oops. Commit: git commit -am "Accidentally add secret (should be dropped)".

Then append def placeholder(): pass and commit: git commit -am "Add placeholder function".

Drop the secret:

GIT_SEQUENCE_EDITOR="sed -i '1s/^pick/drop/'" git rebase -i HEAD~2
grep SECRET_API_KEY calculator.py || echo "secret is gone from branch"

Task 3b: Prove reflog rescues the “dropped” commit

Dropped ≠ deleted (Step 3 again).

git reflog -n 10
SECRET_SHA=$(git reflog | grep -m1 'Accidentally add secret' | awk '{print $1}')
git branch secret-backup $SECRET_SHA
git log secret-backup --oneline
⚠️ For *real* secrets: drop+rescue is the wrong workflow

Drop + rescue leaves more copies of the secret, not fewer. For an actual leaked credential:

  1. Rotate the credential immediately (the only step that truly mitigates).
  2. Scrub with git filter-repo or BFG.
  3. Ask collaborators to re-clone.

Use drop only for non-sensitive cleanup (debug prints, experiments).

Task 4: Reword a message

GIT_SEQUENCE_EDITOR="sed -i '1s/^pick/reword/'" \
  GIT_EDITOR="sed -i '1s/.*/Refactor: cleanup notes and placeholder/'" \
  git rebase -i HEAD~2
git log --oneline -3

Two env vars = two editors (todo list + message editor). In real life you’d hand-edit both.

Wrap-up: rule of thumb

  • Local, unpushed history → rebase -i (any verb).
  • Shared, pushed history → git revert only (next step).

Rewriting public history forces every collaborator to reconcile.

10

Squash Merge: Collapse a Feature Into a Single Commit

🎯 You will learn to

  • Pick squash vs. rebase vs. merge based on how main’s log should read.
  • Anticipate the trade-off: clean main, lost intra-feature bisect precision.
  • Recover individual feature commits if a regression needs fine-grained blame.

git merge --squash <branch> collapses a multi-commit feature into one new commit on main. The feature branch is untouched.

🤔 Predict first

After git merge --squash feature followed by git commit, how many parents does the new commit on main have — one, two, or three? And what does that imply for git bisect later?

📋 Three merge strategies side by side (Steps 8 + 10 unified)
Method main’s graph Use when
git merge feature Merge commit, 2 parents (diamond) Long-lived branch; preserve merge context
rebase + merge (ff) Linear, each commit preserved Short feature; keep individual commits
git merge --squash One new commit, branch untouched Want main to read as one commit per feature

Task 1: Inspect the feature

cd /tutorial/myproject
git log feature-stats --oneline -5     # three focused commits

Task 2: Squash-merge

git switch main
git merge --squash feature-stats
git status       # staged changes, but NO commit yet — squash stops here
git commit -m "Add descriptive statistics module (mean, variance, stddev)"

Task 3: Confirm + clean up

git log --oneline main            # one new commit for the feature
git branch -D feature-stats       # -D because not ff-merged in Git's view
⚠️ The cost: bisect granularity

bisect on main can only narrow to the whole feature commit, not one of its three internal commits. Keeping the feature branch around (or its reflog) preserves fine-grained recovery — the strongest argument against deleting merged feature branches the same day they merge.

11

Revert: Safely Undo a Pushed Commit

🎯 You will learn to

  • Reach for revert — not reset --hard — whenever a bad commit is already on a shared branch.
  • Read the anti-matter pattern in the graph: the original stays; a new commit negates it.
  • Decide between revert (public safety) and rebase-drop (private cleanup) by asking one question: has this been pushed?

Scenario

You pushed Refactor: rename divide → div to main. Ten teammates already pulled. Then CI discovers every import of divide now breaks.

🤔 Predict first

You have two options on the table:

  • A. git reset --hard HEAD~1 + git push --force
  • B. git revert HEAD + git push

Which one breaks every teammate’s clone? Why? (Step 3’s schema is the key — what changes existing SHAs?)

The answer

reset --hard + push --force would fix your clone but break every teammate’s — their local main still points at the rewritten SHA. Not acceptable.

git revert <sha> is the additive, public-safe undo. It computes the inverse patch of the target commit and commits that as a new commit. No existing SHAs change; no force-push; no collaborator pain.

Task 1: See the bad commit

Setup planted a “pushed” refactor that broke callers.

cd /tutorial/myproject
git log --oneline -5
grep -c 'def divide\|def div' calculator.py

Task 2: Revert it

git revert HEAD --no-edit
git log --oneline -5

Two commits visible: the bad one and its revert. git log is now a truthful record of what happened.

Task 3: Prove the reachable commit count

Predict: did revert delete anything? (Answer: no — history grew by 1.)

git rev-list --count HEAD
git cat-file -p HEAD          # examine the revert commit object
git cat-file -p HEAD^         # the original bad commit, still reachable

The single rule

If anyone else has it, revert. If only you have it, rebase is fair game.

📋 Revert vs. reset vs. rebase-drop, side by side
Goal Pushed? Tool
Remove a bad commit from shared history Yes git revert <sha> (additive)
Clean up a local WIP branch before PR No rebase -i with drop
Nuke local branch to a prior state No reset --hard <sha>
💡 Reverting a *merge* commit (`-m 1`)

Merge commits have two parents; revert needs to know which side is the “mainline” (the side you want to keep). git revert -m 1 <merge-sha> keeps the first-parent side and undoes the merged-in branch. Get the number wrong and you revert the wrong side.

12

Git Submodules: Add & Clone

🎯 You will learn to

  • Add a submodule to an existing repo with one command.
  • Clone a submodule-using repo correctly (--recursive) — or recover after forgetting.
  • Recognize the gitlink (mode 160000) + .gitmodules as the two structural differences from a regular file.
  • Pick submodules vs. package manager vs. monorepo based on the actual problem.

🤔 Predict first

When you git submodule add a 200-MB repo, how much storage does the outer repo’s tracked tree gain — a few hundred megabytes, or a few hundred bytes?

📖 Three core terms (open before reading further)
Term What it is
Submodule A nested Git repo inside an outer Git repo
.gitmodules Plain-text config file in the outer repo listing each submodule’s path + URL
Gitlink A tree entry with mode 160000 whose “content” is a 40-char commit SHA (instead of file bytes)

Two more terms (Pinned SHA, --recursive) are introduced inline as they come up; the full glossary is at the bottom of this step.

Mental model: library subscription

A submodule is a subscription to a specific edition of a library:

  • No photocopy — no file duplication.
  • You record the book title + edition number (.gitmodules URL + pinned SHA).
  • Anyone with your note fetches the same edition.
  • Upgrade by changing the edition number.

Edition number = commit SHA. Book = the submodule’s Git repo hosted elsewhere.

On-disk layout

@startuml
main-repo/
  .git/
    modules/
      math-utils/  ← submodule's actual git data (objects, refs, HEAD…)
  .gitmodules      ← where Git should fetch each submodule
  src/
  vendor/
    math-utils/    ← nested Git repo (the working tree)
      .git         ← gitfile: "gitdir: ../../.git/modules/math-utils"
      utils.py
@enduml

Task 1: Inspect the “upstream” library

Pre-built: /tutorial/math-utils-src/ (working repo, double+triple) and /tutorial/math-utils.git (bare clone acting as the remote URL).

cat /tutorial/math-utils-src/utils.py

Task 2: Add the submodule

cd /tutorial/myproject
git switch main
git submodule add /tutorial/math-utils.git vendor/math-utils
git status                            # TWO new entries

Open .gitmodules in the editor. Predict before scrolling the answers:

  1. How many lines per submodule?
  2. Is the pinned SHA stored here?
  3. What breaks if the file is deleted?
Answers
  1. 3 lines (header + path + url). Tiny by design.
  2. URL yes, SHA no. The SHA is the gitlink in the tree (see below). Two independent facts: where to fetch vs. which commit to check out.
  3. Teammates can’t clone the submodule. .gitmodules is the subscription directory; without it, clone --recursive has no URL.

Inspect the gitlink:

git ls-files -s vendor/math-utils    # mode 160000 = submodule
git commit -m "Add math-utils submodule at v0.1.0"

Task 3: Clone with --recursive

cd /tutorial
git clone --recursive myproject colleague-clone
ls colleague-clone/vendor/math-utils

Without --recursive, the folder exists empty until the teammate runs git submodule update --init --recursive.

💡 When submodules are the *right* tool

Yes: versioned code you own shared across several repos.

No: third-party deps (use a package manager — npm, pip, cargo), or single config files (use config management).

📋 Submodule glossary (full)
Term What it is
Submodule A nested Git repo inside an outer Git repo
.gitmodules Plain-text config file in the outer repo listing each submodule’s path + URL
Gitlink A tree entry with mode 160000 whose “content” is a 40-char commit SHA (instead of file bytes)
Pinned SHA The exact commit of the submodule the outer repo wants checked out at the gitlink path
--recursive Clone flag that fetches submodules at clone-time (otherwise the folder is empty)
13

Updating Submodules: Upstream Bumps & Resync

🎯 You will learn to

  • Upgrade a submodule to new upstream work via the two-step dance (fetch/checkout inside, add/commit outside).
  • Diagnose and fix the “teammate forgot submodule update” trap — muscle memory for post-pull.
  • Force-resync any drifted submodule back to the pinned SHA with one deterministic command.

🤔 Predict first

Upstream publishes new commits. After you git pull the outer repo, will your local submodule’s working directory show the new content automatically — or do you have to do something extra?

Task 1: Upstream publishes v0.2

/tutorial/publish-math-utils-v0.2.sh
git --git-dir=/tutorial/math-utils.git log --oneline --all
cd /tutorial/myproject
git status            # nothing changed here — push doesn't propagate

Task 2: Fetch + checkout inside the submodule

A submodule is a nested repo. Use normal git inside it:

cd /tutorial/myproject/vendor/math-utils
git fetch
git checkout origin/HEAD
cd /tutorial/myproject
git status            # vendor/math-utils (new commits)
git diff vendor/math-utils

The outer diff is exactly one line-Subproject commit <old> / +Subproject commit <new>. Line-level diffs live in the submodule’s own object database.

Task 3: Bump the pinned SHA in the outer repo

git add vendor/math-utils
git commit -m "Bump math-utils to v0.2.0 (adds quadruple)"

Task 4: The teammate trap

cd /tutorial/colleague-clone
git pull
cat vendor/math-utils/utils.py     # still v0.1 on disk!

pull updated the pinned SHA in the tree, but did not touch their submodule working directory. Code that imports quadruple now fails. Fix:

git submodule update --init --recursive
cat vendor/math-utils/utils.py     # now has quadruple
💡 Make this a habit (one-time config)

After every pull that might touch submodule paths, run git submodule update --init --recursive. Or, one-time setup:

git config --global submodule.recurse true

Now pull and checkout do the right thing automatically.

Task 5: Force-resync a drifted submodule

Simulate drift:

cd /tutorial/colleague-clone/vendor/math-utils
git checkout HEAD~1
cd /tutorial/colleague-clone
git status            # modified: vendor/math-utils (new commits)
git submodule update --init --recursive
git status            # clean — pinned SHA restored

Same command works for never-initialized, partially-fetched, or drifted submodules.

14

Submodule Internals: What 'Content Changed' Means

🎯 You will learn to

  • Read modified content vs. new commits straight from git status and pick the right fix.
  • Execute the six-step publish ceremony without falling into the detached-HEAD trap.
  • Resync any weird submodule state deterministically with one command.
  • Reason from first principles — outer repo tracks one SHA; inner repo is a full Git repo; they’re independent.

🤔 Predict first

You edit vendor/math-utils/utils.py directly without cd-ing into the submodule. What does the outer repo’s git status say about vendor/math-utilsmodified content, new commits, both, or nothing?

The mental model

The outer repo stores exactly one thing per submodule (besides .gitmodules): the pinned commit SHA. On every git status, Git compares:

SHA the outer tree pins   vs    SHA at the submodule's current HEAD
    (gitlink, mode 160000)         (what's actually checked out)
Condition Message
SHAs match clean
Submodule committed new SHA new commits
Submodule working tree dirty modified content
Both both messages

Nothing else can cause a “modified” submodule.

Task 1: Clean starting state

cd /tutorial/myproject
git submodule status

Prefix: ` ` clean, + HEAD ≠ pinned, - not initialized.

Task 2: Dirty the submodule working tree

Open vendor/math-utils/utils.py. Append:

def halve(x):
    return x / 2

Save. Back in outer:

cd /tutorial/myproject
git status                      # modified content
git diff vendor/math-utils      # no real line diff — just a summary
cd vendor/math-utils && git diff   # the real diff lives here

Task 3: Commit inside the submodule — then try to push

# inside vendor/math-utils
git add utils.py
git commit -m "Add halve helper"
git push                        # FAILS — predict the error

Likely: fatal: You are not currently on a branch (detached HEAD from submodule update) or no upstream branch. This is the top submodule footgun — Step 1’s detached-HEAD concept, encountered here.

Fix:

git switch -c update-halve 2>/dev/null || git switch update-halve
git log --oneline -2
# git push -u origin update-halve   # real push would succeed now

Back in outer:

cd /tutorial/myproject
git status                      # now: new commits (not modified content)

Task 4: Bump the pinned SHA

git add vendor/math-utils
git commit -m "Bump math-utils: add halve helper"
git log -1 -p vendor/math-utils   # shows ONE line: -Subproject commit ... / +Subproject commit ...
💡 The six commands are six invariants — derive them yourself

The ceremony looks arbitrary; each step preserves one invariant:

# Command Invariant preserved
1 cd sub; git switch -c <branch> HEAD is branch-attached (not detached)
2 git commit inside sub Your change is a commit object
3 git push inside sub New SHA exists on the sub’s remote
4 cd ../..; git add <path> Outer tree stages the new pinned SHA
5 git commit outer Outer records a commit pinning the new SHA
6 git push outer New pin is visible to teammates

Know the invariants and the commands derive themselves — no memorization needed.

Task 5: Force-resync (the universal fix)

git submodule update --init --recursive
# add --force if local submodule changes should be discarded

🧭 Fixes 95% of “my submodule is weird” moments

git submodule update --init --recursive

Safe on any repo. Set git config --global submodule.recurse true to make pull/checkout do it automatically.

15

Capstone: On-Call Debugging Under Pressure

🎯 You will demonstrate you can

  • Compose 5+ advanced Git tools into one realistic end-to-end workflow — without step-by-step instruction.
  • Pick squash/rebase/merge based on the history shape you want, not memorized rules.
  • Trust the reflog safety net after chaining several destructive operations.
  • Read state first, act second — the professional habit that defeats blind-testing.
🩺 30-second readiness check — answer before starting

Without scrolling, answer from memory. If any feels shaky, revisit the listed step before attempting the capstone. Component-skill research (Lovett 2001, Ambrose et al. 2010): 45 min on a weak skill beats hours on the integrated task.

  1. Where do orphaned commits live, and how do you anchor one as a branch? Shaky? → revisit Step 2 (reflog).
  2. What’s the physical difference between git rebase and git revert in terms of which existing SHAs change? Shaky? → revisit Step 11 (revert) — or really, Step 3.
  3. Why does git stash not include feature.py if you never git add-ed it? Shaky? → revisit Step 4 (stash gotchas).
  4. What’s the verb to finish a paused cherry-pick after resolving conflicts? A paused rebase? Shaky? → revisit Step 5 or Step 8.
  5. After git bisect run, what’s the non-negotiable final command, and why? Shaky? → revisit Step 7 (bisect).

All five clear? Proceed. Two or more shaky? Spend 15 minutes on the weak step first. The capstone is an integration exercise — fragile components compound into frustration.

Scenario — no hand-holding

You’re on-call. Page: absolute(-4) == 4 fails on main. CI red. Teammate left a dirty tree with an unrelated note. Nobody knows which of ~6 recent commits broke things.

Your checklist:

  1. Shelve the unrelated in-progress note (tree must be clean for bisect).
  2. Find the bad commit via binary search.
  3. Read its message and diff before touching code (author intent).
  4. Fix on a dedicated branch. Messy WIP commits expected.
  5. Clean up so main sees one focused commit.
  6. Merge to main.
  7. Restore the shelved note.
  8. Verify reflog could still recover everything you rewrote.

Nothing new — every command came earlier. The point is choice and composition under pressure.

Style. Loop: read state → decide → act → re-read state. git status, git log --oneline --graph --all, git reflog are your dashboard. Lost? Re-read state, don’t guess.

The state you walk into

cd /tutorial/myproject
git status
git log --oneline --graph --all -12
python3 test_calculator.py

Hints — open only if stuck for a minute

Task 1 (shelve WIP)

Step 4. One command, noun form. Bisect needs a clean tree.

Task 2 (find the culprit)

Step 7, automated. Test exits 0 = good, non-zero = bad. Always end with reset.

Task 3 (read intent)

Step 6’s chain: git blame + git show <sha>.

Task 4 (messy fix branch)

Branch off main, iterate, make any number of WIP commits, get tests green.

Task 5 (squash into one)

Step 9 rebase -i + squash, or Step 10 merge --squash. Either is fine.

Task 6 (merge)

Whatever strategy leaves main with one clean fix commit on top.

Task 7 (restore note)

Step 4. Inverse of Task 1. Leave uncommitted.

Task 8 (reflog verify)

Step 2. Read-only check: git reflog still sees your pre-squash commits.

Success criteria

  • python3 test_calculator.py prints all tests pass.
  • main ends with exactly one new fix commit.
  • calculator.py still has your uncommitted # TODO: add clamp helper note.
  • git reflog retains your intermediate messy commits.

The “burning down the repo” callback

From Step 1’s antipattern: panic = delete the folder, re-clone, force-push. You did the opposite:

Situation What you did What novices do
Dirty tree stash delete folder
Unknown-culprit regression bisect read 30 diffs
Author intent blame + show guess
Messy intermediates rebase / squash rewrite from scratch
“Lost” commits reflog panicked rm -rf

Same competence gap you’ll see on every team for the rest of your career.

🏔️ Stretch (optional, not auto-tested)

Re-run with one extra wrinkle: the shelved note conflicts with the bug-fix line on stash pop. Resolve the conflict, pick keep-both or keep-fix, verify tests + reflog. This is the capstone’s capstone.

🗺️ The unifying schema — one picture

Every command from the basic tutorial and these 14 advanced steps falls into exactly one of three categories. Only category 3 is dangerous to push. Internalize this picture and you can predict the safety of any unfamiliar Git command at a glance.

@startuml
layout vertical
box "1. ALWAYS SAFE - reads state or moves refs without changing history\nNo new SHAs, no force-push needed\n- git blame, git log, git show, git diff, git status\n- git branch (create), git switch, git checkout (read mode)" as Safe
box "2. SAFE TO PUSH - appends new SHAs without changing existing ones\nAdditive only - teammates fast-forward cleanly\n- git commit\n- git cherry-pick\n- git revert (the anti-matter commit)\n- git merge (with or without merge commit)\n- git merge --squash + git commit\n- git stash (local by design, never pushed)" as Additive
box "3. DANGEROUS TO PUSH - rewrites or abandons existing SHAs\nLocal/unpushed branches only - needs --force on shared\n- git rebase\n- git rebase -i (squash, drop, fixup, edit, reword)\n- git commit --amend\n- git reset --hard / --mixed / --soft" as Rewriting
@enduml

The single decision rule: before pushing, ask “did I rewrite or abandon any existing SHAs?” If yes, the command lives in category 3 and your teammates’ clones will diverge. Reach for category 2 (revert, merge, cherry-pick) when undoing pushed work.

🌱 What to do this week (post-tutorial spaced retrieval)

Without spaced retrieval, ~50% of what you learned today is gone in a week. Twenty minutes total over the next month locks it in:

When What
Tomorrow (10 min) Recreate the capstone from a blank slate — same scenario, same tools, no scrolling back. If you stumble, re-do that step (not the whole capstone).
In 1 week (5 min) Pick any 3 commands from this tutorial. From memory: state name, scenario, and the Step 3 schema (creates objects? moves pointers? both?).
In 1 month (5 min) The next time you face a real “lost commit” or “messy branch” at work, reach for git reflog first and rm -rf .git never. That moment is the highest-value retrieval practice you’ll do.

The Cepeda meta-analysis (254 studies, 14,000+ participants) shows spaced practice produces ~2× better retention than equal-duration massed practice — and the gap widens with delay. This 20 minutes is your highest-ROI study time.