Advanced Git: Debugging, History Rewriting, and Submodules
Go beyond the basics — master detached HEAD, git's object model, stash, cherry-pick, blame, bisect, rebase (including interactive), squash merge, and submodules. Every step is grounded in a real scenario you will meet on the job.
Branches, HEAD, and Detached HEAD
🎯 You will learn to
- Explain why branch creation is O(1) — no files get copied.
- Tell attached from detached HEAD by reading
.git/HEAD. - Anticipate where orphaned commits come from, setting up the reflog rescue.
📚 The 15-step arc (open once, then close)
| Phase | Steps | What you build |
|---|---|---|
| Foundations | 1–3 | Mental model: branches are pointers; commits are immutable hashed snapshots |
| Daily tools | 4–7 | Stash, cherry-pick, blame, bisect — used weekly on real teams |
| History rewriting | 8–11 | Rebase, interactive rebase, squash-merge, revert — when to use each |
| Submodules | 12–14 | Nested repos, the gitlink, six-step publish ceremony |
| Capstone | 15 | Compose 5+ tools under pressure with no hand-holding |
Steps 1–3 are foundational — every later step refers back. After Step 7, take a break before Step 8 (spacing helps consolidation).
Why this tutorial exists
You already know init, add, commit, branch, merge, remotes.
This tutorial lifts the hood — object database, refs, HEAD — so every
“scary” command becomes a safe, predictable pointer move.
Two antipatterns to retire on sight:
| Antipattern | What it looks like |
|---|---|
| Blind-testing | Typing random add/commit/push/pull permutations until errors stop |
| Burning down the repo | Deleting the folder, copying files out, re-cloning, force-pushing |
Both come from an inaccurate mental model. Each step fixes one piece.
Prerequisite self-check
Answer from memory. Any shaky? Revisit the basic tutorial.
- New file is red in
git status. State name? Command to green? - After a commit + one more edit, what does bare
git diffcompare? mainandfeaturehave diverged. Canmerge featurefast-forward?- Teammate pushed a buggy commit to shared
main.reset --hard + force-push, orrevert? - Staged a
.envwith secrets. Does adding to.gitignorenow help?
Expected answers
- Untracked →
git addstages it. - Working tree vs. index. Index matches HEAD (nothing staged), so you see unstaged edits.
- No — diverged branches need a merge commit with two parents.
git revert. Additive; doesn’t break teammates’ clones.- No.
.gitignoreonly blocks future tracking. Usegit rm --cached+ rotate the secret.
Task 1: Prove a branch is a 41-byte pointer
Predict first: what’s in .git/refs/heads/main? A commit list? A snapshot?
cd /tutorial/myproject
cat .git/refs/heads/main
cat .git/refs/heads/feature-divide
cat .git/HEAD
Each branch file is one line — a commit SHA. HEAD is
ref: refs/heads/main — a pointer to a pointer.
@startuml
branch main:
A "Initial commit"
B "Add add function"
head main
@enduml
That indirection lets commit advance the branch pointer while HEAD
auto-follows — no HEAD rewrite needed.
Task 2: Detach HEAD and feel the difference
git switch --detach HEAD~1
cat .git/HEAD # now a raw 40-char SHA, not a ref
Detached HEAD = HEAD pinned to a commit, not a branch. Watch the graph: HEAD floats on the commit node itself.
Museum-archive analogy. You can read any document, but notes left without a label have nowhere to go when you leave.
git switch -c <name>is that label.
Any commit you make here is anchored to nothing. git switching away
orphans it. The next step shows how to rescue orphans.
Cleanup
git switch main
✍️ Before moving on (30-second self-test)
Without scrolling up, answer:
- How many bytes is a branch?
- What’s the physical difference between attached and detached HEAD?
Got both? You’ve internalized the schema this whole tutorial rests on.
Branch Internals & Detached HEAD — Knowledge Check
Min. score: 80%1. What is a Git branch, physically speaking?
A branch is just a tiny file holding a 40-character SHA. That is why git switch -c new-branch is instantaneous — Git does not copy files, it writes one line of text.
2. Which statements about Detached HEAD state are true? (Select all that apply) (select all that apply)
Detached HEAD stores a raw SHA in .git/HEAD. No branch tracks commits made here. Before leaving, create a branch (git switch -c rescue) — otherwise the commits become orphaned and the next step’s reflog is your recovery path.
3. Why can HEAD point to a branch name rather than a commit SHA?
The pointer chain is HEAD → refs/heads/branch → commit. A commit only needs to rewrite the branch file — HEAD dereferences through it. This indirection is the engineering reason branches are cheap.
4. You want to inspect a commit from last week without risking any accidental edits. Which is the safest approach?
git switch --detach enters read-only-feeling detached HEAD at any commit. You can look around freely; git switch main returns you unchanged. git reset --hard would rewrite your current branch — destructive. git checkout <sha> . overwrites files without moving HEAD.
5. Put in order the steps to safely inspect an old commit and return to normal operation. (arrange in order)
git log --oneline main # find the old commit SHAgit switch --detach# enter detached HEAD at it cat calculator.py # look around, read freelygit switch main # return; no changes made
git reset --hard# DANGER: rewinds main git checkout. # DANGER: overwrites working dir git branch -f main# DANGER: moves branch pointer
Detached HEAD is the safe inspection mode — HEAD anchored to the commit, no branch pointer moved. The distractors all modify state (main’s pointer, working directory), which is the opposite of inspection.
Rescuing Lost Work with git reflog
🎯 You will learn to
- Recover commits lost to bad rebases, hard resets, and detached-HEAD orphans.
- Tell what
git log --allcan see from whatgit reflogcan see. - Know reflog’s limits — it’s local, and disappears with the clone.
🤔 Predict first
You make an experimental commit in detached HEAD, then git switch main
away without creating a branch. Can git log --all find that commit?
Can anything?
log --all vs reflog — the load-bearing distinction
git log --all |
git reflog |
|
|---|---|---|
| Walks | Commits reachable from refs | Every position HEAD occupied |
| Sees orphans? | No (unreachable = invisible) | Yes (reachability irrelevant) |
| Shared across clones? | Yes | No — local only |
Task 1: Deliberately lose work
cd /tutorial/myproject
git switch --detach HEAD
echo "# experimental note" >> calculator.py
git add calculator.py && git commit -m "Experimental: add note in detached HEAD"
git switch main
git log --all --oneline # the Experimental commit is GONE from this view
It’s orphaned — no ref reaches it, so log --all walks right past.
Task 2: Find the orphan
git reflog
Each line: <sha> HEAD@{n}: <action>: <description>.
| Expression | Meaning |
|---|---|
HEAD@{0} |
where HEAD is now |
HEAD@{1} |
where HEAD was one move ago |
HEAD@{n} |
n moves ago |
The detached-HEAD commit is at HEAD@{1}.
Task 3: Anchor it with a branch
git branch rescued-work HEAD@{1}
git log rescued-work --oneline
The universal recipe: git reflog → note the SHA or HEAD@{n} →
git branch <name> <sha> anchors it as reachable. Works for dropped
commits after interactive rebase, botched resets, failed rebases —
any “lost” commit that’s still in .git/objects.
git reflog — Knowledge Check
Min. score: 80%
1. You made three commits in detached HEAD state, then ran git switch main without creating a branch. A teammate asks if the commits are lost. What do you tell them?
Orphaned commits remain in .git/objects/ until git gc prunes them. git reflog shows every position HEAD has been at, including the orphaned one. git branch rescue <sha> rescues the work.
2. In one sentence, why can git reflog show commits that git log --all cannot?
This is the load-bearing distinction. git log --all is a graph traversal starting at refs; an unreachable commit is invisible to it. git reflog is a literal diary of HEAD positions — reachability is irrelevant. Internalize this or later destructive commands will feel unpredictable.
3. What does HEAD@{2} mean?
HEAD@{n} is reflog syntax — n movements back in the HEAD-position log. Different from HEAD~n (n commits back along first-parent chain) and HEAD^n (nth parent of HEAD). Three similar-looking but semantically different suffixes — get them wrong and you will end up at a different commit than you intended.
4. Reflog is local only. Which of these destroys your reflog and the rescue path with it? (Select all that apply) (select all that apply)
Reflog lives in .git/logs/. Destroying .git/ (option 1) takes the reflog with it. A fresh clone (option 2) starts with an empty reflog of that clone. Expiry (option 4) is configurable via gc.reflogExpire. A force-push (option 3) rewrites the remote’s branch but doesn’t touch your local reflog — your local rescue path is still intact.
5. Put these steps in the correct order to rescue an orphaned commit you just made in detached HEAD. (arrange in order)
git switch main # leave detached HEADgit reflog # find the SHA of the orphaned commitgit branch rescued-work HEAD@{1} # anchor it with a branchgit log rescued-work --oneline # verify it is reachable again
git log --all --oneline # would NOT show the orphangit checkout# needs the SHA you just lost git push origin rescued-work
The canonical rescue recipe. Distractor 1 would fail silently — log --all cannot see orphans. Distractor 2 needs the SHA, which was lost when the terminal scrolled. Distractor 3 is a separate sharing concern — irrelevant to rescue.
6. [Revisit Step 3 — preview] In the next step you will see that commits are content-addressable and immutable. Given that, what does git branch rescue <orphan-sha> actually do to the commit object?
Same mechanic you learned in the previous step. Creating a branch is one fwrite() of 41 bytes. Rescue doesn’t move commits; it makes them reachable. This is why rescue is instantaneous regardless of commit size — a concept the next step formalizes as Git’s object model.
Relative Commit Addresses & Git's Object Database
🎯 You will learn to
- Name any commit without a SHA using
HEAD~n,BRANCH^, andrev-parse. - Prove Git’s history model is snapshot-based — commits point to trees that point to blobs holding full file bytes — by hashing content directly.
- Predict that a single trailing space changes the entire SHA chain — and say why that matters for
blamelater.
🚪 This is the threshold step
Step 3 is the conceptual hinge of the whole tutorial. Every later step (rebase, cherry-pick, bisect, submodules) becomes obvious or mysterious depending on whether the object model clicks here.
If it doesn’t click on the first read, that’s expected — threshold concepts (Meyer & Land) are transformative (they reframe the whole domain) and troublesome (they resist quick mastery). Re-read, re-run the hashing experiment, sleep on it. Most learners need two passes. The recall prompt at the bottom is your self-check.
Relative references
| Expression | Meaning |
|---|---|
HEAD~n |
n commits back along first-parent chain |
BRANCH^ |
shorthand for BRANCH~1 |
BRANCH^2 |
second parent of a merge commit |
@startuml
branch main:
A "Oldest commit"
B "main~2"
C "HEAD~1"
D "HEAD / main"
head main
@enduml
Task 1: Practice
cd /tutorial/myproject
git rev-parse HEAD # current SHA
git rev-parse HEAD~1 # parent
git rev-parse main # same as HEAD
Task 2: Prove content-addressability
Every object in .git/objects/ is addressed by the SHA-1 of its
content. Three object kinds:
| Object | Stores |
|---|---|
| blob | Raw file bytes (no filename) |
| tree | Directory: filename → blob/tree SHA |
| commit | Tree SHA + parent SHAs + author + message |
Hash the same bytes in two unrelated repos:
echo "hello world" | git hash-object --stdin
cd /tmp && git init -q bob-repo && cd bob-repo
echo "hello world" | git hash-object --stdin
cd /tutorial/myproject
Identical 40-char SHA. Same bytes → same hash, always, everywhere. That’s why Git deduplicates across branches and history for free.
Task 3: Byte-exact means byte-exact
Predict: hashing "hello world " with one trailing space — same SHA?
printf 'hello world \n' | git hash-object --stdin
Different. One whitespace byte → new blob SHA → new tree SHA → new commit SHA. That’s why reformatter commits (Step 6) mask real authorship: every whitespace tweak rewrites the entire hash chain.
✍️ Before moving on (the unifying invariant)
Close this and answer from memory:
“What’s the one property of existing commit objects that lets every later step in this tutorial work?”
The invariant (peek only after attempting)
Existing commit objects are immutable. Git changes history by creating new objects and/or moving references — never by editing old commits in place.
Every Git command falls into one of these operation categories:
| Operation type | Examples | What changes |
|---|---|---|
| Create immutable objects | hash-object, commit, stash, cherry-pick, revert |
New blob / tree / commit objects |
| Move refs | branch, reset, fast-forward merge, finalizing a successful rebase |
Branch / ref points to a different commit |
| Update index | add, conflicted-resolution staging, merge --squash |
Staging area changes |
| Update working tree | switch, restore, checkout, stash pop, submodule update |
Files on disk change |
| Transfer objects/refs | fetch, push, pull |
Local/remote object/ref sets change |
Most everyday commands combine categories (e.g., commit creates a
commit object and moves a branch ref and clears the index).
The point isn’t that operations are pure — it’s that no operation
rewrites existing commit objects. Whenever a later step feels
confusing, ask: what objects is this creating? what refs is it
moving? what’s still in .git/objects that I could recover?
Relative Addresses & Object Database — Knowledge Check
Min. score: 80%
1. You want the commit two before main. Which reference is correct?
main~2 walks back 2 commits along the first-parent chain. main^2 means the second parent of a merge commit — completely different. main-2 and main..2 are not valid syntax.
2. What does git hash-object do?
git hash-object is the low-level plumbing that every commit uses internally. Because identical content always yields the same SHA, Git deduplicates identical files across the entire history for free.
3. Which statements about Git objects are true? (Select all that apply) (select all that apply)
Git’s history model is snapshot-based: each commit points to a tree, which points to blobs holding full file content. Storage may later be packed and delta-compressed (git gc produces pack files using delta encoding) without changing the model — the abstraction commits expose is always whole snapshots. Filenames live in tree objects, not in blobs, so two files with identical content share one blob.
4. [Revisit Step 1] You are in detached HEAD at a commit that is 4 back from main. Which command prints that SHA without copying it from git log?
git rev-parse is the universal ‘ref → SHA’ translator. It accepts relative (main~4), symbolic (main), short (a73f), or branch/tag references. git show displays the full commit diff, not just the SHA. git log --limit is not valid syntax.
5. Why does git branch feature complete in milliseconds even on a 10-GB repo?
A branch creation is a single fwrite() of 41 bytes. No copying, no traversal, no network. Once you see branches as tiny pointer files, their speed and cheapness stops being mysterious.
6. [Revisit Steps 1-2] You enter detached HEAD at an old commit, make one exploratory commit, and switch away without creating a branch. In terms of Git objects, what happens to that commit?
Objects in Git live until garbage collection. An orphaned commit is not ‘deleted’ — it is just unreachable from any ref. git reflog still records HEAD’s path through it, which is how git branch rescue <sha> can rescue it. This links Step 2’s reflog safety net to Step 3’s object-model view: unreachable ≠ deleted.
7. Put in order the commands that prove Git is content-addressable (same bytes → same SHA, across unrelated repos). (arrange in order)
echo "hello world" | git hash-object --stdin # in your main repo# note the 40-char SHA — call it SHA-Acd /tmp && git init -q other && cd other # fresh unrelated repoecho "hello world" | git hash-object --stdin # same bytes# output is IDENTICAL — call it SHA-B; SHA-A == SHA-B
git push origin main # sharing has nothing to do with hashinggit config user.email alice # author metadata is in commits, not blobsgit commit -m "hello" # commit SHA includes parent/time; not the demo
Content-addressability is a property of bytes hashed, independent of repo, branch, or user. Distractor 1 (push) is irrelevant — hashes are local. Distractor 2 (email) affects commit objects but not blob hashes. Distractor 3 (commit) creates a commit object whose SHA depends on parent + time + author — not the cleanest demo of blob deduplication.
Saving Work Temporarily with git stash
🎯 You will learn to
- Context-switch cleanly mid-feature without polluting history with WIP commits.
- Pick
popvs.applycorrectly. - Diagnose the classic “stash missed my new file” footgun.
Scenario
You’re mid-feature when your lead yells “hotfix on main, now!”
Your options without stash are all bad: WIP commit (pollutes history),
git restore (destroys work), or stay put (can’t isolate the fix).
git stash is the escape hatch.
🤔 Predict first
After git stash, where does your in-progress work end up — in the
index, in the working tree, in a private commit, or deleted? And what
will git status say about your working tree?
Task 1: See the dirty tree
A half-finished power function is already sitting in calculator.py:
def power(a, b):
# TODO: add input validation
return a ** b
cd /tutorial/myproject
git status
git diff
Task 2: Stash it
git stash
git status # clean!
git stash list # your WIP is here
💡 How stash works internally (Step 3 callback)
A stash is a merge commit at refs/stash — first parent is HEAD at stash
time, second parent records the index (and a third parent records untracked
files when you use -u). Same object model as every other commit, which is
why git stash apply <sha> works on any historical stash.
Task 3: Do the hotfix on a dedicated branch
git switch -c hotfix-divide-zero
In the editor, append to calculator.py:
def safe_divide(a, b):
"""Divide a by b, raising ValueError on zero denominator."""
if b == 0:
raise ValueError("Cannot divide by zero")
return a / b
git add calculator.py
git commit -m "Hotfix: add safe_divide to prevent zero-division errors"
git switch main
git merge hotfix-divide-zero --no-edit
git branch -d hotfix-divide-zero
Task 4: Restore your WIP
git stash pop
git stash list # empty — pop removed it
pop = apply + drop. Use apply instead if you want to keep the stash
(e.g. to apply it on multiple branches).
📋 Full stash cheat sheet (other flags)
| Command | Effect |
|---|---|
git stash |
Save tracked mods + staged; clean tree |
git stash pop |
Restore and drop the top stash |
git stash apply |
Restore but keep the stash |
git stash drop |
Delete without applying |
git stash push -m "msg" |
Save with a message |
git stash -u |
Also include untracked files |
Gotcha: plain git stash skips untracked (never-add-ed) files. Use
-u to include them — the most common stash footgun.
Task 5: Finish the feature
Edit calculator.py so power has real validation, then commit
(message must include “power”):
def power(a, b):
"""Return a raised to the power of b."""
if not isinstance(a, (int, float)) or not isinstance(b, (int, float)):
raise TypeError("Arguments must be numbers")
return a ** b
git stash — Knowledge Check
Min. score: 80%
1. You are mid-edit on app.py when your lead asks for an urgent hotfix on main. You have NOT staged your changes yet. Which approach keeps your tree clean for the hotfix without losing your in-progress work?
git stash is built for this: save tracked modifications and staged changes to a private stack, reset the tree, let you context-switch cleanly. Recovered with git stash pop.
2. What does git status report immediately after git stash?
git stash resets the working tree to match HEAD — so git status reports clean. Your changes are safe in the stash commit at refs/stash.
3. Difference between git stash pop and git stash apply?
Use pop for the usual workflow. Use apply when you want the same stash on multiple branches — the entry stays in the list until you manually git stash drop.
4. You ran git stash but your brand-new file feature.py (never git add-ed) is still there. Why?
Plain git stash only captures what Git is tracking — modified tracked files and staged changes. For brand-new files, use git stash -u (--include-untracked).
5. [Evaluate] A teammate says: ‘I never use stash — I just commit with WIP and squash later.’ Best evaluation?
Both preserve work. The difference is visibility. Pushed WIP commits enter shared history and degrade git log, git bisect, and code review. Stash is private — no pollution, but you can forget it. Neither is universally right.
6. [Revisit Step 1] You stashed on main, then switched to a commit with git switch --detach HEAD~2 to inspect old code. What is the safe way to recover the stash?
Stashes are not tied to a branch — but a conflicting pop in detached HEAD leaves you with unresolved changes and nothing anchoring them. Always return to a named branch before popping.
7. [Revisit Step 3] Where does Git physically store a stash entry?
A stash is a proper commit in the object database, anchored by refs/stash. This is why git stash survives across terminals and reboots, and why git stash apply <sha> works with any historical stash. Same object model as Step 3 — stash is not a special case.
8. Put in order the complete “stash → hotfix → resume” workflow from Task 3 of this step. (arrange in order)
git stash # save WIP, clean working treegit switch -c hotfix-xyz # dedicated branch for the hotfixgit commit -am "Hotfix: ..." # fix + commit on the hotfix branchgit switch main && git merge hotfix-xyz --no-editgit branch -d hotfix-xyz # clean up the merged hotfix branchgit stash pop # restore your WIP on main
git commit -m "WIP" # pollutes history if pushedgit restore . # discards WIP permanentlygit stash drop # would throw away the WIPgit push origin stash # you cannot push a stash
The canonical context-switch sequence. Each distractor is a common novice mistake — committing WIP pollutes shared history; git restore destroys work; dropping the stash before popping loses it; stashes are local-only (no push). Learn this six-line sequence as a unit.
Cherry-Pick: Copy One Specific Commit
🎯 You will learn to
- Pick cherry-pick for one-commit backports; reject it for many-commit integration.
- Resolve a cherry-pick conflict end-to-end (same marker dance as merge — different final verb).
- Explain why the copied commit has a new SHA (apply Step 3’s object model).
Scenario
Lead: “The absolute helper on experimental is useful on main too.
Bring that one commit over — leave the half-baked multiply behind.”
🤔 Predict first
Cherry-pick produces a new commit with a new SHA. What happens to the
original commit on experimental — does it move, get rewritten,
vanish, or stay put unchanged?
cherry-pick <sha> replays one commit’s patch on top of HEAD as a
new commit (new parent → new SHA, same message + diff).
Task 1: Inspect
The pre-built experimental has two commits: a half-baked
experimental_multiply, and a reusable absolute.
cd /tutorial/myproject
git log experimental --oneline
You only want the second commit.
Task 2: Cherry-pick the tip
A branch name resolves to its tip commit — no SHA copy needed:
git switch main
git cherry-pick experimental
git log --oneline
A new commit Add absolute value function sits on main with a
different SHA from the original. Same patch, new parent → new SHA.
💡 Schema check (Step 3 callback). Cherry-pick creates a new immutable object and moves the branch pointer to it. The original commit on
experimentalis untouched — Git never edits commits in place. This pattern repeats in every step from here on.
🔍 Contrast — what’s not like cherry-pick.
git branch fooat the same commit creates zero new objects (just a 41-byte ref file). Both move pointers; only cherry-pick also creates a new commit. That’s why branch creation is instant and cherry-pick can fail with a conflict.
Task 3: Produce and resolve a conflict
Make the same line differ on both branches:
On main, edit calculator.py so def add(a, b): return a + b becomes:
def add(a, b):
"""Return the sum of two numbers."""
return a + b
git add calculator.py && git commit -m "Document add function"
On experimental, change the same line differently:
git switch experimental
Edit to:
def add(a, b): return a + b # simple addition
git add calculator.py && git commit -m "Inline comment on add"
git switch main
git cherry-pick experimental # CONFLICT
git status
You’ll see <<<<<<< / ======= / >>>>>>> in the file. Conflicts
are not failures — Git is asking a human to combine two valid
changes.
Edit the block to keep both sides:
def add(a, b):
"""Return the sum of two numbers."""
return a + b # simple addition
git add calculator.py
git cherry-pick --continue # NOT `git commit` — use the cherry-pick verb
🆘 Stuck on the conflict?
- Open
calculator.pyand find the<<<<<<</=======/>>>>>>>block. - The block has two halves: above
=======is what you have (HEAD), below is what’s coming in (the cherry-picked commit). - Edit so the result keeps the docstring and the inline comment, then delete all three marker lines.
git add calculator.py→git cherry-pick --continue.- To bail at any point:
git cherry-pick --abortresets cleanly.
Cherry-Pick — Knowledge Check
Min. score: 80%
1. What does git cherry-pick <sha> do?
Cherry-pick replays one commit as a new commit on HEAD. The source commit is unchanged. The new commit has the same patch and message but a new parent and therefore a new SHA.
2. You cherry-pick commit abc123 from experimental onto main. Afterwards, what is on experimental?
Cherry-pick is a copy operation. The source commit stays where it is. Two commits with the same patch now live in two branches with different SHAs.
3. During a cherry-pick, Git reports a conflict. Which sequence correctly completes it?
Standard conflict resolution: edit the file to remove <<<<<<< markers, git add to mark as resolved, git cherry-pick --continue (commits silently with the original message; pass -e/--edit if you want the editor). --abort bails and restores HEAD.
4. [Revisit Step 3] After a cherry-pick, the new commit has a different SHA from the source. Why?
A commit’s SHA is SHA-1(tree + parent(s) + author + committer + message). Same patch on a different parent → different tree (possibly) and definitely different parent reference → different SHA. Chapter 2’s object-model lesson makes this inevitable.
5. Which scenario is a bad fit for cherry-pick?
For integrating many commits, use git merge or git rebase — cherry-picking 50 commits by hand is laborious and loses merge base information, which complicates future merges. Cherry-pick is surgical — reserve it for one or a few commits.
6. [Revisit Step 4] Mid-cherry-pick, a conflict pauses Git. You realize you need to check something on another branch first. Which sequence safely preserves your conflict-resolution progress so far?
You cannot cleanly stash or switch with an in-progress cherry-pick — Git’s internal state (MERGE_MSG, CHERRY_PICK_HEAD, conflicted index) is not stash-compatible. Abort, switch, do the other task, come back, and re-start the cherry-pick. The abort is cheap and restores a clean state.
7. Put in order the commands that resolve a conflicted cherry-pick end-to-end. (arrange in order)
git switch main # be on the target branchgit cherry-pick# CONFLICT reported git status # see which files conflict# edit the conflicted file: remove <<<<<<<, =======, >>>>>>> markersgit add# mark resolved git cherry-pick --continue # finalize with original message
git commit -m "resolve conflict" # breaks the cherry-pick flowgit cherry-pick --force # not a real flaggit merge --continue # wrong verb — not a mergegit reset --hard # would abort + discard
The post-conflict verb is cherry-pick --continue, not commit. The other distractors are common reflex mistakes — --force doesn’t exist here; merge --continue is a different operation; reset --hard discards rather than finalizes. Use --abort to bail out cleanly.
git blame: Who Last Changed This Line (and Why)?
🎯 You will learn to
- Answer “why does this line exist?” by chaining
blame -L→show <sha>. - Predict when plain blame lies — reformatter commits mask real authors.
- Defuse the lie with
-worblame.ignoreRevsFile. - Recognize blame’s blind spot: it can only see existing lines.
The two-command forensic workflow
git blame -L <start>,<end> <file>→ find the SHA that last touched the line.git show <sha>→ read the commit message and diff — the why lives here.
Blame is for context, not accusation.
Task 1: Why does this line exist?
git blame -L 7,7 calculator.py
# Copy the SHA from the first column, then:
git show <that-sha>
Who, when, why — covered. That chain is 90% of real blame use.
Task 2: The reformatter-masked authorship case
Setup planted: Bob wrote clip. CI-Bot later ran whitespace
normalization (no logic change).
Predict: who will plain blame name as the last author of def clip?
git blame -L 1,$(wc -l < calculator.py) calculator.py | grep -i 'clip'
Last-toucher wins — blame names CI-Bot, masking Bob. Inspect:
git show <ci-bot-sha> # pure whitespace diff
Add -w to skip whitespace-only changes:
git blame -w -L 1,$(wc -l < calculator.py) calculator.py | grep -i 'clip'
Now the author is Bob — the real logic author. For recurring formatters, persist this:
echo "<ci-bot-sha>" >> .git-blame-ignore-revs
git config blame.ignoreRevsFile .git-blame-ignore-revs
GitHub’s web blame UI honors this file too.
Task 3: Default blame vs. HEAD -- blame
Predict first: if your working tree has uncommitted edits to a
file, will plain git blame <file> show those uncommitted lines or
hide them?
echo "# uncommitted note" >> calculator.py
git blame calculator.py | tail # the uncommitted line is shown — with a zero SHA "Not Committed Yet"
git blame HEAD -- calculator.py | tail # only what's committed at HEAD
git restore calculator.py # discard the experimental edit
The distinction. Default git blame <file> annotates the file
as it currently is on disk — uncommitted lines included, marked
with the zero SHA 00000000 and the author “Not Committed Yet”.
git blame HEAD -- <file> instead asks “who last touched this line
in the version recorded at HEAD?” Different question, different
answer when the working tree is dirty.
Still a real blind spot, though. Blame can only attribute existing
lines (in either mode). A bug caused by a deleted line is invisible.
For deletions, reach for git log -p, git log -S (pickaxe search),
or git bisect (next step) — the official Git docs are explicit that
deleted/replaced lines require diff- or pickaxe-style history search.
📋 Full flag cheat sheet (`-C`, `-M`, `ignoreRevsFile`)
| Flag | Use when |
|---|---|
-L start,end |
You know which lines matter (avoid scanning 1000 lines) |
-w |
A reformatter was the last toucher |
-C -M |
A line moved or was copied across files |
blame.ignoreRevsFile |
Permanently skip known reformat commits |
💡 Sanity check: when `-w` is a no-op (try it)
git blame -L 1,$(wc -l < calculator.py) calculator.py | grep -i 'def add'
Plain blame already shows the real author — -w is identical here.
Rule: -w matters only when a reformatter was the last toucher.
git blame — Knowledge Check
Min. score: 80%
1. What does git blame calculator.py show?
Blame gives per-line provenance — the last-touching commit and author. Combined with git show <sha>, you see the full context: why the line was written this way.
2. You need to know the commit message for line 42’s last modification. Which sequence gets you there fastest?
git blame -L 42,42 restricts output to line 42 only — instant. The first column is the SHA; pipe that SHA into git show for the full message. This two-step recipe is idiomatic.
3. A colleague recently ran black across the whole repository. Now git blame shows them as the author of every line. How do you see the real last-meaningful author?
-w ignores whitespace-only changes, hiding pure reformatting from blame. For recurring formatters, add the reformat commit SHAs to a file referenced by blame.ignoreRevsFile — now everyone skips them consistently.
4. [Revisit Step 3] When git blame prints a SHA for a line, what kind of object does that SHA refer to?
Blame attributes lines to commits. The SHA printed is a commit SHA — run git cat-file -t <sha> to confirm it reports commit. You then use git show <sha> to read it.
5. When is git blame the wrong tool for finding a bug?
Blame only tells you about existing lines. A bug caused by an absent line (e.g., forgetting to call validate()) leaves blame blind. For regressions introduced by a missing line, use git bisect (next step) or git log -p to scan history.
6. [Analyze] Give a concrete bug where git blame would mislead you even though the culprit line IS in the file. Which of the following fits best?
Reformatter commits are the classic blame-mislead scenario. The CI bot’s commit ‘last touched’ every line, so blame attributes all lines to the bot — hiding the real author who introduced the logic bug. Defense: git blame -w to skip whitespace-only changes, or blame.ignoreRevsFile to skip known reformat commits.
7. [Revisit Step 4] Your working tree has an uncommitted edit to calculator.py. You run plain git blame calculator.py. What do you see for the modified line?
Default git blame <file> annotates the file as it currently is — so an uncommitted line appears with the zero SHA 00000000 and author “Not Committed Yet”. To restrict to the committed version of the file, use git blame HEAD -- <file>. Two different questions (“who touched what I’m reading right now?” vs. “who touched what’s recorded at HEAD?”); two different commands. Note this is separate from the deletion blind spot — a line that no longer exists in the file is invisible to either mode of blame.
8. Put in order the “forensic chain” for understanding why a specific line in parser.py exists.
(arrange in order)
git blame -L 42,42 parser.py # find SHA of last change to line 42# copy the SHA from the first columngit show# read commit message + full diff # is the commit message a reformatter? If yes, try -w:git blame -w -L 42,42 parser.py # ignore whitespace-only commits
git log parser.py | grep 42 # slow, imprecisegit diff HEAD parser.py # shows YOUR uncommitted edits, not authorshipgit status parser.py # unrelated to authorship
The blame → show chain answers “why does this line exist?” The -w fallback defuses reformatter masking. The distractors are all plausible-looking commands that don’t answer the authorship question — common cul-de-sacs when learners panic-grep instead of reaching for blame.
git bisect: Binary Search for the Commit That Broke Things
🎯 You will learn to
- Decide when bisect is worth reaching for (rule: ≥ ~5 commits or slow tests).
- Run an automated bisect end-to-end and always reset afterward.
- Spot regressions blame cannot find — deletions, behavioral changes, and anything involving missing lines.
🤔 Predict first
A regression appeared somewhere in the last 1000 commits.
Roughly how many tests would git bisect need to find the exact
breaking commit? Pick one before reading on: 1000, 500, 100, or ~10.
Why bisect beats every alternative
Reading 30 diffs by hand is slow. blame can’t see missing lines.
log --grep="fix" is wishful thinking.
Bisect runs binary search on history: log₂(30) ≈ 5 tests to pin
the exact culprit. 1000 commits → ~10 tests. Scales forever.
Task 1: See the regression
Setup planted 5 commits; one of them broke absolute(-4) == 4.
cd /tutorial/myproject
git log --oneline -7
python3 test_calculator.py # AssertionError
Task 2: Manual bisect (feel the motion)
git bisect start
git bisect bad HEAD
git bisect good HEAD~5
# Git checks out a midpoint. Test it:
python3 test_calculator.py
# exit 0 → git bisect good ; exit ≠ 0 → git bisect bad
# Repeat until Git prints "<sha> is the first bad commit"
git bisect reset
Task 3: Automated bisect (the real-world default)
git bisect start HEAD HEAD~5
git bisect run python3 test_calculator.py
git bisect reset
bisect run uses the script’s exit code (0 = good, non-zero = bad)
to drive the search. Always finish with reset — otherwise HEAD
stays on the last midpoint.
Task 4: Fix the bug
Bisect points at Simplify absolute (BUG: removes negation!).
In the editor, restore the body of absolute:
def absolute(x):
"""Return |x| — handles negatives, zero, and positives."""
return x if x >= 0 else -x
git commit -am "Fix: restore negation in absolute"
python3 test_calculator.py # all tests pass
⚠️ Test-portability caveat (real-world bisects)
Bisect runs the test at every historical commit in range. If the test itself was added mid-range, older commits won’t have it and bisect breaks. Restore the modern test each iteration:
git bisect run -- bash -c 'cp /tmp/test.py . && python3 test.py'
🌙 Halftime: take a break before Step 8
You’ve finished the daily tools phase (stash, cherry-pick, blame, bisect). Steps 8–11 are history rewriting — denser and structurally riskier.
Walk away for at least 30 minutes (overnight is better) before continuing. Spaced practice is one of the most replicated findings in cognitive science: a 30-minute break before harder material produces measurably better retention than pushing straight through. Your hippocampus consolidates while you’re not studying.
When you come back, predict from memory: what does git stash actually save?
Why does cherry-pick create a new SHA? If those don’t come fast, re-do the
step. If they do, Step 8 awaits.
git bisect — Knowledge Check
Min. score: 80%
1. A regression appeared somewhere in the last 50 commits. Roughly how many tests does git bisect need to find the exact breaking commit?
Binary search halves the range each test: 50 → 25 → 13 → 7 → 4 → 2 → 1. About 6 iterations. For 1000 commits, ~10 tests. This scaling is why bisect is irreplaceable on long-running projects.
2. Which sequence correctly runs an automated bisect?
You must tell bisect the boundaries first — bad (usually HEAD) and a known-good earlier commit. Only then can run automate. The command’s exit code (0 = good, nonzero = bad) drives the search.
3. You ran git bisect run successfully. What must you do afterwards?
git bisect reset is non-negotiable. It restores HEAD to where you started and removes bisect’s temporary refs. Skipping it leaves HEAD on a random historical commit — a common cause of ‘why is my code weird?’ panic.
4. Which test property is required for git bisect run to work?
Bisect uses the exit code as its oracle. Also critical: the test must actually run at every historical commit — if the test file was added mid-range, older commits will fail to even find the test, confusing bisect. Use git bisect run -- bash -c 'cp /tmp/test.py . && python3 test.py' to work around this.
5. [Revisit Step 1] In the middle of a manual bisect, Git leaves HEAD at a historical commit while you decide good/bad. What HEAD state are you in, and why is that OK?
During bisect, HEAD is detached at whichever historical commit Git picked as midpoint. That is fine because bisect’s internal refs (BISECT_HEAD, refs/bisect/*) track progress. git bisect reset restores the pre-bisect HEAD. Same detached-HEAD concept as Step 1 — just used in service of a search.
6. [Revisit Step 6] A bug appears because a line that used to exist was deleted. Which tool finds the deletion commit?
Blame only attributes existing lines. A deletion is invisible to blame (the line isn’t there!). Bisect operates on behavior, not lines: if the test failed after commit X and passed at commit X-1, X is the culprit, regardless of whether X added, modified, or deleted code.
7. Put in order the commands for an automated bisect that finds a regression in the last 100 commits and returns HEAD to normal. (arrange in order)
python3 test_calculator.py # verify HEAD currently failsgit bisect start HEAD HEAD~100 # bad=HEAD, good=100 backgit bisect run python3 test_calculator.py # Git iterates ~7 times# Git prints: "is the first bad commit" git bisect reset # return HEAD to pre-bisect state
git bisect stop # not a real subcommandgit bisect --force # not a real flaggit reflog # unrelated to bisect workflowgit bisect run -- python3 test.py HEAD HEAD~100 # wrong arg order
Bisect needs boundaries first (start <bad> <good>), then run. The run script’s exit code (0 = good, nonzero = bad) drives the binary search — ~log₂(100) ≈ 7 iterations. reset is non-negotiable; skipping it leaves HEAD on a historical midpoint commit and your code “looks weird.”
Rebase: Integrate Changes Without a Merge Commit
🎯 You will learn to
- Pick rebase for short local branches, merge for shared/long-lived ones — and say why.
- Produce linear history with rebase + fast-forward merge (no diamond).
- Resolve a rebase conflict — same marker dance as merge, but finish with
rebase --continue. - Recover from a bad rebase using reflog (Step 2’s safety net applied).
Mental model: the video-editor timeline cut
Select the clips (commits) unique to your feature, cut, move playhead to
main’s tip, paste. Each paste is a new commit object — same patch,
new parent, new SHA. Originals stay in .git/objects (reflog recovers).
💡 Schema check (Step 3 callback). Rebase = “cherry-pick a series” under the hood. New objects, branch pointer moved. Same mechanic Step 5 used on one commit; Step 8 just iterates.
🔍 Contrast — what’s not like rebase. A fast-forward merge on a strict-extension branch creates zero new commits —
main’s pointer just slides forward to the feature tip. Rebase + ff-merge together produce linear history because rebase did all the new-commit-creation up front; the merge has nothing left to do.
Task 1: Inspect the divergence
Pre-built: feature-sqrt has square_root; main later got
Bump version notes + Add identity helper.
cd /tutorial/myproject
git log --all --oneline --graph --decorate
Task 2: Rebase and fast-forward
Predict before running: how many parents will the feature tip have after rebase?
git switch feature-sqrt
git rebase main
git switch main
git merge feature-sqrt # fast-forward, no merge commit
git branch -d feature-sqrt
Result: one linear line on the graph. No diamond.
Task 3: Rebase through a conflict (desirable difficulty)
Real rebases conflict when upstream touched the same lines. Produce one deliberately:
git switch -c feature-trailer main~1
echo '# end-of-module trailer' >> calculator.py
git commit -am 'Add trailer comment at end of file'
git rebase main # CONFLICT — both sides appended at EOF
git status
Conflicts aren’t failures — they’re “two valid changes touched the
same lines; a human must combine them.” Edit calculator.py so the
bottom keeps both the identity helper and your trailer
comment, removing the <<< / === / >>> markers.
git add calculator.py
git rebase --continue # NOT `git commit` — use the rebase verb
git switch main
git branch -D feature-trailer
Remember: rebase conflict = merge conflict mechanics, but finalize with
git rebase --continue. Bail withgit rebase --abort.
When to rebase vs merge
| Situation | Prefer |
|---|---|
| Short feature branch (hours–days), only you | Rebase |
| Long-lived or already-pushed branch used by teammates | Merge |
| Cardinal rule | Never rebase shared history |
Rebase — Knowledge Check
Min. score: 80%
1. What does git rebase main do when run on feature-sqrt?
Rebase rewrites the branch: feature-sqrt’s unique commits become new commits on top of main. This linearizes history but changes SHAs — so never rebase pushed branches others are using.
2. After rebasing, why does the rebased commit have a different SHA than before?
Step 3 again: SHA(commit) = SHA-1(tree + parent(s) + author + committer + message). Change the parent → new SHA. Same patch, new identity.
3. When is git rebase a bad idea?
Rebase rewrites history. If others have the old SHAs, their branches will diverge and they will get ugly conflicts. Stick to merge for anything pushed and shared, rebase for local linearization.
4. [Revisit Step 2] You rebased feature-sqrt and realized it broke everything. Before pushing. How do you recover the pre-rebase state?
Rebase is only ‘destructive’ in the sense of changing branch pointers — the original commits remain in .git/objects until garbage collection. git reflog records every HEAD position including the pre-rebase tip; git reset --hard restores it. Your safety net, earned in Step 2.
5. After rebasing a feature onto main, you run git merge feature on main. What happens?
After rebase, feature is a strict linear extension of main. The merge reduces to just advancing the main pointer (fast-forward). This is the whole reason many teams rebase before merging: clean, linear history.
6. Which statements about rebase are true? (Select all that apply) (select all that apply)
Rebase applies each patch in turn and can conflict at any of them — you resolve, git add, git rebase --continue. --abort restores the pre-rebase state. Rebase does not push anything; that is a separate git push step (often needing --force-with-lease on rebased branches, which is where collaborator pain happens).
7. [Revisit Step 5] You had two choices for bringing a colleague’s single fix into main: cherry-pick or rebase. Both create new commits with new SHAs. What is the key difference in intent?
Under the hood they use the same machinery — patch out, replay on new parent, new SHA. The difference is scope: cherry-pick = one commit, rebase = a series. Step 5’s cherry-pick and Step 8’s rebase are the same technique at different scales.
8. [Revisit Step 3] During rebase, a conflict halts you mid-stream. git reflog at this moment shows many entries. Which entry do you want to git reset --hard to if you decide to abort manually instead of using git rebase --abort?
Reflog logs every HEAD movement. The pre-rebase position is typically labeled checkout or the last commit before the rebase entries. That SHA is the pre-rebase branch tip — the safe rescue point. This is the Step 2 reflog safety net applied to rebase.
9. [Revisit basic tutorial Step 11] You edit a conflicted file during a git rebase, remove the <<<<<<< / ======= / >>>>>>> markers, and run git add. What is the next command to finalize this one commit of the rebase?
A rebase conflict uses the same markers and the same git add step as a merge conflict (basic tutorial Step 11). The only difference is the final verb — git rebase --continue tells Git to replay the remaining commits, which git commit would not. Running git commit by reflex here often leaves the rebase half-done. git rebase --abort at any point restores the pre-rebase state.
10. Put in order the commands to rebase a private 3-commit feature branch onto the latest main and fast-forward merge, with nothing leftover on disk.
(arrange in order)
git switch feature # be on the branch being rebasedgit rebase main # replay feature commits onto latest maingit switch main # target the integration branchgit merge feature # fast-forward, no merge commitgit branch -d feature # clean up the now-merged branch
git push --force # DANGER on shared branchesgit merge feature --no-ff # would create a merge commit — not ffgit rebase feature # backwards — rebases main onto featuregit reset --hard feature # destroys main's history
The correct direction is “rebase the shorter branch onto the longer.” Running rebase feature from main (distractor 3) does the opposite — rebases main onto feature, usually rewriting commits you didn’t want to touch. --no-ff prevents fast-forward (that’s the point of this strategy — a linear, no-merge-commit result). --force has no place in a local pre-PR workflow.
Interactive Rebase: Edit, Squash, Reorder, Drop
🎯 You will learn to
- Squash messy WIP commits into one clean commit before opening a PR.
- Drop an accidentally-committed secret (and recover it from reflog if needed).
- Reword a commit message retroactively without changing its diff.
- Pick the right verb (
pick/reword/squash/fixup/drop/edit) for the rewriting goal.
🚪 This is the second threshold step
Step 9 is the densest step in the tutorial — eight verbs, several edge cases, and the most “wait, what?” moments in real Git. That’s not a bug; it’s where most engineers’ command of Git plateaus. Crossing this threshold is what separates “I use Git” from “I shape Git history.” Plan two passes. Don’t worry if Task 4 needs a re-read.
⚠️ Safe zone only
Interactive rebase rewrites history (Step 3: new parents → new SHAs).
Run it only on commits that (a) are unpushed, or (b) live on a feature
branch only you use. For public history, use git revert (next).
🤔 Predict first
After rebase -i collapses four messy commits into one clean commit,
do the original four still exist anywhere — and could you recover one
of them with git reflog?
💡 Schema check. Same pattern as Steps 5 & 8: every rewriting verb here (
squash,drop,reword,edit) creates new commit objects and moves the branch pointer. The “old” commits don’t disappear — they’re just unreferenced. Reflog finds them.
The four verbs you’ll use here
| Verb | Effect |
|---|---|
pick |
Use commit as-is (default) |
squash |
Meld into previous; combine messages |
drop |
Remove commit |
reword |
Edit message only |
📋 All six core verbs (`fixup`, `edit`)
| Verb | Effect |
|---|---|
pick |
Use commit as-is (default) |
reword |
Edit message only |
edit |
Pause so you can commit --amend or add fixes / split |
squash |
Meld into previous; combine messages |
fixup |
Like squash, drop this commit’s message |
drop |
Remove commit |
Two more verbs exist for advanced workflows: break (pause mid-rebase
so you can poke around, then git rebase --continue) and exec <cmd>
(run a shell command after each replayed commit, e.g. exec pytest).
See git help rebase if you need them.
🛠 Why this VM uses scripted `sed` instead of `$EDITOR`
Real workflow: git rebase -i HEAD~N opens your $EDITOR, you hand-edit
action words, save-and-close. This browser VM can’t host an interactive
editor, so we script it via GIT_SEQUENCE_EDITOR="sed -i …".
The skill is knowing what to change, not typing the sed. For each
task: (1) predict the edit on paper, (2) run the scripted version,
(3) verify the log matches your prediction.
Task 1: Inspect the messy branch
cd /tutorial/myproject
git log --oneline -5 # 4 ugly commits on refactor-power
Task 2: Squash four commits into one
Predict: which lines get squash, and why must line 1 stay pick?
GIT_SEQUENCE_EDITOR="sed -i '2,4s/^pick/squash/'" git rebase -i HEAD~4
git commit --amend -m "Refactor: cleanup notes in calculator.py"
git log --oneline -3
Task 3: Drop a secret-leaking commit
Append to calculator.py: SECRET_API_KEY=oops. Commit:
git commit -am "Accidentally add secret (should be dropped)".
Then append def placeholder(): pass and commit:
git commit -am "Add placeholder function".
Drop the secret:
GIT_SEQUENCE_EDITOR="sed -i '1s/^pick/drop/'" git rebase -i HEAD~2
grep SECRET_API_KEY calculator.py || echo "secret is gone from branch"
Task 3b: Prove reflog rescues the “dropped” commit
Dropped ≠ deleted (Step 3 again).
git reflog -n 10
SECRET_SHA=$(git reflog | grep -m1 'Accidentally add secret' | awk '{print $1}')
git branch secret-backup $SECRET_SHA
git log secret-backup --oneline
⚠️ For *real* secrets: drop+rescue is the wrong workflow
Drop + rescue leaves more copies of the secret, not fewer. For an actual leaked credential:
- Rotate the credential immediately (the only step that truly mitigates).
- Scrub with
git filter-repoor BFG. - Ask collaborators to re-clone.
Use drop only for non-sensitive cleanup (debug prints, experiments).
Task 4: Reword a message
GIT_SEQUENCE_EDITOR="sed -i '1s/^pick/reword/'" \
GIT_EDITOR="sed -i '1s/.*/Refactor: cleanup notes and placeholder/'" \
git rebase -i HEAD~2
git log --oneline -3
Two env vars = two editors (todo list + message editor). In real life you’d hand-edit both.
Wrap-up: rule of thumb
- Local, unpushed history →
rebase -i(any verb). - Shared, pushed history →
git revertonly (next step).
Rewriting public history forces every collaborator to reconcile.
Interactive Rebase — Knowledge Check
Min. score: 80%1. Which interactive-rebase action keeps the commit but lets you change only its message?
reword keeps the commit content identical but opens the editor to change the message. pick is no-op, squash melds into the previous commit, drop deletes it.
2. What is the difference between squash and fixup?
Both meld into the previous commit. squash opens the editor so you can combine messages; fixup just drops the squashed commit’s message. Use fixup for trivial typos, squash when both messages are meaningful.
3. You just ran git rebase -i HEAD~3 and realized you dropped a commit you needed. Can you recover it?
Dropped commits remain in .git/objects until garbage collection prunes them. git reflog is the bookmark that lets you find them. git reset --hard <reflog-sha> restores the exact pre-rebase state. This is the safety net. Always verify it works once before a high-stakes rebase.
4. [Revisit Step 3] After an interactive rebase, the rewritten commits have new SHAs even if their patches are identical. Why?
Same answer as for simple rebase: different parent → different SHA. The object model does not allow ‘editing’ a commit — all that changes is which commit the branch pointer references.
5. Which of these is the most dangerous use of interactive rebase?
Rewriting shared history is the nuclear option. Everyone who fetched the old commits now has a conflicting local copy of main; their pulls fail spectacularly. For public history, use git revert (which creates a new anti-matter commit) instead. Reserve interactive rebase for local cleanup.
6. You want to split one giant commit into three smaller ones during interactive rebase. Which action lets you do that?
edit pauses rebase at that commit with HEAD there. You then git reset HEAD~ (un-commit but keep changes staged/unstaged), split the changes into multiple git add + git commit cycles, and finally git rebase --continue. The original one commit is replaced by your new sequence.
7. [Revisit Step 2] In Task 3b you ‘rescued’ a dropped commit. In terms of the object database, what did git branch secret-backup <sha> actually do?
Same mechanic as Step 2’s rescued-work branch. The dropped commit was never deleted — only unreferenced. Creating a branch (one 41-byte file) re-anchors it as reachable. Now git gc won’t prune it. This is the same reflog + branch recipe, applied to a different scenario (rebase-drop vs detached-HEAD-orphan).
8. [Revisit Step 5] You are about to interactive-rebase a branch. You have uncommitted edits you want to keep but not carry through the rebase. Safest workflow?
Rebase refuses to start with dirty working tree — so Git is already stopping you. Stash is the clean pattern: preserve the work-in-progress (Step 4), do the rebase, pop the stash onto the rebased branch. This composes tools across steps — recognizing when two tools work together is the mark of Git fluency.
9. Put in order the steps to squash 4 messy commits on a local branch into one clean commit (assuming you’re using the scripted VM editor). (arrange in order)
git log --oneline -5 # inspect what we're rewritingGIT_SEQUENCE_EDITOR="sed -i '2,4s/^pick/squash/'" git rebase -i HEAD~4# Git replays commit 1 as pick, squashes commits 2-4 into itgit commit --amend -m "Refactor: cleanup notes in calculator.py"git log --oneline -3 # verify: 4 commits became 1
git push --force # DANGER: never on shared historygit rebase -i HEAD~4 --squash-all # not a real flaggit reset --hard HEAD~4 # destroys commits without combininggit merge --squash HEAD~4 # wrong tool for intra-branch squash
The canonical pre-PR cleanup. Distractor 1 is the cardinal rule broken. Distractor 2 is invented. Distractor 3 discards instead of squashing (no single commit preserves the combined patch). Distractor 4 (merge --squash) is for branch-to-branch collapse, not for cleaning up commits on the current branch.
Squash Merge: Collapse a Feature Into a Single Commit
🎯 You will learn to
- Pick squash vs. rebase vs. merge based on how
main’s log should read. - Anticipate the trade-off: clean main, lost intra-feature
bisectprecision. - Recover individual feature commits if a regression needs fine-grained blame.
git merge --squash <branch> collapses a multi-commit feature into one
new commit on main. The feature branch is untouched.
🤔 Predict first
After git merge --squash feature followed by git commit, how many
parents does the new commit on main have — one, two, or three? And
what does that imply for git bisect later?
📋 Three merge strategies side by side (Steps 8 + 10 unified)
| Method | main’s graph | Use when |
|---|---|---|
git merge feature |
Merge commit, 2 parents (diamond) | Long-lived branch; preserve merge context |
rebase + merge (ff) |
Linear, each commit preserved | Short feature; keep individual commits |
git merge --squash |
One new commit, branch untouched | Want main to read as one commit per feature |
Task 1: Inspect the feature
cd /tutorial/myproject
git log feature-stats --oneline -5 # three focused commits
Task 2: Squash-merge
git switch main
git merge --squash feature-stats
git status # staged changes, but NO commit yet — squash stops here
git commit -m "Add descriptive statistics module (mean, variance, stddev)"
Task 3: Confirm + clean up
git log --oneline main # one new commit for the feature
git branch -D feature-stats # -D because not ff-merged in Git's view
⚠️ The cost: bisect granularity
bisect on main can only narrow to the whole feature commit, not one
of its three internal commits. Keeping the feature branch around (or its
reflog) preserves fine-grained recovery — the strongest argument against
deleting merged feature branches the same day they merge.
Squash Merge — Knowledge Check
Min. score: 80%
1. What does git merge --squash feature do?
Squash stages a combined patch but does not commit — you supply the message. The result is one new commit on main containing all of the feature’s changes; the feature’s individual commits never appear on main.
2. After git merge --squash feature; git commit, what is true of the feature branch?
Squash merge does not touch the feature branch. It is still there with its full history. To delete it after squashing, use git branch -D feature (force, because it is not ff-merged by Git’s definition).
3. [Compare with Step 8] You have a 3-commit feature. You merge it three ways. Which output is correct?
Plain merge = 1 merge commit (2 parents). Rebase linearizes so merge ff-forwards 3 commits. Squash collapses the 3 into 1 new commit. Team preference decides which is right for the project.
4. Why might a team reject squash merge as a default policy?
With squash, git bisect can only narrow to ‘this whole feature’, not to which intermediate commit caused the regression. Intermediate authors also disappear from main’s history. Some teams prefer rebase/merge for richer history.
5. You squash-merged feature-stats into main. The next day you discover one of the three internal commits had a bug. How do you fix only that part?
Squash hides internal granularity on main. But the original commits still exist where the feature branch was (or via reflog). You can cherry-pick a fix or write a small revert patch on main. This is the classic squash trade-off — convenience on main, less surgical control later.
6. [Revisit Step 7] A regression is reported on main three months after a feature was squash-merged in. git bisect on main narrows the culprit to the squash commit. What is your next move?
Squash flattens main’s history, not the feature branch’s. The fine-grained commits are still preserved on the feature branch (assuming you didn’t delete it) and in reflog. Bisect on the feature branch pinpoints the exact internal commit. This is the strongest argument for keeping merged feature branches for a while, not deleting them immediately.
7. [Revisit Step 3] After git merge --squash feature; git commit, the new squash commit on main is a Git commit object like any other. What are its parents?
A squash commit has exactly one parent: the prior HEAD of the branch you ran merge --squash on. The feature branch tip is not referenced as a parent — which is why git log main shows a clean linear history and why git bisect on main cannot drill into the feature. Same object-model: the commit records exactly the parents it was given, nothing more.
8. Put in order the commands to squash-merge a 3-commit feature-stats branch into main, then clean up.
(arrange in order)
git switch main # target branchgit merge --squash feature-stats # stages combined diff; NO commit yetgit status # verify: staged changes, no commitgit commit -m "Add statistics module (mean, variance, stddev)"git branch -D feature-stats # force-delete (not ff-merged in Git's view)
git merge feature-stats # creates a merge commit — wrong strategygit branch -d feature-stats # refuses: the branch's commits aren't on maingit cherry-pick feature-stats # only copies the tip commit, not combinedgit push --force # unneeded and dangerous
--squash stages but does NOT commit — the extra git commit step is intentional so you write a fresh, whole-feature message. Use capital -D to delete: Git’s fast-forward definition says the feature branch is not merged (only a new combined commit landed on main), so lowercase -d refuses. Distractor 3 (cherry-pick) would only copy the tip commit’s patch, not the cumulative diff of the whole branch.
Revert: Safely Undo a Pushed Commit
🎯 You will learn to
- Reach for revert — not
reset --hard— whenever a bad commit is already on a shared branch. - Read the anti-matter pattern in the graph: the original stays; a new commit negates it.
- Decide between revert (public safety) and rebase-drop (private cleanup) by asking one question: has this been pushed?
Scenario
You pushed Refactor: rename divide → div to main. Ten teammates
already pulled. Then CI discovers every import of divide now breaks.
🤔 Predict first
You have two options on the table:
- A.
git reset --hard HEAD~1+git push --force - B.
git revert HEAD+git push
Which one breaks every teammate’s clone? Why? (Step 3’s schema is the key — what changes existing SHAs?)
The answer
reset --hard + push --force would fix your clone but break every
teammate’s — their local main still points at the rewritten SHA. Not acceptable.
git revert <sha> is the additive, public-safe undo. It computes
the inverse patch of the target commit and commits that as a new
commit. No existing SHAs change; no force-push; no collaborator pain.
Task 1: See the bad commit
Setup planted a “pushed” refactor that broke callers.
cd /tutorial/myproject
git log --oneline -5
grep -c 'def divide\|def div' calculator.py
Task 2: Revert it
git revert HEAD --no-edit
git log --oneline -5
Two commits visible: the bad one and its revert. git log is now
a truthful record of what happened.
Task 3: Prove the reachable commit count
Predict: did revert delete anything? (Answer: no — history grew by 1.)
git rev-list --count HEAD
git cat-file -p HEAD # examine the revert commit object
git cat-file -p HEAD^ # the original bad commit, still reachable
The single rule
If anyone else has it, revert. If only you have it, rebase is fair game.
📋 Revert vs. reset vs. rebase-drop, side by side
| Goal | Pushed? | Tool |
|---|---|---|
| Remove a bad commit from shared history | Yes | git revert <sha> (additive) |
| Clean up a local WIP branch before PR | No | rebase -i with drop |
| Nuke local branch to a prior state | No | reset --hard <sha> |
💡 Reverting a *merge* commit (`-m 1`)
Merge commits have two parents; revert needs to know which side is the
“mainline” (the side you want to keep). git revert -m 1 <merge-sha>
keeps the first-parent side and undoes the merged-in branch. Get the
number wrong and you revert the wrong side.
git revert — Knowledge Check
Min. score: 80%
1. Why is git revert safe on shared branches where git reset --hard + push --force is not?
The rule compresses to one property: does this operation change existing SHAs? Revert — no. Reset/rebase/amend — yes. Changed SHAs break anyone who already fetched the old ones. Revert is the only undo that preserves shared-history safety.
2. What does git revert <sha> physically add to history?
Revert computes the inverse diff of <sha> and lands it as a regular commit on the current branch. The new commit’s parent is whatever HEAD was when you ran the command — when you revert HEAD, that happens to be <sha>, but when you revert an older commit it isn’t. git log shows both the bad commit and its undo, which is the honest story of what happened.
3. You accidentally pushed a bad commit to main. Three teammates have pulled. Best move?
Shared history was already distributed. Revert appends an undo; teammates’ next pull fast-forwards cleanly. Force-pushing after reset or rebase makes teammates’ branches diverge and their pulls fail — exactly what we avoid.
4. [Revisit Step 9] Rebase-drop and revert both “undo” a commit. Which is correct about their effect on SHAs?
The destructive/additive distinction is the heart of this step. Rebase-drop replays every commit after the dropped one on a new parent — new SHAs cascading. Revert just appends one new commit. Same apparent outcome (the bad change is gone); completely different impact on collaborators.
5. You want to revert a merge commit (one with two parents). What additional flag do you need?
Merge commits have two parents; revert needs to know which side is “mainline” (the version you want to keep). -m 1 means “first parent is mainline; undo the second-parent branch.” Getting this wrong reverts the wrong side.
6. Put in order the safe public-undo workflow after discovering a bad commit on shared main. Distractors rewrite history.
(arrange in order)
git log --oneline -5 # find the bad SHAgit revert--no-edit # create the anti-matter commit git log --oneline -3 # verify: both bad + revert are theregit push # succeeds — no history rewrite
git reset --hard^ # rewrites main, breaks teammates git push --force-with-lease # overwrites remotegit rebase -i^ drop # rewrites downstream commits rm -rf .git && git clone# "burning down the repo" antipattern
Revert-and-push is the only sequence that leaves every existing SHA untouched. Each distractor rewrites history in some way — which is exactly the failure mode revert exists to avoid. Run the safe one often enough that it becomes reflex.
Git Submodules: Add & Clone
🎯 You will learn to
- Add a submodule to an existing repo with one command.
- Clone a submodule-using repo correctly (
--recursive) — or recover after forgetting. - Recognize the gitlink (mode
160000) +.gitmodulesas the two structural differences from a regular file. - Pick submodules vs. package manager vs. monorepo based on the actual problem.
🤔 Predict first
When you git submodule add a 200-MB repo, how much storage does the
outer repo’s tracked tree gain — a few hundred megabytes, or a few
hundred bytes?
📖 Three core terms (open before reading further)
| Term | What it is |
|---|---|
| Submodule | A nested Git repo inside an outer Git repo |
.gitmodules |
Plain-text config file in the outer repo listing each submodule’s path + URL |
| Gitlink | A tree entry with mode 160000 whose “content” is a 40-char commit SHA (instead of file bytes) |
Two more terms (Pinned SHA, --recursive) are introduced inline as
they come up; the full glossary is at the bottom of this step.
Mental model: library subscription
A submodule is a subscription to a specific edition of a library:
- No photocopy — no file duplication.
- You record the book title + edition number (
.gitmodulesURL + pinned SHA). - Anyone with your note fetches the same edition.
- Upgrade by changing the edition number.
Edition number = commit SHA. Book = the submodule’s Git repo hosted elsewhere.
On-disk layout
@startuml
main-repo/
.git/
modules/
math-utils/ ← submodule's actual git data (objects, refs, HEAD…)
.gitmodules ← where Git should fetch each submodule
src/
vendor/
math-utils/ ← nested Git repo (the working tree)
.git ← gitfile: "gitdir: ../../.git/modules/math-utils"
utils.py
@enduml
Task 1: Inspect the “upstream” library
Pre-built: /tutorial/math-utils-src/ (working repo, double+triple)
and /tutorial/math-utils.git (bare clone acting as the remote URL).
cat /tutorial/math-utils-src/utils.py
Task 2: Add the submodule
cd /tutorial/myproject
git switch main
git submodule add /tutorial/math-utils.git vendor/math-utils
git status # TWO new entries
Open .gitmodules in the editor. Predict before scrolling the answers:
- How many lines per submodule?
- Is the pinned SHA stored here?
- What breaks if the file is deleted?
Answers
- 3 lines (header +
path+url). Tiny by design. - URL yes, SHA no. The SHA is the gitlink in the tree (see below). Two independent facts: where to fetch vs. which commit to check out.
- Teammates can’t clone the submodule.
.gitmodulesis the subscription directory; without it,clone --recursivehas no URL.
Inspect the gitlink:
git ls-files -s vendor/math-utils # mode 160000 = submodule
git commit -m "Add math-utils submodule at v0.1.0"
Task 3: Clone with --recursive
cd /tutorial
git clone --recursive myproject colleague-clone
ls colleague-clone/vendor/math-utils
Without --recursive, the folder exists empty until the teammate
runs git submodule update --init --recursive.
💡 When submodules are the *right* tool
Yes: versioned code you own shared across several repos.
No: third-party deps (use a package manager — npm, pip, cargo), or single config files (use config management).
📋 Submodule glossary (full)
| Term | What it is |
|---|---|
| Submodule | A nested Git repo inside an outer Git repo |
.gitmodules |
Plain-text config file in the outer repo listing each submodule’s path + URL |
| Gitlink | A tree entry with mode 160000 whose “content” is a 40-char commit SHA (instead of file bytes) |
| Pinned SHA | The exact commit of the submodule the outer repo wants checked out at the gitlink path |
--recursive |
Clone flag that fetches submodules at clone-time (otherwise the folder is empty) |
Git Submodules — Knowledge Check
Min. score: 80%1. What does a Git submodule actually store in the outer repository?
The outer repo stores ONE SHA per submodule (the pinned commit) plus a .gitmodules entry for the URL. The submodule’s working files are checked out in the submodule path; its git data (objects, refs, HEAD) lives in the outer repo’s .git/modules/<name>/ — the submodule directory itself contains only a .git text file (a “gitfile”) pointing there, NOT a full .git/ directory.
2. A teammate clones your repo normally with git clone <url>. What do they see at the submodule path?
Plain git clone records the submodule entries but does not fetch their content. The folder exists but is empty. git clone --recursive <url> or git submodule update --init --recursive after the fact populates it.
3. Which statements about submodules are true? (Select all that apply) (select all that apply)
The outer repo stores only a pinned SHA (gitlink, mode 160000) and a .gitmodules entry — not file copies. The submodule is a genuine nested repo.
4. [Synthesis — revisits Steps 1, 3] Why is it internally consistent that a submodule is ‘just a pinned commit SHA’?
Back to the object model (Step 3). A commit SHA uniquely identifies a whole-project snapshot (commit → tree → blobs). Pinning a commit SHA is enough to reconstruct the submodule’s entire content. No file duplication is necessary — exactly the same property that makes branches cheap (Step 1).
5. [Revisit Step 8] A submodule’s pinned SHA is 40 characters, just like a regular commit SHA. In terms of Git objects, what kind of object does it point to?
A gitlink pins a commit SHA — which (via the commit’s tree and blobs) uniquely determines the submodule’s entire file state. The commit lives in the submodule’s .git/objects/, not the outer repo’s. This is exactly the same commit-SHA-as-snapshot-identity property rebase relies on (Step 8) and that makes the whole object model coherent.
6. Put in order the commands a teammate runs to add a submodule, commit it, and set up a colleague’s workstation so the submodule files appear. Distractors are verb-variants that look right but fail. (arrange in order)
cd /tutorial/myprojectgit submodule add /tutorial/math-utils.git vendor/math-utilsgit commit -m "Add math-utils submodule at v0.1.0"cd /tutorial && git clone --recursive myproject colleague-clone
git submodule init /tutorial/math-utils.git vendor/math-utils # init alone does not fetchgit clone myproject colleague-clone # submodule folder emptygit submodule fetch /tutorial/math-utils.git # not a valid subcommandgit merge /tutorial/math-utils.git # confuses a nested repo with a branch
git submodule add combines clone + config in one step; init without update is half the story. Plain git clone creates an empty submodule folder. git submodule fetch is invented. git merge on a URL is a syntax error. Verb selection is what separates a working submodule workflow from a broken one.
Updating Submodules: Upstream Bumps & Resync
🎯 You will learn to
- Upgrade a submodule to new upstream work via the two-step dance (fetch/checkout inside,
add/commitoutside). - Diagnose and fix the “teammate forgot
submodule update” trap — muscle memory for post-pull. - Force-resync any drifted submodule back to the pinned SHA with one deterministic command.
🤔 Predict first
Upstream publishes new commits. After you git pull the outer repo,
will your local submodule’s working directory show the new content
automatically — or do you have to do something extra?
Task 1: Upstream publishes v0.2
/tutorial/publish-math-utils-v0.2.sh
git --git-dir=/tutorial/math-utils.git log --oneline --all
cd /tutorial/myproject
git status # nothing changed here — push doesn't propagate
Task 2: Fetch + checkout inside the submodule
A submodule is a nested repo. Use normal git inside it:
cd /tutorial/myproject/vendor/math-utils
git fetch
git checkout origin/HEAD
cd /tutorial/myproject
git status # vendor/math-utils (new commits)
git diff vendor/math-utils
The outer diff is exactly one line — -Subproject commit <old> /
+Subproject commit <new>. Line-level diffs live in the submodule’s
own object database.
Task 3: Bump the pinned SHA in the outer repo
git add vendor/math-utils
git commit -m "Bump math-utils to v0.2.0 (adds quadruple)"
Task 4: The teammate trap
cd /tutorial/colleague-clone
git pull
cat vendor/math-utils/utils.py # still v0.1 on disk!
pull updated the pinned SHA in the tree, but did not touch
their submodule working directory. Code that imports quadruple
now fails. Fix:
git submodule update --init --recursive
cat vendor/math-utils/utils.py # now has quadruple
💡 Make this a habit (one-time config)
After every pull that might touch submodule paths, run
git submodule update --init --recursive. Or, one-time setup:
git config --global submodule.recurse true
Now pull and checkout do the right thing automatically.
Task 5: Force-resync a drifted submodule
Simulate drift:
cd /tutorial/colleague-clone/vendor/math-utils
git checkout HEAD~1
cd /tutorial/colleague-clone
git status # modified: vendor/math-utils (new commits)
git submodule update --init --recursive
git status # clean — pinned SHA restored
Same command works for never-initialized, partially-fetched, or drifted submodules.
Updating Submodules — Knowledge Check
Min. score: 80%
1. You bumped a submodule to v0.2 and pushed. A teammate pulls your change and reports tests failing because quadruple does not exist. Most likely cause?
Classic trap. git pull on the outer repo updates the pinned SHA in the tree but does NOT touch the submodule working directory. They need git submodule update --init --recursive to actually reflect the new SHA on disk. Configure git config submodule.recurse true to make pull do this automatically.
2. Upgrading a submodule requires how many git commit calls in total (inside + outside)?
The answer depends on whether you are authoring the upgrade (write code inside submodule → commit inside → push → commit outside) or just pulling in upstream work (checkout new commit inside → commit outside). In either case the outer commit is mandatory — that is the SHA bump.
3. git status in the outer repo shows modified: vendor/math-utils (new commits). What does it mean?
The outer repo compares the pinned SHA with the submodule’s actual HEAD. Mismatch → new commits. Fix: git add <path> + commit to pin the new SHA, or git submodule update to snap the submodule back to the pinned SHA.
4. [Revisit Step 1] Why doesn’t git pull automatically update submodule working directories — what Git principle is respected by this design?
Git keeps the outer/inner repo boundary strict: an outer pull updates the pinned SHA (a fact about the outer tree) but does not reach into the inner repo and rewrite its HEAD. You must explicitly say git submodule update. Same conservative-HEAD-movement philosophy that makes detached-HEAD-with-uncommitted-changes impossible.
5. [Revisit Step 3] The outer repo’s diff for a submodule change is always just one line: -Subproject commit <old> / +Subproject commit <new>. Why is that enough?
Step 3’s object-model insight applied again. A commit SHA resolves to a deterministic snapshot. Pinning a new SHA is, by construction, equivalent to changing the entire content — no further diff data is needed in the outer commit. Minimum information, maximum fidelity.
6. [Evaluate] A teammate says: ‘After every git pull I always run git submodule update --init --recursive, even on repos without submodules. Paranoia, or sensible?’
The command is safe on any repo. Running it unconditionally is a cheap habit that prevents the most common submodule bug (stale working dir). Equivalent hardening: git config --global submodule.recurse true to make pull/checkout do it automatically.
7. Upstream publishes a v0.2 commit. Put in order the commands that land it as a pinned version bump in your outer repo. Distractors are verb-variants that look right but fail or do the wrong thing. (arrange in order)
cd /tutorial/myproject/vendor/math-utilsgit fetch # pulls new commits into submodulegit checkout origin/HEAD # move submodule HEAD to new SHAcd /tutorial/myproject # back to outer repogit add vendor/math-utils # stage the gitlink SHA changegit commit -m "Bump math-utils to v0.2.0"
git pull # ambiguous in detached-HEAD submodulegit submodule update # resets TO pinned SHA — opposite of what we wantgit commit -am "Bump" # would sweep in unrelated WD changesgit merge origin/HEAD # creates a merge commit inside submodule
git submodule update (distractor 2) is exactly the wrong verb here — it resets the submodule back to whatever the outer tree pins, erasing the new checkout. That’s the single most common submodule confusion, and getting the direction right is the heart of this step. git pull in detached HEAD is unreliable. -am would include unrelated changes. merge creates a commit structure we don’t want inside the submodule.
Submodule Internals: What 'Content Changed' Means
🎯 You will learn to
- Read
modified contentvs.new commitsstraight fromgit statusand pick the right fix. - Execute the six-step publish ceremony without falling into the detached-HEAD trap.
- Resync any weird submodule state deterministically with one command.
- Reason from first principles — outer repo tracks one SHA; inner repo is a full Git repo; they’re independent.
🤔 Predict first
You edit vendor/math-utils/utils.py directly without cd-ing into
the submodule. What does the outer repo’s git status say about
vendor/math-utils — modified content, new commits, both, or
nothing?
The mental model
The outer repo stores exactly one thing per submodule (besides
.gitmodules): the pinned commit SHA. On every git status, Git compares:
SHA the outer tree pins vs SHA at the submodule's current HEAD
(gitlink, mode 160000) (what's actually checked out)
| Condition | Message |
|---|---|
| SHAs match | clean |
| Submodule committed new SHA | new commits |
| Submodule working tree dirty | modified content |
| Both | both messages |
Nothing else can cause a “modified” submodule.
Task 1: Clean starting state
cd /tutorial/myproject
git submodule status
Prefix: ` ` clean, + HEAD ≠ pinned, - not initialized.
Task 2: Dirty the submodule working tree
Open vendor/math-utils/utils.py. Append:
def halve(x):
return x / 2
Save. Back in outer:
cd /tutorial/myproject
git status # modified content
git diff vendor/math-utils # no real line diff — just a summary
cd vendor/math-utils && git diff # the real diff lives here
Task 3: Commit inside the submodule — then try to push
# inside vendor/math-utils
git add utils.py
git commit -m "Add halve helper"
git push # FAILS — predict the error
Likely: fatal: You are not currently on a branch (detached HEAD from
submodule update) or no upstream branch. This is the top submodule
footgun — Step 1’s detached-HEAD concept, encountered here.
Fix:
git switch -c update-halve 2>/dev/null || git switch update-halve
git log --oneline -2
# git push -u origin update-halve # real push would succeed now
Back in outer:
cd /tutorial/myproject
git status # now: new commits (not modified content)
Task 4: Bump the pinned SHA
git add vendor/math-utils
git commit -m "Bump math-utils: add halve helper"
git log -1 -p vendor/math-utils # shows ONE line: -Subproject commit ... / +Subproject commit ...
💡 The six commands are six invariants — derive them yourself
The ceremony looks arbitrary; each step preserves one invariant:
| # | Command | Invariant preserved |
|---|---|---|
| 1 | cd sub; git switch -c <branch> |
HEAD is branch-attached (not detached) |
| 2 | git commit inside sub |
Your change is a commit object |
| 3 | git push inside sub |
New SHA exists on the sub’s remote |
| 4 | cd ../..; git add <path> |
Outer tree stages the new pinned SHA |
| 5 | git commit outer |
Outer records a commit pinning the new SHA |
| 6 | git push outer |
New pin is visible to teammates |
Know the invariants and the commands derive themselves — no memorization needed.
Task 5: Force-resync (the universal fix)
git submodule update --init --recursive
# add --force if local submodule changes should be discarded
🧭 Fixes 95% of “my submodule is weird” moments
git submodule update --init --recursive
Safe on any repo. Set git config --global submodule.recurse true
to make pull/checkout do it automatically.
Submodule Internals — Knowledge Check
Min. score: 80%
1. You see modified: vendor/math-utils (modified content) in the outer git status. What caused it?
modified content specifically means: the submodule working tree is dirty — files inside are unstaged or untracked. The HEAD may still match the pinned SHA. Running git status inside the submodule will show the dirty files.
2. You see modified: vendor/math-utils (new commits) in the outer git status. What caused it?
new commits means: inside the submodule, HEAD advanced (someone committed, or checked out a different SHA). The outer repo still records the OLD pinned SHA, so it flags the divergence. Fix: git add <path> + commit to bump the pinned SHA, or git submodule update to reset the submodule back to the pinned SHA.
3. You run git diff vendor/math-utils in the outer repo after making and committing a change in the submodule. What do you see?
The outer repo’s diff for a submodule path is always the gitlink SHA change — one line. To see content-level diffs, cd into the submodule and run plain git diff there. Two repos, two diff domains.
4. Which commands reset a submodule’s working directory and HEAD to exactly the SHA the outer repo pins?
git submodule update --init --recursive is the deterministic reset. It clones missing submodules and checks out each one at the outer tree’s pinned SHA. git reset --hard in the outer repo does NOT affect submodule working directories — Git treats them as separate repos.
5. [Revisit Step 3] Why is it consistent that the outer repo records ONLY a pinned SHA for each submodule — not the submodule’s files?
Same object-model insight as Step 3. A commit SHA points at a tree that points at blobs — one SHA resolves to a deterministic snapshot. Storing the SHA is equivalent to storing the files. No duplication is needed.
6. [Evaluate] You edited vendor/math-utils/utils.py and saved. Your teammate pulls your branch and sees a clean git status. Why didn’t your edit get to them?
An edit to a submodule file affects only your working tree until you perform the two-step commit: (1) commit inside the submodule and push its new commit to the submodule’s remote, (2) git add <path> + commit in the outer repo to bump the pinned SHA. Skip either step and the change never reaches teammates.
7. [Revisit Step 1] You edit a file inside a submodule, run git add && git commit inside the submodule, then git push. Git errors with something like fatal: You are not currently on a branch. What Step-1 concept explains this?
After git submodule update, submodules are in detached HEAD at the pinned SHA (because that’s what the outer tree specified — no branch context). Any commit you make there is anchored to nothing. Fix: git switch -c <branch> inside the submodule before committing. Same detached-HEAD pattern as Step 1, encountered in a submodule setting.
8. [Revisit Step 8] You ran git rebase main inside a submodule and rewrote three of its commits. The outer repo’s git status says modified: vendor/math-utils (new commits). Is anything wrong with this?
A submodule is a real Git repo — rebase works there exactly as in Step 8/9. The complication is that the outer repo may still pin the pre-rebase SHAs; if those weren’t pushed, teammates checking out old outer-repo commits will fail to fetch them (fatal: reference is not a tree). Same cardinal rule: rebase only unpushed/local history.
9. The full “publish a submodule change” ceremony. Put the six required commands in order. Distractors are verb-variants that break one or more of the ceremony’s causal invariants. (arrange in order)
cd vendor/math-utils # (1) enter the submodulegit switch -c update-halve # (2) attach HEAD to a branch (NOT detached)git commit -am "Add halve helper" # (3) create the commit inside submodulegit push -u origin update-halve # (4) publish the submodule SHA to its remotecd ../.. # (5) return to outer repogit add vendor/math-utils && git commit -m "Bump math-utils: add halve" # (6) pin new SHA and push outer
git commit -am "..." # BEFORE switching to a branch — detached HEAD, un-pushablegit submodule update --init --recursive # resets inner HEAD, losing your new commitgit push --force origin update-halve # unneeded; no history to overwritegit rebase origin/main # rewrites SHAs you just created; defeats (4)
Each ceremony step preserves one invariant — branch-attached HEAD, commit-exists-in-submodule, SHA-on-remote, SHA-pinned-in-outer, outer-pushed. Each distractor breaks one. Committing first orphans the commit; submodule update resets it; --force is a shared-history violation; rebase rewrites the commits you just tried to publish. Knowing the invariants is the schema that makes the recipe stick.
Capstone: On-Call Debugging Under Pressure
🎯 You will demonstrate you can
- Compose 5+ advanced Git tools into one realistic end-to-end workflow — without step-by-step instruction.
- Pick squash/rebase/merge based on the history shape you want, not memorized rules.
- Trust the reflog safety net after chaining several destructive operations.
- Read state first, act second — the professional habit that defeats blind-testing.
🩺 30-second readiness check — answer before starting
Without scrolling, answer from memory. If any feels shaky, revisit the listed step before attempting the capstone. Component-skill research (Lovett 2001, Ambrose et al. 2010): 45 min on a weak skill beats hours on the integrated task.
- Where do orphaned commits live, and how do you anchor one as a branch? Shaky? → revisit Step 2 (reflog).
- What’s the physical difference between
git rebaseandgit revertin terms of which existing SHAs change? Shaky? → revisit Step 11 (revert) — or really, Step 3. - Why does
git stashnot includefeature.pyif you nevergit add-ed it? Shaky? → revisit Step 4 (stash gotchas). - What’s the verb to finish a paused cherry-pick after resolving conflicts? A paused rebase? Shaky? → revisit Step 5 or Step 8.
- After
git bisect run, what’s the non-negotiable final command, and why? Shaky? → revisit Step 7 (bisect).
All five clear? Proceed. Two or more shaky? Spend 15 minutes on the weak step first. The capstone is an integration exercise — fragile components compound into frustration.
Scenario — no hand-holding
You’re on-call. Page: absolute(-4) == 4 fails on main. CI red.
Teammate left a dirty tree with an unrelated note. Nobody knows which
of ~6 recent commits broke things.
Your checklist:
- Shelve the unrelated in-progress note (tree must be clean for bisect).
- Find the bad commit via binary search.
- Read its message and diff before touching code (author intent).
- Fix on a dedicated branch. Messy WIP commits expected.
- Clean up so
mainsees one focused commit. - Merge to
main. - Restore the shelved note.
- Verify reflog could still recover everything you rewrote.
Nothing new — every command came earlier. The point is choice and composition under pressure.
Style. Loop: read state → decide → act → re-read state.
git status,git log --oneline --graph --all,git reflogare your dashboard. Lost? Re-read state, don’t guess.
The state you walk into
cd /tutorial/myproject
git status
git log --oneline --graph --all -12
python3 test_calculator.py
Hints — open only if stuck for a minute
Task 1 (shelve WIP)
Step 4. One command, noun form. Bisect needs a clean tree.
Task 2 (find the culprit)
Step 7, automated. Test exits 0 = good, non-zero = bad. Always end with reset.
Task 3 (read intent)
Step 6’s chain: git blame + git show <sha>.
Task 4 (messy fix branch)
Branch off main, iterate, make any number of WIP commits, get tests green.
Task 5 (squash into one)
Step 9 rebase -i + squash, or Step 10 merge --squash. Either is fine.
Task 6 (merge)
Whatever strategy leaves main with one clean fix commit on top.
Task 7 (restore note)
Step 4. Inverse of Task 1. Leave uncommitted.
Task 8 (reflog verify)
Step 2. Read-only check: git reflog still sees your pre-squash commits.
Success criteria
python3 test_calculator.pyprintsall tests pass.mainends with exactly one new fix commit.calculator.pystill has your uncommitted# TODO: add clamp helpernote.git reflogretains your intermediate messy commits.
The “burning down the repo” callback
From Step 1’s antipattern: panic = delete the folder, re-clone, force-push. You did the opposite:
| Situation | What you did | What novices do |
|---|---|---|
| Dirty tree | stash | delete folder |
| Unknown-culprit regression | bisect | read 30 diffs |
| Author intent | blame + show | guess |
| Messy intermediates | rebase / squash | rewrite from scratch |
| “Lost” commits | reflog | panicked rm -rf |
Same competence gap you’ll see on every team for the rest of your career.
🏔️ Stretch (optional, not auto-tested)
Re-run with one extra wrinkle: the shelved note conflicts with the
bug-fix line on stash pop. Resolve the conflict, pick keep-both
or keep-fix, verify tests + reflog. This is the capstone’s capstone.
🗺️ The unifying schema — one picture
Every command from the basic tutorial and these 14 advanced steps falls into exactly one of three categories. Only category 3 is dangerous to push. Internalize this picture and you can predict the safety of any unfamiliar Git command at a glance.
@startuml
layout vertical
box "1. ALWAYS SAFE - reads state or moves refs without changing history\nNo new SHAs, no force-push needed\n- git blame, git log, git show, git diff, git status\n- git branch (create), git switch, git checkout (read mode)" as Safe
box "2. SAFE TO PUSH - appends new SHAs without changing existing ones\nAdditive only - teammates fast-forward cleanly\n- git commit\n- git cherry-pick\n- git revert (the anti-matter commit)\n- git merge (with or without merge commit)\n- git merge --squash + git commit\n- git stash (local by design, never pushed)" as Additive
box "3. DANGEROUS TO PUSH - rewrites or abandons existing SHAs\nLocal/unpushed branches only - needs --force on shared\n- git rebase\n- git rebase -i (squash, drop, fixup, edit, reword)\n- git commit --amend\n- git reset --hard / --mixed / --soft" as Rewriting
@enduml
The single decision rule: before pushing, ask “did I rewrite or abandon
any existing SHAs?” If yes, the command lives in category 3 and your
teammates’ clones will diverge. Reach for category 2 (revert, merge,
cherry-pick) when undoing pushed work.
🌱 What to do this week (post-tutorial spaced retrieval)
Without spaced retrieval, ~50% of what you learned today is gone in a week. Twenty minutes total over the next month locks it in:
| When | What |
|---|---|
| Tomorrow (10 min) | Recreate the capstone from a blank slate — same scenario, same tools, no scrolling back. If you stumble, re-do that step (not the whole capstone). |
| In 1 week (5 min) | Pick any 3 commands from this tutorial. From memory: state name, scenario, and the Step 3 schema (creates objects? moves pointers? both?). |
| In 1 month (5 min) | The next time you face a real “lost commit” or “messy branch” at work, reach for git reflog first and rm -rf .git never. That moment is the highest-value retrieval practice you’ll do. |
The Cepeda meta-analysis (254 studies, 14,000+ participants) shows spaced practice produces ~2× better retention than equal-duration massed practice — and the gap widens with delay. This 20 minutes is your highest-ROI study time.
Cumulative Final Quiz — Choose the Right Tool
Min. score: 80%1. Match each scenario to the single best tool. Which option correctly pairs all four?
Cherry-pick is surgical (one commit), merge is bulk (many commits), interactive rebase is for history cleanup, squash-merge collapses a branch into one commit. Steps 5, 8, 9, 10 each framed this table; this question just asks you to recognize when to use which. The others mis-apply cherry-pick (wrong for 50 commits) or merge (doesn’t clean WIP).
2. [Interleaves Steps 3, 5, 8, 9, 10] Which of these operations create commits with new SHAs even when the patch is identical to an earlier commit? Select all that apply. (select all that apply)
A commit’s SHA hashes its tree + parent(s) + author + committer + message. Change any of those and the SHA changes. Cherry-pick, rebase, interactive-rebase, and squash-merge all create new commit objects with different parents or combined trees. Fast-forward merge and git branch do NOT create commits — they only move pointers (Step 1’s whole point). This is the deep schema: commits are immutable; “moving” a commit is always “copy + move pointer to copy.”
3. [Interleaves Steps 2, 9, 1] You performed three destructive-feeling operations in sequence: git reset --hard HEAD~3, then git rebase -i dropping a commit, then entering detached HEAD and making a throwaway commit. Which single tool can recover commits lost in all three cases?
Reflog is the universal safety net because it records HEAD’s position history, not the cause. Whether HEAD moved via reset, rebase drop, or leaving detached HEAD, the SHA it was at is recorded. Branch that SHA back into reachability and the “lost” work is found. This is the Step 2 lesson cashed in on a composite workflow — and the reason the tutorial framed destructive commands as “less scary than they sound.”
4. [Interleaves basic tutorial Step 11 + advanced Step 8] You hit a conflict during git rebase main. You edit the file, remove all <<<<<<< / ======= / >>>>>>> markers, and run git add. Which command finishes this one commit’s resolution?
A rebase conflict is identical in mechanics to a merge conflict — same markers, same git add to mark resolved — but the final verb differs because rebase is replaying, not merging. Reflex-typing git commit here is the single most common mistake; it leaves the rebase half-done. git rebase --abort at any point restores the pre-rebase state.
5. [Interleaves Steps 6, 7] A bug appeared because someone removed a line of validation that used to prevent it. Which investigation tool finds the commit that introduced the bug?
Blame attributes existing lines only — a missing line is invisible to it. Bisect operates on behavioral outcomes (did the test pass or fail?) regardless of whether the change was an addition, modification, or deletion. This is why the capstone you just finished started with bisect, not blame — the bug could just as easily have been a deletion, and starting with bisect generalizes.
6. [Interleaves Steps 3, 11, 13] A submodule is stored in the outer repo as a gitlink entry (mode 160000) containing a 40-character SHA. That SHA references which kind of Git object?
The gitlink pins a commit SHA (in the submodule’s repo). That commit deterministically resolves to a tree, which resolves to blobs — so one SHA is equivalent to a full content snapshot. Same object-model reasoning as Step 3 — snapshot-identity is carried by the commit SHA, which is why “one 40-char pin” is enough information to reconstruct the entire submodule’s state.
7. [Interleaves Steps 8, 9, 10] Which of these operations are forbidden on a branch that has been pushed and is shared with teammates? Select all that apply. (select all that apply)
The cardinal rule — anything that rewrites published commits is forbidden on shared branches, because teammates holding the old SHAs will diverge. That rules out rebase (any flavor), amend, and force-push. Revert and merge are additive (they only append new commits without changing existing history), so they are safe. Same rule, different commands. Memorize the property (rewrite = dangerous), not the per-command list.
8. [Interleaves Steps 2, 8] You rebased feature locally, then git push was rejected because teammate Alice had pushed to feature in the meantime. What is the safe recovery sequence?
(arrange in order)
git reflog # find the pre-rebase SHA of `feature`git reset --hard# undo your rebase locally git pull # merge Alice's changes into your (unrebased) branchgit push # succeeds — no history was rewritten
git push --forcegit rebase --abortgit revert HEADrm -rf .git && git clone
When rebasing a shared branch goes wrong, the fix is always — undo your rewrite first, then integrate normally. Reflog finds the pre-rebase SHA; reset --hard restores it; pull merges Alice’s work; push succeeds. The distractors represent the antipatterns Step 1 named — push --force overwrites Alice’s work; rebase --abort does not apply after the rebase is complete; revert is for undoing a single commit, not a rebase; rm -rf .git && clone is the “burning down the repo” antipattern.
9. [Interleaves Steps 4, 7] You are mid-edit on a feature when a teammate asks you to bisect a regression on main. Your working tree has uncommitted changes you want to keep. Two tools from this tutorial compose to solve this cleanly — which pair?
Bisect moves HEAD across arbitrary historical commits — a dirty working tree either blocks it or carries your edits into commits they don’t belong in. Stash is designed exactly for this — private, local, temporary. Committing WIP pollutes history if pushed; git restore . destroys your work. Recognize the compose-two-tools pattern — most real Git tasks chain more than one command.
10. [Interleaves Steps 7, 10] Three months ago your team squash-merged the feature-stats branch into main. A regression has surfaced that bisect on main narrows down to the squash commit. The squash commit changed 800 lines. What is your next move?
Squash-merge collapses main’s history, not the feature branch’s. The feature branch’s commits still exist (and its reflog too) if it wasn’t deleted. Bisect there pinpoints the exact internal commit. This is the strongest pragmatic argument for keeping merged feature branches around for a while, not deleting them the day they merge. Step 10’s quiz framed this; this question checks that you can reach for the recovery without being reminded.
11. [The unifying insight] After working through the whole advanced tutorial, which statement best captures what every command you learned actually does?
This is the load-bearing invariant from Step 3 cashed in across all 15 steps. Branch creation moves a ref. Commit creates an object + moves a ref + clears the index. Rebase creates a series of new objects + moves a ref. Cherry-pick creates one new object + moves a ref. Squash-merge creates one new object + moves a ref. Even the “destructive” commands (reset, rebase drop) only move refs — the old objects remain in .git/objects and reflog keeps their addresses. If you internalize immutability of existing commits, nothing in Git is mysterious.
12. [Evaluate — meta] A junior teammate says: “Destructive Git commands like rebase and reset are too dangerous to use; I’ll stick with merge and revert only.” Evaluate this position.
Conditional knowledge is the mark of an expert (Ambrose et al. 2010). The cardinal rule is not “rebase = bad” — it is “rebase rewrites history, which is dangerous on shared branches and safe on local ones.” The junior’s heuristic is safer and less effective; teaching them when each tool is appropriate is the goal. The same tool, on the same commits, is either routine or catastrophic depending on one thing — has this history been pushed and pulled by others? This is the single most important distinction the advanced tutorial taught.