Version Control with Git
Want to practice? Try the Interactive Git Tutorial — hands-on exercises in a real Linux system right in the browser!
In modern software construction, version control is not just a convenience—it is a foundational practice, solving several major challenges associated with managing code. Git is by far the most common tool for version control. Let’s dive into both!
Basics
What is Version Control?
Version control (also known as source control or revision control) is the software engineering practice of controlling, organizing, and tracking different versions in the history of computer files. While it works best with text-based source code, it can theoretically track any file type.
We call a tool that supports version control a Version Control System (VCS). The most common version control systems are:
- Git (most common for open source systems, also used by Microsoft, Apple, and most other companies)
- Mercurial (used by Meta, formerly Facebook (Goode and Rain 2014), Jane Street, and some others)
- Piper (internal tool used by Google (Potvin and Levenberg 2016))
- Subversion (used by some older projects)
Why is it Essential?
Manual version control—saving files with names like Homework_final_v2_really_final.txt—is cumbersome and error-prone. Automated systems like Git solve several critical problems:
- Collaboration: Multiple developers can work concurrently on the same project without overwriting each other’s changes.
- Change Tracking: Developers can see exactly what has changed since they last worked on a file.
- Traceability: It provides a summary of every modification: who made it, when it happened, and why.
- Reversion/Rollback: If a bug is introduced, you can easily revert to a known stable version.
- Parallel Development: Branching allows for the isolated development of new features or bug fixes without affecting the main codebase.
Centralized vs. Distributed Version Control
There are two primary models of version control systems:
| Feature | Centralized (e.g., Subversion, Piper) | Distributed (e.g., Git, Mercurial) |
|---|---|---|
| Data Storage | Data is stored in a single central repository. | Each developer has a full copy of the entire repository history. |
| Offline Work | Requires a connection to the central server to make changes. | Developers can work and commit changes locally while offline. |
| Best For | Small teams requiring strict centralized control. | Large teams, open-source projects, and distributed workflows. |
The Git Architecture: The Three States
To understand Git, you must understand where your files live at any given time. Git operates across three main “states” or areas:
- Working Directory (or Working Tree): This is where you currently edit your files. It contains the files as they exist on your disk.
- Staging Area (or Index): This is a middle ground where you “stage” changes you want to include in your next snapshot.
- Local Repository: This is where Git stores the compressed snapshots (commits) of your project’s history.
Fundamental Git Workflow
A typical Git workflow follows these steps:
- Initialize: Turn a directory into a Git repo using
git init. - Stage: Add file contents to the staging area with
git add <filename>. - Commit: Record the snapshot of the staged changes with
git commit -m "message". - Check Status: Use
git statusto see which files are modified, staged, or untracked. - Review History: Use
git logto see the sequence of past commits.
Inspecting Differences
git diff is used to compare different versions of your code:
git diff: Compares the working directory to the staging area.git diff --staged(also--cached): Compares the staging area to the latest commit — useful to review exactly what you are about to commit.git diff HEAD: Compares the working directory to the latest commit.git diff HEAD^ HEAD: Compares the parent commit to the latest commit (shows what the latest commit changed).git diff main..feature: Shows all changes infeaturethat are not yet inmain— useful for reviewing a branch before merging.
Branching and Merging
A branch in Git is like a pointer to a commit (implemented as a lightweight, 41-byte text file stored in .git/refs/heads/ that contains the SHA checksum of the commit it currently points to). Creating or destroying a branch is nearly instantaneous — Git writes or deletes a tiny reference, not a copy of your project. The HEAD pointer (stored in .git/HEAD) normally holds a symbolic reference to the current branch, such as ref: refs/heads/main.
Integrating Changes
When you want to bring changes from a feature branch back into the main codebase, Git typically uses one of two automatic merge strategies:
- Fast-Forward Merge: When the target branch (
main) has received no new commits since the feature branch was created, Git simply advances themainpointer to the tip of the feature branch. No merge commit is created; the history stays perfectly linear. Usegit merge --no-ffto force Git to create a merge commit even when a fast-forward is possible — this preserves a record that a feature branch existed. - Three-Way Merge: When both branches have diverged — each has commits the other doesn’t — Git compares both tips against their common ancestor and creates a new merge commit with two parents. The commit graph forms a diamond shape where the two diverging paths converge.
Alternative Integration Workflows
For more control over your project’s history, you can use these manual techniques:
- Rebasing: Re-applies commits from one branch onto a new base, producing new commit objects with new SHA hashes. Creates a linear history but must never be used on shared branches, as it rewrites history that collaborators may already have.
- Squashing:
git merge --squashcollapses all commits from a feature branch into a single commit on the target branch, keeping the main history tidy.
Complications
- Merge Conflict: Happens when Git cannot automatically reconcile differences — usually when the same lines of code were changed in both branches. Git marks the conflicting sections directly in the file using conflict markers:
<<<<<<< HEAD your version of the code ======= incoming branch version >>>>>>> feature-branchTo resolve: edit the file to keep the correct content (removing all markers), then
git addthe resolved file andgit committo complete the merge. Usegit merge --abortto cancel a merge in progress and return to the pre-merge state. - Detached HEAD: Occurs when HEAD points directly to a commit hash rather than a branch reference — for example, when using
git switch --detach <commit>to inspect an older version of the codebase. New commits made in this state are not anchored to any branch and can easily be lost when switching away. To preserve work from a detached HEAD, create a new branch withgit switch -c <name>before switching elsewhere. Usegit reflogto recover the hash of any commits made in detached HEAD state.
Advanced Power Tools
Git includes several advanced commands for debugging and project management:
git stash: Temporarily saves local changes (staged and unstaged) so you can switch branches without committing messy or incomplete work.git cherry-pick: Selectively applies a specific commit from one branch onto another.git bisect: Uses a binary search through your commit history to find the exact commit that introduced a bug.git blame: Annotates each line of a file with the name of the author and the commit hash of the last person to modify it.git revert: Safely “undoes” a previous commit by creating a new commit with the inverse changes, preserving the original history.git reflog: Records every position HEAD has pointed to, even when you switch branches, reset, or make commits in detached HEAD state. This is your safety net for recovering “lost” commits — if a commit is no longer reachable via any branch,git reflogwill show its hash so you can recover it withgit switch -c <name> <hash>.
Managing Large Projects: Submodules
For very large projects, Git Submodules allow you to keep one Git repository as a subdirectory of another. This is ideal for including external libraries or shared modules while maintaining their independent history. Internally, a submodule is represented as a file pointing to a specific commit ID in the external repo.
Best Practices for Professional Use
- Write Meaningful Commit Messages: Messages should explain what was changed and why. Avoid vague messages like “bugfix” or “small changes”.
- Commit Small and Often: Aim for small, coherent commits rather than massive, “everything” updates.
- Never Force-Push (
git push -f) on Shared Branches: Force-pushing overwrites the remote history to match your local copy, permanently deleting any commits your collaborators have already pushed. - Use
git revertto Undo Shared History: When a bad commit has already been pushed, usegit revert <hash>to create a new “anti-commit” that safely inverts the change while preserving the full history. Never usegit reset --hardon shared branches — it rewrites history and breaks every collaborator’s local copy. - Use
.gitignore: Always include a.gitignorefile to prevent tracking unnecessary or sensitive files. The file uses glob patterns:*.pyc— ignore all files with a given extension.__pycache__/— ignore an entire directory (trailing slash)..env— ignore a specific file (commonly used to protect secrets and API keys).node_modules/,venv/— ignore dependency folders..DS_Store,Thumbs.db— ignore OS-generated clutter files. Note:.gitignorehas no retroactive effect — files already tracked by Git must be explicitly removed withgit rm --cached <file>before the ignore pattern applies. Commit the.gitignoreitself so the whole team benefits.
- Pull Frequently: Regularly pull the latest changes from the main branch to catch merge conflicts early.
Git Command Manual
Common Git commands can be categorized into several functional groups, ranging from basic setup to advanced debugging and collaboration.
Configuration and Initialization
Before working with Git, you must establish your identity and initialize your project.
git config: Used to set global or repository-specific settings. Common configurations include setting your username, email, and preferred text editor.git init: Initializes a new, empty Git repository in your current directory, allowing Git to begin tracking files.
The Core Workflow (Local Changes)
These commands manage the lifecycle of your changes across the three Git states: the working directory, the staging area (index), and the repository history.
git add: Adds file contents to the staging area to be included in the next commit.git status: Provides an overview of which files are currently modified, staged for the next commit, or untracked by Git.git commit: Records a snapshot of all changes currently in the staging area and saves it as a new version in the local repository’s history. Professional practice encourages writing meaningful commit messages to help team members understand the “what” and “why” of changes.git log: Displays the sequence of past commits. Common flags:git log -p: Shows the actual changes (patches) introduced in each commit.git log --oneline: Displays each commit as a single compact line (short hash + message).git log --graph --all: Renders an ASCII art graph of all branch and merge history.
git diff: Compares different versions of your project:git diff: Compares the working directory to the staging area.git diff --staged(alias--cached): Compares the staging area to the latest commit.git diff HEAD: Compares the working directory to the latest commit.git diff HEAD^ HEAD: Compares the parent commit to the latest commit (shows what the latest commit changed).git diff main..feature: Shows commits infeaturenot yet inmain.
git restore(Git 2.23+): The modern command for undoing file changes, replacing the file-restoration uses of the oldergit checkoutandgit reset:git restore --staged <file>: Unstages a file, moving it out of the staging area while leaving working directory modifications untouched.git restore <file>: Discards all uncommitted changes to a file in the working directory, restoring it to its last staged or committed state. This is irreversible — uncommitted changes will be permanently lost.
Branching and Merging
Branching allows for parallel development, such as working on a new feature without affecting the main codebase.
git branch: Lists, creates, or deletes branches. A branch is a lightweight pointer (a 41-byte file in.git/refs/heads/) to a specific commit.git branch -d <branch>: Deletes a branch that has already been merged (safe — Git will refuse if unmerged commits would be lost).git branch -D <branch>: Force-deletes a branch regardless of merge status (use with care).
git switch(recommended, Git 2.23+): The modern, dedicated command for navigating branches.git switch <branch>: Switches to an existing branch.git switch -c <new-branch>: Creates a new branch and immediately switches to it.git switch --detach <commit>: Checks out an arbitrary commit in detached HEAD state for safely inspecting older code without affecting any branch.
git checkout(legacy): The older multi-purpose command that handled both branch switching and file restoration. Still widely encountered in documentation and scripts.git checkout <branch>is equivalent togit switch <branch>;git checkout -b <name>is equivalent togit switch -c <name>.git merge: Integrates changes from one branch into another.git merge --squash: Combines all commits from a feature branch into a single commit on the target branch to maintain a cleaner history.git merge --no-ff: Forces creation of a merge commit even when a fast-forward would be possible, preserving the record that a feature branch existed.git merge --abort: Cancels an in-progress merge (including one with conflicts) and restores the branch to its pre-merge state.
git rebase: Re-applies commits from one branch onto a new base. This is often used to create a linear history, though it must never be used on shared branches.
Remote Operations
These commands facilitate collaboration by syncing your local work with a remote server (like GitHub).
git clone: Creates a local copy of an existing remote repository.git remote: Lists remote connections.git remote add origin <url>registers a remote namedorigin(the conventional primary remote name).git pull: Fetches changes from a remote repository and immediately merges them into your current local branch.git push: Uploads your local commits to a remote repository. Note: Never usegit push -f(force-push) on shared branches, as it can overwrite and destroy work pushed by other team members.git push -u origin <branch>: Pushes the branch and sets up upstream tracking, so futuregit pushandgit pullcalls on this branch no longer need to specify the remote and branch name.
- Bare Repositories: A bare repository (created with
git init --bare) contains only the Git metadata with no working directory — it stores history but you cannot edit files in it directly. Remote servers (GitHub, GitLab, self-hosted) use bare repositories as the central point that all developers push to and pull from.
Advanced and Debugging Tools
Git includes powerful utilities for handling complex scenarios and tracking down bugs.
git stash/git stash pop: Temporarily saves uncommitted changes (both staged and unstaged) so you can switch contexts without making a messy commit. Usepopto re-apply those changes later.git cherry-pick: Selectively applies a single specific commit from one branch onto another.git bisect: Uses a binary search through commit history to find the exact commit that introduced a bug.git blame: Annotates each line of a file with the author and commit ID of the last person to modify it.git revert <commit>: Creates a new “anti-commit” that applies the exact inverse changes of a previous commit, safely undoing it without rewriting history. Prefer this overgit resetwhenever the commit to undo has already been pushed to a shared branch.git reflog: Shows a chronological log of every position HEAD has pointed to in the local repository. Indispensable for recovering “lost” commits — commits made in detached HEAD state or after an accidental reset can be found here and recovered withgit switch -c <name> <hash>.git show: Displays detailed information about a specific Git object, such as a commit.git submodule: Allows you to include an external Git repository as a subdirectory of your project while maintaining its independent history.
Quiz
Git Commands Flashcards
Which Git command would you use for the following scenarios?
You have some uncommitted, incomplete changes in your working directory, but you need to switch to another branch to urgently fix a bug. How do you temporarily save your current work without making a messy commit?
You know a bug was introduced recently, but you aren’t sure which commit caused it. How do you perform a binary search through your commit history to find the exact commit that broke the code?
You are looking at a file and want to know exactly who last modified a specific line of code, and in which commit they did it.
You want to safely ‘undo’ a previous commit that introduced an error, but you don’t want to rewrite history or force-push. How do you create a new commit with the exact inverse changes?
You want to see exactly what has changed in your working directory compared to your last saved snapshot (the most recent commit).
You have a feature branch with several experimental commits, but you only want to move one specific, completed commit over to your main branch.
You want to integrate a feature branch into main, but instead of bringing over all 15 tiny incremental commits, you want them combined into one clean commit on the main branch.
You are building a massive project and want to include an entirely separate external Git repository as a subdirectory within your project, while keeping its history independent.
You are starting a brand new project in an empty folder on your computer and want Git to start tracking changes in this directory.
You have just installed Git on a new computer and need to set up your username and email address so that your commits are properly attributed to you.
You’ve made changes to three different files, but you only want two of them to be included in your next snapshot. How do you move those specific files to the staging area?
You’ve lost track of what you’ve been doing. You want a quick overview of which files are modified, which are staged, and which are completely untracked by Git.
You have staged all the files for a completed feature and are ready to permanently save this snapshot to your local repository’s history with a descriptive message.
You want to review the chronological history of all past commits on your current branch, including their author, date, and commit message.
You’ve made edits to a file but haven’t staged it yet. You want to see the exact lines of code you added or removed compared to what is currently in the staging area.
You want to start working on a completely new feature in isolation without affecting the main codebase.
You are currently on your feature branch and need to switch your working directory back to the ‘main’ branch.
Your feature branch is complete, and you want to integrate its entire commit history into your current ‘main’ branch.
Instead of creating a merge commit, you want to take the commits from your feature branch and re-apply them directly on top of the latest ‘main’ branch to create a clean, linear history.
You want to start working on an open-source project hosted on GitHub. How do you download a full local copy of that repository to your machine?
Your team members have uploaded new commits to the shared remote repository. You want to fetch those changes and immediately integrate them into your current local branch.
You have finished making several commits locally and want to upload them to the remote GitHub repository so your team can see them.
You have a specific commit hash and want to see detailed information about it, including the commit message, author, and the exact code diff it introduced.
You want to start working on a new feature in isolation. How do you create a new branch called ‘feature-auth’ and immediately switch to it in a single command?
You accidentally staged a file you didn’t intend to include in your next commit. How do you move it back to the working directory without losing your modifications?
You made some experimental changes to a file but want to discard them entirely and revert to the version from your last commit.
You merge a feature branch into main, and Git performs the merge without creating a new merge commit — it simply moves the ‘main’ pointer forward. What type of merge is this, and when does it occur?
You want to safely inspect the codebase at a specific older commit without modifying any branch. How do you do this?
Version Control and Git Quiz
Test your knowledge of core version control concepts, Git architecture, branching strategies, and advanced commands.
Which of the following best describes the core difference between centralized and distributed version control systems (like Git)?
What are the three primary local states that a file can reside in within a standard Git workflow?
What does the command git diff HEAD compare?
Which Git command should you NEVER use on a shared branch because it can permanently overwrite and destroy work pushed by other team members?
You have some uncommitted, incomplete changes in your working directory, but you need to switch to another branch to urgently fix a bug. Which command is best suited to temporarily save your current work without making a messy commit?
What happens when you enter a ‘Detached HEAD’ state in Git?
Which Git command utilizes a binary search through your commit history to help you pinpoint the exact commit that introduced a bug?
What is the primary purpose of Git Submodules?
Which of the following are advantages of a Distributed Version Control System (like Git) compared to a Centralized one? (Select all that apply)
Which of the following represent the core local states (or areas) where files can reside in a standard Git architecture? (Select all that apply)
Which of the following commands are primarily used to review changes, history, or differences in a Git repository? (Select all that apply)
In which of the following scenarios would using git stash be considered an appropriate and helpful practice? (Select all that apply)
Which of the following are valid methods or strategies for integrating changes from a feature branch back into the main codebase? (Select all that apply)
A faulty commit was pushed to a shared ‘main’ branch last week and your teammates have already synced it. Why should you use git revert to fix this rather than git reset --hard followed by a force-push?
When integrating a feature branch into ‘main’, under what condition will Git perform a fast-forward merge rather than creating a three-way merge commit?
What does the file .git/HEAD contain when you are checked out on a branch, compared to when you are in a detached HEAD state?
Arrange the Git commands into the correct order to: create a feature branch, make changes, and integrate them back into main via a merge.
git switch -c feature&&git add app.py&&git commit -m 'Add feature'&&git switch main&&git merge feature
Arrange the commands to safely stash your work, pull remote changes, and restore your stashed work.
git stash&&git pull&&git stash pop
Arrange the commands to undo a bad commit on a shared branch safely: first identify the commit, then revert it, then push the fix.
git log --oneline&&git revert &&git push</code>
</span>
</div>
</div>
Arrange the commands to initialize a new repository and record an initial commit.
Drag fragments into the answer area in the correct order (some items are distractors that should not be used):
→ Drop here →
Correct order:
git init&&git add .&&git commit -m 'Initial commit'
Arrange the commands to register a remote called origin and push the main branch to it for the first time.
Drag fragments into the answer area in the correct order (some items are distractors that should not be used):
→ Drop here →
Correct order:
git remote add origin &&git push -u origin main</code>
</span>
</div>
</div>
Arrange the commands to stage a forgotten file and fold it into the last commit without changing the commit message.
Drag fragments into the answer area in the correct order (some items are distractors that should not be used):
→ Drop here →
Correct order:
git add forgotten.py&&git commit --amend --no-edit
</div>
</div>