C for C++ Programmers: Origin Story
Every systems programmer has an origin story. Yours starts here. Shed the safety net of C++, master raw memory, and emerge with powers most programmers never unlock — all through hands-on practice in an interactive Linux terminal.
Origin Story — Shedding the C++ Armor
Chapter 1: Every hero starts by losing something.
Welcome to the C Tutorial! You already know C++ — so instead of starting from zero, we’ll focus on what’s different and what’s missing.
Think of C++ as a suit of high-tech armor: classes, std::string, templates — layers of protection built over decades. C is what’s underneath: raw, exposed, powerful. Learning C means voluntarily removing the armor to understand what it was protecting you from. That’s not a downgrade — it’s an origin story. Every systems programming superhero (Linux kernel devs, embedded engineers, OS hackers) started right here.
Prerequisites — what we assume you know
We assume you’ve written non-trivial C++ — meaning you’ve used std::cout, std::string, std::vector, classes with constructors / destructors, references (int&), and new / delete. You should be comfortable reading a for loop, a function signature, and a header #include. Templates, the STL beyond <vector> / <string>, RAII, and exceptions are referenced but not required — we’ll mention what each loses when we drop them. No prior C exposure required; in fact, prior C will make some sections feel slow.
Total time: ~120 min for all 11 chapters at a deliberate pace. Each chapter is gated by working code + a knowledge check, so you can stop and resume between chapters without losing state.
🎯 You will learn to
- Identify the C++ features that simply don’t exist in C (references, namespaces, overloading, templates).
- Apply
gcc -Wall -std=c11to compile a C source file — and explain whyg++would mask the differences. - Predict whether
printfadds an implicit newline before you run the program.
C is not a “simpler C++.” It’s an older, smaller language that C++ grew out of. Many features you rely on in C++ simply don’t exist:
| C++ Feature | C Equivalent |
|---|---|
cout << x |
printf("%d", x) |
new / delete |
malloc() / free() |
class |
struct (no methods, no access control) |
string |
char[] arrays + string functions |
References (&) |
Pointers only |
bool |
#include <stdbool.h> or use int |
| Namespaces | None — everything is global |
| Function overloading | Not supported |
| Templates | Not supported |
Task: Compile and run your first C program
A file hello.c has been created. Look at it in the editor, then compile and run it:
cd c_project
gcc -Wall -std=c11 hello.c -o hello
./hello
Important: We use gcc, not g++. Using g++ would compile as C++ and mask the differences we’re here to learn.
Before you start editing code, study the program first. You’ll learn more by reading code before writing it. Read hello.c carefully and identify all the differences from C++ you can spot.
Notice:
#include <stdio.h>instead of#include <iostream>printf()instead ofcout <<- No
using namespace std;— C has no namespaces
✏️ Predict before you compile
Look at the four printf calls in hello.c. Each ends with \n. Mentally delete the \n from the third line’s printf — so it reads printf("Just you, raw memory, and a compiler."); (no \n).
Now predict: when you compile and run that modified version, what would the output look like? Pick one:
- (a) Identical to the original —
printfalways adds an implicit newline. - (b) Lines 3 and 4 collapse onto a single line — output ends with
Just you, raw memory, and a compiler.Let's go. - (c) Line 3 disappears entirely — without
\n,printfdoesn’t flush. - (d) Compile error —
printfrequires every string to end with\n.
Commit to a letter on paper. Then compile the original and read the actual output. (The next exercise won’t ask you to actually delete the \n — this is a thought experiment.)
⚠️ Open after you've committed to an answer
The answer is (b). C’s printf writes exactly the bytes you give it — no implicit newline, no implicit flush rule based on string content. Lines 3 and 4 would collapse: Just you, raw memory, and a compiler.Let's go. This is the C++→C trap to lock in early: in C, every \n is something you explicitly wrote. Coming from cout << x << endl; it’s easy to forget that endl was doing two things — newline and flush — and that printf does neither for you automatically.
Why does this matter? Forgetting \n is the #1 reason “my program ran but I didn’t see any output” — output sits in stdout’s line-buffer, never flushed before the program exits, vanished. We’ll meet fflush(stdout) properly in Step 3 when we mix printf with scanf.
#include <stdio.h>
int main(void) {
printf("=== Welcome to the Danger Zone ===\n");
printf("No classes. No RAII. No safety net.\n");
printf("Just you, raw memory, and a compiler.\n");
printf("Let's go.\n");
return 0;
}
Step 1 — Knowledge Check
Min. score: 80%1. In C, what is the correct way to print text to the terminal?
C uses printf() from <stdio.h> for output. cout is C++ only. C has no objects, no operator overloading, and no << for I/O.
2. Why do we compile with gcc instead of g++ in this tutorial?
g++ compiles .c files as C++, silently accepting features like references, classes, and overloading that don’t exist in C. Using gcc ensures we learn real C.
3. What does int main(void) mean in C, and how does it differ from int main()?
In C, int main() means ‘main can take any number of arguments’ — it’s an old-style declaration. int main(void) explicitly says ‘no arguments.’ In C++, both mean the same thing, but in C, the distinction matters.
4. A C++ program uses std::string name = "Alice"; std::cout << name.length();. Why can’t this approach work in C? (Select the most fundamental reason.)
The core issue isn’t a missing function — it’s a missing paradigm. C has no objects, no methods, no operator overloading. A C ‘string’ is just a char[] array. You must use standalone functions like strlen() from <string.h>. This is the fundamental shift: C gives you data and functions, not objects and methods.
5. Arrange the lines to write a minimal C program that prints "42" to the terminal.
(arrange in order)
#include <stdio.h>int main(void) {printf("%d\n", 42);return 0;}
#include <iostream>std::cout << 42 << std::endl;
A C program needs #include <stdio.h> (not <iostream>), uses printf with a format specifier (not cout), and has the standard int main(void) signature. The distractors are C++ syntax that won’t compile under gcc.
Power #1 — printf: Speak to the Machine
Power Unlocked: Formatted Output
Your first superpower: talking directly to the terminal. printf is C’s Swiss Army knife for output. It takes a format string containing ordinary text and conversion specifiers that start with %:
🎯 You will learn to
- Apply
printfconversion specifiers (%d,%f,%s,%c,%x,%%) to format mixed values. - Analyze width / precision / padding modifiers (
%.2f,%-20s,%05d) and predict their output. - Modify a working program — adding a new conversion — to lock in the syntax.
| Specifier | Type | Example |
|---|---|---|
%d |
int |
printf("%d", 42) → 42 |
%f |
double |
printf("%f", 3.14) → 3.140000 |
%c |
char |
printf("%c", 'A') → A |
%s |
char* (string) |
printf("%s", "hi") → hi |
%p |
pointer | printf("%p", ptr) → 0x7fff... |
%x |
hex int |
printf("%x", 255) → ff |
%% |
literal % |
printf("100%%") → 100% |
Width and Precision
You can control formatting with width and precision modifiers:
%10d— right-align integer in a field 10 characters wide%-10s— left-align string in a field 10 characters wide%.2f— show exactly 2 decimal places%05d— pad with zeros:00042
Predict Before You Run (PRIMM)
Before compiling, predict what each line in format_lab.c will print. Write down your predictions on paper, then compile and check. This predict-then-verify cycle is called PRIMM (Predict, Run, Investigate, Modify, Make) — and it’s one of the most effective ways to learn a new language’s quirks.
gcc -Wall -std=c11 format_lab.c -o format_lab
./format_lab
How many did you get right?
Investigate and Modify
Now try these modifications to deepen your understanding:
- Investigate: Change
%.2fto%.5f. How many decimal places appear now? - Investigate: What does
%+ddo? Tryprintf("%+d", 42)andprintf("%+d", -7). - Modify: Add a new line that prints:
Score in hex: 0x2a(Hint: use%xand the0xprefix).
#include <stdio.h>
int main(void) {
int xp = 42;
double hp = 97.5;
char rank = 'S';
char player[] = "xX_SlayerKing_Xx";
// Basic specifiers
printf("Player: %s\n", player);
printf("XP: %d\n", xp);
printf("HP: %f\n", hp);
printf("Rank: %c\n", rank);
// Width and precision
printf("HP (1 decimal): %.1f\n", hp);
printf("HP (no decimals): %.0f\n", hp);
printf("XP (zero-padded): [%05d]\n", xp);
printf("Player (right-20):[%20s]\n", player);
printf("Player (left-20): [%-20s]\n", player);
// Multiple values in one call
int xp_needed = 100;
printf("%s: %d/%d XP (%.1f%% to next level)\n",
player, xp, xp_needed, (xp * 100.0) / xp_needed);
return 0;
}
Step 2 — Knowledge Check
Min. score: 80%
1. What does printf("%.2f", 3.14159) print?
.2f means ‘show exactly 2 decimal places.’ The value is rounded to 3.14.
2. You want to print a literal % character. Which format string is correct?
Since % starts a conversion specifier, the only way to print a literal % in printf is %%. Using \% is not valid in C’s printf (unlike some other languages).
3. What happens if you use the wrong specifier, like printf("%d", 3.14)?
printf reads raw bytes from the stack based on the format specifier. %d reads 4 bytes as an int, but 3.14 was passed as an 8-byte double. The result is undefined behavior — typically garbage output. The compiler may warn (-Wall) but won’t stop you.
4. Arrange the printf arguments to correctly print: Player xX_SlayerKing_Xx has 42/100 XP (42.0%)
(arrange in order)
printf("Player%s has %d/%d XP (%.1f%%)\n","xX_SlayerKing_Xx",42,100,42.0);
%f has %s"42",
%s matches the string "xX_SlayerKing_Xx", %d matches ints 42 and 100, %.1f matches the double 42.0, and %% is the printf escape that produces a single literal % in the output. The distractor "42" is wrong because %d expects an int, not a string.
5. [Interleaved: Revisit Step 1] Which of the following C++ features does NOT exist in C?
C has pointers, structs, and header files — these are shared with C++. But function overloading (two functions with the same name but different parameters) is a C++ feature. In C, every function must have a unique name.
Power #2 — scanf: Listen (But Watch Your Back)
Power Unlocked: Reading Input (with great danger)
Every superpower has a dark side. scanf lets you hear the user — but it’s also how most C programs get hacked.
scanf reads formatted input from the user. It uses the same % specifiers as printf, but with a critical difference: scanf needs pointers because it must store the input somewhere.
🎯 You will learn to
- Identify the buffer-overflow risk in unbounded
scanf("%s", ...)andgets()style input. - Apply
fgets(buf, sizeof(buf), stdin)as the safe alternative for reading lines. - Explain why
fflush(stdout)is required after a prompt that lacks a trailing\n.
int age;
scanf("%d", &age); // & gives the ADDRESS of age
The & (address-of operator) is required for basic types. Without it, scanf would receive the value of age (garbage, since it’s uninitialized), interpret it as a memory address, and write to a random location — a classic undefined behavior bug.
The Buffer Overflow Danger
Reading strings with scanf is notoriously dangerous:
char name[10];
scanf("%s", name); // DANGER: no length limit!
If the user types more than 9 characters, scanf writes past the end of the array — a buffer overflow. This is the exact vulnerability class that has caused thousands of real-world security exploits.
The safe alternative: Use fgets() to read a line with a length limit:
fgets(name, sizeof(name), stdin); // reads at most 9 chars + '\0'
Why fflush(stdout) Matters
Notice the template code has fflush(stdout) after each printf prompt. Why? When your program writes to stdout, C doesn’t send the text to the screen immediately — it buffers it for efficiency. A newline \n usually flushes the buffer, but our prompts ("Enter server name: ") don’t end with \n. Without fflush(stdout), the prompt might never appear before scanf/fgets blocks waiting for input — the user sees a blank screen. fflush(stdout) forces the buffer to the screen immediately.
Task: Fix the vulnerable program
The file input_lab.c has a buffer overflow bug. This is a Bug Hunt — you’ll learn more from finding and fixing broken code than from writing it yourself. Let’s go.
- Replace the dangerous
scanf("%s", ...)withfgets(). - Compile with
gcc -Wall -std=c11 input_lab.c -o input_lab. - Run
./input_laband test it.
Hint: fgets includes the newline character \n in the buffer. The provided strip_newline helper removes it.
#include <stdio.h>
#include <string.h>
// Helper: remove trailing newline from fgets input
void strip_newline(char *str) {
size_t len = strlen(str);
if (len > 0 && str[len - 1] == '\n') {
str[len - 1] = '\0';
}
}
int main(void) {
char server[20];
int players;
printf("Enter server name: ");
fflush(stdout);
// BUG: this scanf has no length limit — buffer overflow!
scanf("%s", server);
printf("Enter player count: ");
fflush(stdout);
scanf("%d", &players);
printf("Server %s: %d players online.\n", server, players);
return 0;
}
Step 3 — Knowledge Check
Min. score: 80%
1. Why does scanf("%d", &age) need the & before age?
scanf must write the parsed value somewhere. &age provides the memory address of age. Without &, scanf would interpret the current (garbage) value of age as an address — undefined behavior.
2. What is the specific danger of scanf("%s", buffer) when the user types more characters than buffer can hold?
scanf with %s has no built-in length limit. It keeps writing characters until it sees whitespace, potentially overwriting adjacent memory. This is a classic security vulnerability.
3. fgets(buf, 20, stdin) reads at most how many characters into buf?
fgets reads at most size - 1 characters, reserving the last byte for \0. So fgets(buf, 20, stdin) reads at most 19 characters. This is what makes it safe — unlike scanf, it respects the buffer boundary.
4. Arrange the lines to safely read a city name (max 30 chars), strip its trailing newline, and print it back as City: <name>. The pattern is the same as input_lab.c — but you must transfer it to a new buffer name, a new size, and a different output format.
(arrange in order)
char city[30];printf("Enter city: ");fgets(city, sizeof(city), stdin);strip_newline(city);printf("City: %s\n", city);
scanf("%s", city);gets(city);char city[1000];
Declare a buffer with a sensible bound (30 chars covers most real city names — bigger isn’t always better; oversized buffers waste stack and don’t fix the safety issue), prompt, read safely with fgets (which limits input to sizeof(city) - 1 chars), strip the trailing newline that fgets includes, then print with the format the question asked for. scanf("%s") and gets() are both unsafe — gets was removed from the C standard entirely because it cannot be used safely. char city[1000] would also work but it’s not a fix — even a 1000-char buffer can be overflowed; the right defense is fgets-with-sizeof, not just larger buffers.
5. [Interleaved: Revisit Step 2] What does printf("%05d", 42) print?
The 0 flag means ‘pad with zeros instead of spaces’, and 5 is the field width. So 42 gets zero-padded to 5 digits: 00042. Without the 0 flag, %5d would give ` 42` (space-padded).
Power #3 — malloc/free: Control Over Memory Itself
Power Unlocked: Manual Memory Management
This is the big one. The power that separates C programmers from everyone else: you control memory directly. No garbage collector. No smart pointers. Just you and the heap. With great power comes great responsibility — and great bugs.
This step teaches you the discipline that prevents the silent memory bugs that have crashed real systems for decades. You’ll meet the grim student-error stats at the boss fight in step 11 — for now, focus on building the schema that prevents them.
🎯 You will learn to
- Apply
malloc/freecorrectly — request bytes withsizeof, validate theNULLreturn, and pair every allocation with a release. - Analyze the four-state pointer lifecycle (Uninitialized → Alive → Null → Dead) and explain which transitions cause use-after-free.
- Distinguish stack-allocated locals from heap allocations and predict when each becomes invalid.
In C++, you allocate heap memory with new and release it with delete. C uses lower-level functions from <stdlib.h>:
| C++ | C |
|---|---|
int *p = new int; |
int *p = malloc(sizeof(int)); |
int *a = new int[10]; |
int *a = malloc(10 * sizeof(int)); |
delete p; |
free(p); |
delete[] a; |
free(a); |
Stack vs. Heap: Where Does Memory Live?
Before diving into malloc, you need to know where your variables live:
@startuml
layout vertical
box "Stack\n(grows downward)\nlocal variables, auto-managed" as stack
box "(free space)" as freesp
box "Heap\n(grows upward)\nmalloc'd memory, manual" as heap
box "Global / Static\nglobal variables, string literals" as glob
box "Code (Text)\nyour compiled functions" as code
stack -- freesp
freesp -- heap
heap -- glob
glob -- code
note right of stack : High address
note right of code : Low address
@enduml
Key insight: Stack memory is free and automatic — but it dies when the function returns. Heap memory survives function calls — but you must free() it yourself. Returning a pointer to a local stack variable is a classic bug: the memory is gone by the time the caller uses the pointer.
✏️ Predict: returning the address of a local
Before reading on, predict what this program does:
int *make_seven(void) {
int x = 7;
return &x; // <- returning the address of a local
}
int main(void) {
int *p = make_seven();
printf("%d\n", *p);
return 0;
}
Pick one — commit before you scroll:
- (a) Always prints
7—xis just an integer, the value gets returned with the pointer. - (b) Compile error — gcc rejects
return &xfor a local. - (c) Sometimes prints
7, sometimes garbage, sometimes segfaults — undefined behavior. The stack frame holdingxdied whenmake_sevenreturned. - (d) Always segfaults — the OS detects the stale pointer.
⚠️ Open after you've committed
The answer is (c). When make_seven returns, its stack frame is reclaimed — x no longer exists in any meaningful sense. The pointer p now points at memory that will be reused by the next function call. On a quiet main, the bytes might still happen to read 7 (giving the illusion of correctness). Call another function before printing, and the bytes are different — segfault, garbage value, or worse, plausible-looking-but-wrong data.
With gcc -Wall, you’ll likely see warning: function returns address of local variable [-Wreturn-local-addr]. Heed the warning. This is exactly what the Ownership Rule’s first question prevents: who allocates? If the answer is “the function’s stack frame,” the lifetime ends at the return statement.
The fix is one of: (1) caller passes in a buffer (void make_seven(int *out) { *out = 7; }), (2) the function mallocs and returns the heap pointer (caller now must free), or (3) x is a static local (lives for the program’s lifetime, but is shared — usually wrong).
🔧 Tool callout: AddressSanitizer makes lifetime bugs visible
The dangling-pointer bug above is invisible at runtime by default — your program “works” until it doesn’t. AddressSanitizer (built into gcc and clang) instruments every memory access at compile time and flags use-after-free, heap overflow, stack-use-after-return, and leaks the moment they happen.
gcc -Wall -std=c11 -g -fsanitize=address memory_lab.c -o memory_lab
./memory_lab
For a clean program you’ll see no extra output. For the dangling-pointer program above, AddressSanitizer prints a precise diagnostic naming the offending line. You’ll meet this tool again in the boss fight (step 11) — think of it as the X-ray vision that turns silent C bugs into loud ones.
Key Differences from C++
mallocreturnsvoid*— in C, this implicitly converts to any pointer type (no cast needed). Don’t add a cast; it hides bugs.mallocdoes NOT initialize memory — the bytes are garbage. Usecalloc()if you need zeroed memory.malloccan fail — it returnsNULLif there’s no memory. Always check.- No constructors —
mallocjust gives you raw bytes. You must initialize fields yourself.
📋 The Ownership Rule: name it before you write it
C++ has destructors and unique_ptr to keep track of who owns what. C does not. The discipline that replaces it is answering four questions about every pointer you write. Before you allocate or pass a pointer in C, force yourself to commit to:
- Who allocates? Which function calls
malloc? (Often the only honest answer is “this one — right here.”) - Who frees? Which function calls
freeon this pointer? (Must be exactly one, on every code path including errors.) - Who borrows it? Which functions read/write through this pointer without taking ownership? They must not free it.
- What’s mutable? Can the function modify the pointed-to data? If not, the parameter type should say
const T *, notT *.
Most C bugs that aren’t syntax errors come from skipping one of these questions. Make answering them a reflex.
The Pointer Lifecycle: A Mental Model
Here’s a mental model that will save you hours of debugging. Every pointer variable is in one of four states:
@startuml
[*] --> Uninitialized
Uninitialized --> Alive : malloc()
Alive --> Dead : free()
Alive --> Null : p = NULL
Null --> Alive : p = malloc()
@enduml
| State | Meaning | Safe Operations |
|---|---|---|
| Uninitialized | Declared but not assigned | None — using it is undefined behavior |
| Alive | Points to valid, allocated memory | Dereference (*p), member access (p->x), free |
| Null | Explicitly set to NULL |
Compare (p == NULL), reassign |
| Dead | Was freed — memory returned to OS | Nothing! Accessing a dead pointer is use-after-free |
The most dangerous transition is Alive → Dead (via free()), because the pointer variable still holds the old address — it just doesn’t point to valid memory anymore. The pointer looks fine, but the memory behind it is gone. Pro tip: set pointers to NULL immediately after freeing them — it converts a future use-after-free (silent corruption) into a NULL-deref (loud crash you can debug).
Task: Build a dynamic array
Complete the program in memory_lab.c:
- Allocate an array of
countintegers usingmalloc. - Check if
mallocreturnedNULL. - Fill the array with squares:
arr[i] = i * i. - Print the array.
- Free the memory when done.
gcc -Wall -std=c11 memory_lab.c -o memory_lab
./memory_lab
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int count = 5;
// Sub-goal 1: Allocate heap memory
// Use malloc(count * sizeof(int)) to request space for 'count' ints
int *squares = NULL; // Replace NULL with your malloc call
// Sub-goal 2: Validate allocation
// Check if malloc returned NULL (out of memory). If so, print error and exit.
// Sub-goal 3: Initialize data
// Fill array with squares: squares[i] = i * i
// Print the array
printf("Squares:");
for (int i = 0; i < count; i++) {
printf(" %d", squares[i]);
}
printf("\n");
// Sub-goal 4: Release memory
// Every malloc must have a matching free
return 0;
}
Step 4 — Knowledge Check
Min. score: 80%
1. What does malloc(10 * sizeof(int)) return?
malloc allocates raw, uninitialized bytes and returns a pointer. 10 * sizeof(int) = 40 bytes (assuming 4-byte ints). Unlike calloc, malloc does NOT zero-initialize. It returns NULL if allocation fails.
2. In C, should you cast the return value of malloc? E.g., int *p = (int*)malloc(...);
In C, void* implicitly converts to any pointer type — no cast needed. Adding a cast like (int*) can mask the bug of forgetting #include <stdlib.h>, because without the header, the compiler assumes malloc returns int (in older C standards), and the cast silently converts the wrong type.
3. What happens if you forget to call free() on malloc’d memory?
While the OS does reclaim memory on process exit, memory leaks in long-running programs (servers, daemons) gradually consume all available RAM. In C, there is no garbage collector — you are responsible for every byte you allocate.
4. Arrange the lines to dynamically allocate an array of 100 doubles, check for failure, use it, and clean up. (arrange in order)
double *data = malloc(100 * sizeof(double));if (data == NULL) { return 1; }data[0] = 3.14;printf("%.2f\n", data[0]);free(data);
double *data = new double[100];delete[] data;
The sequence is: (1) allocate with malloc, (2) check for NULL, (3) use the memory, (4) print, (5) free. The distractors use C++ syntax (new/delete[]), which doesn’t exist in C.
5. [Interleaved: Revisit Step 3] You write scanf("%d", age) (without &). What happens?
Without &, scanf receives the value of age (which is uninitialized garbage), interprets that garbage as a memory address, and writes the parsed input there. This is undefined behavior — it might crash, corrupt memory, or appear to work by coincidence. The compiler may warn with -Wall, but won’t stop you.
Power #4 — Strings: Bare-Knuckle Text Wrangling
Power Unlocked: Raw String Manipulation
In C++, std::string does the heavy lifting — memory, length tracking, concatenation, all automatic. In C, you are the string class. Every byte, every null terminator, every bounds check — that’s on you. A “string” is just an array of char terminated by a null byte '\0':
🎯 You will learn to
- Apply
strcmpfor string equality and explain why==silently compares pointer addresses instead. - Apply
strncpywith manual'\0'termination to copy strings safely without buffer overflow. - Identify the C++ “false friends” (
+,=,.length()) that compile but do the wrong thing onchar*.
char name[] = "Alice";
// Memory layout: ['A']['l']['i']['c']['e']['\0']
// [0] [1] [2] [3] [4] [5]
The null terminator '\0' marks where the string ends. Every string function (strlen, printf %s, etc.) scans forward until it hits '\0'. If you forget the null terminator, functions will read past the end of your array — undefined behavior.
String Functions (from <string.h>)
| Function | Purpose | Gotcha |
|---|---|---|
strlen(s) |
Returns length (not counting '\0') |
O(n) — scans for '\0' every time |
strcpy(dst, src) |
Copies src into dst | No bounds checking! Use strncpy |
strcat(dst, src) |
Appends src to dst | No bounds checking! |
strcmp(a, b) |
Compares: returns 0 if equal | You CANNOT use == to compare strings |
strncpy(dst, src, n) |
Copies at most n chars | May NOT null-terminate if src >= n |
“False Friends” from C++
Some C syntax looks like C++ but does something completely different. These traps will get you if you’re on autopilot:
+on strings: In C++,str1 + str2concatenates. In C,+onchar*does pointer arithmetic — it moves the address, not concatenate. Usestrcat().=on strings: In C++,str1 = str2copies. In C,=onchar[]is illegal after declaration. Usestrcpy()orstrncpy().- No
.length(): C strings have no methods. Usestrlen()— and it’s O(n), not O(1).
✏️ Predict: two ways to “make a string”
Both lines below look like reasonable ways to make a string named cat. But they have very different storage. Predict before you read on:
const char *literal = "cat"; // line A
char array[] = "cat"; // line B
array[0] = 'b'; // legal? what does `array` hold afterward?
literal[0] = 'b'; // legal? same question.
Pick one — commit before you scroll:
- (a) Both lines work.
literalandarrayare both"bat"afterward. - (b)
array[0] = 'b'works (arraybecomes"bat");literal[0] = 'b'is undefined behavior — likely a segfault. - (c) Both lines compile but produce undefined behavior — string literals are read-only.
- (d)
literalandarrayare aliases for the same memory, so both succeed and end up"bat".
⚠️ Open after you've committed
The answer is (b).
char array[] = "cat"allocates a writable 4-byte char array on the stack and copies the literal"cat\0"into it.arrayowns its bytes. Mutation is fine.const char *literal = "cat"stores the string literal in a read-only segment of the program’s memory (often.rodata).literalis a pointer into that read-only memory. Writing through it is undefined behavior — usually a segfault on Linux/macOS.
The const on const char *literal is your safety net: the compiler refuses literal[0] = 'b'. Drop the const (char *literal = "cat") and the compiler accepts it without warning, but the program will still crash at runtime — silent UB. Always declare string-literal pointers as const char *.
The deeper lesson: two variables that look identical at the call site can have completely different lifetimes and write permissions. C’s “everything is bytes” simplicity stops at the storage class.
The #1 Mistake: Using == to Compare Strings
if (name == "Alice") // WRONG! Compares pointer addresses, not contents
if (strcmp(name, "Alice") == 0) // CORRECT! Compares character-by-character
Task: Fix the string bugs
The file strings_lab.c has three bugs related to C strings. Find and fix all of them:
- A string comparison using
==instead ofstrcmp - An unsafe
strcpythat should usestrncpy - A missing null terminator after
strncpy
gcc -Wall -std=c11 strings_lab.c -o strings_lab
./strings_lab
#include <stdio.h>
#include <string.h>
int main(void) {
// Bug 1: comparing strings with ==
char lang[] = "C";
if (lang == "C") {
printf("Language is C\n");
} else {
printf("Language is not C\n");
}
// Bug 2: strcpy with no size limit
char dest[8];
char src[] = "A very long string that overflows the buffer";
strcpy(dest, src);
printf("Copied: %s\n", dest);
// Bug 3: strncpy may not null-terminate
char abbrev[4];
strncpy(abbrev, "Pittsburgh", sizeof(abbrev));
printf("Abbreviation: %s\n", abbrev);
return 0;
}
Step 5 — Knowledge Check
Min. score: 80%
1. What is the length of the string "Hello" in memory (including the null terminator)?
‘Hello’ has 5 visible characters, plus the invisible \0 null terminator = 6 bytes total. strlen("Hello") returns 5 (it doesn’t count \0), but the array needs 6 bytes of storage.
2. Why can’t you use == to compare C strings?
In C, a string is an array, and array names decay to pointers. str1 == str2 compares whether both pointers refer to the same memory address, not whether the characters match. Use strcmp(str1, str2) == 0 to compare contents.
3. Arrange the lines to safely copy a string from src into dest (size 20), ensuring null-termination.
(arrange in order)
char dest[20];char *src = "Hello, World!";strncpy(dest, src, sizeof(dest) - 1);dest[sizeof(dest) - 1] = '\0';printf("%s\n", dest);
strcpy(dest, src);dest = src;
Declare the buffer, define the source, copy safely with strncpy (reserving space for \0), manually null-terminate, then print. strcpy has no size limit (unsafe). dest = src doesn’t copy — it just changes the pointer (and is illegal for arrays).
4. [Interleaved: Revisit Step 4] After char *s = malloc(50);, what is the content of the 50 bytes?
malloc returns uninitialized memory. The bytes could be anything — remnants of previous allocations. If you need zeroed memory, use calloc(50, 1) instead. For a string buffer, you must at minimum set s[0] = '\0' before using it with string functions.
Power #5 — Structs: Build Your Own Data Types
Power Unlocked: Custom Data Structures
Time to level up from primitive types. With structs, you can bundle related data together and build the foundations of any system — game engines, operating systems, databases. C has no classes, but structs + functions give you everything you need.
🎯 You will learn to
- Define a
typedef‘d struct and access its fields through a pointer with->. - Apply the C “no-methods” idiom — pass
Struct *(orconst Struct *) to standalone functions instead of writing member functions. - Distinguish C
structsemantics from C++struct/class(no access control, no constructors, no inheritance).
In C++, class and struct are nearly identical (differing only in default access). In C, struct is all you have, and it’s much more limited:
- No methods — functions that operate on a struct are standalone
- No access control — no
private,protected, orpublic - No constructors/destructors — you write init/cleanup functions yourself
- No inheritance — you can nest structs for composition
⚠️ Negative-transfer trap: struct defaults differ between C++ and C
If your C++ habit is “struct and class are basically the same”, unlearn it for C:
| Comparison point | C++ struct |
C++ class |
C struct |
|---|---|---|---|
| Default access | public |
private |
(no concept of access at all) |
| Methods | yes | yes | no |
| Constructors | yes | yes | no |
| Inheritance | yes | yes | no |
So when a C++ programmer writes struct Point { double x, y; };, they have a perfectly valid public-by-default C++ class. When you write the same line in C, you have a passive data record — no methods, no encapsulation, no this. Functions that operate on a struct live outside it and take a pointer to it as their first parameter. That convention is everything you’ll do in this step.
Side-by-side: same idea in C++ and C
To lock in the paradigm shift, here’s the same concept (a translatable point) written both ways. The C++ version uses methods; the C version uses standalone functions that take a pointer as their first argument:
// C++: data + methods bound together
struct Point {
double x, y;
void translate(double dx, double dy) {
x += dx; y += dy;
}
double magnitude() const {
return std::sqrt(x*x + y*y);
}
};
Point p{3, 4};
p.translate(1, 1); // method call: p.translate(...)
double m = p.magnitude();
// C: data and functions live separately, linked by convention
typedef struct {
double x, y;
} Point;
void point_translate(Point *p, double dx, double dy) {
p->x += dx; p->y += dy;
}
double point_magnitude(const Point *p) {
return sqrt(p->x * p->x + p->y * p->y);
}
Point p = {3, 4};
point_translate(&p, 1, 1); // function call: point_translate(&p, ...)
double m = point_magnitude(&p);
Three conventions to internalize from the C version:
- Module prefix on every function —
point_translate,point_magnitude. C has no namespaces, so the prefix is the namespace. - First parameter is
Type *self— by convention. The function knows nothing about its receiver until you hand it one. Pass&pat the call site instead of writingp.translate. - Use
const Type *selffor read-only access —point_magnitudedoesn’t modifyp, so its parameter isconst Point *. This is C’s best approximation of a C++constmethod.
⚠️ Negative-transfer trap: struct assignment is fieldwise, not deep
In C++, you’d reach for a copy constructor to control what happens when one object is copied to another. C has no copy constructors. Struct assignment in C is a literal byte-by-byte copy of the fields. That’s fine for value-type structs (like Point above) — but it’s a trap for any struct that holds a pointer to heap memory.
Predict the output of this program. Commit before you scroll:
typedef struct {
char *data; // points to heap memory
} Buffer;
int main(void) {
char text[] = "hello";
Buffer a = { text }; // a.data points at `text`
Buffer b = a; // struct assignment
b.data[0] = 'y'; // mutate through b
printf("%s %s\n", a.data, b.data);
return 0;
}
- (a)
hello hello— assignment doesn’t actually run; the compiler optimizes it away. - (b)
hello yello—bgot an independent copy; mutatingb.datadoesn’t affecta. - (c)
yello yello—aandbshare the samedatapointer; mutating one mutates the other. - (d) Compile error — C forbids assigning between structs.
⚠️ Open after you've committed
The answer is (c): yello yello. The line Buffer b = a copies the one field of Buffer — which is the pointer data, not what it points to. After the assignment, a.data and b.data are aliases for the same character array. Mutating through one is visible through the other.
This is the trap the Ownership Rule prevents. The four questions:
- Who allocates the bytes that
a.dataandb.datapoint at? → The local arraytextinmain. - Who frees them? →
textlives on the stack; freed automatically whenmainreturns. But iftexthad beenmalloced, who frees it —aorb? - Who borrows? → After
b = a, you have two borrowers of the same memory. - What’s mutable? → Both can mutate. Neither can tell the other “I’m mutating now.”
In C++, a copy constructor would deep-copy the buffer. In C, you write that yourself: a buffer_clone(const Buffer *src) function that mallocs a new array and memcpys the contents. C makes the work explicit because the compiler refuses to guess your ownership intent.
Declaring and Using Structs
struct Point {
double x;
double y;
};
// Without typedef, you must write 'struct Point' everywhere:
struct Point p1;
p1.x = 3.0;
p1.y = 4.0;
typedef Saves Typing
typedef struct {
double x;
double y;
} Point;
// Now you can just write 'Point':
Point p1 = {3.0, 4.0};
The Arrow Operator (->)
When you have a pointer to a struct, use -> instead of .:
Point *pp = &p1;
pp->x = 5.0; // same as (*pp).x = 5.0
Task: Build an RPG Character Sheet
Complete structs_lab.c to create a Character struct (think RPG character sheet) and functions that operate on it. This is how you do “OOP” in C — structs hold data, standalone functions provide behavior.
We’ve provided the main() function — your job is to build the struct and its functions. Filling in a working skeleton is a faster path to understanding than staring at a blank file.
- Define the
Characterstruct usingtypedef(fields:name[50],level,hp). - Implement
character_initto populate a character. - Implement
character_printto display a character’s stats.
gcc -Wall -std=c11 structs_lab.c -o structs_lab
./structs_lab
#include <stdio.h>
#include <string.h>
// TODO: Define a Character struct using typedef with fields:
// - char name[50]
// - int level
// - double hp
// TODO: Implement character_init
// Takes a POINTER to Character, plus name, level, hp as parameters
// Copies name into c->name using strncpy (safely!)
// Sets c->level and c->hp
// TODO: Implement character_print
// Takes a POINTER to Character (use const for safety)
// Prints: "<name> [Lv.<level>] HP: <hp>"
int main(void) {
Character hero;
character_init(&hero, "LinkSlayer99", 42, 97.5);
character_print(&hero);
Character boss;
character_init(&boss, "DarkLord_X", 99, 1000.0);
character_print(&boss);
return 0;
}
Step 6 — Knowledge Check
Min. score: 80%1. Why do C programmers pass struct pointers to functions instead of passing structs by value?
C passes everything by value. Passing a 200-byte struct copies all 200 bytes onto the stack. A pointer is just 8 bytes and lets the function modify the original. C has no references — pointers are the only option for ‘pass by reference’ behavior.
2. Given Character *c = &hero;, which syntax accesses the name field?
-> is the member access operator for pointers to structs. c->name is equivalent to (*c).name. Using c.name would fail because c is a pointer, not a struct.
3. Arrange the lines to define a Rectangle struct and a function that calculates its area.
(arrange in order)
typedef struct {double width;double height;} Rectangle;double rect_area(const Rectangle *r) {return r->width * r->height;}
class Rectangle {return r.width * r.height;
typedef struct { ... } Rectangle; defines the struct. The area function takes a const pointer (read-only) and uses -> to access members through the pointer. class doesn’t exist in C. r.width would be wrong because r is a pointer — you need r->width.
4. [Interleaved: Revisit Step 5] Why does character_init use strncpy instead of strcpy for the name?
As we learned in the strings step, strcpy has no length limit and can overflow the destination buffer. strncpy copies at most n characters, making it safe for fixed-size char arrays like name[50]. But remember: strncpy may NOT null-terminate, so we add '\0' manually.
5. In C++, you’d write p.translate(1, 1). The closest equivalent in idiomatic C is:
The C convention is prefix_action(&p, args...). The prefix (point_) substitutes for namespaces, the &p substitutes for the implicit this, and the function lives outside the struct. This pattern repeats for every C ‘class-like’ API you’ll meet — pthread_create, fopen, git_repository_open all follow it.
Power #6 — Unions: Shape-Shifting Memory
Power Unlocked: One Memory Location, Many Forms
This power is subtle but deadly useful. A union lets a single block of memory shape-shift between different types — like a Pokemon swapping between Fire, Water, and Electric attack types using the same move slot. It’s normal to wonder “when would I ever use this?” The answer: unions show up in parsers, network protocols, every Pokemon-style “this thing can be one of N variants” system, and any code that handles multiple data shapes through the same interface. If this step feels harder than previous ones, that’s expected — you’re building a more sophisticated mental model.
🎯 You will learn to
- Apply the tagged-union pattern (enum tag + anonymous union) to represent a value that can hold one of N variants.
- Analyze why
sizeof(union)equals the size of its largest member, and predict which member is valid at any moment. - Distinguish C tagged unions from C++
std::variant— and explain which guarantees the compiler does not give you in C.
Motivating example: a single attack slot, three element types
Imagine a Pokemon battle engine. An attack can be Fire (with burn_dmg), Water (with splash_radius), or Electric (with volts). Each type carries different data, but a Pokemon stores them all in the same attack slot. You could declare three separate fields and waste two-thirds of the memory every time, or you could declare one union and accept that only one variant is valid at a time:
union AttackData {
int burn_dmg; // valid when type == FIRE
double splash_radius; // valid when type == WATER
int volts; // valid when type == ELECTRIC
};
This is exactly the trade-off unions make: all members share the same memory. The size of a union equals the size of its largest member.
union Value {
int i; // 4 bytes
double d; // 8 bytes
char s[8]; // 8 bytes
};
// sizeof(union Value) == 8 (size of largest member)
At any moment, only one member is valid. Writing to val.d overwrites whatever was in val.i. Reading a member you didn’t last write to is undefined behavior — the Pokemon equivalent of “asking the Fire attack what its splash radius is.”
✏️ Predict before you read on
Suppose union Value v; and you do:
v.i = 42; // write 4 bytes as int
printf("%f\n", v.d); // read 8 bytes as double — what prints?
Pick one — commit before you scroll:
- (a)
42.000000— C converts the int to a double on read. - (b)
0.000000— the unwritten upper bytes are zero, so the double is well-defined. - (c) An unpredictable garbage float — C reinterprets the raw bytes; the upper 4 bytes are whatever was on the stack.
- (d) Compile error — the compiler rejects mismatched member access.
⚠️ Open after you've committed to a letter
The answer is (c). C does no conversion between union members — it reinterprets the same bytes through whichever type you ask for. The lower 4 bytes hold the int 42; the upper 4 bytes hold whatever was on the stack before v was declared. Read as a double, that bit pattern is meaningless.
Why does this matter? Because the union itself doesn’t know which member is currently valid. There’s no runtime check, no compiler warning. The discipline is on you — and that discipline is what the tagged union pattern below formalizes.
Tagged Unions: The C Pattern for “Variant Types”
Since the union doesn’t know which member is active, you need to track it yourself. The standard pattern is a struct with a tag (enum) and a union — the tag is the Pokemon’s type, the union holds the type-specific data:
typedef enum { TYPE_INT, TYPE_DOUBLE, TYPE_STRING } ValueType;
typedef struct {
ValueType type; // tag: which union member is valid
union {
int i;
double d;
char s[32];
}; // anonymous union (C11)
} TaggedValue;
⚠️ Negative-transfer trap: this is not std::variant
C++17 introduced std::variant<int, double, std::string> — a type-safe tagged union with constructors, destructors, and the std::visit machinery to dispatch on the active alternative. C has none of that. The C tagged-union pattern is what std::variant was built on top of. In C:
- You manage the tag yourself.
- The compiler can’t help you avoid reading the wrong member.
- There’s no
std::visit— you write theswitchby hand.
If you came from C++17 expecting std::variant-style guarantees, uninstall that habit before this step. The C version is hand-rolled discipline, not language support.
Task: Build a tagged value system
Complete unions_lab.c to implement a TaggedValue that can hold an int, double, or string. Implement the print_value function that uses a switch on the tag.
gcc -Wall -std=c11 unions_lab.c -o unions_lab
./unions_lab
#include <stdio.h>
#include <string.h>
typedef enum { TYPE_INT, TYPE_DOUBLE, TYPE_STRING } ValueType;
typedef struct {
ValueType type;
union {
int i;
double d;
char s[32];
};
} TaggedValue;
// TODO: Implement print_value
// Use a switch on val->type to print the correct member:
// TYPE_INT: printf("int: %d\n", ...)
// TYPE_DOUBLE: printf("double: %.2f\n", ...)
// TYPE_STRING: printf("string: %s\n", ...)
void print_value(const TaggedValue *val) {
}
int main(void) {
TaggedValue v1 = { .type = TYPE_INT, .i = 42 };
TaggedValue v2 = { .type = TYPE_DOUBLE, .d = 3.14 };
TaggedValue v3 = { .type = TYPE_STRING };
strncpy(v3.s, "hello", sizeof(v3.s) - 1);
v3.s[sizeof(v3.s) - 1] = '\0';
print_value(&v1);
print_value(&v2);
print_value(&v3);
return 0;
}
Step 7 — Knowledge Check
Min. score: 80%
1. A union with an int (4 bytes), double (8 bytes), and char[4] (4 bytes). What is sizeof this union?
A union’s size equals its largest member. All members share the same starting address in memory, so the union must be large enough to hold any one of them. Here, double at 8 bytes is largest.
2. What happens if you write to val.i and then read val.d (without writing to val.d first)?
Only the last-written member is valid. Reading a different member reinterprets the raw bytes as a different type — the result is unpredictable. This is why tagged unions use an explicit type tag.
3. Arrange the lines to create a tagged union for a Shape that can be a circle (with radius) or rectangle (with width and height), and print the area.
(arrange in order)
typedef enum { CIRCLE, RECT } ShapeType;typedef struct {ShapeType type;union { double radius; struct { double w, h; }; };} Shape;if (s.type == CIRCLE) printf("%.2f\n", 3.14 * s.radius * s.radius);else printf("%.2f\n", s.w * s.h);
class Shape { virtual double area(); };
First define the enum for shape types, then the tagged struct with an anonymous union containing either a radius or a {w, h} sub-struct. The if dispatches on the tag. The distractor uses C++ classes/virtual functions, which don’t exist in C.
4. [Interleaved: Revisit Step 5] In the TaggedValue struct, the string member is char s[32]. If you assign strncpy(v.s, "hello", sizeof(v.s)), is the string safely null-terminated?
strncpy null-terminates ONLY if the source string is shorter than n. Since "hello" (5 chars) < 32, the remaining bytes are filled with \0. But if the source were 32+ chars, no null terminator would be added. The safe habit is always s[sizeof(s)-1] = '\0' after strncpy.
5. A teammate writes print_value like this — no switch on the tag:
void print_value(const TaggedValue *val) {
printf("int: %d, double: %.2f, string: %s\n",
val->i, val->d, val->s);
}
Without the tag-based dispatch, print_value reads ALL three union members — but only one was ever validly written. The other two reads reinterpret raw bytes through the wrong type, which is undefined behavior. This is exactly what the tag is for: it tells you which member is currently meaningful, so you only read that one. Skipping the tag dispatch defeats the entire pattern.
Power #7 — Function Pointers: Code That Rewires Itself
Power Unlocked: Functions as Values
This is arguably C’s most mind-bending power: functions are just addresses in memory, and you can store, pass, and swap them at runtime. This is how C programs achieve polymorphism without classes — and it’s the secret behind qsort, callback systems, and plugin architectures.
🎯 You will learn to
- Read the function-pointer declaration syntax (
int (*fp)(int, int)) and explain why the inner parentheses matter. - Apply
qsortwith a custom comparator — castingconst void*parameters back to the real type before comparing. - Create ascending and descending comparators and predict their effect on the same input array.
In C, a function name (without parentheses) evaluates to the function’s memory address. You can store this address in a function pointer and call the function through it.
int add(int a, int b) { return a + b; }
int sub(int a, int b) { return a - b; }
// Declare a function pointer
int (*operation)(int, int);
operation = add; // point to 'add'
int result = operation(3, 4); // calls add(3, 4) → 7
operation = sub; // repoint to 'sub'
result = operation(3, 4); // calls sub(3, 4) → -1
Reading the Syntax (Pair Up!)
Function pointer syntax is notoriously confusing — even experienced C programmers have to pause and think about it. If you’re working alongside a classmate, this is an excellent moment for pair programming. Two brains parsing int (*fp)(const void*, const void*) is genuinely better than one.
The syntax int (*operation)(int, int) reads as:
operationis a pointer (the*)- to a function (the parameter list
(int, int)) - that returns
int
Warning: Without the inner parentheses, int *operation(int, int) means “a function returning int*” — completely different!
qsort: The Classic Callback Example
The C standard library’s qsort sorts any array using a comparison function you provide:
void qsort(void *base, size_t nmemb, size_t size,
int (*compar)(const void*, const void*));
The comparison function receives void* pointers (generic pointers — C’s limited version of templates). You must cast them to the correct type inside.
Worked Example: A Complete Comparator
Before you write your own, study this fully worked comparator for sorting doubles:
// Sub-goal: Cast void* to the actual type
int compare_doubles(const void *a, const void *b) {
double da = *(const double *)a; // cast void* → double*, then dereference
double db = *(const double *)b;
// Sub-goal: Return comparison result
if (da < db) return -1;
if (da > db) return 1;
return 0;
}
Notice the pattern: (1) cast void* to the real type, (2) dereference to get the value, (3) compare. Your task below follows the same pattern but for int.
Task: Sort an array with qsort
Complete funcptr_lab.c:
- Implement
compare_ascendingforqsort(return negative if*a < *b, zero if equal, positive if*a > *b). - Implement
compare_descending(reverse order). - Use
qsortwith each comparator.
gcc -Wall -std=c11 funcptr_lab.c -o funcptr_lab
./funcptr_lab
#include <stdio.h>
#include <stdlib.h>
void print_array(const int *arr, int n) {
for (int i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
printf("\n");
}
// TODO: Implement compare_ascending for qsort
// Parameters are const void* pointers — cast to const int*
// Return: negative if *a < *b, zero if equal, positive if *a > *b
int compare_ascending(const void *a, const void *b) {
return 0; // Replace this
}
// TODO: Implement compare_descending (reverse of ascending)
int compare_descending(const void *a, const void *b) {
return 0; // Replace this
}
int main(void) {
int data[] = {42, 17, 93, 8, 56, 31, 74};
int n = sizeof(data) / sizeof(data[0]);
printf("Original: ");
print_array(data, n);
qsort(data, n, sizeof(int), compare_ascending);
printf("Ascending: ");
print_array(data, n);
qsort(data, n, sizeof(int), compare_descending);
printf("Descending: ");
print_array(data, n);
return 0;
}
Step 8 — Knowledge Check
Min. score: 80%
1. What does the declaration int (*fp)(double, double); mean?
The parentheses in (*fp) are critical. They make fp a pointer to a function. Without them, int *fp(double, double) would declare a function returning int* — very different!
2. Why does qsort use void* parameters in its comparison function?
C lacks C++ templates. void* is C’s mechanism for generic programming — it’s a pointer to ‘any type.’ The downside: you must manually cast to the correct type inside the callback, with no compiler safety net.
3. Arrange the lines to define a comparison function for sorting strings with qsort, then call qsort on a string array.
(arrange in order)
int cmp_str(const void *a, const void *b) {return strcmp(*(const char **)a, *(const char **)b);}char *words[] = {"banana", "apple", "cherry"};qsort(words, 3, sizeof(char *), cmp_str);
return *(char *)a - *(char *)b;std::sort(words, words + 3);
For an array of char* strings, qsort passes pointers to array elements — i.e., char** cast as void*. We cast back to const char** and dereference to get the char*, then compare with strcmp. The distractor *(char*)a - *(char*)b compares single characters, not full strings. std::sort is C++ only.
4. [Interleaved: Revisit Step 6] How do function pointers relate to structs in C?
By putting function pointers inside structs, C programmers can simulate object-oriented patterns — the struct holds data + function pointers, like a C++ vtable. This is how early ‘C with Classes’ (the precursor to C++) worked.
Trial by Fire — Arrays, Pointers, and the Decay Trap
Every Hero Has a Weakness. This Is Yours.
Array decay and pass-by-value are the kryptonite of C programmers. More bugs come from misunderstanding these two concepts than from almost anything else in the language. This step is a trial — survive it, and you’ll have the mental model that separates beginners from real systems programmers.
Scaffolding pause: You’ve been writing code from scratch in the last few steps. Now we’re deliberately giving you back some scaffolding — pre-written buggy code to debug — because this concept is a notorious trap even for experienced programmers. Finding bugs is the right exercise type here: it forces you to reason about why code breaks, which is exactly the skill you need for array/pointer issues.
🎯 You will learn to
- Explain array-to-pointer decay and predict what
sizeof(arr)returns inside a function vs. at the call site. - Apply the C convention of passing an array’s length as a separate parameter.
- Apply pointer-to-pointer (
int **) parameters to let a function modify the caller’s pointer (output parameter).
In C++, arrays and pointers are related but distinct. In C, they are so intertwined that students routinely confuse them — this is the most treacherous “false friend” between C and C++.
The Decay Rule: When you pass an array to a function, it silently decays into a pointer to its first element. The function receives just a pointer — all size information is lost.
void print_size(int arr[]) {
// SURPRISE: sizeof(arr) is 8 (pointer size), NOT the array size!
printf("sizeof = %zu\n", sizeof(arr)); // prints 8
}
int main(void) {
int data[100];
printf("sizeof = %zu\n", sizeof(data)); // prints 400
print_size(data); // prints 8!
}
This is the #1 source of bugs in C array code. The function signature int arr[] is identical to int *arr — it’s just syntactic sugar.
Quick Refresh: The Pointer Lifecycle (from Step 4)
Remember the four pointer states? You’ll need them for Bug 3:
- Alive → points to valid memory (after malloc)
- Dead → was freed (use-after-free if you touch it)
- Null → explicitly set to NULL (safe to check, unsafe to dereference)
- Uninitialized → never assigned (garbage address)
Bug 3 involves a pointer that should transition from Null to Alive — but doesn’t, because of how C passes arguments.
C Is Strictly Pass-by-Value
C++ has references (int &x). C does not. Everything in C is passed by value — including pointers. When you pass a pointer, the function gets a copy of the pointer (the address), not a reference to the original pointer variable.
This means:
- Modifying
*ptrinside a function changes the pointed-to data (the copy points to the same address) - Modifying
ptritself (e.g.,ptr = malloc(...)) does NOT affect the caller’s pointer
To modify a pointer from inside a function, you need a pointer to a pointer (int **pp).
Task: Find and fix the array/pointer bugs
The file arrays_lab.c has three bugs, ordered by difficulty:
- Bug 1 (easy):
array_lengthusessizeofon a decayed array — fix: pass length as parameter. - Bug 2 (easy):
zero_fillhas the same sizeof bug. - Bug 3 (hard):
allocatemodifies a local copy of the pointer. Fix: change the parameter toint **ptrand use*ptr = malloc(...). Also update the caller to pass&heap_data.
Start with Bugs 1-2. Once those compile and run, tackle Bug 3 — it’s conceptually different (pass-by-value for pointers).
gcc -Wall -std=c11 arrays_lab.c -o arrays_lab
./arrays_lab
#include <stdio.h>
#include <stdlib.h>
// Bug 1: This function tries to compute array length
// but sizeof(arr) gives POINTER size, not array size!
int array_length(int arr[]) {
return sizeof(arr) / sizeof(arr[0]);
}
// Bug 2: This function tries to zero-fill an array
// but uses the wrong size
void zero_fill(int arr[]) {
int len = sizeof(arr) / sizeof(arr[0]); // BUG: decay!
for (int i = 0; i < len; i++) {
arr[i] = 0;
}
}
// Bug 3: This function tries to allocate memory for the caller
// but the caller's pointer never changes (pass-by-value!)
void allocate(int *ptr, int n) {
ptr = malloc(n * sizeof(int)); // BUG: modifies local copy only
if (ptr != NULL) {
for (int i = 0; i < n; i++) {
ptr[i] = i * 10;
}
}
}
int main(void) {
// Test Bug 1 & 2
int data[5] = {1, 2, 3, 4, 5};
printf("Array length: %d (expected 5)\n", array_length(data));
zero_fill(data);
printf("After zero_fill: %d %d %d %d %d (expected all 0s)\n",
data[0], data[1], data[2], data[3], data[4]);
// Test Bug 3
int *heap_data = NULL;
allocate(heap_data, 5);
if (heap_data == NULL) {
printf("heap_data is still NULL! allocate() didn't work.\n");
}
// After fixing: uncomment these lines
// printf("heap_data[0] = %d (expected 0)\n", heap_data[0]);
// free(heap_data);
return 0;
}
Step 9 — Knowledge Check
Min. score: 80%1. What happens to an array when you pass it to a function in C?
Array decay is one of C’s most important rules. void f(int arr[]) is identical to void f(int *arr) — both receive a pointer. sizeof(arr) inside the function returns the pointer size (8 bytes), not the array size. You must pass the length separately.
2. A function void resize(int *p, int new_size) calls p = realloc(p, new_size * sizeof(int)) inside. After resize(data, 100) returns, what is data in the caller?
C is strictly pass-by-value. The function modifies its local copy of p, not the caller’s data. After realloc, the original memory may have been freed and moved, so data now points to freed memory — a use-after-free bug. Fix: use int **p or return the new pointer.
3. Arrange the lines to write a function that doubles every element in an array, accepting the length as a parameter (since sizeof won’t work on a decayed array). (arrange in order)
void double_array(int *arr, int len) {for (int i = 0; i < len; i++) {arr[i] *= 2;}}
int len = sizeof(arr) / sizeof(arr[0]);void double_array(int arr[100]) {
The function must accept len as a parameter because sizeof(arr) would return 8 (pointer size) due to array decay. The distractor sizeof(arr) / sizeof(arr[0]) is the classic bug this step teaches. int arr[100] in a parameter is misleading — it’s still just a pointer.
4. [Interleaved: Revisit Step 4] After free(p), what state is the pointer p in (using the pointer lifecycle model)?
After free(p), the pointer is in the Dead state. It still holds the old memory address — free does NOT set it to NULL automatically. Any dereference of a dead pointer is undefined behavior (use-after-free). Best practice: immediately write p = NULL; after free(p);.
Power #8 — File I/O: Read and Write the World
Power Unlocked: Persistent Storage
Up until now, everything you’ve built vanishes when the program exits. This power changes that — you can read from and write to files on disk, making your programs interact with the real world. Config files, save games, log files, databases — it all starts here.
🎯 You will learn to
- Apply the open-use-close pattern (
fopen→ read/write →fclose) and check theNULLreturn on everyfopen. - Distinguish file modes (
"r","w","a","r+") and predict whether existing contents survive each one. - Apply
fprintf/fgetsto write and read a file line-by-line, and explain why missingfclosecauses silent data loss.
Files in C: Open, Use, Close
File I/O in C follows a simple pattern that mirrors how you use files in real life:
- Open the file with
fopen()→ get aFILE*handle - Read or write using the handle
- Close the file with
fclose()
FILE *fp = fopen("data.txt", "r"); // "r" = read mode
if (fp == NULL) {
perror("fopen failed"); // prints reason (e.g., file not found)
return 1;
}
// ... use fp ...
fclose(fp);
File Modes
| Mode | Meaning | If file doesn’t exist |
|---|---|---|
"r" |
Read only | Returns NULL (error) |
"w" |
Write (truncates existing content!) | Creates new file |
"a" |
Append (adds to end) | Creates new file |
"r+" |
Read and write | Returns NULL (error) |
Warning: "w" destroys existing file contents. Use "a" to append.
Predict: What happens here?
Before reading further, predict what this code does:
FILE *fp = fopen("important_data.txt", "w");
fclose(fp);
Does important_data.txt still have its original contents? (Answer: No — "w" truncated it to zero bytes. This two-line program just erased the file’s contents.)
Reading and Writing Functions
| Function | Purpose | Like printf/scanf but to files |
|---|---|---|
fprintf(fp, fmt, ...) |
Write formatted text to file | printf → stdout; fprintf → file |
fscanf(fp, fmt, ...) |
Read formatted input from file | scanf → stdin; fscanf → file |
fgets(buf, n, fp) |
Read a line (safe, with limit) | Same as stdin version, but from file |
feof(fp) |
Check if end-of-file reached | Returns non-zero at EOF |
Notice the pattern: printf, scanf, and fgets all have file-based counterparts — just add f and pass the FILE* as the first (or last) argument.
✏️ Predict: how do you know you’ve reached end-of-file?
You’re about to write a loop that reads every line from a file. The natural way to write it in many languages is while (not at EOF) { read line; process line; }. Most C tutorials warn against the equivalent while (!feof(fp)) — but why?
Suppose data.txt contains exactly two lines:
hello
world
And you write:
while (!feof(fp)) {
fgets(line, sizeof(line), fp);
printf("got: %s", line);
}
How many lines does the loop print? Pick one — commit before scrolling:
- (a) 2 —
feofbecomes true exactly when we’ve consumed both lines. - (b) 3 — the last iteration prints
worldtwice becausefeofdoesn’t trip until after a failing read. - (c) Infinite loop —
feofis only set byfseek, never byfgets. - (d) 0 —
feofreturns true on the first iteration because the file is opened with the cursor past the end.
⚠️ Open after you've committed
The answer is (b). feof returns true only after a read function has failed to read past the end. The loop:
- Reads “hello\n”,
feofis still false → printsgot: hello. - Reads “world\n”,
feofis still false (we haven’t tried to read past EOF yet) → printsgot: world. feofis still false! Re-enters loop.fgetsfails (returns NULL), butlinestill contains “world\n” from the previous read. Printsgot: worldagain.- Now
feofis true → exits.
The fix that this tutorial’s code uses: while (fgets(line, sizeof(line), fp) != NULL). fgets returns NULL exactly when there’s nothing more to read — no off-by-one, no stale buffer. Rule: drive the loop by the read function’s return value, not by feof.
The Resource Management Pattern
C has no RAII (like C++ destructors) and no with statement (like Python). You must manually close every file you open. Forgetting fclose() can cause:
- Data loss (buffered writes not flushed to disk)
- File descriptor leaks (the OS limits how many files a process can have open)
Task: Save and load a playlist
Complete fileio_lab.c to:
- Write a playlist of songs to a file using
fprintf. - Read the file back line by line using
fgets. - Count the total number of tracks and print the result.
- Properly close all files.
gcc -Wall -std=c11 fileio_lab.c -o fileio_lab
./fileio_lab
#include <stdio.h>
#include <string.h>
int main(void) {
// === PART 1: Save the playlist ===
// TODO: Open "playlist.txt" for writing ("w" mode)
// TODO: Check if fopen returned NULL (use perror for error message)
const char *songs[] = {"Bohemian Rhapsody", "Blinding Lights", "Levitating",
"Anti-Hero", "Bad Guy", "Cruel Summer"};
int num_songs = sizeof(songs) / sizeof(songs[0]);
// TODO: Write each song on its own line using fprintf
// TODO: Close the file
printf("Saved %d tracks to playlist.txt\n", num_songs);
// === PART 2: Load the playlist back ===
// TODO: Open "playlist.txt" for reading ("r" mode)
// TODO: Check if fopen returned NULL
char line[100];
int track_count = 0;
// TODO: Read lines with fgets until it returns NULL (EOF)
// TODO: Increment track_count for each line
// TODO: Close the file
printf("Loaded %d tracks from playlist.txt\n", track_count);
return 0;
}
Step 10 — Knowledge Check
Min. score: 80%
1. What happens if you open an existing file with fopen("data.txt", "w")?
The "w" mode truncates the file to zero length before writing. This is a common source of data loss. If you want to add to an existing file, use "a" (append mode) instead.
2. What does fgets(buf, 100, fp) return when it reaches the end of the file?
fgets returns NULL when there is nothing more to read (end-of-file or error). This is why the standard reading loop is while (fgets(buf, size, fp) != NULL). Note: EOF is used with character-level functions like fgetc, not with fgets.
3. Why is it important to call fclose() on every file you open?
C I/O is buffered — fprintf writes to an in-memory buffer, not directly to disk. fclose flushes this buffer. Without it, the last writes may never reach the file. Additionally, each open file uses a file descriptor, and the OS limits how many a process can hold.
4. Arrange the lines to safely read all lines from a file and print them with line numbers. (arrange in order)
FILE *fp = fopen("input.txt", "r");if (fp == NULL) { perror("open"); return 1; }char buf[256];int n = 1;while (fgets(buf, sizeof(buf), fp) != NULL) {printf("%d: %s", n++, buf);}fclose(fp);
while (!feof(fp)) {fp.close();
Open the file, check for NULL, declare buffer and counter, loop with fgets (which returns NULL at EOF), print each line with its number, then close. The distractor while (!feof(fp)) is a classic C bug — feof only returns true after a read fails, causing the last line to be processed twice. fp.close() is C++/Java syntax — C uses fclose(fp).
5. [Interleaved: Revisit Step 3] How is fprintf(fp, "%s\n", word) related to printf("%s\n", word)?
In fact, printf(...) is essentially fprintf(stdout, ...). The C standard I/O library uses the same formatting engine for both. stdout, stdin, and stderr are all FILE* pointers — they’re just pre-opened for you.
Final Boss — A Linked List in C
The Final Boss Fight
Every origin story ends with a boss battle. This is yours.
You’ll combine every power you’ve unlocked — structs, pointers, malloc, free, printf, and scanf — to build a singly linked list from scratch. The starter file gives you the function signatures (node_create, list_print, list_free) and a working main() that drives them. The bodies are empty — that’s your fight. No TODO comments naming the lines. No partial implementations to nudge you. Just the contract and the compiler.
This is supposed to be hard. If you get stuck, that doesn’t mean you’re not cut out for C — it means you’re fighting the boss, not the tutorial. Go back and re-read the specific step that covers the concept you’re struggling with. Every power you need is already in your toolkit. The challenge is wielding them all at once.
🎯 You will learn to
- Create a singly-linked list end-to-end — define the recursive
Nodestruct, allocate nodes withmalloc, traverse, and free every node without leaks. - Apply
headandtailpointers to insert at the tail in O(1). - Analyze a 3-node trace by hand before writing code, predicting
malloc/freecounts and the loop-termination condition.
⚠️ Negative-transfer trap: in C++ you’d just #include <list>
In C++ you’d reach for std::list<int> (doubly-linked) or std::forward_list<int> (singly-linked) and the standard library would handle every memory bug for you — push_back, pop_front, the destructor, the works. The C standard library has none of that. No list.h, no built-in container. Every linked-list operation in C is hand-rolled — you write the struct, the malloc, the traversal, the free, and the bug fixes when one of those goes sideways. That’s why this is the capstone: it’s the moment the C++ training wheels come off.
Why linked lists are the ultimate pointer test: When researchers tracked real student code, three categories of pointer errors accounted for nearly all bugs:
| Error Category | % of Students Who Make It |
|---|---|
| Memory leak (pointer leaves scope without free) | 74% |
| Dereferencing a dead pointer (use-after-free) | 70% |
| Dereferencing a null pointer | 57% |
Building a linked list exercises all three. Pay special attention to freeing nodes and checking for NULL.
Requirements
Your program should:
- Read an integer
nfrom stdin (how many values to insert). - Read
nintegers and insert each into a linked list. - Print the list (space-separated values, then a newline).
- Free all memory — every node must be deallocated.
The Node Struct
typedef struct Node {
int value;
struct Node *next;
} Node;
Note: For recursive (self-referencing) structs, you must name the struct (struct Node) and use struct Node *next inside — because Node (the typedef) isn’t defined yet at that point.
✏️ Predict warm-up — trace 3 nodes by hand before you compile
Before you write a single line of node_create, work through this on paper. The point is to load the data structure into your head so you’re coding from a model, not flailing.
Imagine the user enters Enter count: 3, then values 10, 20, 30. After all three insertions, draw:
- Three boxes, one per node, each labeled with
valueandnext. - Arrows for every
nextpointer (where does node 1’snextpoint? Node 3’s?). - Two outside arrows: one labeled
head, one labeledtail. Where do they point?
Now answer (commit to a number):
- How many
malloc(sizeof(Node))calls happen total? - How many
free(...)calls must happen during cleanup? - In
list_free, thecurrpointer takes how many distinct values during the walk? (Hint: it visits every node exactly once, plus one terminal value.) - When
list_printprints node 3, what doescurr->nextequal? What stops the loop?
Once you have these numbers, then start coding node_create / list_print / list_free. The implementation almost writes itself once the picture is clear. Without the picture, every implementation move is guesswork — and guesswork is why 70% of students hit use-after-free.
Example Run
Enter count: 4
Enter value: 10
Enter value: 20
Enter value: 30
Enter value: 40
List: 10 20 30 40
Hints
- To insert at the tail, track a
tailpointer. malloc(sizeof(Node))allocates one node.- Set
new_node->next = NULLfor the last node. - To free the list, walk through and free each node — but save
nextbefore callingfree!
gcc -Wall -std=c11 linked_list.c -o linked_list
echo "4 10 20 30 40" | ./linked_list
🔬 Boss-level verification: run it under AddressSanitizer
You met AddressSanitizer in step 4 as the X-ray vision for memory bugs. The boss fight is exactly where to use it: linked-list code is the densest source of leaks, double-frees, and use-after-frees in real C programs. Once your basic version passes the tests, recompile with the sanitizer and run again:
gcc -Wall -std=c11 -g -fsanitize=address linked_list.c -o linked_list
echo "4 10 20 30 40" | ./linked_list
A correct implementation produces no extra output. If you see a wall of red text — congratulations, you’ve just found a real bug, with the offending line number underlined. Common things AddressSanitizer catches at this step:
- Memory leak — you forgot to
free(or only freed the head, not the tail). - Use-after-free — you read
curr->nextafterfree(curr). The classic trap from the step prose. - Heap-buffer-overflow — you wrote past
malloc‘d memory (rare for nodes; more likely if you allocatenints and writen+1).
Pass under both gcc-with-warnings and AddressSanitizer and you’ve cleared the boss fight properly. In real C code review, “it passes the tests” without “it passes the sanitizer” is not enough.
#include <stdio.h>
#include <stdlib.h>
typedef struct Node {
int value;
struct Node *next;
} Node;
Node *node_create(int value) {
return NULL;
}
void list_print(const Node *head) {
}
void list_free(Node *head) {
}
int main(void) {
int n;
printf("Enter count: ");
scanf("%d", &n);
Node *head = NULL;
Node *tail = NULL;
for (int i = 0; i < n; i++) {
int val;
printf("Enter value: ");
scanf("%d", &val);
Node *new_node = node_create(val);
if (new_node == NULL) {
fprintf(stderr, "malloc failed\n");
list_free(head);
return 1;
}
if (head == NULL) {
head = new_node;
tail = new_node;
} else {
tail->next = new_node;
tail = new_node;
}
}
printf("List: ");
list_print(head);
list_free(head);
return 0;
}
Step 11 — Knowledge Check
Min. score: 80%
1. Why must you save curr->next BEFORE calling free(curr) in list_free?
After free(curr), the memory is returned to the allocator. Any access to curr->next is undefined behavior — the allocator may have already overwritten that memory, or the page may be unmapped. Always save what you need before freeing.
2. In typedef struct Node { ... struct Node *next; } Node;, why do we need both the struct tag Node and the typedef name Node?
Inside the struct definition, the typedef Node doesn’t exist yet — it’s defined at the closing brace. So self-referential structs must use the tag name struct Node. The typedef Node only becomes available after the full definition is complete.
3. Arrange the lines to free a linked list without leaking memory or causing use-after-free. (arrange in order)
Node *curr = head;while (curr != NULL) {Node *next = curr->next;free(curr);curr = next;}
curr = curr->next;free(next);
Save curr->next into a temp variable BEFORE freeing curr. Then advance to the saved next. The distractor curr = curr->next after free(curr) is a use-after-free bug — the most common mistake. free(next) would free the wrong node.
4. Arrange the lines to create a node, insert it at the tail of a linked list, and update the tail pointer. (arrange in order)
Node *new_node = malloc(sizeof(Node));new_node->value = val;new_node->next = NULL;tail->next = new_node;tail = new_node;
new_node->next = tail;head = new_node;
Allocate a new node, set its value, set its next to NULL (it’s the new tail). Link it to the current tail with tail->next = new_node, then update the tail pointer. new_node->next = tail would create a circular reference (wrong direction). head = new_node would lose the rest of the list.
5. [Evaluate] Your main() keeps both a head and a tail pointer. A teammate proposes simplifying it to only a head pointer — every insertion would walk to the end of the list before linking the new node. For a list of N existing nodes, what’s the cost of inserting one new node at the tail under each design?
With a tail pointer, each tail-insert is two pointer assignments — O(1). Without one, you walk the entire list to find the tail before linking — O(N) per insert, O(N²) for building a list of N nodes. This is the same cost analysis behind C++’s std::list (which also stores both endpoints) and Python’s collections.deque (doubly-linked, both ends O(1)).
6. [Final Integration] Which of the following C features have you used in the linked list program? (Select all that apply) (select all that apply)
The linked list integrates everything: structs (Node), malloc/free (allocation/cleanup), pointers (traversal, next-links, pass-by-reference), and printf/scanf (I/O). If you got this right, you just used every power in the toolkit at once. Boss defeated. Origin story complete. You’re a C programmer now.