Architectural styles describe the dominant shape of a system: pipe-and-filter, layered, publish-subscribe, client-server, and so on. Architectural tactics are smaller design moves that an architect uses to improve one quality attribute inside that larger shape.
Think of tactics as the architect’s quality-attribute toolbox. A style says, “organize this subsystem as independent filters connected by pipes.” A tactic says, “add a watchdog and timeout so failed components are detected quickly,” or “add a cache so repeated requests avoid expensive reacquisition.”
Tactics are useful because they make quality attributes concrete. Instead of saying “make it available,” the architect can ask: What failure do we need to detect? How quickly? What recovery action happens after detection? What performance cost are we willing to pay for that detection?
Tactics vs. Styles
Concept
Scope
Example
Main question
Architectural style
Shapes the gross structure of a subsystem or whole system
publish-subscribe, layered, pipe-and-filter
What element types, connector types, and constraints dominate this design?
Architectural tactic
Improves a target quality attribute through a reusable design move
heartbeat, ping-echo, caching, redundancy
Which quality scenario improves, and what qualities does the tactic trade away?
A system usually combines both. A robot might use publish-subscribe as its communication style, then apply heartbeat to detect failed components and caching to avoid repeatedly recomputing expensive map data.
Availability Tactics
Availability is the ability of a system to mask, detect, repair, or recover from faults. Many availability tactics start with the same problem: before a system can recover from a failed component, it has to notice the failure.
Ping-Echo
Goal: detect that a component, process, node, or service has stopped responding before the fault escalates into a visible failure.
Solution: a watchdog periodically sends an asynchronous request, the ping, to each monitored component. A healthy component replies with an echo. If the watchdog does not receive the echo before a timeout, it activates a recovery mechanism, such as restarting the component, routing around it, or starting a replacement instance.
Quality impact:
Promotes availability: the system can detect failed components and trigger recovery.
Inhibits performance: pings and echoes consume network bandwidth, processing cycles, and queue capacity.
Simplifies monitored components: most of the logic lives in the watchdog; a monitored component only needs to answer the ping.
Ping-echo is a good fit when the watchdog controls the monitoring schedule and when the extra request-response traffic is acceptable.
Heartbeat
Goal: detect that a component, process, node, or service has stopped working.
Solution: each monitored component periodically sends a heartbeat message to a watchdog. If the watchdog does not receive a heartbeat before a timeout, it activates recovery.
Quality impact:
Promotes availability: the system can infer failure from silence.
Inhibits performance: heartbeat messages consume resources, though usually fewer messages than ping-echo because there is no request-response pair.
Complicates monitored components: every monitored component needs a heartbeat routine and must keep sending heartbeats even while doing its normal work.
Heartbeat is a good fit when monitored components already have their own control loop, or when reducing monitoring traffic matters more than keeping monitored components simple.
Ping-Echo vs. Heartbeat
Tactic
Who initiates the message?
Message pattern
Main benefit
Main cost
Ping-echo
Watchdog
watchdog ping, component echo
simple monitored components
more messages and centralized monitoring work
Heartbeat
Monitored component
component heartbeat
fewer messages and easy passive monitoring
heartbeat logic inside every monitored component
Both tactics need carefully chosen timeout values. A timeout that is too short creates false positives and unnecessary recovery. A timeout that is too long lets failures remain hidden.
Redundancy
Redundancy improves availability by ensuring that another component can take over when one component fails.
Active redundancy: multiple replicas run at the same time. If one fails, another already-running replica can continue service quickly. This improves recovery time but costs more CPU, memory, and coordination.
Cold spare: a backup component is available but not running the workload until failure occurs. This saves resources but recovery is slower because the spare must be started, warmed up, or synchronized.
Redundancy is rarely enough on its own. The system still needs detection, failover, state synchronization, and tests that prove the recovery path actually works.
Performance Tactic: Caching
Goal: avoid expensive reacquisition or recomputation of a resource.
Solution: store a local copy of a resource in a fast-access cache. When a later request asks for the same resource, the system serves the cached copy instead of asking the slower provider again.
Quality impact:
Promotes performance: repeated requests can avoid slow network calls, database reads, file-system access, or expensive computation.
May improve availability: cached data can sometimes let a system keep serving degraded responses when the source is temporarily unavailable.
Inhibits consistency and modifiability: the system now has to decide when cached data is stale, how invalidation works, and which components are responsible for cache correctness.
Consumes memory or storage: a cache trades space for time.
A good caching requirement names the scenario and the measure. “Use caching” is not a quality requirement. “When the product catalog receives repeated requests for the same item within a 10-minute window, at least 90% of those requests are served from cache and p95 response time stays below 100 ms” is a quality requirement that caching might satisfy.
Choosing a Tactic
Use tactics after the quality attribute scenario is specific enough to judge them. A practical sequence is:
State the quality scenario and measure.
Identify the failure, delay, change, or risk that blocks the measure.
Choose a tactic that directly addresses that blocker.
Name the qualities the tactic will likely inhibit.
Add observability so the team can verify the tactic works in production-like conditions.
For example, a team trying to improve availability might start with this scenario: “If one perception worker crashes while the robot is operating, the system detects the crash within 2 seconds and starts a replacement worker within 5 seconds.” Ping-echo, heartbeat, or process supervision could all be candidate tactics. The right choice depends on the runtime style, the acceptable monitoring traffic, and how much logic the team wants inside each worker.
Tactics do not remove trade-offs. They make trade-offs inspectable.
Architectural Tactics Quiz and Flashcards
Use these flashcards and quiz questions to practice distinguishing tactics from styles, matching tactics to quality scenarios, and naming the costs of ping-echo, heartbeat, redundancy, and caching.
Architectural Tactics Flashcards
Availability and performance tactics, including ping-echo, heartbeat, redundancy, and caching.
Difficulty:Basic
What is an architectural tactic?
A reusable design move that helps achieve a specific quality attribute, such as availability, performance, testability, or modifiability.
Architectural styles shape the dominant structure of a system. Tactics are smaller moves inside that structure: heartbeat for availability, caching for performance, dependency injection for testability.
Difficulty:Basic
How does a tactic differ from an architectural style?
A style defines the gross structure: element types, connector types, and constraints. A tactic improves one quality scenario inside that structure.
Publish-subscribe is a style. Heartbeat is a tactic. A pub-sub robot can still use heartbeat to detect failed components.
Difficulty:Basic
Describe the ping-echo availability tactic.
A watchdog sends a ping to monitored components; healthy components reply with an echo. If the watchdog does not receive an echo before a timeout, it triggers recovery.
Ping-echo centralizes monitoring logic in the watchdog, but it creates request-response monitoring traffic.
Difficulty:Basic
Describe the heartbeat availability tactic.
Each monitored component periodically sends a heartbeat message to a watchdog. If the watchdog stops receiving heartbeats before a timeout, it infers failure and triggers recovery.
Heartbeat often uses fewer messages than ping-echo, but every monitored component must implement heartbeat behavior.
Difficulty:Intermediate
Compare ping-echo and heartbeat.
Ping-echo: watchdog initiates monitoring; simpler monitored components; more messages. Heartbeat: monitored components initiate monitoring messages; fewer messages; more logic inside each monitored component.
Both improve availability by detecting faults before they become visible failures. Both inhibit performance because monitoring consumes bandwidth, processing cycles, and queue capacity.
Difficulty:Intermediate
Why do timeout values matter in ping-echo and heartbeat tactics?
A timeout that is too short causes false failure detections and unnecessary recovery. A timeout that is too long lets real failures remain hidden.
Timeout selection is part of the architecture, not an implementation afterthought. It directly shapes availability, performance, and operational noise.
Difficulty:Basic
Distinguish active redundancy and cold spare.
Active redundancy: multiple replicas run at the same time so another can take over quickly. Cold spare: a backup exists but is inactive until failure, saving resources but increasing recovery time.
Active redundancy improves recovery time at higher runtime cost. Cold spares lower steady-state cost but require startup, warm-up, or synchronization during recovery.
Difficulty:Basic
Describe the caching performance tactic.
A system stores a fast local copy of a resource so later requests can avoid expensive retrieval or recomputation.
Caching trades space and consistency complexity for lower latency or higher throughput.
Difficulty:Intermediate
What quality attributes can caching inhibit?
Caching can inhibit consistency and modifiability because the system must define cache invalidation, stale-data rules, ownership, and coherence across components.
Caching is not just a performance win. It creates a second place where data can live, so correctness now depends on keeping cached data fresh enough for the scenario.
Difficulty:Advanced
What sequence should an architect follow when choosing a tactic?
State the quality scenario and measure, identify the blocker, choose a tactic that addresses it, name inhibited qualities, and add observability to verify the tactic works.
Tactics should be selected because they improve a specific scenario, not because they are popular or familiar.
Workout Complete!
Your Score: 0/10
Come back later to improve your recall!
Architectural Tactics Quiz
Apply availability and performance tactics to concrete quality-attribute scenarios.
Difficulty:Basic
Which statement best distinguishes an architectural tactic from an architectural style?
The labels are swapped. Styles describe the gross structure (publish-subscribe, layered, pipe-and-filter), and tactics are the smaller quality-attribute moves applied inside that structure.
Tactics are not tied to object-oriented programming. Heartbeat, caching, and redundancy appear in many paradigms and runtimes.
Both styles and tactics can affect many qualities. Caching is a performance tactic, dependency injection is a testability tactic — there is no fixed performance-versus-maintainability split.
Correct Answer:
Explanation
Styles are structural constraints at architectural scale; tactics are reusable quality-attribute moves applied inside a design. A publish-subscribe system can still use heartbeat, redundancy, and caching.
Difficulty:Basic
A watchdog sends a request every 2 seconds to each worker. Each healthy worker replies immediately. If no reply arrives before timeout, the watchdog restarts the worker. Which tactic is this?
In heartbeat, the monitored component initiates periodic messages. Here the watchdog initiates the check and expects a reply.
A cold spare is a backup component waiting to be activated after failure. It does not describe the failure-detection message pattern.
Caching stores resources to avoid expensive reacquisition. It is unrelated to liveness checks.
Correct Answer:
Explanation
Ping-echo has the watchdog initiate the check. The monitored component only needs to answer the ping; missing echoes trigger recovery.
Difficulty:Basic
Each worker sends an “alive” message to a monitor every 5 seconds. If the monitor stops receiving messages from one worker, it replaces that worker. Which tactic is this, and what is one cost?
Ping-echo is watchdog-initiated. The stem says each worker initiates the periodic “alive” message, so the workers are not passive responders.
Cold spare describes the recovery resource (a standby kept stopped until needed), not how the monitor detects that a worker has failed.
Active redundancy is about running multiple replicas simultaneously so failover is fast. It does not describe the periodic liveness signal in the stem.
Correct Answer:
Explanation
Heartbeat shifts the periodic message to the monitored component. It can use fewer messages than ping-echo, but it complicates each monitored component and still consumes network and processing resources.
Difficulty:Intermediate
A team is choosing between ping-echo and heartbeat for 10,000 IoT devices on a low-bandwidth network. Which trade-offs should they consider? Select all that apply.
Ping-echo’s two-message-per-check pattern is exactly what matters at 10,000 devices on a low-bandwidth network — easy to overlook when comparing tactics on a whiteboard.
Heartbeat saves the ping side of the exchange, but the device firmware now owns periodic liveness behavior — this is a real cost to weigh, not a free lunch.
Heartbeat still needs timeouts. The monitor infers failure from silence, but only after a threshold elapses — without one, the monitor could never declare a device dead.
Monitoring is not free under either tactic. Even tiny liveness messages add up at scale and compete with real workload traffic.
Both are availability tactics — both detect failed components so recovery can run. Both also inhibit performance as a cost. There is no clean split where one is a performance tactic and the other an availability tactic.
Correct Answers:
Explanation
The useful comparison is who sends messages, how many messages exist, and where complexity lives. Both tactics improve availability by detecting faults, and both charge a performance cost for monitoring.
Difficulty:Basic
A checkout service keeps a standby payment worker stopped until the active worker fails. On failure, the standby is started and warmed up. Which redundancy tactic is this?
Active redundancy keeps multiple replicas running at the same time so another can take over quickly. The stem says the standby is stopped until failure, which is the opposite end of the redundancy trade-off.
Ping-echo is a detection tactic — it tells the system that the active worker has failed. The question asks about the recovery resource the system has waiting after detection.
Caching stores resources to avoid expensive reacquisition. It does not describe whether a backup worker is already running or kept stopped until needed.
Correct Answer:
Explanation
Cold spare saves steady-state resources but increases recovery time. The system must start, warm, and synchronize the spare after detecting failure.
Difficulty:Intermediate
A product catalog receives repeated requests for the same item. A cache serves 92% of repeat requests and keeps p95 latency below 100 ms. Which quality attribute does the tactic primarily improve, and what risk did it introduce?
Caching can sometimes help degraded availability, but the scenario’s measure is latency. Dependency cycles are not the cache-specific risk.
Caching may affect tests, but the scenario is explicitly about latency. Lower bandwidth is usually a benefit, not the central risk.
Portability is about moving across environments. CPU scheduling is not the relevant cache trade-off.
Correct Answer:
Explanation
Caching primarily improves performance by avoiding expensive reacquisition. The architectural cost is deciding when cached data is fresh enough and how invalidation works.
Difficulty:Intermediate
A team says, “We should add caching.” What is the best architectural response?
Caching can slow systems down or break semantics if hit rates are low or invalidation is hard.
Caching is not inherently wrong. It is wrong when the consistency cost exceeds the performance benefit for the scenario.
Pipe-and-filter is a style choice and is unrelated to whether a repeated resource should be cached.
Correct Answer:
Explanation
Tactics should be tied to scenarios — what repeated resource, under what load, with what hit-rate/latency target and stale-data tolerance. A cache is justified when the measured performance gain is worth the memory, invalidation, and stale-data costs.
Difficulty:Advanced
A quality scenario says: “If one perception worker crashes while the robot is operating, the system detects the crash within 2 seconds and starts a replacement worker within 5 seconds.” Which architectural elements or tactics are likely relevant? Select all that apply.
The scenario’s first half is fault detection within 2 seconds, exactly what heartbeat or ping-echo addresses.
Starting a replacement worker requires recovery capacity, commonly redundancy or supervision.
Old heartbeat messages would hide failure. Liveness must be current.
If the team cannot observe detection and recovery times, it cannot verify the quality scenario.
Layer bridging is a layered-style performance trade-off, not a recovery tactic.
Correct Answers:
Explanation
Availability tactics often compose. Detection, recovery capacity, and observability all have to work together for the quality scenario to be satisfied.
Workout Complete!
Your Score: 0/8
Cookie & Privacy Notice:
This site stores a few preferences and your progress locally in your browser
(cookies and localStorage) so it works the way you left it.
Nothing is sent to or stored on any external server, and this site does not
sell, share, or disclose any user data to third parties.
View & manage your data →