Understanding Circuit Breakers in Software Engineering: From Traditional to Serverless

Back to Labs Content

Software Architecture
System Design

Understanding Circuit Breakers in Software Engineering: From Traditional to Serverless

Friday, March 14, 2025 at 10:46:27 AM GMT+8

What Is a Circuit Breaker?

Imagine you’re using electricity at home, and a short circuit occurs. The circuit breaker in your electrical panel cuts the power to prevent a fire. In software, the concept is similar: it’s a design pattern that protects your system from repeated failures when calling external services (APIs, databases, etc.).

Main Purposes:

Detect Failures

The first job of a circuit breaker is to act as a vigilant watchdog, constantly monitoring interactions between your application and external services like APIs, databases, or third-party systems. It keeps an eye on every request, tracking whether they succeed or fail based on specific criteria, such as receiving an error code (e.g., HTTP 500), timing out after a set duration (e.g., no response within 2 seconds), or encountering exceptions like network disconnections.

To do this effectively, the circuit breaker collects data over a defined window—perhaps the last 10 requests or the past 30 seconds—and calculates metrics like the total number of failures or the failure rate (e.g., 60% of calls failed). If these metrics cross a configurable threshold—say, five failures in a row or a 50% error rate—it recognizes that something’s wrong with the external service. This detection isn’t just about noticing a single hiccup; it’s about identifying patterns of unreliability that could harm your system if left unchecked. By catching these issues early, the circuit breaker ensures your application doesn’t blindly keep trying a service that’s clearly struggling.

Prevent Cascading Failures

Once a failure is detected, the circuit breaker steps in to stop a domino effect known as cascading failures, where one broken component drags down the entire system. Imagine an e-commerce app where the payment API is down: without a circuit breaker, every user request might hang, waiting for a timeout, piling up server resources, slowing the database, and eventually crashing the whole application.

In its Closed state, the circuit breaker allows calls to proceed, but as soon as failures hit the threshold, it flips to Open, cutting off all further attempts to contact the faulty service. This immediate halt prevents the problem from rippling through your system—your app stops wasting threads, memory, or CPU cycles on a hopeless task. Instead of letting a single point of failure—like a slow third-party API—overload your servers or exhaust connection pools, the circuit breaker isolates the issue, keeping the rest of your application stable and responsive. It’s like closing a floodgate to protect the town downstream from a burst dam.

Provide a Fallback Response

When the circuit breaker blocks calls in its Open state, it doesn’t just leave users hanging—it offers a fallback response to keep the system usable. This fallback is a preplanned alternative to the failed service’s output, designed to minimize disruption.

For example, if a weather API fails, the circuit breaker might return a cached forecast from an hour ago or a simple message like "Weather data unavailable, try again later." In a payment system, it could redirect users to an alternative checkout method or log the attempt for later retry. The fallback doesn’t fix the root problem, but it ensures graceful degradation.

Your application keeps running in a limited capacity rather than crashing or showing cryptic errors. Crafting a good fallback requires understanding your use case: it might be static data, a default value, or even a call to a backup service. By providing this safety net, the circuit breaker maintains user trust and buys time for the external service to recover without sacrificing functionality entirely.

Circuit Breaker Pattern

Overall Mechanism

Closed: All calls are forwarded. If failures exceed the threshold (e.g., 5), it switches to Open.
Open: Calls are blocked, and a fallback is used. After a set time (e.g., 30 seconds), it moves to Half-Open.
Half-Open: A test call is made. Success → Closed, Failure → Open.

Simple Code Example

Here’s a basic implementation in JavaScript:

class CircuitBreaker {
  constructor(maxFailures = 5, resetTimeout = 30000) {
    this.state = "CLOSED";
    this.failureCount = 0;
    this.maxFailures = maxFailures;
    this.resetTimeout = resetTimeout;
  }


  async call(service) {
    if (this.state === "OPEN") {
      if (Date.now() > this.resetTime) {
        this.state = "HALF_OPEN";
      } else {
        return "Fallback: Service unavailable";
      }
    }


    try {
      const result = await service();
      if (this.state === "HALF_OPEN") {
        this.state = "CLOSED";
        this.failureCount = 0;
      }
      return result;
    } catch (error) {
      this.failureCount++;
      if (this.failureCount >= this.maxFailures) {
        this.state = "OPEN";
        this.resetTime = Date.now() + this.resetTimeout;
      }
      return "Fallback: Service unavailable";
    }
  }
}


// Example usage
const breaker = new CircuitBreaker();
const fakeService = () => Math.random() > 0.5 ? "Success" : Promise.reject("Error");
breaker.call(fakeService).then(console.log);

Circuit Breakers in Serverless

In a serverless environment (e.g., AWS Lambda), circuit breakers are still valuable, but their stateless nature poses challenges. The state must be stored externally, such as in DynamoDB.

Example in AWS Lambda

const AWS = require('aws-sdk');
const dynamodb = new AWS.DynamoDB.DocumentClient();


async function handler(event) {
  const serviceName = "ExternalAPI";
  const state = await dynamodb.get({
    TableName: "CircuitBreakerState",
    Key: { Service: serviceName }
  }).promise();


  if (state.Item?.State === "OPEN" && Date.now() < state.Item.ResetTime) {
    return { statusCode: 503, body: "Service unavailable" };
  }


  try {
    const response = await callExternalAPI();
    if (state.Item?.State === "HALF_OPEN") {
      await dynamodb.update({
        TableName: "CircuitBreakerState",
        Key: { Service: serviceName },
        UpdateExpression: "SET #state = :closed",
        ExpressionAttributeNames: { "#state": "State" },
        ExpressionAttributeValues: { ":closed": "CLOSED" }
      }).promise();
    }
    return { statusCode: 200, body: response };
  } catch (error) {
    // Logic to update failure count and switch to Open
    return { statusCode: 503, body: "Service unavailable" };
  }
}

Conclusion

Circuit breakers are a powerful pattern for building resilient systems, whether on traditional servers or in serverless environments. With the simulations and code above, I hope you’ve gained a clearer understanding of how they work.

Another Recommended Labs Content

When dealing with large volumes of data, efficient database management becomes essential. Two widely used techniques to improve performance and scalability are database partitioning and database sharding. Although often confused, these approaches differ fundamentally in architecture, complexity, and suitable use cases. This article explores these differences in detail, helping you decide which fits your application best.

Behind every well-architected system is a set of tough decisions. The CAP Theorem simplifies those decisions by showing you what you must give up to keep your system fast, correct, and resilient. Learn how to apply this in real-world architecture.

Domain-Driven Design (DDD) is a powerful approach to software development that places the business domain—not the technology—at the center of your design decisions. First introduced by Eric Evans, DDD is essential for developers and architects who want to build systems that reflect real-world complexity and change.

🚀Darmawan