How a Simple `for` Loop Can Freeze Your Go Service

In backend engineering, the most dangerous code is often the simplest.

We recently encountered a classic scaling pitfall in one of our core services. The system required a background reconciliation loop — a process that periodically checks the health of millions of in-memory objects.

The solution seemed obvious: write a for loop to scan the map.

The result?

A service that periodically “froze” for hundreds of milliseconds, causing API timeouts and P99 latency spikes.

This is the story of how a harmless loop turned into a performance bottleneck — and the “Paced Monitor” pattern we used to fix it.

The Scenario: Managing State in RAM

Imagine you're building:

A session manager
A job scheduler
A real-time game server
A high-performance control plane

To keep things fast, you store your objects (let’s call them Entities) in memory.

You need a background worker that:

Checks for timeouts
Repairs stale state
Cleans up inconsistent objects

Sounds simple.

The Naive Implementation

The standard thread-safe pattern in Go:

func (m *Manager) RunHealthCheck() {
    // 🔒 LOCK THE WORLD
    m.store.mu.RLock() 
    defer m.store.mu.RUnlock()

    for id, entity := range m.store.Entities {
         if !entity.IsHealthy() {
             m.repair(id)
         }
    }
}

It works perfectly in unit tests with 10 or 100 items.

But working ≠ scaling.

The Math: Why 1.6 Million Objects Hurt

Assume:

1 GB RAM allocated for metadata
~1.6 million entities fit in memory

When the health check runs:

It iterates over 1.6 million items

CPU Cost Per Iteration

Operation	Estimated Cost
Fetch pointer	~10 ns
Check condition	~2 ns
Access nested data	~200 ns
Total per item	~250 ns

Now multiply:

1,600,000 × 250 ns ≈ 0.4 seconds

The loop takes ~400 milliseconds.

Sounds small?

It’s not.

The Real Problem: Lock Contention

The issue is this line:

m.store.mu.RLock()

We hold a global read lock for 400ms.

In a highly concurrent system, 400ms is an eternity.

During that window:

❌ Writers block (Lock())
❌ Readers may queue
❌ API calls stall
❌ P99 latency spikes
❌ Load balancers may time out

Now imagine scaling to 10GB RAM (~16 million items):

16,000,000 × 250 ns ≈ 4 seconds

Your service freezes for 4 seconds.

This creates latency jitter:

Fast one moment
Unresponsive the next

The Solution: The “Paced Monitor” Pattern

We moved from:

Eager Locking → Lock everything at once To:
Lazy Locking → Lock only what we need, when we need it

Also known as:

Snapshot & Yield

Step 1: Snapshot the Keys

Instead of holding the lock during processing, we:

Lock briefly
Copy the list of keys
Unlock immediately

func (m *Manager) GetEntityIDs() []string {
    m.store.mu.RLock()
    defer m.store.mu.RUnlock()

    ids := make([]string, 0, len(m.store.Entities))
    for id := range m.store.Entities {
        ids = append(ids, id)
    }
    return ids
}

Copying strings is much cheaper than running full health logic inside the lock.

Step 2: Fine-Grained Locking

Now we process each entity individually.

We lock only for nanoseconds:

func (m *Manager) checkSingleEntity(id string) {
    m.store.mu.RLock()
    entity, ok := m.store.Entities[id]
    m.store.mu.RUnlock()

    if !ok {
        return
    }

    if !entity.IsHealthy() {
        m.repair(id)
    }
}

Step 3: Yield to the Scheduler (The Secret Sauce)

After each item (or small batch), we yield:

func (m *Manager) RunPacedHealthCheck() {
    allIDs := m.GetEntityIDs()

    for _, id := range allIDs {
        m.checkSingleEntity(id)

        // PACING: Let user requests run
        time.Sleep(1 * time.Millisecond)
    }
}

That tiny sleep:

Allows waiting goroutines to acquire locks
Lets user requests "squeeze in"
Reduces tail latency
Prevents request starvation

The Trade-Off

We traded:

Before	After
Fast scan (~0.4s)	Slow scan (5–10s)
Massive freeze	Zero user impact
High throughput	Stable latency
Terrible P99	Smooth tail

In distributed systems:

Background tasks must be second-class citizens.

They should never compete with the user request path.

Key Takeaways

1️⃣ Big-O Is Not Enough

Even O(n) can destroy your system at scale.

2️⃣ Locks Amplify Latency

Holding a global lock turns CPU time into system-wide pause time.

3️⃣ Throughput vs Latency Is a Trade

Sometimes slowing down background work improves overall system performance.

4️⃣ Always Think in Tail Latency

Users don’t care about average latency. They feel P99.

The Principle

If your background job:

Iterates millions of items
Holds a shared lock
Runs periodically

It’s not a loop.

It’s a distributed denial-of-service against yourself.

The Pattern Name

You can call it:

Paced Monitor
Snapshot & Yield
Cooperative Background Processing
Latency-Friendly Reconciliation

But the principle is simple:

Do small work. Release locks quickly. Yield often. Protect the request path at all costs.

In backend engineering, the simplest code can be the most dangerous.

Sometimes the fix is not smarter algorithms.

Sometimes it’s just:

time.Sleep(1 * time.Millisecond)

And the discipline to respect concurrency.

How a Simple `for` Loop Can Freeze Your Go Service

How a Simple for Loop Can Freeze Your Go Service

The Scenario: Managing State in RAM

The Naive Implementation

The Math: Why 1.6 Million Objects Hurt

CPU Cost Per Iteration

The Real Problem: Lock Contention

The Solution: The “Paced Monitor” Pattern

Step 1: Snapshot the Keys

Step 2: Fine-Grained Locking

Step 3: Yield to the Scheduler (The Secret Sauce)

The Trade-Off

Key Takeaways

1️⃣ Big-O Is Not Enough

2️⃣ Locks Amplify Latency

3️⃣ Throughput vs Latency Is a Trade

4️⃣ Always Think in Tail Latency

The Principle

The Pattern Name

How a Simple `for` Loop Can Freeze Your Go Service