Skip to main content

Command Palette

Search for a command to run...

Strategies for dealing with Swift actor data races

Updated
9 min read
Strategies for dealing with Swift actor data races
D

I love technology and everything related, from gadgets to new professional techniques. I like thinking, researching, optimizing, inventing and developing. I have a strong background in software research and development, operating systems, Voice-over-IP, network security, wired and wireless network engineering, complemented with electronic engineering background.

My career goal is to always keep learning, to be challenged, and to work remotely so I can be present for my family.

Bug hacker and master troubleshooter, my strength is understanding a problem and getting to the root of it. I'm mostly a self-taught individual and a constant learner. I push my technical boundaries daily and search for ways to improve my skills every day. With over 20 years of experience writing software in various languages, creating or optimizing algorithms, the digital development world is my turf.

Sample challenges which I particularly enjoyed:

  • Created a GLSL based magnification tool for a client who was turned down by three other companies as "impossible to do on macOS".
  • Optimized several SQL queries to reduce load time of a particular web page from several seconds to sub 50ms.
  • Identified the root cause of stuttering animations in iOS mobile app and implemented mitigation strategy

Specialties: Swift, Objective-C and PHP Software Development; TCP/IP and Wireless Network Engineering

When Swift introduced actors in Swift 5.5, many developers breathed a sigh of relief. Finally, a language-level construct that would protect us from data races! But here's the uncomfortable truth: actors don't eliminate data races—they just change where they can occur.

The culprit? Re-entrancy.

💡
Article written by AI under human supervision & guidance.

The False Sense of Security

Consider this innocent-looking actor:

actor BankAccount {
    private var balance: Double = 1000

    func withdraw(_ amount: Double) async -> Bool {
        // Check if we have sufficient funds
        guard balance >= amount else {
            return false
        }

        // Simulate an async operation (logging, validation, network call, etc.)
        await performAsyncValidation()

        // Deduct the amount
        balance -= amount
        return true
    }

    private func performAsyncValidation() async {
        try? await Task.sleep(for: .milliseconds(100))
    }
}

Looks safe, right? The actor should serialize access to balance. Let's test it:

let account = BankAccount()

async let withdrawal1 = account.withdraw(600)
async let withdrawal2 = account.withdraw(600)

let results = await [withdrawal1, withdrawal2]
print(results) // [true, true] — Wait, what?!

print(await account.getBalance()) // 💥 -200

Both withdrawals succeeded, and we've gone negative. This is a data race.

Why Does This Happen? The Mailbox Model

Actors only do one thing at a time—that's true. But actors also don't like to sit around doing nothing. Think of an actor as having a "mailbox" of pending work. When you call an actor method, you're dropping a message into that mailbox.

Here's the key insight: when an actor method hits an await, it suspends—and the actor immediately picks up the next message from its mailbox rather than waiting idle.

Here's the sequence that leads to our bug:

  1. withdraw(600) #1 starts, checks balance >= 600 ✅ (balance is 1000)

  2. withdraw(600) #1 hits await performAsyncValidation() and suspends

  3. Actor picks up next message: withdraw(600) #2 starts (re-entrancy!)

  4. #2 checks balance >= 600 ✅ (balance is still 1000—#1 hasn't modified it yet)

  5. #2 hits await and suspends

  6. #1 resumes, deducts 600 → balance is now 400

  7. #2 resumes, deducts 600 → balance is now -200 💥

The guard check and the mutation are not atomic across the await boundary. The actor only guarantees that synchronouscode blocks don't interleave.

Two Flavors of Re-Entrancy Problems

Re-entrancy can cause two distinct types of issues:

Problem 1: Incorrect State (Data Corruption)

This is what we saw above—multiple operations interleave and corrupt shared state, leading to invalid results like a negative bank balance.

Problem 2: Wasteful Duplicate Work

Even when state doesn't get corrupted, re-entrancy can cause unnecessary work. Consider a cache that fetches from a remote server:

actor DataCache {
    private var cache: [UUID: Data] = [:]

    func read(_ key: UUID) async -> Data? {
        if let data = cache[key] {
            return data
        }

        // Fetch from remote if not cached locally
        guard let data = try? await fetchFromServer(key) else {
            return nil
        }

        cache[key] = data
        return data
    }
}

When multiple concurrent reads request the same uncached key:

cache read called for DDFA2377-...
attempt to read remote cache for DDFA2377-...
cache read called for DDFA2377-...          // Re-entrancy!
attempt to read remote cache for DDFA2377-... // Duplicate request!
cache read called for DDFA2377-...          // Re-entrancy again!
attempt to read remote cache for DDFA2377-... // Another duplicate!

We made three network requests when one would have sufficed. The first request would have cached the result for the others—but they all started before any completed.


Now let's look at strategies to handle both types of problems.

Strategy 1: Bounce Concurrent Requests

The simplest approach: if an operation is already in progress, reject new requests immediately.

actor BankAccount {
    private var balance: Double = 1000
    private var isWithdrawing = false

    enum WithdrawError: Error {
        case operationInProgress
        case insufficientFunds
    }

    func withdraw(_ amount: Double) async throws -> Double {
        // Reject if another withdrawal is in progress
        guard !isWithdrawing else {
            throw WithdrawError.operationInProgress
        }

        isWithdrawing = true
        defer { isWithdrawing = false }

        guard balance >= amount else {
            throw WithdrawError.insufficientFunds
        }

        await performAsyncValidation()

        balance -= amount
        return balance
    }
}

When to use: Idempotent operations, UI button debouncing, preventing duplicate submissions.

Pros: Simple, explicit, fast failure.

Cons: Callers must handle rejection and potentially retry.

Strategy 2: Queue and Await Previous Operations

Instead of rejecting concurrent requests, queue them up and process them serially.

actor BankAccount {
    private var balance: Double = 1000
    private var pendingOperations: [CheckedContinuation<Void, Never>] = []
    private var isOperationInProgress = false

    private func acquireLock() async {
        if isOperationInProgress {
            // Wait our turn
            await withCheckedContinuation { continuation in
                pendingOperations.append(continuation)
            }
        }
        isOperationInProgress = true
    }

    private func releaseLock() {
        if let next = pendingOperations.first {
            pendingOperations.removeFirst()
            next.resume() // Wake up the next waiter
        } else {
            isOperationInProgress = false
        }
    }

    func withdraw(_ amount: Double) async -> Result<Double, WithdrawError> {
        await acquireLock()
        defer { releaseLock() }

        guard balance >= amount else {
            return .failure(.insufficientFunds)
        }

        await performAsyncValidation()

        balance -= amount
        return .success(balance)
    }
}

When to use: When all requests must eventually be processed, order matters, or you need transactional semantics.

Pros: No requests are dropped; guaranteed serialization.

Cons: Increased latency for queued requests; potential for queue buildup.

Strategy 3: Optimistic Execution with Rollback

Sometimes you want to proceed optimistically and verify/rollback if conditions changed during the async operation.

actor BankAccount {
    private var balance: Double = 1000
    private var transactionLog: [UUID: Double] = [:]

    func withdraw(_ amount: Double) async -> Result<Double, WithdrawError> {
        // Initial validation
        guard balance >= amount else {
            return .failure(.insufficientFunds)
        }

        // Capture pre-await state
        let transactionId = UUID()
        let balanceBefore = balance

        // Optimistically reserve the funds
        balance -= amount
        transactionLog[transactionId] = amount

        // Perform async work
        let validationPassed = await performAsyncValidation()

        // Post-await verification
        let stateCorrupted = balance < 0
        let validationFailed = !validationPassed

        if stateCorrupted || validationFailed {
            // Rollback
            if let reserved = transactionLog.removeValue(forKey: transactionId) {
                balance += reserved
            }
            return .failure(validationFailed ? .validationFailed : .insufficientFunds)
        }

        // Commit
        transactionLog.removeValue(forKey: transactionId)
        return .success(balance)
    }
}

When to use: When async operations are expensive and you want to maximize throughput; when rollback is cheap.

Pros: Maximum concurrency, no blocking.

Cons: Rollback logic can be complex; may waste work on rolled-back transactions.

Strategy 4: Re-validate After Await

A simpler variant of Strategy 3—just re-check your preconditions after awaiting:

actor BankAccount {
    private var balance: Double = 1000

    func withdraw(_ amount: Double) async -> Result<Double, WithdrawError> {
        // First check
        guard balance >= amount else {
            return .failure(.insufficientFunds)
        }

        let balanceSnapshot = balance

        await performAsyncValidation()

        // Re-validate after await
        guard balance == balanceSnapshot else {
            // State changed during await—abort or retry
            return .failure(.stateChanged)
        }

        guard balance >= amount else {
            return .failure(.insufficientFunds)
        }

        balance -= amount
        return .success(balance)
    }
}

When to use: When you can afford to fail and have the caller retry.

Pros: Very simple; no complex state tracking.

Cons: May require retry logic; can fail even when it theoretically could have succeeded.

Strategy 5: Coalesce with In-Progress Task Tracking

This elegant pattern solves the "duplicate work" problem by tracking in-flight operations. Subsequent requests for the same resource await the existing task rather than starting a new one.

The key insight is to store the task itself (not just a boolean flag) so concurrent callers can await the same result:

actor DataCache {
    enum CacheEntry {
        case inProgress(Task<Data?, Error>)
        case loaded(Data)
    }

    private var cache: [UUID: CacheEntry] = [:]

    func read(_ key: UUID) async -> Data? {
        // Already have the data? Return immediately.
        if case let .loaded(data) = cache[key] {
            return data
        }

        // Already fetching? Await the existing task.
        if case let .inProgress(task) = cache[key] {
            return try? await task.value
        }

        // Start a new fetch and store the task immediately (before awaiting!)
        let task: Task<Data?, Error> = Task {
            try await fetchFromServer(key)
        }

        cache[key] = .inProgress(task)

        // Now await our own task
        if let data = try? await task.value {
            cache[key] = .loaded(data)
            return data
        } else {
            cache[key] = nil
            return nil
        }
    }
}

Now concurrent reads coalesce into a single network request:

cache read called for DDFA2377-...
cache read called for DDFA2377-...  // Sees .inProgress, awaits same task
cache read called for DDFA2377-...  // Sees .inProgress, awaits same task
attempt to read remote cache for DDFA2377-...  // Only ONE network call!
remote cache HIT for DDFA2377-...
cache read finished for DDFA2377-...
cache read finished for DDFA2377-...
cache read finished for DDFA2377-...

When to use: Caching, token refresh flows, any idempotent fetch where duplicate requests are wasteful.

Pros: Eliminates duplicate work; all callers get the same result; elegant state machine.

Cons: Slightly more complex; requires thinking about task lifecycle.

This pattern is explained in depth in Donny Wals' excellent article on actor re-entrancy, which also covers practical applications like token refresh flows and image loaders.

Strategy 6: Immutable State + Version (Compare-and-Swap)

Instead of mutating state, use a version number to detect concurrent modifications:

actor BankAccount {
    private var state: AccountState

    struct AccountState: Sendable {
        let balance: Double
        let version: Int
    }

    init(balance: Double) {
        self.state = AccountState(balance: balance, version: 0)
    }

    func withdraw(_ amount: Double) async -> Result<Double, WithdrawError> {
        let snapshot = state

        guard snapshot.balance >= amount else {
            return .failure(.insufficientFunds)
        }

        await performAsyncValidation()

        // Compare-and-swap: reject if state changed
        guard state.version == snapshot.version else {
            return .failure(.stateChanged)
        }

        state = AccountState(
            balance: snapshot.balance - amount,
            version: snapshot.version + 1
        )
        return .success(state.balance)
    }
}

When to use: Complex state objects; when you want clear semantics about what constitutes a "change."

Pros: Clear state transitions; easy to debug; works well with audit trails.

Cons: Requires immutable state design; version conflicts need retry logic.

Choosing the Right Strategy

Bounce → Debouncing, idempotent ops • Low complexity • High throughput (fails fast)

Queue → Ordered processing, transactions • Medium complexity • Low throughput (serialized)

Optimistic + Rollback → High-throughput systems, cheap rollback • High complexity • High throughput

Re-validate → Simple operations, retriable • Low complexity • Medium throughput

Coalesce (Task Tracking) → Caching, deduplication • Medium complexity • High throughput

Immutable + Version → Complex state, audit trails • Medium complexity • Medium throughput

Key Takeaways

  1. Actors protect synchronous access, not sequences of operations spanning await.

  2. Think "mailbox": when your method suspends, the actor immediately processes the next queued message.

  3. Two problems to watch for: incorrect state (data corruption) and wasteful duplicate work.

  4. The golden question: Every time you write await inside an actor, ask yourself: "What assumptions have I made about state before this await that I need to re-verify after?"

  5. Choose your strategy based on your use case: correctness vs. throughput vs. simplicity.

  6. Test concurrent access explicitly—create stress tests that hammer your actors from multiple tasks simultaneously.

The next time you write an actor method with an await, pause and ask yourself: "What could change while I'm suspended?" Your future self will thank you.


For more on this topic, I highly recommend Donny Wals' deep dive on actor re-entrancy, which includes practical examples like building token refresh flows and async image loaders.

What patterns have you found useful for managing actor re-entrancy? Share your experiences in the comments!

#Swift #iOSDevelopment #Concurrency #SwiftConcurrency #Programming #SoftwareEngineering