Race conditions in web apps: TOCTOU, limit-overrun and single-packet

A race condition occurs when the outcome of an operation depends on the order or timing in which concurrent requests are processed. In web applications, the appearance of sequential execution is an illusion: the server handles dozens of requests in parallel, and when two of them touch the same state — a balance, a coupon-usage counter, a password-attempt count — without proper coordination, the application may enforce a business rule fewer times than it should. The attacker doesn’t break encryption or inject any payload; they simply arrive at the right moment, many times at once.

The classic case is TOCTOU (time-of-check to time-of-use): the application checks a condition (“the coupon hasn’t been used yet”) and only afterward acts on it (“mark it as used”), and between those two steps there’s a window where another request runs the same check with the same result. Redeeming the same gift card five times, withdrawing a balance twice, bypassing the MFA attempt limit — these are all instances of the same structural flaw. And ever since James Kettle’s research on the single-packet attack (2023), exploiting these windows has become far more reliable, even in microsecond-wide windows.

How it works

The vulnerable pattern almost always takes the form of a check-then-act sequence executed non-atomically. Consider redeeming a discount coupon:

# VULNERABLE: check-then-act without atomicity
@app.post("/cupom/resgatar")
def resgatar():
    codigo = request.json["codigo"]
    cupom = db.query("SELECT * FROM cupons WHERE codigo = %s", codigo)

    # 1) CHECK: is the coupon still available?
    if cupom.usado:
        return {"erro": "cupom já utilizado"}, 400

    # ... business logic, credit to the wallet ...
    creditar_saldo(current_user, cupom.valor)

    # 2) ACT: mark as used (TOO LATE)
    db.execute("UPDATE cupons SET usado = TRUE WHERE codigo = %s", codigo)
    return {"ok": True}

With one request at a time, this works. The problem shows up when two requests arrive almost together and both run the SELECT before either one reaches the UPDATE. Visually, the concurrent timeline looks like this:

Request A:  SELECT (usado=FALSE) ──┐
Request B:  SELECT (usado=FALSE) ──┤  <- both read the old state
Request A:  creditar_saldo()       │
Request B:  creditar_saldo()       │  <- DUPLICATE credit
Request A:  UPDATE usado=TRUE   ───┤
Request B:  UPDATE usado=TRUE   ───┘  <- harmless, it was already TRUE

Both read usado = FALSE, both passed the check, both credited the balance. The window between the SELECT and the UPDATE — often just a few milliseconds of application logic, I/O wait, or a database round-trip — is exactly where the attack lives. This interval is called the race window, and the attacker’s goal is to fit as many requests as possible inside it.

The raw HTTP form of what you want to fire is trivial — the “trick” is sending many of them simultaneously:

POST /cupom/resgatar HTTP/2
Host: app.exemplo.com
Cookie: session=eyJ...
Content-Type: application/json
Content-Length: 22

{"codigo":"WELCOME50"}

Variations and bypasses

The same flaw shows up in several forms. It’s worth knowing the archetypes because the injection point and the evidence change.

Limit-overrun (exceeding a limit). This is the most lucrative category: any action with a numeric limit that’s checked before it’s applied. Withdrawing/transferring beyond the balance (double-spend), redeeming the same credit N times, applying the same voucher repeatedly to the cart, exceeding an invite quota, or going over the item limit in a “1 per customer” promotion.

Rate-limit / anti-bruteforce bypass. If the password- or MFA-code attempt counter is of the read-increment-write kind, firing 50 OTP guesses simultaneously can make all of them be evaluated before the counter reaches the limit. The effect is a bruteforce of a 6-digit code that should have been blocked after 3 attempts.

Single-endpoint vs. multi-endpoint races. The most common case collides the same route with itself (multiple redemptions of the same coupon). But there’s the multi-endpoint variant: two different routes that touch the same state and whose execution order isn’t guaranteed — for example, applying a coupon while, in parallel, confirming the order, causing the discount to “leak” into an already-closed state. These require synchronizing requests to distinct endpoints at the same instant.

TOCTOU in files and state outside the database. Not every race is in the relational database. An upload that validates the file type and then moves it; a cached permission check that expires between the check and the use; Redis flags manipulated with separate GET/SET instead of atomic operations. The principle is identical: the check and the use don’t share the same “lock.”

The bypass that makes it all viable: the single-packet attack. The historical obstacle to exploiting these windows was network jitter: sending 20 requests “at the same time” over the internet meant they arrived spread across dozens of milliseconds, often too wide for the race window. The single-packet attack technique (Kettle, 2023) solves this using HTTP/2: the client sends almost all of each request, holding back a small final fragment of each one, and then releases all the final fragments in a single TCP packet. The server receives the completion of all the requests at virtually the same time and processes them in parallel, removing network jitter from the equation. In practice, the technique reliably accommodates 20–30 requests per packet (limited by the ~1500-byte MTU), which is usually enough for most races. A caveat: the single-packet attack targets network jitter; server-side jitter (variation in processing time due to CPU contention, etc.) is not eliminated — which is precisely why you fire dozens of requests, not just two. When the target only speaks HTTP/1.1, the fallback is the classic last-byte sync, which isn’t as precise but still narrows the arrival spread considerably.

How we exploit it in a pentest

Exploitation is methodical, and the reference tool is Turbo Intruder (a Burp Suite extension), which implements the single-packet attack. Burp Repeater also supports the technique via tab groups (“Send group in parallel”), useful for a quick shot before scripting.

1. Map limited, state-changing actions. We look for any operation with a “can only happen once” or “at most X” rule: coupon/gift-card redemption, withdrawal, transfer, discount application, vote, OTP validation, “1 per customer,” resource creation with a plan limit. Every limit check followed by a mutation is suspect.

2. Establish the baseline. We perform the action once legitimately and record the expected result (final balance, “coupon used,” counter). It’s against this baseline that the overrun will appear.

3. Fire in parallel with single-packet. We send the same request (or the multi-endpoint set) many times simultaneously. In Turbo Intruder, you use the Engine.BURP2 engine with concurrentConnections=1, queuing the copies with engine.queue(..., gate=...) and firing them all with engine.openGate():

# Turbo Intruder — single-packet trigger (HTTP/2)
def queueRequests(target, wordlists):
    engine = RequestEngine(
        endpoint=target.endpoint,
        concurrentConnections=1,
        engine=Engine.BURP2,   # Burp's HTTP/2 stack; requires an HTTP/2 target
    )
    # Queue 30 copies of the SAME request, held at the "gate"
    for i in range(30):
        engine.queue(target.req, gate='race1')
    # Release all final fragments in the same packet
    engine.openGate('race1')

def handleResponse(req, interesting):
    table.add(req)

For multi-endpoint races, you queue different requests on the same gate (replacing target.req with the raw text of each request) — for example, POST /cupom/aplicar and POST /pedido/confirmar — so that both cross the window together. When debugging synchronization, it helps to “warm up” the connection with an innocuous request before the group, so that connection-setup latency doesn’t skew the measurement.

4. Observe the overrun. The visual confirmation is direct: the coupon marked as used 5 times, a balance going negative, several 200 OK responses for an action that should return 400 after the first, or the OTP accepted after dozens of attempts that the rate-limit should have cut off. We attach the Turbo Intruder response table showing multiple successes for the same operation.

5. Quantify the impact. We distinguish “5 duplicated credits” from “arbitrary negative balance”: the latter is usually Critical. We always validate that the result is the actual overrun (an inconsistent state persisted in the database/dashboard), not just harmless concurrent responses — two 200 responses that converge to the same final state do not constitute the vulnerability.

Summary for the report

Impact: a business-limited action enforced more times than allowed — e.g., the same coupon/gift-card redeemed N times, a balance driven negative (double-spend), bypass of the OTP/password attempt limit. Results in direct financial loss, fraud, or account compromise.

Severity: High to Critical (CVSS in the ~8.1–9.1 range when there’s financial loss or authentication bypass; typically a network vector, low complexity once the window is identified). The exact score depends on context — calculate it from the actual demonstrated impact.

Preconditions: a state-changing action with a non-atomic check-then-act pattern; the ability to send concurrent requests (HTTP/2 makes this easier via single-packet, but it isn’t strictly required). Many targets require authentication, but not all.

Suggested evidence: the original request; Turbo Intruder output with multiple 200/success responses for the operation that should be one-time only; the resulting state (negative balance, counter uses = 5, coupon used multiple times) with before-and-after screenshots from the database/dashboard; and the baseline result (1 request) for contrast.

How to mitigate

The single rule is: make the operation atomic and enforce the limit at the database layer, never only in memory or in the application. The application sees state “frozen” at the moment of the SELECT; the database is the only arbiter that sees all concurrent transactions.

1. Pessimistic row lock with `SELECT ... FOR UPDATE`

Open a transaction, lock the resource’s row, and only then check and update. The second concurrent transaction waits until the first finishes — and by then it already reads the new state:

# SAFE: the check and the action happen within the SAME transaction,
# with the row locked. The 2nd request waits and sees usado=TRUE.
@app.post("/cupom/resgatar")
def resgatar():
    codigo = request.json["codigo"]
    with db.transaction():
        cupom = db.query(
            "SELECT * FROM cupons WHERE codigo = %s FOR UPDATE",  # <- row lock
            codigo,
        )
        if cupom.usado:
            return {"erro": "cupom já utilizado"}, 400
        db.execute("UPDATE cupons SET usado = TRUE WHERE codigo = %s", codigo)
        creditar_saldo(current_user, cupom.valor)
    return {"ok": True}

Note: FOR UPDATE only truly serializes under an adequate isolation level and when all concurrent transactions go through the same row lock. Under REPEATABLE READ/SERIALIZABLE (PostgreSQL’s default for snapshots), be prepared to handle serialization errors and retry.

2. Let the database enforce the limit atomically

Better still: eliminate the window by writing the condition inside the UPDATE itself, so that the database decides who wins. For a balance, this prevents the negative for good:

-- The UPDATE only affects rows that still satisfy the condition.
-- Under concurrency, only ONE transaction gets rowcount = 1.
UPDATE contas
   SET saldo = saldo - 100
 WHERE id = 42
   AND saldo >= 100;       -- the check and the action are the SAME atomic operation

If rowcount = 0, there was no balance — reject. For single-use coupons, let the uniqueness constraint do the work: a resgates(cupom_id ... UNIQUE) table makes the second INSERT fail on a key violation, and the database never allows the double redemption, regardless of timing.

3. Optimistic locking (versioning)

When locking rows is expensive, use a version column. The write only counts if the version hasn’t changed; if it did, someone got there first and you repeat the read:

UPDATE pedidos
   SET status = 'pago', version = version + 1
 WHERE id = 7 AND version = 3;   -- fails (rowcount=0) if the version already advanced

4. Idempotency keys for accidental duplication

To keep a double-click or a network retry from running the action twice, require a unique idempotency key per operation. The first request records the key (with a UNIQUE constraint); the second collides and receives the same result, without re-executing the effect:

POST /pagamentos HTTP/2
Idempotency-Key: 4e1c2a9f-7b3d-4f8a-9c10-2de5b1a07f33
Content-Type: application/json

{"valor": 100, "destino": "conta-99"}

Note that idempotency keys neutralize accidental duplication (retry/double-click), but they don’t replace the lock: an attacker simply sends a different key with each request. For limit-overrun, database atomicity (items 1–2) is the primary control.

5. Defense in depth

Distributed locks (e.g., Redlock on Redis) and atomic counters (INCR) protect state outside the relational database. Treat distributed locks as a coordination convenience, not as a guarantee of correctness under failure — the safety of algorithms like Redlock is debated and depends on timing assumptions; to guarantee uniqueness, always prefer the transactional arbiter (constraint/atomic UPDATE) at the source of truth. Keep rate-limiting robust against OTP/password attempts, with an atomic increment in the store (and not a read-modify-write).

Never trust if (recurso.disponivel) { usar(recurso); } in memory. Between the if and the usar, another request may have already “used” the same resource. The lock has to wrap both steps.

Mitigation checklist

Eliminate the in-memory check-then-act pattern; check and act within the same transaction.
Use SELECT ... FOR UPDATE (pessimistic lock) on rows for limited resources, with retry handling for serialization errors.
Enforce the limit in the UPDATE ... WHERE saldo >= X itself and validate the rowcount.
Add UNIQUE constraints in the database for single-use actions (redemptions, votes).
Optimistic locking (a version column) where locking rows is costly.
Idempotency keys with UNIQUE to neutralize duplication from retry/double-click (complements, doesn’t replace, the lock).
Atomic counters (INCR) and distributed locks for cached/Redis state — as a layer, not as the sole guarantee.
Race-resistant OTP/password rate-limiting (atomic increment, not read-modify-write).
Test concurrency in CI: fire the action in parallel and assert that the limit holds.

Race conditions are a reminder that “the code is logically correct” and “the code is correct under concurrency” are different claims. The window between checking and using is invisible in any sequential reading of the code — it exists only in time. The defender has to close it in the only layer that sees all requests at once: the database. The attacker only needs one well-timed packet.