Skip to content

Sweepers

The fulfillment subsystem is fronted by two background sweepers that guarantee no order is ever left silently broken because of a process crash or a half-finished checkout. They run in the same Node.js process as the API server, scheduled by @nestjs/schedule.

Stuck-order sweeper

Schedule: every minute.

Symptom it fixes: payment succeeded but the order is still INITIATED or PENDING more than 5 minutes later. This happens when the process dies between writing payments.status = 'SUCCEEDED' and the fire-and-forget call to FulfillmentService.fulfillOrder completing.

What it does:

  1. Finds up to 25 such orders per tick (indexed query).
  2. Re-runs fulfillOrder for each.
  3. If the re-run throws, transitions the order to NEEDS_ATTENTION and raises a fulfillment_failed admin alert with payload { source: 'fulfillment-sweeper', error: '<message>' }.

The sweeper is idempotent — calling it twice in a row on the same order is a no-op once the order moves out of INITIATED|PENDING.

Stale-reservation sweeper

Schedule: every 5 minutes.

Symptom it fixes: inventory_movements ORDER_RESERVE rows older than 30 minutes that never got a paired ORDER_CONFIRM or ORDER_RELEASE. This happens when a checkout abandons after locking voucher rows but before the gateway callback fires.

What it does: calls InventoryService.sweepStaleReservations(30), which releases the reserved voucher rows back to available and writes an audit ORDER_RELEASE movement row with note stale-reservation sweep.

Tuning

Defaults live as constants on FulfillmentSweeperService:

ConstantDefaultEffect
STUCK_AFTER_MINUTES5Grace period before re-triggering fulfillment.
MAX_PER_RUN25Hard ceiling per minute (back-pressure safety).
RESERVATION_STALE_MINUTES30Reservation TTL before stale-sweep releases it.

Reduce STUCK_AFTER_MINUTES for snappier recovery; raise MAX_PER_RUN only if your DB can handle the additional fulfillment load per minute.