Sweepers
The fulfillment subsystem is fronted by two background sweepers that
guarantee no order is ever left silently broken because of a process
crash or a half-finished checkout. They run in the same Node.js process
as the API server, scheduled by @nestjs/schedule.
Stuck-order sweeper
Schedule: every minute.
Symptom it fixes: payment succeeded but the order is still
INITIATED or PENDING more than 5 minutes later. This happens when
the process dies between writing payments.status = 'SUCCEEDED' and the
fire-and-forget call to FulfillmentService.fulfillOrder completing.
What it does:
- Finds up to 25 such orders per tick (indexed query).
- Re-runs
fulfillOrderfor each. - If the re-run throws, transitions the order to
NEEDS_ATTENTIONand raises afulfillment_failedadmin alert with payload{ source: 'fulfillment-sweeper', error: '<message>' }.
The sweeper is idempotent — calling it twice in a row on the same
order is a no-op once the order moves out of INITIATED|PENDING.
Stale-reservation sweeper
Schedule: every 5 minutes.
Symptom it fixes: inventory_movements ORDER_RESERVE rows older
than 30 minutes that never got a paired ORDER_CONFIRM or
ORDER_RELEASE. This happens when a checkout abandons after locking
voucher rows but before the gateway callback fires.
What it does: calls InventoryService.sweepStaleReservations(30),
which releases the reserved voucher rows back to available and writes
an audit ORDER_RELEASE movement row with note stale-reservation sweep.
Tuning
Defaults live as constants on FulfillmentSweeperService:
| Constant | Default | Effect |
|---|---|---|
STUCK_AFTER_MINUTES | 5 | Grace period before re-triggering fulfillment. |
MAX_PER_RUN | 25 | Hard ceiling per minute (back-pressure safety). |
RESERVATION_STALE_MINUTES | 30 | Reservation TTL before stale-sweep releases it. |
Reduce STUCK_AFTER_MINUTES for snappier recovery; raise
MAX_PER_RUN only if your DB can handle the additional fulfillment
load per minute.