0045: Pruning Soft-Deleted Offers to Improve Query Performance
STATUS
Accepted
CONTEXT
The PostgreSQL offers table in Offer API contains ~5.1 million rows, of which roughly 96% are soft-deleted (deleted_at IS NOT NULL).
Here's a graph that represents the live vs soft-deleted Offers at the time of writing this ADR.
Laravel queries on active offers automatically include WHERE deleted_at IS NULL, meaning each read must filter through a bloated table and associated indexes.
As the table grows:
- Query latency increases due to larger scans and index lookups.
- Cache effectiveness decreases because most rows are irrelevant.
- VACUUM/ANALYZE workloads increase.
- The operational table stops behaving like “live data” and instead becomes a long-term log.
We must decide:
- Whether pruning old soft-deleted rows will materially improve performance.
- If pruning is recommended, what retention window to use.
- Whether pruned rows should be archived.
- How frequently pruning should occur.
How Soft-Deleted Offers Are Currently Accessed
In Offer API, there are two routes that allow lookups for soft-deleted/trashed Offers.
GET /v1/admin/offers/{offer_id}
This route allows soft-deleted Offers to be accessed by their snowflake ID. It is an internal route used exclusively by the adgem/api project during click processing when creating transactions.
This is the most critical dependency we must address. The business rule is: if a player saw an offer, we honor their payout even if the offer was later soft-deleted.
Current usage of this functionality is infrequent, but frequent enough that pruning old offers may impact some publishers. Here is a graph that represents the number of clicks for Offers that have been soft-deleted for longer than their attribution window allows.
Note: Jan 12 and Jan 20 represent partial days of data collection.
Note: During this time period, only 2 apps were seen doing this.
GET /v1/offer_lookup/{offer:snowflake_id}
This endpoint also allows access to soft-deleted Offers. It is used as a backoffice lookup tool by the adgem/new_dashboard project. Pruning old offers would not break this functionality.
However, if we archive old offers rather than fully deleting them, should this lookup tool also query the archive? That would require additional work.
Considered Options
Option 1 — Add or Migrate to Partial Indexes
Because only ~4% of rows are "alive", most of our important indexes should be partial so they only index active rows. That way:
- Index size ≈ 4% of full-table index
- Active queries touch much smaller structures
Example:
CREATE INDEX CONCURRENTLY idx_live_offers_adgem_app_id
ON offers (adgem_app_id)
WHERE deleted_at IS NULL;
We have tested this functionality already with the following index:
CREATE INDEX IF NOT EXISTS offers_campaign_id_partial_index
ON offers(campaign_id)
WHERE deleted_at IS NULL;
We have seen improvements on the v2 offers index endpoint immediately after deployment.
However, we could analyze whether we should migrate old indexes to become partial.
Option 2 — Hard-delete soft-deleted rows older than a defined retention window
Prune (hard-delete) Offers that are older than X amount of days/months. Columns like attribution_window_days and deleted_at could be used to determine a starting point as to when the Offer stopped being relevant. Example: prune soft-deleted Offers 3 months after their attribution window ends.
A change that could be made is: conversions should be valid only within the offer's attribution window. However, conversions outside that window should not be honored.
If the offer outside this window is pruned (hard-deleted) and not found, the player should see an "Offer Unavailable" page and their payout is not completed.
Here are a few options for retention windows and their impact on the table.
Total Offers at January 14th, 2026: 5,167,521
| Retention Window | N. Deleted Rows | Offer Count After Deletion |
|---|---|---|
| 0 days after soft-delete | 4,979,247 | 188,274 (~96% reduction) |
| 30 days after soft-delete | 4,967,250 | 200,271 (~96% reduction) |
| 90 days after soft-delete | 4,183,462 | 984,059 (~81% reduction) |
| 180 days after soft-delete | 3,663,280 | 1,504,241 (~71% reduction) |
| 365 days after soft-delete | 3,377,338 | 1,790,183 (~65% reduction) |
Note: 90 days is the maximum attribution window for Offers. Average is 30 days.
Option 3 — Archive soft-deleted rows to a separate database, then hard-delete from the main table
Same as option 2, but archive the Offers to a separate database for archival purposes first.
The archived Offers should live in a data warehouse, not a regular PostgreSQL application table.
DECISION
Retention Window
We will use a 90-day retention window after soft-deletion. This aligns with the maximum attribution window for Offers, ensuring that any valid conversion can still be processed. Based on the data analysis, this results in an ~81% reduction in table size (from 5.1M to ~984K rows).
Handling the Critical Dependency
The GET /v1/admin/offers/{offer_id} endpoint used during click processing is the most critical dependency. After pruning:
- Clicks on offers that have been soft-deleted for less than 90 days will continue to work as expected.
- Clicks on offers that have been soft-deleted for more than 90 days will return a 404, and the player will see an "Offer Unavailable" page. Their payout will not be completed.
This is an acceptable business trade-off. The data shows that clicks in this scenario peaked at ~520-540/day (Jan 12-13) from 2 apps, then dropped to single digits by Jan 15 and remained low through Jan 20. These clicks are already outside the offer's attribution window and should not be honored. Publishers exhibiting this behavior should be notified of the policy.
Phase I: Do Partial Indexes When Applicable
Whenever we find the need to create indexes on the Offers table, we should create them as partial indexes. This has been done already for the latest index we created and should be done for all future indexes when applicable.
We will also create tasks to analyze current indexes and migrate them to become partial indexes. This can dramatically improve their performance on the bloated table.
Phase II: Archive Old Offers to Data Warehouse
Before hard-deleting, we will copy offers outside the retention window to a separate database in our data warehouse. This phase is required and must be completed before Phase III begins.
Criteria for completion:
- Archived offers are queryable in Metabase for debugging and auditing purposes.
- The dashboard offer lookup tool (
GET /v1/offer_lookup/{offer:snowflake_id}) is updated to query the archive as a fallback when an offer is not found in the main table.
This ensures we maintain historical data for compliance, debugging, and business intelligence purposes.
Phase III: Hard-Delete Old Offers from Main Table
We will hard-delete offers outside our 90-day retention window from the main table. This will be done via a scheduled Artisan command that:
- Runs daily during off-peak hours (e.g., 3 AM UTC).
- Processes deletions in batches of 10,000 rows to avoid lock contention.
- Logs the number of rows deleted per run for monitoring.
- Can be triggered manually for initial catch-up.
Once the backlog is cleared, the job frequency can be reduced to weekly.
Rollback Strategy
If issues are discovered after pruning:
- Within the archive retention period: Restore affected offers from the data warehouse archive back to the main table.
- Pruning job issues: The job can be disabled immediately via feature flag or by removing it from the scheduler. No automatic rollback of deletions is possible once committed, which is why Phase II archival is mandatory.
- Emergency stop: The Artisan command will include a
--dry-runflag to preview deletions without executing them.
This will keep our offers table lean and improve database performance.
CONSEQUENCES
Positive
- Smaller main table and indexes, improving query speed for all active-offer queries.
- Reduced I/O and better memory/cache performance.
- More predictable operational footprint and improved long-term scalability.
Negative
- Soft deleted archived Offers will need to be retrieved separately.
- If any legacy workflow depends on old soft-deleted records being on the main table, it may break silently.
- The initial cleanup may be resource-heavy and must be chunked to avoid long locks.
Risks and Mitigations
| Risk | Mitigation |
|---|---|
| Complex data recovery after the retention window | Phase II archival is mandatory; archived data can be restored from the data warehouse if needed. |
| Complex operational costs for archival strategy | Use existing data warehouse infrastructure; limit archive queries to Metabase and the lookup tool fallback. |
| Undocumented dependencies on old soft-deleted rows | Announce the change internally before rollout; monitor error logs for unexpected 404s on offer lookups. |
| Misconfigured pruning job deletes too aggressively | Implement --dry-run flag; alert on deletion counts exceeding expected thresholds; require Phase II completion before Phase III. |
| Lock contention from large pruning batches | Limit batch size to 10,000 rows; schedule during off-peak hours (3 AM UTC); monitor query latency during runs. |
NOTES
- After rollout, table metrics (relation size, index size, query latency) should be tracked for 1–2 weeks.
- Retention window can be adjusted later based on observed usage and business needs.
References
- Laravel 11.x Soft Delete / Pruning documentation
- Efficient Soft-Delete Models Without Query Degradation
- PostgreSQL Documentation: Partial Indexes
- DELETEs are difficult: soft deletes are not the solution
- PR #92: feat(AGPI-1537): Pruning Soft-Deleted Offers to Improve Query Performance
- PR #127: docs: backfill PR reference links for existing ADRs
Original Author
Luiz Bueno