Replay jobs and audit
The current console exposes both replay execution and administrative history.
What replay is for
Replay lets you re-run historical traffic through a chosen encoding version and rule version.
This is useful when you want to answer questions like:
- What would have happened if this rule had existed earlier?
- Did my encoding change improve or break interpretation of old traffic?
- Can I test safely before depending only on new live events?
Replay jobs
Replay jobs are listed with:
job_idjob_typeencoding_versionrule_versiontime_range_starttime_range_endstatusevents_processedevents_failedcreated_atcompleted_at
What each replay field means
| Field | Plain-English meaning | Why you care |
|---|---|---|
job_id | The unique ID for this replay run | Useful when discussing a specific job |
job_type | What kind of replay work Esper should perform | Controls the purpose of the run |
encoding_version | Which encoding plan version to use | Lets you test a specific translation layer |
rule_version | Which rule set version to use | Lets you test a specific policy state |
time_range_start | Beginning of the historical window | Defines which past traffic is included |
time_range_end | End of the historical window | Defines where the replay stops |
status | Current state of the replay | Tells you whether to wait, inspect, or retry |
events_processed | How many events were handled successfully | Main progress and throughput signal |
events_failed | How many events could not be processed | Main signal that something may be wrong |
created_at | When the replay was requested | Useful for operator history |
completed_at | When the replay finished, if it has finished | Helps measure duration |
Supported job types:
DryRunStateRebuildDecisionOnly
How to think about them:
| Job type | Best use |
|---|---|
DryRun | Safest first test of a proposed change |
StateRebuild | Recompute saved state from historical traffic |
DecisionOnly | Focus on decision outcomes rather than broader state rebuilds |
Supported statuses:
PendingRunningCompleteFailed
Launching a replay job
The current form submits:
POST /tenants/{tenant_id}/replay-jobs
Required fields:
job_typeencoding_versionrule_versiontime_range_starttime_range_end
The app currently initializes the form with UTC timestamps and version 1 for
both encoding and rules.
Good first replay habit:
- use a short time window first
- prefer
DryRunbefore heavier replay modes - compare
events_processedandevents_failedbefore trusting the results
Audit history
The audit page lists tenant-scoped entries with:
audit_entry_idactoractionrecorded_at
The overview page also uses audit entry count as part of the tenant’s operational posture summary.
What each audit field means
| Field | Plain-English meaning | Why you care |
|---|---|---|
audit_entry_id | Unique record of an admin action | Useful when tracing specific changes |
actor | Who performed the action | Answers ownership questions |
action | What changed | Quick summary of the mutation |
recorded_at | When it happened | Helps line up a change with later outcomes |
How to use these together
In the current product flow:
- Create or revise encoding and rule definitions.
- Launch a replay job against a historical time window.
- Review resulting decisions and events.
- Confirm the administrative changes in audit history.
This is the cleanest way for a non-technical operator to explain a change:
- what changed
- when it changed
- what historical data was replayed
- what outcome changed as a result