Skip to main content

Documentation Index

Fetch the complete documentation index at: https://trunk-4cab4936-sam-gutentag-slow-test-monitor-percentile-ev.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

The slow test monitor detects test cases whose measured duration exceeds a threshold you set, evaluated over a configurable percentile, time window, and sample size. It applies labels to slow tests so your team can identify and prioritize performance improvements without classifying tests as flaky or broken.

When to Use This Monitor

  • Identify tests slowing down CI: Surface the specific tests adding the most wall time to your pipeline.
  • Enforce duration budgets: Label any test that exceeds an acceptable runtime so it gets reviewed before merging.
  • Track regressions: Catch tests that were fast but became slow after a code change.

How It Works

The monitor evaluates test duration at a configured percentile across runs in a rolling time window. When a test’s percentile duration meets or exceeds the configured threshold and enough sample runs have been collected, the monitor activates and applies the configured labels. Resolution happens when the test’s measured duration drops back below the threshold over subsequent runs. If staleAfterMinutes is set, the monitor also resolves any active test that has had no recent runs on monitored branches — this prevents labels from persisting on tests that have been removed from the suite. Once the monitor activates, detection evidence (the specific runs that triggered it) is visible in the Events tab on the test details page.

Configuration

SettingDescriptionDefault
Duration thresholdTest duration (milliseconds) at the configured percentile required to trigger detectionRequired
PercentileThe percentile of run durations used to evaluate the threshold (e.g., p50, p95)Required
WindowTime window (minutes) over which duration samples are collectedRequired
Sample sizeMinimum number of runs required before the monitor can activateRequired
Stale afterMinutes without any run on monitored branches before an active test resolves (optional)Disabled
Branch scopeBranch names or glob patterns to monitorAll branches
ActionApply labels (the only available action — this monitor does not classify)Apply labels

Duration Threshold

Set the threshold in milliseconds. A value of 5000 flags any test whose percentile duration meets or exceeds 5 seconds. Tune this based on your acceptable CI budget — tighter thresholds surface more tests but may require more review bandwidth.

Percentile

The percentile controls which point on the duration distribution is compared to the threshold.
  • p95 means 95% of a test’s runs are at or below the measured duration. A test only needs occasional slow runs to push its p95 above the threshold, making this a less strict setting that catches intermittent slowness.
  • p50 means 50% of a test’s runs are at or below the measured duration. A test must have the majority of its runs be slow before it triggers, making this a stricter setting that filters out one-off spikes.
Choose a higher percentile (p75, p90, p95) to catch tests with sporadic slowness. Choose a lower percentile (p50) to surface only tests that are consistently slow.

Window and Sample Size

The window controls how far back duration samples are collected. Sample size sets the minimum number of runs needed before the monitor will activate. This prevents a single slow run from triggering the monitor on a test with no history. For example, a window of 1440 minutes (one day) and a sample size of 5 means the monitor evaluates the last day’s runs and requires at least five before drawing a conclusion. The preview panel shows up to 1000 tests that ran within the configured window.

Stale After

When set, any test that has been active (labeled slow) but stops running on monitored branches for staleAfterMinutes minutes will be automatically resolved. Use this to clean up labels after a slow test is removed from the suite or renamed.

Branch Scope

Scope the monitor to branches where test duration matters most, such as main or merge queue branches. Tests running on feature branches may have intentionally limited execution or variable infrastructure and may not represent a genuine slowness concern.

Choosing Between Monitors

GoalRecommended monitor
Flag tests that are taking too longSlow test monitor
Track recently added testsNew test monitor
Detect tests consistently being skippedSkipped test monitor
Detect tests that fail then pass on retryPass-on-retry monitor
Alert on tests failing at a sustained rateFailure rate monitor