How does Pearl measure marking time saved?

Pearl measures marking time saved as the difference between baseline marking time (manual marking of a submission against a rubric, captured via assessor self-report and platform timestamps before SAM is enabled) and SAM-assisted marking time (time from submission open to mark confirmed, with SAM proposing grades and feedback that the assessor reviews, edits and confirms). Time saved is computed per assessor per qualification, then aggregated.

What is the sample size for the 67% claim?

The 67% figure is drawn from AY 2024-25 customer data across 4 FE providers, 47 assessors, and 8,412 marked submissions on Ofqual-regulated qualifications. Mean time saved per submission was 67% (median 64%, interquartile range 58% to 73%).

What counts as 'marking time' in the Pearl measurement?

Marking time is the active time an assessor spends opening a submission, reading the work, applying the rubric, writing feedback, and confirming the grade. It excludes IQA sampling time, learner-facing communication, administration, and any time spent training SAM on a new rubric.

What are the caveats on the 67% claim?

Three main caveats: (1) results vary by qualification type, level, and rubric complexity, with the strongest savings on structured written assessments and the weakest on open-ended portfolio work; (2) time saved excludes the one-off rubric training time required when SAM is first deployed on a new qualification; (3) assessors retain final mark authority and may choose to disregard SAM's proposal, which adds time on a minority of submissions.

← Back to Pearl

// METHODOLOGY

How we measure the 67% claim.

Pearl claims its AI marking platform, SAM, cuts assessor marking time by 67% on Ofqual-regulated qualifications. This page explains how that number is measured, what counts as marking time, the sample it is drawn from, and the caveats. Written for procurement teams who want to verify the maths before signing.

The headline

67%

Mean time saved

8,412

Marked submissions

Assessors

FE providers

Across AY 2024-25, SAM-assisted marking reduced assessor active marking time by a mean of 67% per submission, with a median of 64% and an interquartile range of 58% to 73%. The sample covered Ofqual-regulated qualifications at Levels 2, 3 and 4, including BTECs, NVQs, Functional Skills and Access to HE units.

What we mean by marking time

Marking time is the active time an assessor spends:

Opening a submission and reading the work.
Applying the rubric, criterion by criterion.
Writing assessor feedback and developmental comments.
Confirming the final grade.

Marking time excludes:

IQA sampling and verification time.
Learner-facing communication and resubmission discussions.
General administration, planning and CPD.
One-off rubric training time when SAM is first deployed on a new qualification.

How the baseline is captured

Before SAM is enabled on a qualification, the assessor marks a cohort of submissions in the standard Pearl interface. Active time is measured from submission-open to grade-confirmed, with idle-tab detection pausing the timer after 90 seconds of inactivity. Assessor self-report time logs are reconciled against platform timestamps. The baseline is the mean active marking time per submission across that cohort, computed per assessor per qualification.

How the SAM-assisted time is captured

Once SAM is enabled on the qualification, every submission is processed by SAM on upload. The assessor sees a proposed grade, criterion-level rationale, and suggested feedback. The assessor reviews, edits where needed, and confirms the final mark. Active time is measured on the same submission-open to grade-confirmed basis, with the same idle-tab detection. The SAM-assisted time is the mean active time per submission across the same qualification cohort.

Time saved per submission is computed as:

Time saved % = (Baseline time − SAM-assisted time) / Baseline time × 100

Per-assessor figures are then aggregated to the qualification level, the provider level, and finally the cross-provider mean.

The sample

Time period: AY 2024-25, September 2024 to July 2025.
Providers: 4 UK FE providers (2 colleges, 2 ITPs).
Assessors: 47, each with at least 30 baseline and 30 SAM-assisted submissions on the same qualification.
Submissions: 8,412 marked submissions, paired across baseline and SAM-assisted conditions on matched qualifications.
Qualification mix: Levels 2 to 4 across BTEC, NVQ, Functional Skills, Access to HE. Excluded: open portfolio assessments without a fixed rubric, group projects, and observed practical assessments.

Caveats and what we do not claim

Three caveats matter when you read the 67% figure:

Rubric complexity drives variance. Time saved is strongest on structured written assessments with well-defined rubrics (mean 71%) and weakest on open-ended written work with high inter-criterion overlap (mean 54%). Open portfolio work without a fixed rubric was excluded from the sample.
One-off setup cost is not netted off. Each qualification requires 4 to 12 hours of rubric configuration and calibration before SAM is reliable. This is borne by the lead assessor or IQA and is not counted in the per-submission marking time.
Assessor authority adds time on a minority. Around 8% of submissions in the sample involved an assessor materially disagreeing with SAM's proposed grade. On those submissions, the assessor spent on average 22% longer than they would have done marking from scratch, because they reviewed both SAM's proposal and their own working. This is included in the SAM-assisted time average, so the headline 67% is net of those cases.

We do not claim:

That SAM marks without an assessor. The assessor confirms every grade and retains final mark authority.
That 67% applies to every qualification. Levels 1 and 5 are not in the sample.
That the figure will hold at every provider. Provider-level means in the sample ranged from 58% to 74%.

How to verify with your own data

Procurement teams can replicate the measurement in a paid pilot:

Pick one qualification, one cohort.
Run 4 weeks of baseline marking in the Pearl interface without SAM enabled.
Enable SAM, run a further 4 weeks on the same cohort.
Pearl provides the raw timestamps, per-assessor active-time logs, and a reconciliation report. You compute time saved against your own definitions.

Pilot pricing and scope are agreed in advance. The pilot is paid because the rubric configuration and assessor training cost real time at our end.

FAQ

Who validated the methodology?

The measurement methodology was reviewed internally by Pearl's product and assessment leads, and externally by an IQA practitioner at one of the four sample providers. It has not been peer-reviewed in an academic journal. We are open to a third-party audit and will publish any subsequent revisions on this page.

Where can I see the raw data?

Anonymised per-assessor, per-qualification time logs are available under NDA for procurement teams in active evaluation. Email sales@epearl.co.uk to request access.

Does the 67% include marking quality?

No. This page only measures time. Mark quality is measured separately, against IQA agreement rates, inter-assessor reliability and learner outcome data. We will publish a separate methodology page on mark quality once the AY 2025-26 dataset is complete.

Run the 67% test on your own provision.

Book a 20-minute call. We will scope a paid pilot against one of your qualifications and agree the measurement protocol up-front.

Book a 20-min demo →

Last updated 26 May 2026. Methodology owner, Pearl product team. Send corrections to sales@epearl.co.uk.