01 · Methodology

The methodology, at the depth you'd take to your own team.

You're here because you want to know whether the score in the Workbench is something you could justify to your own technical leadership, your CFO, or your board. This page is built to answer that — by showing the architecture, the composite logic, the confidence model, the source catalog, and the flywheel that closes the loop.

It is deliberately one level less detailed than the implementation. Exact weights, prompts, thresholds, and per-vendor allocation logic live behind the line — under change control, under quarterly review, visible to design partners under NDA. What you see below is enough to evaluate the rigor — not enough to lift the methodology straight out.

First principles The substrate Substrate → score What we measure Confidence ladder The flywheel Governance

02 · First principles

Four convictions everything else derives from.

The substrate, the scoring, and the primitive all follow from these.

Beat 01

Slice the world by work, not by industry.

Automation maps to work, not to logos. The same UoP for legal drafting ships across financial services, insurance, and government — the alignment math is the same. Industry is downstream of work.

Beat 02

Enterprises are complex adaptive systems, not task lists.

Work is coordination, not tasks. Trying to match and automate tasks assumes they are static and still relevant in the new system of work. They are not. Tasks are downstream of what humans align on doing.

Beat 03

Human-centered — productivity captured, not deflected.

We stay neutral on the outcome — not a zero-displacement enforcer. But by surfacing how people can be redeployed around new AI capability, we give senior leaders a clear alternative to layoffs: capture the productivity gain internally through reorganization and reskilling. Every UoP carries a named redeployment plan; the decision stays with the enterprise.

Beat 04

Large organizations sit below their latent potential. Emergence is a recipe.

Most large organizations run well below what their people, AI, capital, and tooling could deliver together. Human productivity depends on motivation, and motivation depends on conditions — the right manager, the right incentives, the right environment, the right AI tools to augment the work. Get the recipe right and emergence happens. Get it wrong and the same resources deliver a fraction of what they could. That's why the primitive is Unit of Potential, not Unit of Plan. The plan is the floor; emergence is the ceiling.

03 · The substrate

Every score traces to the Global Labor Graph.

The Global Labor Graph is the substrate. Every score in the Workbench, every UoP candidate, every Vendor Allocation cell is computed dynamically from it — not hand-tuned, not made up.

The GLG is a joined multi-source dataset. No single vendor's product. It carries depth (TAG's proprietary placement substrate), breadth (Lightcast + public stats from BLS, ILOSTAT, OECD), and frontier signal (capability research from OpenAI, METR, Scale, Anthropic). The composition is the point: any one source can be wrong; the joined substrate is auditable.

1.28B

job postings · 16-yr depth

594M

professional profiles

104M

annual placements (TAG)

249

countries · cross-country wages

Source catalog · provenance + refresh per source

Source	Role in the substrate	Coverage	Refresh
Lightcast Delta Share	Substrate · postings, profiles, skills, firmographics	1.28B postings (16-yr) · 594M profiles · 28,795 skills · 7.5M+ companies	Daily
TAG (proprietary depth)	Substrate · workforce telemetry, placements, role transitions	100K enterprise clients · 104M placements/yr · 300M candidate interactions/yr	Continuous
O*NET	Substrate · DWAs, KSA ontology, occupation reference	19,265 DWAs · 1,016 occupations	Per release
BLS · OEWS · ILOSTAT · OECD · Eurostat	Substrate · cross-country workforce + wages	249 countries · 396 metros · CPS + OEWS p10–p90 wages	Per release
Eloundou et al. 2023 (OpenAI)	Frontier capability · exposure structure	100% of O*NET occupations · α/β/γ exposure	Static (research)
GDPval (OpenAI 2025)	Frontier capability · task capability benchmark	1,320 tasks · 44 occupations × 9 GDP sectors	Quarterly
METR HCAST Time Horizon	Frontier capability · agentic horizon	Doubling cadence · 7mo → ~3mo (2024-)	Quarterly
Remote Labor Index (Scale AI + CAIS)	Frontier capability · end-to-end project deliverability	240 real Upwork projects · max 2.5% frontier automation	Annual
Acemoglu · Frey-Osborne · Dingel-Neiman	Macro · automation caps, bottleneck residuals, teleworkability	Macro economic priors · cross-checks	Static (research)
WEF Future of Jobs	Workforce readiness · CHRO-surveyed signal	1,000+ CHROs surveyed · reskilling-need trajectories	Annual
Fusebox AgentFuze telemetry	Live telemetry · per-UoP outcome + workforce actuals	Per-UoP real-time signal	Continuous
Anthropic Economic Index	Frontier capability calibration · task-level usage data	Task-level usage signal	Quarterly

The substrate refreshes continuously. Every score that ships is anchored to a snapshot. Coverage and refresh cadence per source are listed above — and visible per-cell in the Workbench.

04 · From substrate to score

We compute dynamically from the substrate — not from hand-tuned fixtures.

The pipeline cascades: substrate signal → archetype classification → enterprise-specific readiness → gated composite → scored cell. Every Workbench score, every UoP candidate is the output of this cascade, not a stored constant.

Methodology · topology

how a score gets computed, at the system level

Tier 01

The Global Labor Graph

multi-source · provenance per signal

Breadth

Lightcast · public stats · O*NET

Depth

TAG placement substrate

Frontier signal

OpenAI · METR · Scale · Anthropic research

Live telemetry

Fusebox AgentFuze · per-UoP signal

provenance + refresh cadence per signal

Tier 02

The pipeline · cascade

dynamic compute · binding-constraint composite

Archetype classification

what work, at what density

Enterprise readiness + gating

outside-in macro · inside-out coalition

Composite + ceilings

binding constraint surfaces · caps visible · no silent downweights

scored + tier-tagged · ceiling-flagged

Tier 03

What ships to the Workbench

every cell is auditable to its substrate inputs

Scored cell

Alignment + Absorption · per archetype

Confidence tier

Estimated · Modeled · Configured · Measured

Reason chip

when a ceiling caps the cell

UoP candidate

scored UoP proposed to the operator

Compute dynamically, not statically.

Substrate signal in, real-time compute out. When the substrate refreshes — a new posting, a new placement, a new capability benchmark — the next score reflects it. No hand-tuned constants in the cells you see.

Binding constraint, not weighted average.

A UoP is a chain. The weakest link breaks it. The composite raises the binding constraint — the single dimension that gates the deployment — rather than averaging away chain-breakers. Weighted sums hide the failure modes the weakest signal already reveals.

Ceilings are explicit, not silent.

When the substrate doesn't carry enough signal for a confident score, the cell shows a ceiling cap with the reason — no sponsor, external data only, governance dimension low, archetype-volume below floor. Operators close the gap to lift the score. We do not silently downweight.

Multi-source provenance, per signal.

Every signal feeding the composite traces to its source in the catalog above. The blend isn't opaque; the contribution is auditable. When sources disagree, the cell reflects the discord — usually as a lower confidence tier.

The specifics — exact weights, sub-dimension counts, threshold values — are versioned and evolve as the substrate deepens. Today's pipeline matches what's defensible at today's data depth. Every change ships behind audit (see §06 Governance).

05 · What we measure

Three measurements. Two scores and a telemetry layer.

Absorption Score

Can the organization structurally take this on?

Measures the organization's capacity to absorb AI in a given work archetype: culture, processes, workforce readiness, metrics, incentives, norms. Slow-to-change, system-level. The lever when the score is low: reorganize the system of work.

Alignment Score

Are the conditions right for humans and AI to co-create value on this specific UoP?

Measures whether Value Clarity, Agency, and Substance are in place on a single deployment. Stakeholder-level, faster to move than Absorption. Both scores need to be high: a high-Absorption org with low Alignment deploys the tool and watches it sit unused; a high-Alignment team in a low-Absorption org builds a brilliant pocket that fails at scale.

Blended Telemetry

Did the deployment actually work — for the business, the agent, and the human?

Agentic telemetry alone — what the AI did, what it cost, where it succeeded — does not prove the deployment created business value. We blend three streams into one neutral measurement layer baked into every UoP. No single vendor controls the score.

01 · Agentic

What the AI actually did, per workflow step. Cost, success and failure, latency, escalation rate.

02 · Business

Did the productivity gain reach where the CFO can see it? Top-line growth, bottom-line margin, throughput, cycle time, customer outcomes the business actually feels.

03 · Human · Worker Voice

Across the frontline and middle management, was the job actually augmented? Are the people inside the deployment engaged?

Status — Work in progress. We’re instrumenting it with agent providers and enterprise customers.

06 · Confidence

Every score wears its tier.

Four tiers, each with a visible badge wherever a score appears. The tier tells you not just how much signal we have, but what kind — external priors only, or modeled with org context, or stakeholder-calibrated, or live-measured.

A CFO acting on an Estimated number believing it's Measured is the credibility-killing event. The ladder prevents that.

Estimated

Modeled

Configured

Measured

Estimatedcap ≤ 60

External priors only — Lightcast firmographics, posting density, public benchmarks, frontier-capability research. Raw substrate signal, no further enrichment yet.

Modeledmethodology-derived

Multi-source signals composed through our scoring algorithm and work-archetype ontology. Methodology-derived, not raw — but no enterprise-specific engagement yet.

Configuredenterprise-calibrated

Enterprise context layered in — sponsor identified, governance posture mapped, workflow dimension scoped. Stakeholder alignment cleared. Deployment plan committed.

Measuredlive telemetry

Live telemetry from agentic + business + human (Worker Voice) streams. Realized outcome measured against the deployment plan. Independent of the deploying entity. Audit-grade.

3,315 / 3,316

scored cells today are Estimated

is Measured — TAG × Harvey · Legal Drafting

The progression up the ladder is the product. Every cycle moves at least one cell up. The substrate refreshes from Measured outcomes — that's the flywheel below.

07 · The flywheel

The lifecycle compounds back into the substrate.

The methodology isn't a one-way pipeline. The substrate computes scores onto Workbench cells. Users act on those scores to generate UoPs. UoPs travel through their lifecycle. Outcomes, blockers, Worker Voice, and user behavior all feed back into the substrate. The next cycle is sharper.

This is the structural reason the platform compounds — and the reason no single vendor can credibly produce this measurement on their own.

Stage 01

Global Labor Graph

Multi-source substrate · refreshes continuously

Stage 02

Workbench scoring

Substrate computes onto every cell · with confidence tier

Stage 03

UoP generation pipeline

Scored UoPs proposed to enterprises

Stage 04

UoP lifecycle

Map · Generated · Configured · Deployed · Gated · Compounding

Stage 05

Deployment Intelligence + Realized Potential

Outcomes, blockers, Worker Voice, user behavior

Refreshes the substrate · next cycle starts smarter

Every deployment teaches the substrate. Every Gated UoP teaches the substrate. Every Worker Voice signal teaches the substrate. The moat isn't data volume — it's credibility that compounds.

08 · Governance

Change-controlled. Quarterly review. Audit-trailed.

Every weight, every threshold, every ceiling rule is owned by the methodology board. Changes ship on a quarterly cadence. The audit log captures every revision and its rationale.

Locked

Global Labor Graph as substrate
Dynamic compute (not fixtures)
Binding-constraint composite
4-tier confidence ladder
Explicit ceilings + reason chips
Multi-source provenance per signal

Evolving

Peer-score structure (Absorption + Alignment)
Agent Capability gate (per-archetype currency)
Workforce-readiness signals (TAG depth integration)
Vendor allocation dimension count
Substrate refresh cadence per source

Open

Cross-cohort weight normalization
Density-penalty calibration
Workforce-volume floor for capping
Live-telemetry write-back to substrate

When weights revise: deployed UoPs see their composite shift ±2–5 points; tier classifications hold within Ready / Gated. We track this explicitly so revisions are calibrated, not disruptive.

Full audit log available on request to methodology@rpotential.ai. Methodology canon is maintained by the methodology-architect agent under board review.