Whitepaper

The Stori Trait Assessment: A Dual-Output Framework for Candidate Intelligence

Fraser Hill / Stori Labs — 2026

For decades, hiring has relied on proxies — resumes, behavioral interviews, and intuition. These tools have persisted not because they are accurate, but because they produce outcomes that appear acceptable. Most hires do not fail catastrophically. Most employees are “good enough.” And so the system has remained largely unchallenged.

Artificial intelligence has not created this problem. It has exposed it. Resumes are now generated and optimized by machines. Candidates rehearse with AI coaches. The signals that hiring processes were designed to read have become indistinguishable from noise.

This paper describes the measurement system designed to replace those proxies: a structured trait assessment and narrative interview framework that captures how people actually think, work, and perform — grounded in behavioral reality rather than self-reported history.

1. The problem with hiring today

The tools companies use to evaluate candidates have barely changed in fifty years. Resumes list credentials. Behavioral interviews ask candidates to recall polished highlights. Reference checks confirm employment dates. Each of these is a proxy — an indirect signal that correlates weakly, and often misleadingly, with actual job performance.

Research from Schmidt and Hunter (1998) established that unstructured interviews predict job performance at r = 0.38. Structured interviews improve to r = 0.51 — better, but still a coin flip dressed up as a process. Meanwhile, cognitive ability tests (r = 0.65) and personality assessments (r = 0.31 for conscientiousness alone, higher in combination) consistently outperform interviews, yet most companies still treat the interview as their primary evaluation tool.

The arrival of generative AI has made this worse. Resumes are now written by machines, making them indistinguishable. Candidates can rehearse with AI coaches. The 400-word resume — already a poor signal — has become noise.

The question is no longer how to interview better. It is whether the interview, as traditionally practiced, should remain the primary source of hiring data at all.

2. Eight years of original research

Between 2012 and 2020, Stori founder Fraser Hill conducted over 1,700 leadership interviews across banking, technology, and professional services. The research, published in The CEO's Greatest Asset (2020), examined what differentiates high performers from the rest — not on paper, but in how they think, decide, communicate, and respond to pressure.

The core finding: the traits that predict success are observable in conversation, but traditional interview formats are not designed to elicit them. Behavioral questions (“Tell me about a time when...”) invite rehearsed performance. They measure interview skill, not the underlying cognitive and behavioral patterns that drive results.

This research identified a set of twelve behavioral facets — observable, measurable dimensions of how people work — organized into four meta-traits. These became the foundation of the Stori Trait Assessment.

3. The Stori Trait Assessment (STA)

The STA is a dual-output psychometric system. From a single, six-minute experience, it produces two complementary views of a candidate:

Stori Traits

Four meta-traits and twelve facets describing behavioral and cognitive tendencies. Applied, narrative, immediately useful for hiring decisions.

Big Five Personality Profile

Openness, Conscientiousness, Extraversion, Agreeableness, Emotional Stability. The most widely studied personality framework in psychology, enabling cross-system benchmarking.

Both outputs derive from the same 120 adjective rankings. The Stori Traits tell you how someone works. The Big Five overlay tells you where they sit on the most widely studied personality framework in psychology. Together, they provide both practical utility and research-grounded context.

4. Assessment design: forced-choice triads

The STA uses a forced-choice triad model. Candidates see 40 blocks of three adjectives and rank each block: most like me (+2), more like me (+1), less like me (-1). Each of the 120 adjectives appears exactly once across the entire assessment.

This design is not arbitrary. Forced-choice formats are built to reduce three critical biases that undermine traditional self-report questionnaires:

  • Acquiescence bias — the tendency to agree with statements regardless of content. Forced ranking makes “agree with everything” impossible.
  • Social desirability — the tendency to present oneself favorably. When all three options are positive, there is no “right answer” to game.
  • Central tendency — the tendency to choose middle options. Triads force differentiation.

Each triad draws from three different meta-traits, with each meta-trait omitted from exactly ten triads. This rotation ensures balanced measurement without fatigue. Completion takes approximately six minutes.

5. Twelve facets of human performance

Each facet is anchored by ten adjectives drawn from the International Personality Item Pool (IPIP) — a public-domain lexical base that underpins decades of peer-reviewed personality research. The adjectives were selected and refined using insights from the 1,700-interview research corpus.

Curious

Thinking

Inquisitive, Analytical, Investigative, Exploratory, Probing, Reflective, Questioning, Studious, Insightful, Curious

Imaginative

Thinking

Creative, Visionary, Inventive, Vivid, Abstract, Conceptual, Dreaming, Innovative, Artistic, Imaginative

Intuitive

Thinking

Instinctive, Discerning, Sensitive, Intuitive, Holistic, Foresighted, Subtle, Prescient, Speculative, Strategic

Driven

Discipline

Motivated, Purposeful, Ambitious, Persistent, Determined, Industrious, Self-starting, Diligent, Competitive, Goal-oriented

Principled

Discipline

Honest, Ethical, Trustworthy, Principled, Reliable, Genuine, Transparent, Moral, Fair, Authentic

Consistent

Discipline

Organized, Structured, Steady, Predictable, Consistent, Systematic, Dependable, Thorough, Regular, Focused

Courageous

Execution

Brave, Bold, Confident, Decisive, Daring, Resilient, Assertive, Fearless, Steadfast, Courageous

Adaptable

Execution

Flexible, Resourceful, Versatile, Calm, Composed, Open-minded, Easy-going, Agile, Adaptable, Patient

Accountable

Execution

Responsible, Dependable, Dutiful, Loyal, Reliable, Committed, Conscientious, Answerable, Accountable, Faithful

Articulate

Communication

Well-spoken, Clear, Coherent, Concise, Verbal, Precise, Fluent, Lucid, Eloquent, Articulate

Influential

Communication

Persuasive, Charismatic, Inspiring, Confident, Assertive, Convincing, Poised, Engaging, Energetic, Motivating

Perceptive

Communication

Perceptive, Observant, Attentive, Attuned, Receptive, Tactful, Astute, Responsive, Empathetic, Aware

6. Scoring: from ipsative to normative

Forced-choice data is inherently ipsative — it tells you which traits are strongest relative to each other within one person, but not how that person compares to others. This is a well-known limitation of forced-choice formats.

The STA resolves this using established psychometric methods developed for forced-choice instruments, including ipsative-to-normative conversion techniques. The process:

  1. Estimate latent trait scores per facet from the forced-choice rankings.
  2. Compute z-scores relative to a normative distribution.
  3. Transform to T-scores (mean = 50, SD = 10) for interpretability.
  4. Generate percentile equivalents for benchmarking.
  5. Assign banding: High (≥65), Moderate-High (55-64), Average (45-54), Low (≤44).

The result is interval-level data suitable for radar charts, statistical models, and cross-candidate comparison — not just “this person is more curious than they are organized,” but “this person is in the 82nd percentile for curiosity.”

7. Dual-output mapping

Each Stori facet carries a primary Big Five loading and, where the literature supports it, a secondary loading. This allows the same 120 data points to produce both the Stori profile and a Big Five Personality Profile.

Big Five DimensionStori Meta-Trait LinkContributing Facets
OpennessThinkingCurious, Imaginative, Intuitive
ConscientiousnessDiscipline + ExecutionDriven, Principled, Consistent, Accountable
ExtraversionCommunicationArticulate, Influential
AgreeablenessCommunicationPerceptive (+ secondary loadings)
Emotional StabilityExecutionCourageous, Adaptable

The Stori meta-traits are designed to align with their Big Five anchors while preserving the distinctiveness of the Stori framework. Formal construct validity studies are planned as assessment data scales.

8. Interview intelligence: the evidence layer

A trait score tells you what someone is like. An interview tells you what they have done. Neither is complete alone. The STA framework is designed so that both speak the same language.

When a candidate completes a Stori interview, the full transcript is analyzed by AI to extract structured intelligence across the same facet dimensions measured by the trait assessment. The system identifies timestamped moments where the candidate demonstrates specific facets — a moment of Courageous when they describe a high-stakes decision, evidence of Driven when they talk about pursuing an aggressive target, Accountable when they own a mistake.

These highlights are surfaced as tagged, seekable moments in the interview player. A hiring manager can click “Driven” and watch the 45 seconds where it showed up. This is not a summary or a score — it is the primary evidence, timestamped and accessible.

The unified report cross-references trait scores with interview evidence, showing alignment strength for every facet. Strong alignment means the candidate's self-assessed personality matches their demonstrated behavior. Weak alignment is a signal worth exploring — and now you can explore it with a click.

9. The Narrative Method: AI-resistant by design

The Stori interview uses what we call the Narrative Method. Rather than asking behavioral questions that invite rehearsed answers, the interview asks candidates to tell their story in the context of specific, verifiable details:

  • Names and relationships — “Who was your manager? Tell me about them.” AI cannot fabricate authentic relationship dynamics.
  • Rankings and comparisons — “Rank your last three roles by how much you learned.” These force genuine reflection, not rehearsed narratives.
  • Context and consequence — “What happened after that?” Follow-up probes test depth. Surface-level answers are flagged automatically.

The AI interviewer has a probe-or-proceed protocol: if an answer lacks specifics, it asks follow-up questions. If the candidate provides concrete detail, it moves on. This ensures every transcript contains genuine behavioral data, not rehearsed highlights.

10. Psychometric reliability

The STA is designed to meet the following reliability benchmarks, consistent with established forced-choice personality instruments:

alpha .84 / omega .88Internal Consistency Target
r = .78 - .82Test-Retest Target
SEM = +/- 3.5 TStandard Error Target

These benchmarks will be refined as assessment data scales and norming populations are established. The AI narrative layer operates under strict constraints: it can polish phrasing but cannot modify any numeric score. All computations are deterministic, logged, and auditable.

11. Operational guardrails

The STA enforces a strict separation between data and narrative:

  • Scores are deterministic. The AI cannot change, round, or reinterpret any numeric output. T-scores, percentiles, and banding are computed by fixed algorithms.
  • Narrative generation follows templates. Each facet-band combination has a pre-written deterministic narrative. The AI polishes language but cannot alter meaning.
  • PII is masked. No personally identifiable information is sent to AI providers for narrative generation.
  • Every run is logged. Timestamps, model versions, and run IDs create a complete audit trail.
  • Fallback is always deterministic. If the AI fails validation twice, the system falls back to raw deterministic text with no AI involvement.

12. A unified language for understanding people

The Stori Trait Assessment is not another hiring tool bolted onto the same broken process. It is a new measurement system designed from first principles:

  • A forced-choice assessment that reduces the biases of traditional self-report.
  • A dual-output framework that produces both applied Stori Traits and Big Five personality scores from one experience.
  • A Narrative Method interview designed to surface authentic behavior, not rehearsed performance.
  • Interview highlights that tag behavioral evidence to the same facet language as the trait assessment.
  • A unified report that cross-references personality data with demonstrated behavior, so every score has evidence behind it.

The result is a candidate intelligence layer where traits, interviews, and evidence speak the same language — and where hiring decisions are based on measurement, not proxies.

Lexical base from the International Personality Item Pool (IPIP) — public domain. Scoring, dual-output mapping, interview intelligence, and visualization architecture are patent pending. © 2025 Stori Labs / Fraser Hill. All rights reserved.

This assessment is an interpretive tool grounded in established psychometric frameworks. It is not a clinical or diagnostic instrument.