How Soultrace Works: A Deep-Dive for Nerds

By

- 10 min Read

How Soultrace Works: A Technical Deep-Dive (v3.0)

Most personality tests are glorified surveys. Fixed questions, fixed order, dump answers into a spreadsheet, calculate percentages. Boring. Statistically lazy.

Soultrace uses a two-stage latent trait model with adaptive question selection. We infer underlying psychological traits from your answers, then compute your color profile as a learned transformation of those traits. This post explains the actual math and code behind it.

The Problem with Direct Classification

Traditional personality tests (and our v2.x system) update archetype probabilities directly from answers. This has issues:

  1. No principled trait modeling: Colors are computed from raw answer patterns, not grounded psychological constructs
  2. Response style bias: Someone who always answers "strongly agree" gets different results than someone who answers "agree" - even if they have the same underlying personality
  3. Calibration nightmare: You need to calibrate likelihood tables for every question × color × score combination

Our solution: a latent trait layer that mediates between answers and colors.

The Architecture

Answers → Trait Updates (Bayesian) → Weight Matrix × Traits → Softmax → Colors
                ↓
         ERS Conditioning

Instead of updating 5 color probabilities directly, we update 8 psychological trait probabilities. Colors emerge as a learned function of traits.

The Eight Latent Traits

We model eight binary traits, each representing a psychologically-grounded dimension with established research backing:

Trait Cluster Description Key Research
Conscientiousness White Organization, dependability, self-discipline Costa & McCrae (1992)
Need for Cognition Blue Enjoying effortful cognitive activity Cacioppo & Petty (1982)
Analytical Thinking Blue Preference for systematic reasoning Frederick (2005)
Agency Motivation Black Drive for achievement and power Bakan (1966); Wiggins (1991)
Promotion Focus Black Orientation toward gains and aspirations Higgins (1997)
Sensation Seeking Red Need for novel, intense experiences Zuckerman (1979)
Emotional Expressivity Red Comfort with displaying emotions Kring et al. (1994)
Communion Motivation Green Drive for connection and belonging Bakan (1966)

Each trait is modeled as P(trait = true) ∈ [0, 1], starting at 0.5 (maximum uncertainty).

Step 1: Bayesian Trait Updates

Question-Trait Mappings

Each question specifies which traits it measures and how:

QUESTION_TRAIT_UPDATES[questionId] = {
  conscientiousness: 'POSITIVE',     // Agreement → trait likely true
  sensationSeeking: 'NEGATIVE',      // Disagreement → trait likely true
  analyticalThinking: 'WEAK_POSITIVE' // Modest positive signal
  // Unlisted traits use NON_INFORMATIVE (no update)
}

Templates define the likelihood P(score | trait = true):

Template Effect
POSITIVE High scores (6,7) indicate trait presence
NEGATIVE Low scores (1,2) indicate trait presence
WEAK_* Same direction, half the update strength
NON_INFORMATIVE Uniform - no information

Symmetric Likelihood

Here's the key insight: we derive P(score | trait = false) by reversing the score:

P(strait=false)=P(8strait=true)P(s \mid \text{trait}=\text{false}) = P(8-s \mid \text{trait}=\text{true})

Where 1↔7, 2↔6, 3↔5, and crucially: 4↔4.

This ensures neutral answers (score = 4) produce zero posterior shift. If you're ambivalent, you don't bias the estimate.

The Update

For each trait with a non-uniform template:

P(trait=trues)=P(strue)P(true)P(s)P(\text{trait}=\text{true} \mid s) = \frac{P(s \mid \text{true}) \cdot P(\text{true})}{P(s)}

Where: P(s)=P(strue)P(true)+P(sfalse)P(false)P(s) = P(s \mid \text{true}) \cdot P(\text{true}) + P(s \mid \text{false}) \cdot P(\text{false})

function updateTrait(
  currentProb: number,
  templateName: UpdateTemplateName,
  answer: LikertScore,
  ersProb: number
): number {
  const template = UPDATE_TEMPLATES[templateName]

  // P(score | trait=true) - mixture conditioned on ERS
  const pScoreGivenTrue = getMixtureLikelihood(template, answer, ersProb)

  // P(score | trait=false) = P(reversed_score | trait=true)
  const reversedAnswer = reverseScore(answer)
  const pScoreGivenFalse = getMixtureLikelihood(template, reversedAnswer, ersProb)

  // Bayes update
  const pScore = pScoreGivenTrue * currentProb + pScoreGivenFalse * (1 - currentProb)
  return (pScoreGivenTrue * currentProb) / pScore
}

Step 2: Extreme Response Style (ERS) Conditioning

The Problem

Some people systematically choose extreme answers (1, 7). Others prefer moderate responses (3, 4, 5). This response style is independent of actual traits, but naive models conflate them.

The Solution

We model ERS as a separate binary latent variable, updated by every answer:

const ERS_LIKELIHOOD = {
  true:  { 1: 0.22, 2: 0.16, 3: 0.08, 4: 0.08, 5: 0.08, 6: 0.16, 7: 0.22 }, // Extremes
  false: { 1: 0.04, 2: 0.10, 3: 0.20, 4: 0.32, 5: 0.20, 6: 0.10, 7: 0.04 }  // Moderate
}

Each update template has two variants (extreme/moderate). The actual likelihood is a mixture:

P(strait=true)=P(ERS)Pextreme(s)+(1P(ERS))Pmoderate(s)P(s \mid \text{trait}=\text{true}) = P(\text{ERS}) \cdot P_{\text{extreme}}(s) + (1 - P(\text{ERS})) \cdot P_{\text{moderate}}(s)

The extreme variant is flatter (less peaked), so extreme responders don't get inflated trait estimates just because they pick endpoints.

Update Order

  1. Update P(ERS) based on answer extremity
  2. Update all traits using the new P(ERS)

This cascading update means the first few answers calibrate our response style estimate, which then conditions all subsequent trait updates.

Step 3: Trait-to-Color Transformation

The Weight Matrix

Colors are computed via a learned linear transformation:

logits=W×traits+bias\text{logits} = W \times \text{traits} + \text{bias} colors=softmax(logits)\text{colors} = \text{softmax}(\text{logits})

Where W is a 5×8 matrix:

const TRAIT_TO_COLOR_WEIGHTS = [
  // [consc, NFC, analyt, agency, promo, sens, emot, comm]
  [ 0.9,  0.1,  0.3, -0.2, -0.1, -0.4, -0.2,  0.2], // White
  [ 0.3,  0.8,  0.7,  0.1,  0.1, -0.2,  0.0,  0.0], // Blue
  [ 0.2,  0.2,  0.2, 0.75, 0.65,  0.2, -0.1, -0.4], // Black
  [-0.4,-0.15, -0.3,  0.2,  0.3,  0.8,  0.7, -0.1], // Red
  [ 0.2,  0.1,  0.0, -0.4, -0.2, -0.2,  0.3,  0.9], // Green
]

Positive weights mean the trait increases that color. The bias vector ensures uniform colors when all traits are at 0.5.

function computeColors(state: TraitState): ColorDistribution {
  const traitVector = TRAIT_IDS.map(id => state.traits[id])

  const logits: number[] = []
  for (let c = 0; c < 5; c++) {
    let sum = TRAIT_TO_COLOR_BIAS[c]
    for (let t = 0; t < 8; t++) {
      sum += TRAIT_TO_COLOR_WEIGHTS[c][t] * traitVector[t]
    }
    logits.push(sum)
  }

  return softmax(logits)
}

Why This Works

The weight matrix encodes domain knowledge about how traits map to colors:

  • High conscientiousness + low sensation seeking → White
  • High NFC + high analytical → Blue
  • High agency + high promotion focus → Black
  • High sensation + high emotional expressivity → Red
  • High communion + low agency → Green

The softmax ensures valid probabilities. The bias ensures calibration (all traits at 0.5 → 20% each color).

Step 4: Adaptive Question Selection

Information Gain

We select questions to maximize expected information gain across traits. For each candidate question:

function computeInformationGain(questionId: number, state: TraitState): number {
  const updates = QUESTION_TRAIT_UPDATES[questionId] ?? {}
  let totalIG = 0

  for (const [traitId, templateName] of Object.entries(updates)) {
    const currentProb = state.traits[traitId]

    // Current entropy
    const hBefore = binaryEntropy(currentProb)

    // Expected entropy after observing answer
    let hAfterExpected = 0
    for (let s = 1; s <= 7; s++) {
      const pScore = predictScoreProbability(s, traitId, templateName, state)
      const pPosterior = bayesianUpdate(currentProb, s, templateName)
      hAfterExpected += pScore * binaryEntropy(pPosterior)
    }

    // Weight by template strength
    const weight = templateName.startsWith('WEAK_') ? 0.5 : 1.0
    totalIG += weight * (hBefore - hAfterExpected)
  }

  return totalIG
}

High information gain = asking this question will reduce trait uncertainty significantly.

Coverage Bonus

To ensure all traits get measured, we add a bonus for under-probed traits:

bonus(q)=traitq11+count[trait]\text{bonus}(q) = \sum_{\text{trait} \in q} \frac{1}{1 + \text{count}[\text{trait}]}

Where count[trait] is how many questions have already measured this trait.

Combined Score

score=info_gain+β×coverage_bonus\text{score} = \text{info\_gain} + \beta \times \text{coverage\_bonus}

With β = 0.3, we favor informative questions while ensuring coverage.

Softmax Sampling

Questions are selected via temperature-controlled softmax:

P(select qi)=exp(scorei/T)jexp(scorej/T)P(\text{select } q_i) = \frac{\exp(\text{score}_i / T)}{\sum_j \exp(\text{score}_j / T)}

With T = 0.2 (low temperature → favor high scores with some stochasticity).

function selectNextQuestion(
  traitState: TraitState,
  selectionState: QuestionSelectionState,
  allQuestionIds: number[]
): number | null {
  const scored = scoreQuestions(traitState, selectionState, allQuestionIds)

  // Sample from softmax distribution
  const r = Math.random()
  let cumulative = 0
  for (const q of scored) {
    cumulative += q.probability
    if (r <= cumulative) {
      return q.questionId
    }
  }

  return scored[scored.length - 1].questionId
}

The Complete Flow

function runAssessment(questionPool: Question[]): ColorDistribution {
  let traitState = createInitialTraitState()  // All traits at 0.5, ERS at 0.5
  let selectionState = createQuestionSelectionState()

  for (let i = 0; i < MAX_QUESTIONS; i++) {
    // Select next question
    const questionId = i === 0
      ? randomChoice(questionPool)  // First question random
      : selectNextQuestion(traitState, selectionState, questionPool)

    // Get answer (1-7)
    const answer = await presentQuestion(questionId)

    // Update trait state (ERS first, then all traits)
    traitState = updateOnAnswer(traitState, questionId, answer)

    // Track coverage
    selectionState = recordAskedQuestion(selectionState, questionId)
  }

  // Compute final colors from traits
  return computeColors(traitState)
}

Why This Actually Works

Psychologically Grounded

Unlike opaque "color likelihoods," our traits have established psychological foundations. Need for Cognition, Agency/Communion, Regulatory Focus - these are real constructs with decades of research. This makes the model interpretable and testable.

Response Style Invariant

ERS conditioning means someone who always picks "strongly agree" gets similar trait estimates to someone who picks "agree" - only the ERS probability differs. This eliminates a major source of bias in self-report assessments.

Calibrated by Design

The weight matrix and bias ensure:

  • All traits at 0.5 → uniform color distribution (20% each)
  • Random answers → approximately uniform colors

We verified this with Monte Carlo simulations.

Efficient Convergence

Trait-based information gain focuses questions on high-uncertainty traits. We converge to stable estimates in ~24 questions instead of 50-100.

Limitations

Myopic selection: We pick questions greedily based on current state. A globally optimal strategy might sacrifice short-term information for better long-term probing.

Weight matrix is hand-tuned: We derived weights from psychological theory and calibrated via simulation. A learned matrix from user data might perform better.

Independence assumption: We treat answers as conditionally independent given traits. In reality, there may be order effects or fatigue.

Try It

That's the actual methodology. No hand-waving, no "proprietary AI" bullshit.

Take the test and see if the math holds up for you: Start Assessment


References

  • Bakan, D. (1966). The duality of human existence. Basic Books.
  • Cacioppo, J. T., & Petty, R. E. (1982). The need for cognition. Journal of Personality and Social Psychology, 42(1), 116-131.
  • Costa, P. T., & McCrae, R. R. (1992). Revised NEO Personality Inventory (NEO-PI-R). Psychological Assessment Resources.
  • Frederick, S. (2005). Cognitive reflection and decision making. Journal of Economic Perspectives, 19(4), 25-42.
  • Higgins, E. T. (1997). Beyond pleasure and pain. American Psychologist, 52(12), 1280-1300.
  • Kring, A. M., Smith, D. A., & Neale, J. M. (1994). Individual differences in dispositional expressiveness. Journal of Personality and Social Psychology, 66(5), 934-949.
  • Wiggins, J. S. (1991). Agency and communion as conceptual coordinates for the understanding and measurement of interpersonal behavior. In D. Cicchetti & W. M. Grove (Eds.), Thinking clearly about psychology (Vol. 2, pp. 89-113). University of Minnesota Press.
  • Zuckerman, M. (1979). Sensation seeking: Beyond the optimal level of arousal. Lawrence Erlbaum Associates.
Soultrace

Who are you?

Stay in the loop

Get notified about new archetypes, features, and insights.