Architecture Overview

How Tensalis detects and corrects hallucinations in LLM outputs — without calling another LLM.

The Problem: Embedding Similarity Is Not Verification

Modern RAG systems evaluate output quality using semantic similarity — cosine distance between embeddings of the generated response and retrieved context. This is a useful measure of topical relevance, but it has a fundamental blind spot.

Contradictory statements can have high embedding similarity:

Context:   "The policy covers damages up to $50,000."
Response:  "The policy covers damages up to $500,000."

Cosine similarity: 0.97  ←  Nearly identical embeddings
Factual accuracy:  WRONG ←  10x numeric error

This is the "embedding similarity trap." Cosine distance measures topical alignment, not logical consistency. A response can be perfectly on-topic while containing factual contradictions that similarity metrics cannot detect — wrong numbers, flipped negations, swapped entities, fabricated specifics.

Tensalis addresses this by treating verification as a logical inference problem, not a similarity problem.

Five-Layer Defense-in-Depth

Every request to /v1/rag passes through five complementary detection layers. The layers are connected by OR-gate logic — if any layer detects an anomaly, the response is flagged. Different hallucination types are caught by different layers, ensuring no single point of failure.

Request
  ├──→ Layer 1A: Semantic Drift Detection    (sub-ms)
  ├──→ Layer 1B: Fabrication Detection       (sub-ms)
  │
  ├──→ OR-Gate: Any trigger?
  │      │
  │      ├──→ Layer 2: Atomic Fact Verification
  │      ├──→ Layer 3: Surgical Correction
  │      └──→ Layer 4: Evidence Chains
  │
  └──→ Layer 5: Audit Ledger (always)
  
  ──→ Response (verified, corrected, logged)
Layer 1A · Pre-Filter
Semantic Drift Detection

Patent Pending

A proprietary, physics-inspired approach to measuring how LLM output evolves through semantic space during generation. Rather than computing a single similarity score for the whole response, this layer analyzes the trajectory of meaning across response segments.

This provides early warning of drift onset — detecting when the LLM begins departing from source material — before the overall response score would reflect the problem. The majority of adversarial cases are caught at this pre-filtering stage in sub-millisecond time, before more expensive verification layers need to run.

Layer 1B · Pre-Filter
Fabrication Detection

Analyzes the information density of the response relative to the source context. When a response contains significantly more specific detail than the provided material could support, it indicates the LLM is fabricating — inventing plausible-sounding specifics not grounded in the source.

This catches a different failure mode from drift: the response may stay on-topic (no drift detected) but still contain invented details, statistics, or conditions.

Layer 2 · Core Engine
Atomic Fact Verification

The core verification engine. Every response is decomposed into individual, typed atomic claims. Each claim is verified independently against the source context.

Claim types and verification:

Claim TypeWhat It Catches
Numeric"increased by 23%" — verifies the number matches context
Currency"costs $9.99" — catches $9.99 vs $99.90 confusion
Date"effective January 15" — validates against context dates
Duration"within 30 business days" — catches timeframe manipulation
Entity"headquartered in London" — verifies named entities and locations
Negation"does not cover" — catches polarity flips (most dangerous hallucination type)
Relation"CEO of Acme Corp" — validates roles, relationships, attributions
GeneralParaphrased claims verified via Natural Language Inference (NLI)

Each fact type uses a deterministic, type-specific verifier — no LLM calls. When a deterministic verifier returns uncertain, the claim cascades to an NLI model (run locally) that classifies the entailment relationship between the claim and context. This cascading approach ensures both speed (deterministic first) and coverage (NLI fallback).

Layer 3 · Correction
Surgical Auto-Correction

When contradictions are identified, the correction engine replaces only the incorrect values while preserving the original response's structure and language.

  • Numeric errors: "45% increase" → "23% increase" (context-grounded value)
  • Entity swaps: "headquartered in London" → "headquartered in New York"
  • Negation flips: "does not cover" → "covers" (based on context polarity)

The original uncorrected response is always preserved in the API response for audit comparison. The caller receives both versions.

Layer 4 · Explainability
Evidence Chains

Every verdict is fully interpretable — no black-box scoring. The API response includes:

  • Per-fact status: supported, contradicted, unsupported, or uncertain
  • Confidence score calibrated to verification accuracy
  • Source evidence: the exact context passage supporting or contradicting each claim
  • Verification method: which verifier produced each result
  • Detection trigger map: which layers fired and which passed
  • Per-layer timing breakdown in milliseconds
Layer 5 · Compliance
Hash-Chained Audit Ledger

Every verification result is appended to a tamper-evident audit trail:

  • Cryptographic chain: Each record is hash-linked to the previous record. Modifying any past record breaks the chain from that point forward, making tampering detectable.
  • Deterministic audit IDs: Every request receives a unique, reproducible identifier
  • Full per-fact breakdown stored: every claim, type, status, confidence, evidence, and correction
  • Query API: Filter records by time range, severity, trust verdict, source, or audit ID
  • Aggregate analytics: Trust rate, latency trends, contradiction rate, severity distribution
  • Chain verification endpoint: Re-computes and validates every hash in the ledger on demand

This provides the foundation for Gartner's TRiSM (Trust, Risk, Security Management) framework — delivering the Trust pillar (audit trail, evidence chains, explainability) with the infrastructure to build Risk analytics (trend detection, severity tracking) on top.

Why No LLM Dependency?

Many hallucination detection tools use an "LLM-as-judge" pattern — sending the response to another LLM and asking it to evaluate accuracy. Tensalis deliberately avoids this for three reasons:

All models run locally on the same Cloud Run instance — no external inference calls, no API keys to manage, no rate limits to hit.

Design Principles

PrincipleRationale
Zero LLM dependency Deterministic extractors + local NLI. No external API calls, no per-request cost, reproducible results.
Defense in depth Five complementary layers with OR-gate logic. Different layers catch different hallucination types.
Model agnostic Works with any LLM provider. Verifies the output, not the model. Drop-in for any RAG pipeline.
Correct, don't just flag Surgical replacement of wrong values. The caller receives a usable response, not just a score.
Auditable by default Every verification recorded with full evidence and tamper-proof hash linking.
Sub-200ms latency Pre-filters catch the majority of cases in sub-ms. Full pipeline under 200ms on 2-vCPU Cloud Run.

Infrastructure

ComponentTechnology
RuntimeGoogle Cloud Run (auto-scaling, us-central1)
FrameworkFastAPI + Uvicorn
EmbeddingsSentence Transformers (local, 384-dim)
NLIDeBERTa-based entailment model (local)
NERspaCy (local)
AuditJSONL with SHA-256 hash chaining
Resources4 GiB RAM, 2 vCPU, zero GPU

Benchmark Results

BenchmarkCasesResult
Production accuracy suite 65 cases 63/65 — 96.9%
Adversarial subset (entity swaps, numeric manipulation, negation attacks) 52 cases 52/52 — perfect detection
Engine unit tests 40 tests 40/40 passing
Ledger unit tests 33 tests 33/33 passing
On benchmark scope

The adversarial suite focuses on the hallucination types that cause real regulatory and business risk: entity swaps, numeric errors, currency manipulation, negation flips, and date falsification. These are the cases where a wrong answer triggers compliance violations or financial loss.

Semantic reasoning tasks (unit conversion, modal logic, implicit inference) represent a separate capability tier that requires LLM-level inference — planned for a future release.

All third-party product names and trademarks are the property of their respective owners. References are for informational purposes only.  ·  Last updated: February 2026  ·  v6.1.2