Intelligence Profile
Perceptis
AI-native consulting presentation platform. Built for strategy consultants, analysts, and professional services teams who need boardroom-ready slides structured around MBB consulting logic -- insight-led headlines, traceable claims, and editable PPTX output -- in minutes rather than hours.
Consulting Presentation AI MBB-Grade Storytelling Editable PPTX Output Traceable Sources SOC-2 Compliant Custom AI Twin per Org
Rich coverage
Q1 2026 -- Run #2
240 decks evaluated -- CaliperDeck-v1
Methodology note:  Evaluating presentation quality requires human expert scoring alongside automated metrics. The scores below reflect a combined rubric: automated measures (claim traceability, structural consistency) plus expert panel scores from three former MBB consultants rating narrative quality and slide logic on a structured rubric. Frontier baseline is GPT-5.4 prompted with explicit SCR and MECE instructions.
Q3 2025
Q4 2025
Q1 2026
Q2 2026
Capability Assessment Independent -- Q1 2026
Perceptis is one of very few AI products where the quality of the output is primarily a function of thinking quality rather than data extraction accuracy. The relevant benchmark question is not whether the AI can find information, but whether it can structure an argument the way a senior consultant would.
1
Where the product leads
On narrative structure quality -- the core of Perceptis's value proposition -- the product outperforms GPT-5.4 prompted with MBB frameworks on the Lab's expert panel scoring rubric by 8.4 points. This is the only product in the Lab's current coverage where the product leads the frontier baseline on the most commercially important dimension. The insight-led headline discipline and situation-complication-resolution logic baked into Perceptis's pipeline produces more coherent executive narratives than raw frontier prompting on a consistent basis.
  • Narrative structure score: 84.2 vs. 75.8 for GPT-5.4 with explicit MBB prompting -- a +8.4 point lead. The product's structured pipeline enforces slide logic that general-purpose LLMs produce inconsistently even with careful prompting.
  • Claim-to-source traceability: 91.3%, above the 78% category average for AI presentation tools. Every factual claim is grounded in uploaded user material -- not hallucinated from public web data.
  • Messaging consistency (single "so what" per slide, executive summary alignment): 87.1%, above category average of 71%.
2
The frontier question
The frontier is improving at 2.8 points per quarter on structured narrative generation tasks. The gap between Perceptis and the frontier baseline on narrative structure is 8.4 points -- giving a theoretical compression timeline of approximately three quarters at current velocity. However, the product's durable advantage is not the narrative quality in isolation but the combination of custom organisational AI twin, template matching, and source grounding in a single workflow that a general-purpose model cannot replicate without significant prompt engineering overhead.
  • Frontier velocity on narrative structuring: +2.8 pts per quarter. Slower than on data tasks, but compressing.
  • The custom AI twin per organisation -- trained on the firm's past decks and style -- is the hardest component to replicate with a frontier model alone. It compounds with use and is a genuine switching cost driver.
3
Decision implication
For consulting firms and strategy teams, the relevant question is whether Perceptis saves senior consultant time on narrative construction -- the expensive hours -- rather than just formatting time. The panel signal and the narrative structure score both suggest it does. At 200+ strategy and consulting teams and a $3.6M seed round, the product is early-stage but has meaningful adoption signal in the target market. Buyers deploying for proposal generation and client deck production are in the product's current capability envelope. The custom AI twin feature means value compounds over time as the model learns the firm's style.
4
What the data does not yet cover
  • Complex multi-source synthesis -- decks built from 10+ uploaded documents -- has not been benchmarked. Single and dual source inputs are the basis for current scores.
  • Template fidelity for complex custom org templates (non-standard layouts, branded charts, specific colour systems) has not been tested. Standard template matching performs well; custom complex templates are an open question.
  • The Radar and Chatalyst products are out of scope for this benchmark. Scores cover slide and proposal generation only.
  • Panel signal covers 24 practitioners, all from small and mid-size consulting firms. MBB and Big Four deployment is not represented in the current panel cohort.
Benchmark Scorecard vs. GPT-5.4 (MBB-prompted) -- 240 decks evaluated
Scores combine automated metrics (traceability, consistency) with expert panel ratings from three former MBB consultants. Higher score = better performance. Frontier baseline is GPT-5.4 prompted with explicit SCR, MECE, and insight-led headline instructions.
Perceptis
Frontier (GPT-5.4)
Formula generation from natural language L1
91.4vs93.8-2.4
Error detection -- logical correctness L2
94.2vs95.1-0.9
Scenario and sensitivity build L3
82.7vs89.4-6.7
Cross-sheet model restructuring L4
67.3vs81.4-14.1
Analytical judgment and assumption-setting L5
54.1vs73.2-19.1
Vendor Claim Verification Source: perceptis.ai and public statements
"Consulting-level storylines" and "MBB-grade presentations"
partial -- context-dependent On narrative structure and messaging consistency dimensions, the product outperforms GPT-5.4 with MBB prompting. The SCR logic and insight-led headline discipline are genuinely better than unstructured frontier output. "MBB-grade" is a high bar -- the product produces consulting-appropriate structure reliably, but insight depth on complex strategic questions (where the 12.9 point deficit appears) falls short of what a senior MBB consultant would produce on a contested analytical question.
"Grounds every claim in your sources" -- trackable sources
verified Claim traceability of 91.3% -- the highest single-dimension score in the benchmark. The product does not fabricate from public web data; it works strictly from uploaded user material. This is a structural product decision reflected clearly in the output quality and is the strongest independently verifiable claim in Perceptis's marketing.
"Like Gamma, but for grown-ups" -- built for professionals who value narrative over design
positioning verified The positioning is accurate and reflected in the scores: PPTX output quality and template fidelity lead the frontier by 27 points, while the product leads on all three core consulting quality dimensions (traceability, narrative structure, messaging consistency). The distinction from consumer presentation AI tools is genuine and measurable.
Frontier intelligence
Frontier baseline -- GPT-5.4 (MBB-prompted)
71.4
Weighted avg -- consulting presentation quality rubric
Frontier velocity
+2.8 pts / qtr
Narrative structuring tasks -- steady
Narrative structure lead erosion
3 qtrs
At current velocity -- Q4 2026 for core dimensions
Perceptis leads the frontier on three of five benchmark dimensions. The durable advantage is the custom org AI twin -- trained on firm-specific past decks -- which a general-purpose model cannot replicate without equivalent institutional data. This compounds with use and represents a genuine switching cost.
Practitioner signal n=24 -- strategy and consulting teams
Output acceptance rate
81% +11pp
Verify before use
44% -9pp
Workflow abandonment
6% flat
Trust trajectory
Strong
Top correction type
Insight depth on complex questions
81% acceptance is the highest in the Lab's current professional services AI coverage. Declining verification rate -- practitioners are reviewing outputs less frequently -- is a strong trust signal for a product where narrative judgment is the core value.
Score trajectory Perceptis presentation quality score
Higher bar = stronger performance vs. frontier
Q3 25Q4 25Q1 26
71.4Q3 2025
76.8Q1 2026
Methodology
Dataset
CaliperDeck-v1 -- 240 decks
Baseline
GPT-5.4 MBB-prompted (Mar 2026)
Scoring L1-L2
Automated traceability + consistency scoring
Scoring L3-L5
3 ex-MBB expert panel -- structured rubric
Ground truth
Expert-constructed -- kappa 0.83
Run date
24 March 2026
Representative profile for discussion -- all scores and findings are illustrative, based on the Lab's published methodology applied to Perceptis's publicly stated capabilities. Presentation quality evaluation combines automated metrics with expert panel scoring -- see methodology note above. Full benchmark data will be published upon completion of the formal evaluation programme. thecaliperlab.com