Research & Insights

LLM Evaluation

Test cases, regression checks, acceptance criteria, and evidence for AI workflows.

Archive

All Articles

All Enterprise AI Workflows AI Governance LLM Security RAG + Document Intelligence LLM Evaluation Eigen Suite Product Research

LLM Evaluation•6 min

How to Evaluate an AI Workflow Before Scaling It

A scorecard for deciding whether an AI workflow should scale, stay in pilot, be redesigned, or be rejected.

June 9, 2026Read more

Enterprise AI Workflows•6 min

Lightweight Process Mining to Find Measurable AI Use Cases

Before building a copilot or agent, map the process friction, decision latency, rework, and evidence needed to prove value.

June 7, 2026Read more

LLM Evaluation•6 min

Scale-Invariance Testing for AI Systems

A research-backed note on using scale-invariance tests as part of AI system evaluation, falsification, and deployment discipline.

December 8, 2025Read more

LLM Evaluation•7 min

Recursivity Experiments as a Validation Discipline for Complex AI Systems

A research note on recursive structure, compression, and why complex AI systems need falsifiable evaluation rather than abstract certainty.

December 2, 2025Read more