B BeCapable Research

Applied AI lab notes

BeCapable Research

Benchmarks and field systems for AI that has to make decisions in messy, practical environments.

Bench marks
Field data
Decision logs
Replay proof
eval loop agent traces public demos

Current Tracks

DukaanBench

A 30-day kirana store benchmark where AI agents manage stock, cash, trust, khata, marketing, and customer service.

Evaluation Interfaces

Tools for making model outputs inspectable: replay screens, evidence logs, scoreboards, and provider response audits.

Field Systems

Applied experiments for education, commerce, operations, and public interest workflows where AI has to survive real constraints.