AI Evaluation

AI Evaluation

This section showcases a systematic evaluation of entity extraction quality comparing Claude (LLM) against spaCy (statistical NLP) across U.S. government domain text. Explore the results, browse the gold dataset, and read the full methodology.

Explore