A new set of benchmarks is needed to assess artificial intelligence's real-world knowledge, which could help specialists better understand AI.
Artificial intelligence models have shown impressive performance on law exams, answering multiple-choice, short-answer, and essay questions as well as humans [1]. However, they struggle with real-world legal tasks.
Some lawyers have learned the hard way, being fined for filing AI-generated court briefs that misrepresented principles of law and cited non-existent cases.
Author: Chaudhri, principal scientist at Knowledge Systems Research in Sunnyvale, California.
Summary: New benchmarks are needed to assess AI's real-world knowledge.