METR Blog

26 April 2024

Emma Abele is METR’s new Executive Director

Emma moves from President to Executive Director, Beth moves to Head of Research

15 March 2024

Autonomy Evaluation Resources

A collection of resources for evaluating potentially dangerous autonomous capabilities of frontier models.

29 February 2024

Portable Evaluation Tasks via the METR Task Standard

METR has published a standard way to define tasks for evaluating the capabilities of AI agents.

07 February 2024

2023 Year In Review

A summary of what METR accomplished in 2023 – our first full year of operation.

16 December 2023

Bounty: Diverse hard tasks for LLM agents

METR (formerly ARC Evals) is looking for (1) ideas, (2) detailed specifications, and (3) well-tested implementations for tasks to measure performance of autonomous LLM agents.

04 December 2023

ARC Evals is now METR

ARC Evals is wrapping up our incubation period at ARC, and spinning off into our own standalone nonprofit.