How Effectively Can Current LLMs Analyze Macrofinancial Issues?

Author: Paola Ganum and Tohid Atashbar

Volume/Issue: Volume 2026 Issue 035

Publication date: February 2026

ISBN: 9798229038935

$20.00

Add to Cart by clicking price of the language and format you'd like to purchase

Available Languages and Formats

	Paperback	PDF	ePub
English

Prices in red indicate formats that are not yet available but are forthcoming.

Topics covered in this book

This title contains information about the following subjects. Click on a subject if you would like to see other titles with the same subjects.

Finance , Economics- Macroeconomics , AI , Large Language Model , Textual Analysis , Macrofinancial Surveillance , IMF Staff Reports , Human-AI Comparison , IMF working papers , IMF seminar , IMF article IV , IMF staff , Large Language Models , Financial Sector Assessment Program , Macrofinancial analysis , Systemic risk

Also of interest

WORKING PAPERS PUBLISHED IN DECEMBER 2007

WORKING PAPERS PUBLISHED IN DECEMBER 2005

WORKING PAPERS PUBLISHED IN DECEMBER 2004

Summary

This paper empirically evaluates the ability of current Large Language Models (LLMs) to analyze macrofinancial coverage in IMF Article IV staff reports, using human economists' assessments as a benchmark. We test several GPT models on reports from 2016-2024, assessing their performance on both qualitative ratings and binary questions. Our findings indicate that the latest models can meaningfully assist economists, achieving an average accuracy of 71-75% on ratings and an average exact match rate of 76-81% on binary questions in 2024 across advanced GPT models. However, we find that LLMs tend to assign higher, less-dispersed ratings than human experts and struggle with open-ended questions that require deep contextual judgment. The paper provides quantitative evidence on current LLM accuracy in this domain, explores the drivers of its performance, and discusses key limitations such as optimistic bias.