Document Type

Article

Publication Date

2025

Journal / Book Title

Journal of Organizational and End User Computing

Abstract

Timely and accurate access to financial data is crucial for empirical research in accounting and finance. However, current data collection processes are often manual, inconsistent, and difficult to scale. This study asks: How can large language models (LLMs) be effectively used to automate financial data collection? Using design science research methodology (DSRM), the author develops a modular architecture that integrates a real-time search API and auxiliary information processing into LLM workflows. The study applies the model to two tasks: extracting ESG report release dates and identifying customer firm tickers from COMPUSTAT. The system achieves 96% and 95% accuracy, respectively, comparable to human performance. This study advances LLM applications in accounting by providing a scalable, practical framework for automating financial data retrieval.

DOI

10.4018/JOEUC.388470

Rights

This article published as an Open Access article distributed under the terms of the Creative Commons Attribution License (CC-BY) (https://creativecommons.org/licenses/by/4.0/) which permits unrestricted use, distribution, and production in any medium, provided the author of the original work and original publication source are properly credited.

Published Citation

Li, Yang. "Collecting Financial Data From Online Sources: Enhancing Large Language Models With Real-Time Search." JOEUC vol.37, no.1 2025: pp.1-23. https://doi.org/10.4018/JOEUC.388470

Share

COinS