Conferences >2024 19th Conference on Compu...

Benchmarking OpenAI’s APIs and other Large Language Models for Repeatable and Efficient Question Answering Across Multiple Documents

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

The rapid growth of document volumes and complexity in various domains necessitates advanced automated methods to enhance the efficiency and accuracy of information extra...Show More

Metadata

Abstract:

The rapid growth of document volumes and complexity in various domains necessitates advanced automated methods to enhance the efficiency and accuracy of information extraction and analysis. This paper aims to evaluate the efficiency and repeatability of OpenAI’s APIs and other Large Language Models (LLMs) in automating question-answering tasks across multiple documents, specifically focusing on analyzing Data Privacy Policy (DPP) documents of selected EdTech providers. We test how well these models perform on large-scale text processing tasks using the OpenAI’s LLM models (GPT 3.5 Turbo, GPT 4, GPT 4o) and APIs in several frameworks: direct API calls (i.e., one-shot learning), LangChain, and Retrieval Augmented Generation (RAG) systems. We also evaluate a local deployment of quantized versions (with FAISS) of LLM models (Llama-2-13B-chat-GPTQ). Through systematic evaluation against predefined use cases and a range of metrics, including response format, execution time, and cost, our study aims to provide insights into the optimal practices for document analysis. Our findings demonstrate that using OpenAI’s LLMs via API calls is a workable workaround for accelerating document analysis when using a local GPU-powered infrastructure is not a viable solution, particularly for long texts. On the other hand, the local deployment is quite valuable for maintaining the data within the private infrastructure. Our findings show that the quantized models retain substantial relevance even with fewer parameters than ChatGPT and do not impose processing restrictions on the number of tokens. This study offers insights on maximizing the use of LLMs for better efficiency and data governance in addition to confirming their usefulness in improving document analysis procedures.

Published in: 2024 19th Conference on Computer Science and Intelligence Systems (FedCSIS)

Date of Conference: 08-11 September 2024

Date Added to IEEE Xplore: 04 November 2024

ISBN Information:

DOI: 10.15439/2024F3979

Conference Location: Belgrade, Serbia

Contents

I. Introduction

TRADITIONAL document analysis methods often rely on manual review or simplistic keyword-based searches, leading to significant inefficiencies and limitations. As the volume and diversity of documents continue to expand, the need for innovative approaches that can streamline analysis while preserving accuracy and comprehensiveness arises. In this study, we explore different methods for assisting the analysis of selected EdTech providers’ Data Privacy Policy (DPP) documents. On the task at hand, we aim to evaluate the efficacy, consistency, benefits and limitations of various LLMs in assessing DPP documents. The importance for such an assessment is founded on the need for an automated, scalable and reliable way to systematically analyze large bodies of text semantically. We also aim to examine the most optimal way of using LLMs regarding the optimizing factor - whether it is the price, the duration or the format of the answers provided that plays a key role in a technical chore. All of these LLM models are trained on extensive datasets and exhibit remarkable proficiency in generating human-like responses and cognitive reasoning across diverse tasks. LLMs are built to handle various tasks, such as text generation, translation, content summary, chatbot conversations, and more [1].

References is not available for this document.

Benchmarking OpenAI’s APIs and other Large Language Models for Repeatable and Efficient Question Answering Across Multiple Documents

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Benchmarking OpenAI’s APIs and other Large Language Models for Repeatable and Efficient Question Answering Across Multiple Documents

Alerts

Abstract:

Metadata

Abstract:

I. Introduction

Authors

Figures

References

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?