3. Programs Archive
An Empirical Analysis on Large Language Models in Debate Evaluation
April 23, 2024
In this study, we investigate the capabilities and inherent biases of advanced large language models (LLMs) such as GPT-3.5 and GPT-4 in the context of debate evaluation.