Text Analytics Evaluation Framework: A Case Study on LLMs and Social Media
A new evaluation framework has been developed to assess the capabilities of large language models (LLMs) in analyzing social media data. This framework, comprising 470 curated questions, was applied to Twitter datasets for tasks like sentiment analysis and hate speech detection. The study found that LLM performance significantly degrades with increasing input scale, especially beyond 500 instances and for numerical tasks, highlighting architectural limitations for quantitative analysis of large text collections. AI
IMPACT Highlights critical architectural bottlenecks in current LLMs for quantitative analysis over large text collections.