Disentangling Similarity and Relatedness in Topic Models
Researchers have developed a new method to distinguish between thematic relatedness and taxonomic similarity in topic models, particularly those augmented with large language models. They created a synthetic benchmark using LLM annotations to train a neural scorer capable of measuring these two semantic axes. This scorer revealed that different topic model families occupy distinct positions in the similarity-relatedness space and that optimizing for one axis can degrade performance on tasks requiring the other. AI
IMPACT Provides a framework for evaluating the semantic nuances captured by topic models, potentially improving their application in downstream NLP tasks.