New AI frameworks tackle concept extraction, taxonomy generation, and materials discovery
ByPulseAugur Editorial·
Summary by gemini-2.5-flash-lite
from 11 sources
Researchers have developed SC-Taxo, a framework using large language models (LLMs) to generate semantically consistent hierarchical taxonomies for scientific literature. This approach addresses inconsistencies in existing methods by employing a bidirectional generation mechanism that refines both bottom-up abstractions and top-down constraints. Experiments show SC-Taxo improves hierarchy alignment and heading quality, even generalizing to Chinese scientific literature.
AI
arXiv:2605.00620v1 Announce Type: new Abstract: Scientific literature is expanding at an unprecedented pace, making it increasingly challenging to efficiently organize and access domain knowledge. A high-quality scientific taxonomy offers a structured and hierarchical representat…
Scientific literature is expanding at an unprecedented pace, making it increasingly challenging to efficiently organize and access domain knowledge. A high-quality scientific taxonomy offers a structured and hierarchical representation of a research field, facilitating literature…
arXiv cs.AI
TIER_1·Usha Bhalla, Thomas Fel, Can Rager, Sheridan Feucht, Tal Haklay, Daniel Wurgaft, Siddharth Boppana, Matthew Kowal, Vasudev Shyam, Jack Merullo, Atticus Geiger, Ekdeep Singh Lubana·
arXiv:2604.28119v1 Announce Type: cross Abstract: Sparse autoencoders (SAEs) are widely used to extract interpretable features from neural network representations, often under the implicit assumption that concepts correspond to independent linear directions. However, a growing bo…
Sparse autoencoders (SAEs) are widely used to extract interpretable features from neural network representations, often under the implicit assumption that concepts correspond to independent linear directions. However, a growing body of evidence suggests that many concepts are ins…
Sparse autoencoders (SAEs) are widely used to extract interpretable features from neural network representations, often under the implicit assumption that concepts correspond to independent linear directions. However, a growing body of evidence suggests that many concepts are ins…
arXiv:2502.14912v2 Announce Type: replace Abstract: We present a framework for generating universal semantic embeddings of chemical elements to advance materials inference and discovery. This framework leverages ElementBERT, a domain-specific BERT-based natural language processin…
arXiv cs.CL
TIER_1·Tomer Ashuach, Dana Arad, Aaron Mueller, Martin Tutek, Yonatan Belinkov·
arXiv:2508.13650v3 Announce Type: replace Abstract: As large language models (LLMs) are increasingly deployed in real-world applications, the need to selectively remove unwanted knowledge while preserving model utility has become paramount. Recent work has explored sparse autoenc…
arXiv:2604.23758v1 Announce Type: new Abstract: The discovery of novel materials is critical for global energy and quantum technology transitions. While deep learning has fundamentally reshaped this landscape, existing predictive or generative models typically operate in isolatio…
Techniques for concept extraction, such as sparse autoencoders and transcoders, aim to extract high-level symbolic concepts from low-level nonsymbolic representations. When these extracted concepts are used for downstream tasks such as model steering and unlearning, it is essenti…
arXiv:2604.24936v1 Announce Type: cross Abstract: Techniques for concept extraction, such as sparse autoencoders and transcoders, aim to extract high-level symbolic concepts from low-level nonsymbolic representations. When these extracted concepts are used for downstream tasks su…
Techniques for concept extraction, such as sparse autoencoders and transcoders, aim to extract high-level symbolic concepts from low-level nonsymbolic representations. When these extracted concepts are used for downstream tasks such as model steering and unlearning, it is essenti…