English(EN)NoisyCausal: A Benchmark for Evaluating Causal Reasoning Under Structured Noise
新AI方法推动复杂、嘈杂和大规模数据的因果发现
作者PulseAugur 编辑部·[37 个来源]·
几篇最新的arXiv论文介绍了因果发现领域的新方法和基准,该领域专注于从数据中识别因果关系。这些进展包括处理嘈杂或不完整数据、整合专家知识以及提高大规模数据集可扩展性的技术。新的基准和测试框架也正在开发中,以严格评估现有因果发现算法在各种假设违反情况下的鲁棒性,特别是在时间序列数据和自然语言推理方面。
AI
Causal inference, estimating causal effects from observational data, is a fundamental tool in many disciplines. Of particular importance across a variety of domains is the continuous treatment setting, where the variable of intervention has a continuous range. This setting is far…
Causal discovery, the problem of inferring the direction of causality, is generally ill-posed. We use the language of structural causal models (SCM) to show that assuming that the causal relations are acyclic and invariant across multiple environments (e.g., the way minimum wage …
Causal discovery, the problem of inferring the direction of causality, is generally ill-posed. We use the language of structural causal models (SCM) to show that assuming that the causal relations are acyclic and invariant across multiple environments (e.g., the way minimum wage …
Selection bias is pervasive in observational studies. For example, large scale biobanks data can exhibit ``healthy volunteer bias'' when respondents are healthier and of higher socio-economic status than the population they are meant to represent. Recovering causal effects from s…
The instrumental-variables (IV) setting is standard for partial identification of causal effects when unobserved confounding makes point identification impossible. Existing approaches face methodological bottlenecks: closed-form bound estimands are required -- e.g., Balke-Pearl e…
Constraint-based causal discovery is widely used for learning causal structures, but heavy reliance on conditional independence (CI) testing makes it computationally expensive in high-dimensional settings. To mitigate this limitation, many divide-and-conquer frameworks have been …
In causal inference, confounders are variables that influence both treatment decisions and outcomes. However, unlike as in randomized clinical trials, the treatment assignment mechanism in observational studies is not known, and it is thus unclear which covariates act as confound…
Recent work on causal abstraction, in particular graphical approaches focusing on causal structure between clusters of variables, aims to summarize a high-dimensional causal structure in terms of a low-dimensional one. Existing methods for learning such summaries from data assume…
Causal inference, especially in observational studies, relies on untestable assumptions about the true data-generating process. Sensitivity analysis helps us determine how robust our conclusions are when we alter these underlying assumptions. Existing frameworks for sensitivity a…
arXiv cs.LG
TIER_1English(EN)·Marvin Sextro, Weronika K{\l}os, Gabriel Dernbach·
arXiv:2601.21092v3 Announce Type: replace Abstract: Planning effective interventions in biological systems requires treatment-effect models that adapt to unseen biological contexts by identifying their specific underlying mechanisms. Yet single-cell perturbation datasets span onl…
arXiv cs.LG
TIER_1English(EN)·Shicheng Fan, Nour Elhendawy, Jianle Sun, Ke Fang, Kun Zhang, Yihang Wang, Lu Cheng·
arXiv:2605.05524v1 Announce Type: new Abstract: Causal representation learning (CRL) seeks to recover latent variables with identifiability guarantees, typically up to permutation and component-wise reparameterization under appropriate assumptions. However, identifiability does n…
arXiv:2605.05568v1 Announce Type: cross Abstract: Despite the growing availability of large datasets, causal structure learning remains computationally prohibitive at scale. We revisit sparsest-permutation learning for linear structural equation models and show that exact Cholesk…
arXiv:2605.05743v1 Announce Type: cross Abstract: Gaussian process marginal likelihood scores and kernel conditional independence tests are theoretically appealing for nonlinear causal discovery but computationally prohibitive at scale. We present two complementary RFF-based meth…
arXiv cs.LG
TIER_1English(EN)·Adrick Tench, Thomas Demeester·
arXiv:2601.16715v2 Announce Type: replace Abstract: Would-be practitioners of causal discovery face a dizzying array of algorithms without a clear best choice. This abundance of competitive methods makes ensembling a natural strategy for practical applications. At the same time, …
arXiv:2605.04313v1 Announce Type: new Abstract: Causal reasoning in natural language requires identifying relevant variables, understanding their interactions, and reasoning about effects and interventions, often under noisy or ambiguous conditions. While large language models (L…
arXiv cs.LG
TIER_1English(EN)·Bruno Petrungaro, Anthony C. Constantinou·
arXiv:2605.04081v1 Announce Type: new Abstract: Causal Bayesian Networks (CBNs) are a powerful tool for reasoning under uncertainty about complex real-world problems. Such problems evolve over time, responding to external shocks as they occur. To support decision-making, CBNs req…
arXiv cs.LG
TIER_1English(EN)·Geert Mesters, Alvaro Ribot, Anna Seigal, Piotr Zwiernik·
arXiv:2605.04381v1 Announce Type: cross Abstract: Causal discovery methods such as LiNGAM identify causal structure from observational data by assuming mutually independent disturbances. This assumption is fragile: shared volatility, common scale effects, or other forms of depend…
arXiv cs.LG
TIER_1English(EN)·Thomas S. Robinson, Ranjit Lall·
arXiv:2605.04838v1 Announce Type: cross Abstract: The standard constraint-based paradigm for causal discovery with incomplete data -- impute first, test second -- is frequently miscalibrated: any consistent conditional independence (CI) test rejects a true null with probability a…
arXiv cs.LG
TIER_1English(EN)·Gideon Stein, Niklas Penzel, Tristan Piater, Joachim Denzler·
arXiv:2605.03045v1 Announce Type: new Abstract: Causal Discovery (CD) is a powerful framework for scientific inquiry. Yet, its practical adoption is hindered by a reliance on strong, often unverifiable assumptions and a lack of robust performance assessment. To address these limi…
Causal discovery methods such as LiNGAM identify causal structure from observational data by assuming mutually independent disturbances. This assumption is fragile: shared volatility, common scale effects, or other forms of dependence can cause the methods to recover the wrong ca…
Causal reasoning in natural language requires identifying relevant variables, understanding their interactions, and reasoning about effects and interventions, often under noisy or ambiguous conditions. While large language models (LLMs) exhibit strong general reasoning abilities,…
arXiv stat.ML
TIER_1English(EN)·Francesco Montagna, Francesco Locatello·
arXiv:2605.13589v1 Announce Type: new Abstract: Causal discovery, the problem of inferring the direction of causality, is generally ill-posed. We use the language of structural causal models (SCM) to show that assuming that the causal relations are acyclic and invariant across mu…
arXiv stat.ML
TIER_1English(EN)·Oliver J. Hines, Caleb H. Miles·
arXiv:2510.16127v2 Announce Type: replace Abstract: The ratio of two probability density functions is a fundamental quantity that appears in many areas of statistics and machine learning, including causal inference, reinforcement learning, covariate shift, outlier detection, inde…
arXiv stat.ML
TIER_1English(EN)·Jin Du, Li Chen, Xun Xian, An Luo, Fangqiao Tian, Ganghua Wang, Charles Doss, Xiaotong Shen, Jie Ding·
arXiv:2505.13770v3 Announce Type: replace-cross Abstract: Reliable causal inference is essential for making decisions in high-stakes areas like medicine, economics, and public policy. However, it remains unclear whether large language models (LLMs) can handle rigorous and trustwo…
Constraint-based causal discovery is widely used for learning causal structures, but heavy reliance on conditional independence (CI) testing makes it computationally expensive in high-dimensional settings. To mitigate this limitation, many divide-and-conquer frameworks have been …
Causal sensitivity analysis aims to provide bounds for causal effect estimates in the presence of unobserved confounding. However, existing methods for causal sensitivity analysis are per-instance procedures, meaning that changes to the dataset, causal query, sensitivity level, o…
arXiv:2605.06993v1 Announce Type: cross Abstract: Causal queries are often only partially identifiable from observational data, and experiments that could tighten the resulting bounds are typically costly. We study the problem of selecting, prior to observing experimental outcome…
arXiv stat.ML
TIER_1English(EN)·Shakeel Gavioli-Akilagun, Kieran Wood, Francesco Quinzan·
arXiv:2605.05809v1 Announce Type: cross Abstract: We propose a framework for determining whether the causal dependence of an outcome $Y$ on a covariate $X$ changes at a given time point, given confounders $\boldsymbol{Z}$. For instance, in financial markets, the effect of a marke…
Causal queries are often only partially identifiable from observational data, and experiments that could tighten the resulting bounds are typically costly. We study the problem of selecting, prior to observing experimental outcomes, a cost-constrained subset of experiments that m…
We propose a framework for determining whether the causal dependence of an outcome $Y$ on a covariate $X$ changes at a given time point, given confounders $\boldsymbol{Z}$. For instance, in financial markets, the effect of a market indicator on asset returns may causally change o…
Gaussian process marginal likelihood scores and kernel conditional independence tests are theoretically appealing for nonlinear causal discovery but computationally prohibitive at scale. We present two complementary RFF-based methods forming a practical toolkit for score-based, c…
Despite the growing availability of large datasets, causal structure learning remains computationally prohibitive at scale. We revisit sparsest-permutation learning for linear structural equation models and show that exact Cholesky factorization is unnecessary for structure recov…
The standard constraint-based paradigm for causal discovery with incomplete data -- impute first, test second -- is frequently miscalibrated: any consistent conditional independence (CI) test rejects a true null with probability approaching 1 when imputation error induces spuriou…
Causal discovery methods such as LiNGAM identify causal structure from observational data by assuming mutually independent disturbances. This assumption is fragile: shared volatility, common scale effects, or other forms of dependence can cause the methods to recover the wrong ca…
arXiv stat.ML
TIER_1English(EN)·Xihang Shan, Da Zhou·
arXiv:2605.01669v1 Announce Type: new Abstract: External priors of unknown reliability create a brittle trade-off in causal discovery: blind trust amplifies errors, blind rejection wastes signal. Real priors are also \emph{heterogeneously} reliable -- physical laws are trustworth…
External priors of unknown reliability create a brittle trade-off in causal discovery: blind trust amplifies errors, blind rejection wastes signal. Real priors are also \emph{heterogeneously} reliable -- physical laws are trustworthy, LLM-suggested edges are speculative -- yet ex…