arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2509.25666 2026-02-11 cs.LG cs.CL

Nudging the Boundaries of LLM Reasoning

Justin Chih-Yao Chen, Becky Xiangyu Peng, Prafulla Kumar Choubey, Kung-Hsiang Huang, Jiaxin Zhang, Mohit Bansal, Chien-Sheng Wu

Comments ICLR 2026 (Camera-Ready)

详情

英文摘要

Current online reinforcement learning (RL) algorithms like GRPO share a key limitation in LLM reasoning: they cannot learn from problems that are "unsolvable" to the model. In other words, they can only improve performance on problems where the model is capable of exploring the correct answer. Consequently, the model's "upper limit" remains unchanged after RL training, even though the likelihood of solving easier, solvable problems may increase. These hard samples cannot contribute to training, as no rollouts yield rewards and thus no gradients are produced. To unlock learning from these hard samples, we propose NuRL, a "nudging" method that aims to push the upper bound of LLM reasoning using self-generated hints, i.e., abstract cues that help reduce the problem difficulty for the model. Given a question and its gold answer, the model generates a CoT and then produces a hint containing the core knowledge needed to solve the problem. During training, we generate G rollouts from the base policy and use the pass rate to decide whether the hint should be injected. For hard samples with a 0% pass rate, we inject the hint and regenerate a new batch of trajectories. This yields two benefits: (1) the hint boosts pass rates (from 0% to non-zero), thereby introducing training signals for previously unsolvable samples, and (2) the hints are self-generated, avoiding distributional shift and do not rely on external models. NuRL achieves consistent improvements across 6 benchmarks and 3 models, while remaining complementary to test-time scaling. Notably, NuRL can raise the model's upper limit, whereas GRPO leaves pass@1024 unchanged from the base model. Furthermore, we present a systematic study of what makes an effective hint and when hints are most useful. Interestingly, the best hints are abstract and high-level, and are most beneficial when applied necessarily and after GRPO has converged.

URL PDF HTML ☆

赞 0 踩 0

2509.22761 2026-02-11 cs.CV cs.AI

MILR: Improving Multimodal Image Generation via Test-Time Latent Reasoning

Yapeng Mi, Yanpeng Zhao, Hengli Li, Chenxi Li, Huimin Wu, Xiaojian Ma, Song-Chun Zhu, Ying Nian Wu, Qing Li

Comments 21 pages,14 figures,9 tables

2509.22576 2026-02-11 cs.LG cs.CL

EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning

Wujiang Xu, Wentian Zhao, Zhenting Wang, Yu-Jhe Li, Can Jin, Mingyu Jin, Kai Mei, Kun Wan, Dimitris N. Metaxas

2509.21836 2026-02-11 cs.AI cs.MA

Axiomatic Choice

Ben Abramowitz, Nicholas Mattei

2509.21797 2026-02-11 cs.CV

MoWM: Mixture-of-World-Models for Embodied Planning via Latent-to-Pixel Feature Modulation

Yangcheng Yu, Xin Jin, Yu Shang, Xin Zhang, Haisheng Su, Wei Wu, Yong Li

2509.21699 2026-02-11 cs.LG

Exact Subgraph Isomorphism Network with Mixed $L_{0,2}$ Norm Constraint for Predictive Graph Mining

Taiga Kojima, Haruto Kajita, Ayato Kohara, Masayuki Karasuyama

2509.19713 2026-02-11 cs.CV cs.RO

VIMD: Monocular Visual-Inertial Motion and Depth Estimation

Saimouli Katragadda, Guoquan Huang

2509.16596 2026-02-11 cs.CL cs.AI

Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels

Junjie Ye, Yuming Yang, Yang Nan, Shuo Li, Qi Zhang, Tao Gui, Xuanjing Huang, Peng Wang, Zhongchao Shi, Jianping Fan

Comments Accepted by EMNLP 2025 Main Conference. Codes for parameter restoration are available at https://github.com/UmeanNever/ParamRestore

2509.13761 2026-02-11 cs.AI cs.CL

THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning

Qikai Chang, Zhenrong Zhang, Pengfei Hu, Jun Du, Jiefeng Ma, Yicheng Pan, Jianshu Zhang, Quan Liu, Jianqing Gao

Comments 22 pages, 13 figures, ICLR 2026

2509.13666 2026-02-11 cs.RO cs.AI

DREAM: Domain-aware Reasoning for Efficient Autonomous Underwater Monitoring

Zhenqi Wu, Abhinav Modi, Angelos Mavrogiannis, Kaustubh Joshi, Nikhil Chopra, Yiannis Aloimonos, Nare Karapetyan, Ioannis Rekleitis, Xiaomin Lin

Comments In Proceeding of ICRA 2026

2509.11362 2026-02-11 cs.LG cs.CV

PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits

Loka Li, Wong Yu Kang, Minghao Fu, Guangyi Chen, Zhenhao Chen, Gongxu Luo, Yuewen Sun, Salman Khan, Peter Spirtes, Kun Zhang

Comments ICLR 2026

2509.08422 2026-02-11 cs.CV cs.LG

LD-ViCE: Latent Diffusion Model for Video Counterfactual Explanations

Payal Varshney, Adriano Lucieri, Christoph Balada, Sheraz Ahmed, Andreas Dengel

Comments 44 Pages

2509.03219 2026-02-11 cs.AI cs.LG

A Novel Framework for Uncertainty-Driven Adaptive Exploration

Leonidas Bakopoulos, Georgios Chalkiadakis

Comments This is an extended version (full paper + appendix) of the paper titled "A Novel Framework for Uncertainty-Driven Adaptive Exploration" accepted as a full paper at AAMAS 2026. The accepted paper can be found in https://openreview.net/forum?id=j5awxzdsU9

2508.21540 2026-02-11 cs.AI

HealthProcessAI: A Technical Framework and Proof-of-Concept for LLM-Enhanced Healthcare Process Mining

Eduardo Illueca-Fernandez, Kaile Chen, Fernando Seoane, Farhad Abtahi

Comments Figure 1 updated, typos corrected, references added, under review

详情

DOI: 10.3389/frai.2026.1716819

英文摘要

Process mining has emerged as a powerful analytical technique for understanding complex healthcare workflows. However, its application faces significant barriers, including technical complexity, a lack of standardized approaches, and limited access to practical training resources. We introduce HealthProcessAI, a GenAI framework designed to simplify process mining applications in healthcare and epidemiology by providing a comprehensive wrapper around existing Python (PM4PY) and R (bupaR) libraries. To address unfamiliarity and improve accessibility, the framework integrates multiple Large Language Models (LLMs) for automated process map interpretation and report generation, helping translate technical analyses into outputs that diverse users can readily understand. We validated the framework using sepsis progression data as a proof-of-concept example and compared the outputs of five state-of-the-art LLM models through the OpenRouter platform. To test its functionality, the framework successfully processed sepsis data across four proof-of-concept scenarios, demonstrating robust technical performance and its capability to generate reports through automated LLM analysis. LLM evaluation using five independent LLMs as automated evaluators revealed distinct model strengths: Claude Sonnet-4 and Gemini 2.5-Pro achieved the highest consistency scores (3.79/4.0 and 3.65/4.0) when evaluated by automated LLM assessors. By integrating multiple Large Language Models (LLMs) for automated interpretation and report generation, the framework addresses widespread unfamiliarity with process mining outputs, making them more accessible to clinicians, data scientists, and researchers. This structured analytics and AI-driven interpretation combination represents a novel methodological advance in translating complex process mining results into potentially actionable insights for healthcare applications.

URL PDF HTML ☆

赞 0 踩 0

2508.21091 2026-02-11 cs.CV

ERTACache: Error Rectification and Timesteps Adjustment for Efficient Diffusion

Xurui Peng, Chenqian Yan, Hong Liu, Rui Ma, Fangmin Chen, Xing Wang, Zhihua Wu, Songwei Liu, Mingbao Lin

2508.16651 2026-02-11 cs.LG cs.AI

HiCL: Hippocampal-Inspired Continual Learning

Kushal Kapoor, Wyatt Mackey, Yiannis Aloimonos, Xiaomin Lin

Comments In proceeding of AAAI

2508.15214 2026-02-11 cs.CL

Self-Guided Function Calling in Large Language Models via Stepwise Experience Recall

Sijia Cui, Aiyao He, Shuai Xu, Hongming Zhang, Yanna Wang, Qingyang Zhang, Yajing Wang, Bo Xu

Comments Accepted to EMNLP 2025

2508.08712 2026-02-11 cs.CL cs.AI cs.DC

A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models

Lingzhe Zhang, Liancheng Fang, Chiming Duan, Minghua He, Leyi Pan, Pei Xiao, Shiyu Huang, Yunpeng Zhai, Xuming Hu, Philip S. Yu, Aiwei Liu

2508.08688 2026-02-11 cs.AI cs.CV

STELAR-VISION: Self-Topology-Aware Efficient Learning for Aligned Reasoning in Vision

Chen Li, Han Zhang, Zhantao Yang, Fangyi Chen, Zihan Wang, Anudeepsekhar Bolimera, Marios Savvides

Comments This paper has been accepted at AAAI 2026. This is the author's extended version. The final version will appear in the official proceedings

2508.06617 2026-02-11 cs.LG cs.AI cs.PF

Generalizing Scaling Laws for Dense and Sparse Large Language Models

Md Arafat Hossain, Xingfu Wu, Valerie Taylor, Ali Jannesari

Comments 8 pages, 8 figures

2508.04337 2026-02-11 cs.CL cs.AI cs.HC cs.IR

Modelling and Classifying the Components of a Literature Review

Francisco Bolaños, Angelo Salatino, Francesco Osborne, Enrico Motta

2507.15911 2026-02-11 cs.CV

Local Dense Logit Relations for Enhanced Knowledge Distillation

Liuchi Xu, Kang Liu, Jinshuai Liu, Lu Wang, Lisheng Xu, Jun Cheng

Comments Accepted by ICCV2025, Code available at https://github.com/yema-web/LDRLD

2507.11097 2026-02-11 cs.CL

The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs

Zichen Wen, Jiashu Qu, Zhaorun Chen, Xiaoya Lu, Dongrui Liu, Zhiyuan Liu, Ruixi Wu, Yicun Yang, Xiangqi Jin, Haoyun Xu, Xuyang Liu, Weijia Li, Chaochao Lu, Jing Shao, Conghui He, Linfeng Zhang

Comments Accepted by ICLR 2026

详情

英文摘要

Diffusion-based large language models (dLLMs) have recently emerged as a powerful alternative to autoregressive LLMs, offering faster inference and greater interactivity via parallel decoding and bidirectional modeling. However, despite strong performance in code generation and text infilling, we identify a fundamental safety concern: existing alignment mechanisms fail to safeguard dLLMs against context-aware, masked-input adversarial prompts, exposing novel vulnerabilities. To this end, we present DIJA, the first systematic study and jailbreak attack framework that exploits unique safety weaknesses of dLLMs. Specifically, our proposed DIJA constructs adversarial interleaved mask-text prompts that exploit the text generation mechanisms of dLLMs, i.e., bidirectional modeling and parallel decoding. Bidirectional modeling drives the model to produce contextually consistent outputs for masked spans, even when harmful, while parallel decoding limits model dynamic filtering and rejection sampling of unsafe content. This causes standard alignment mechanisms to fail, enabling harmful completions in alignment-tuned dLLMs, even when harmful behaviors or unsafe instructions are directly exposed in the prompt. Through comprehensive experiments, we demonstrate that DIJA significantly outperforms existing jailbreak methods, exposing a previously overlooked threat surface in dLLM architectures. Notably, our method achieves up to 100% keyword-based ASR on Dream-Instruct, surpassing the strongest prior baseline, ReNeLLM, by up to 78.5% in evaluator-based ASR on JailbreakBench and by 37.7 points in StrongREJECT score, while requiring no rewriting or hiding of harmful content in the jailbreak prompt. Our findings underscore the urgent need for rethinking safety alignment in this emerging class of language models. Code is available at https://github.com/ZichenWen1/DIJA.

URL PDF HTML ☆

赞 0 踩 0

2507.10382 2026-02-11 cs.LG

Leveraging RAG-LLMs for Urban Mobility Simulation and Analysis

Yue Ding, Conor McCarthy, Kevin O'Shea, Mingming Liu

2507.10155 2026-02-11 cs.CL

What Should Feature Distillation Transfer in LLMs? A Task-Tangent Geometry View

Khouloud Saadi, Di Wang

2507.05526 2026-02-11 cs.LG stat.ME stat.ML

Estimating Interventional Distributions with Uncertain Causal Graphs through Meta-Learning

Anish Dhir, Cristiana Diaconu, Valentinian Mihai Lungu, James Requeima, Richard E. Turner, Mark van der Wilk

2507.04397 2026-02-11 cs.CV

Multi-Expert Learning Framework with the State Space Model for Optical and SAR Image Registration

Wei Wang, Dou Quan, Ning Huyan, Chonghua Lv, Shuang Wang, Yunan Li, Licheng Jiao

2506.22274 2026-02-11 cs.CV cs.CL

Common Objects Out of Context (COOCo): Investigating Multimodal Context and Semantic Scene Violations in Referential Communication

Filippo Merlo, Ece Takmaz, Wenkai Chen, Albert Gatt

Comments Accepted to TACL (pre-MIT Press publication version)

2506.18862 2026-02-11 cs.CV cs.AI

TAMMs: Change Understanding and Forecasting in Satellite Image Time Series with Temporal-Aware Multimodal Models

Zhongbin Guo, Yuhao Wang, Ping Jian, Chengzhi Li, Xinyue Chen, Zhen Yang, Ertai E

Comments Published as a conference paper at The Fourteenth International Conference on Learning Representations (ICLR 2026)

2506.15792 2026-02-11 cs.LG physics.chem-ph

Deep Learning Foundation Models from Classical Molecular Descriptors

Jackson W. Burns, Akshat Shirish Zalte, Charlles R. A. Abreu, Jochen Sieg, Christian Feldmann, Miriam Mathea, William H. Green

AI 大模型

视觉与机器人

科学与医疗

Nudging the Boundaries of LLM Reasoning

MILR: Improving Multimodal Image Generation via Test-Time Latent Reasoning

EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning

Axiomatic Choice

MoWM: Mixture-of-World-Models for Embodied Planning via Latent-to-Pixel Feature Modulation

Exact Subgraph Isomorphism Network with Mixed $L_{0,2}$ Norm Constraint for Predictive Graph Mining

VIMD: Monocular Visual-Inertial Motion and Depth Estimation

Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels

THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning

DREAM: Domain-aware Reasoning for Efficient Autonomous Underwater Monitoring

PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits

LD-ViCE: Latent Diffusion Model for Video Counterfactual Explanations

A Novel Framework for Uncertainty-Driven Adaptive Exploration

HealthProcessAI: A Technical Framework and Proof-of-Concept for LLM-Enhanced Healthcare Process Mining

ERTACache: Error Rectification and Timesteps Adjustment for Efficient Diffusion

HiCL: Hippocampal-Inspired Continual Learning

Self-Guided Function Calling in Large Language Models via Stepwise Experience Recall

A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models

STELAR-VISION: Self-Topology-Aware Efficient Learning for Aligned Reasoning in Vision

Generalizing Scaling Laws for Dense and Sparse Large Language Models

Modelling and Classifying the Components of a Literature Review

Local Dense Logit Relations for Enhanced Knowledge Distillation

The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs

Leveraging RAG-LLMs for Urban Mobility Simulation and Analysis

What Should Feature Distillation Transfer in LLMs? A Task-Tangent Geometry View

Estimating Interventional Distributions with Uncertain Causal Graphs through Meta-Learning

Multi-Expert Learning Framework with the State Space Model for Optical and SAR Image Registration

Common Objects Out of Context (COOCo): Investigating Multimodal Context and Semantic Scene Violations in Referential Communication

TAMMs: Change Understanding and Forecasting in Satellite Image Time Series with Temporal-Aware Multimodal Models

Deep Learning Foundation Models from Classical Molecular Descriptors