arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.21501 2026-04-28 cs.AI

GeoMind: An Agentic Workflow for Lithology Classification with Reasoned Tool Invocation

Yitong Zhou, Mingyue Cheng, Jiahao Wang, Qingyang Mao, Qi Liu

详情

英文摘要

Lithology classification in well logs is a fundamental geoscience data mining task that aims to infer rock types from multi dimensional geophysical sequences. Despite recent progress, existing approaches typically formulate the problem as a static, single-step discriminative mapping. This static paradigm limits evidence-based diagnostic reasoning against geological standards, often yielding predictions that are detached from geological reality due to a lack of domain priors. In this work, we propose GeoMind, a tool-augmented agentic framework that models lithology classification as a sequential reasoning process. GeoMind organizes its toolkit into perception, reasoning, and analysis modules, which respectively translate raw logs into semantic trends, infer lithology hypotheses from multi-source evidence, and verify predictions against stratigraphic constraints. A global planner adaptively coordinates these modules based on input characteristics, enabling geologically plausible and evidence-grounded decisions. To guarantee the logical consistency of GeoMind, we introduce a fine-grained process supervision strategy. Unlike standard methods that focus solely on final outcomes, our approach optimizes intermediate reasoning steps, ensuring the validity of decision trajectories and alignment to geological constraints. Experiments on four benchmark well-log datasets demonstrate that GeoMind consistently outperforms strong baselines in classification performance while providing transparent and traceable decision-making processes.

URL PDF HTML ☆

赞 0 踩 0

2604.21395 2026-04-28 cs.LG cs.AI cs.CV

Supervised Learning Has a Necessary Geometric Blind Spot: Theory, Consequences, and Minimal Repair

Vishal Rajput

Comments 30 pages, 5 figures. Code: https://github.com/vishalstark512/PMH "Revised version with corrected manuscript text."

2604.21164 2026-04-28 cs.SD

MAGIC-TTS: Fine-Grained Controllable Speech Synthesis with Explicit Local Duration and Pause Control

Jialong Mai, Xiaofen Xing, Xiangmin Xu

Comments Release MAGIC-TTS code, pretrained models, and demo: https://github.com/yongaifadian1/MAGIC-TTS, https://huggingface.co/maimai11/MAGIC-TTS, https://yongaifadian1.github.io/MAGIC-TTS/

2604.18491 2026-04-28 cs.LG cs.AI

Faster by Design: Interactive Aerodynamics via Neural Surrogates Trained on Expert-Validated CFD

Nicholas Thumiger, Andrea Bartezzaghi, Mattia Rigotti, Cezary Skura, Thomas Frick, Elisa Serioli, Fabrizio Arbucci, A. Cristiano I. Malossi

Comments 7 pages, 4 figures

2604.15184 2026-04-28 cs.AI

Agent-Aided Design for Dynamic CAD Models

Mitch Adler, Matthew Russo, Michael Cafarella

Comments 5 pages, 3 figures, published in CAIS'26

2604.13015 2026-04-28 cs.RO

Learning Versatile Humanoid Manipulation with Touch Dreaming

Yaru Niu, Zhenlong Fang, Binghong Chen, Shuai Zhou, Revanth Krishna Senthilkumaran, Hao Zhang, Bingqing Chen, Chen Qiu, H. Eric Tseng, Jonathan Francis, Ding Zhao

2604.13006 2026-04-28 cs.CL cs.AI

One Token Away from Collapse: The Fragility of Instruction-Tuned Helpfulness

Erfan Baghaei Potraghloo, Seyedarmin Azizi, Souvik Kundu, Massoud Pedram

详情

英文摘要

Instruction-tuned large language models produce helpful, structured responses, but how robust is this helpfulness under trivial constraints? We show that simple lexical constraints (banning a single punctuation character or common word) cause instruction-tuned LLMs to collapse their responses, losing 14--48\% of comprehensiveness across seven models spanning five families (7B--70B, open- and closed-weight). A blinded human evaluation with 10 STEM-trained evaluators confirms genuine content loss, with information criteria degrading $1.5$--$2.3\times$ more than surface criteria, a finding corroborated by over 4,100 automated pairwise comparisons (77--100\% baseline preference) across three LLM judges from two model families. Diagnostic analysis identifies this as a \emph{planning failure}: two-pass generation recovers 59--96\% of response length, and linear probes on prompt representations predict response length with $R^2 = 0.51$--$0.94$ before generation begins. The same probes yield negative $R^2$ on base models, confirming that instruction tuning introduces the representational structure underlying the collapse. Base models show no systematic degradation under identical constraints, demonstrating that instruction tuning couples task competence to narrow surface-form templates. The effect extends to realistic deployment constraints (preamble suppression, corporate tone guidelines, legal compliance hedging, accessibility requirements) causing comparable degradation ($-$22\% to $-$34\%), with suppressing the conversational opener alone (``Certainly!'') causing 40\% collapse on our most fragile model despite restricting only the opening tokens. We further show that standard independent LLM-as-judge evaluation detects only a 3.5\% quality drop where pairwise evaluation reveals 23\%, exposing a methodological blind spot in current evaluation practice.

URL PDF HTML ☆

赞 0 踩 0

2604.12290 2026-04-28 cs.AI cs.CL

Frontier-Eng: Benchmarking Self-Evolving Agents on Real-World Engineering Tasks with Generative Optimization

Yizhe Chi, Deyao Hong, Dapeng Jiang, Tianwei Luo, Kaisen Yang, Boshi Zhang, Zhe Cao, Xiaoyan Fan, Bingxiang He, Han Hao, Weiyang Jin, Dianqiao Lei, Qingle Liu, Houde Qian, Bowen Wang, Situ Wang, Youjie Zheng, Yifan Zhou, Calvin Xiao, Eren Cai, Qinhuai Na

2604.11991 2026-04-28 cs.RO

Complementarity by Construction: A Lie-Group Approach to Solving Quadratic Programs with Linear Complementarity Constraints

Arun L. Bishop, Micah I. Reich, Zachary Manchester

2604.11564 2026-04-28 cs.CV

Training-Free Model Ensemble for Single-Image Super-Resolution via Strong-Branch Compensation

Gengjia Chang, Xining Ge, Weijun Yuan, Zhan Li, Qiurong Song, Luen Zhu, Shuhong Liu

2604.11468 2026-04-28 cs.CV

Beyond Model Design: Data-Centric Training and Self-Ensemble for Gaussian Color Image Denoising

Gengjia Chang, Xining Ge, Weijun Yuan, Zhan Li, Qiurong Song, Luen Zhu, Shuhong Liu

2604.10334 2026-04-28 cs.CV

SIMPLER: H&E-Informed Representation Learning for Structured Illumination Microscopy

Abu Zahid Bin Aziz, Syed Fahim Ahmed, Gnanesh Rasineni, Mei Wang, Olcaytu Hatipoglu, Marisa Ricci, Malaiyah Shaw, Guang Li, J. Quincy Brown, Valerio Pascucci, Shireen Elhabian

2604.10112 2026-04-28 cs.CV

Dual-Branch Remote Sensing Infrared Image Super-Resolution

Xining Ge, Gengjia Chang, Weijun Yuan, Zhan Li, Zhanglu Chen, Boyang Yao, Yihang Chen, Yifan Deng, Shuhong Liu

2604.08015 2026-04-28 cs.CV cs.LG

Component-Adaptive and Lesion-Level Supervision for Improved Small Structure Segmentation in Brain MRI

Minh Sao Khue Luu, Evgeniy N. Pavlovskiy, Bair N. Tuchinov

Comments This version includes additional false-negative and false-positive error analysis in the Results

2604.05500 2026-04-28 cs.CV

CLIP-Guided Data Augmentation for Night-Time Image Dehazing

Xining Ge, Weijun Yuan, Gengjia Chang, Xuyang Li, Shuhong Liu

2604.02374 2026-04-28 cs.SD

Evaluating Generalization and Robustness in Russian Anti-Spoofing: The RuASD Initiative

Ksenia Lysikova, Kirill Borodin, Grach Mkrtchian

Comments Submitted to IEEE Access. Under review

2603.29928 2026-04-28 cs.AI

ScoringBench: A Benchmark for Evaluating Tabular Foundation Models with Proper Scoring Rules

Jonas Landsgesell, Pascal Knoll, Tizian Wenzel

2603.28463 2026-04-28 cs.CV

Decoupling Wavelet Sub-bands for Single Source Domain Generalization in Fundus Image Segmentation

Shramana Dey, Varun Ajith, Abhirup Banerjee, Sushmita Mitra

2603.21421 2026-04-28 cs.RO cs.AI

HyReach: Vision-Guided Hybrid Manipulator Reaching in Unseen Cluttered Environments

Shivani Kamtikar, Kendall Koe, Justin Wasserman, Samhita Marri, Benjamin Walt, Naveen Kumar Uppalapati, Girish Krishnan, Girish Chowdhary

Comments 8 pages, 5 figures, 5 tables

2603.20806 2026-04-28 cs.CV

Less is More in Semantic Space: Intrinsic Decoupling via Clifford-M for Fundus Image Classification

Yifeng Zheng

Comments Withdrawn by the author because this early version does not reflect the current scope, validation protocol, and contributor information of the work. A substantially revised version is being prepared

2603.17573 2026-04-28 cs.RO cs.DB cs.LG

HeiSD: Hybrid Speculative Decoding for Embodied Vision-Language-Action Models with Kinematic Awareness

Zihao Zheng, Zhihao Mao, Sicheng Tian, Maoliang Li, Jiayu Chen, Xinhao Sun, Zhaobo Zhang, Xuanzhe Liu, Donggang Cao, Hong Mei, Xiang Chen

2603.15941 2026-04-28 cs.CV

Towards Fair and Robust Volumetric CT Classification via KL-Regularised Group Distributionally Robust Optimisation

Samuel Johnny, Blessed Guda, Goodness Obasi, Aaron Emmanuel, Moise Busogi

Comments CVPR 2026 Medical Imaging & Healthcare Workshop

2603.15130 2026-04-28 cs.CL

Indirect Question Answering in English, German and Bavarian: A Challenging Task for High- and Low-Resource Languages Alike

Miriam Winkler, Verena Blaschke, Barbara Plank

Comments LREC 2026 (this version fixes an error with the baseline scores)

2603.10926 2026-04-28 cs.LG cs.AI

ECoLAD: Deployment-Oriented Evaluation for Automotive Time-Series Anomaly Detection

Kadir-Kaan Özer, René Ebeling, Markus Enzweiler

Comments 6 pages, 3 figures, 5 tables

2603.03269 2026-04-28 cs.CV cs.LG

LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory

Junyi Zhang, Charles Herrmann, Junhwa Hur, Chen Sun, Ming-Hsuan Yang, Forrester Cole, Trevor Darrell, Deqing Sun

Comments Project page: https://LoGeR-project.github.io/

2603.01581 2026-04-28 cs.RO cs.LG

KERV: Kinematic-Rectified Speculative Decoding for Embodied VLA Models

Zihao Zheng, Zhihao Mao, Maoliang Li, Jiayu Chen, Xinhao Sun, Zhaobo Zhang, Donggang Cao, Hong Mei, Xiang Chen

Comments This paper has been accepted by DAC 2026

2603.01147 2026-04-28 cs.CV

ConVibNet: Needle Detection during Continuous Insertion via Frequency-Inspired Features

Jiamei Guo, Zhehao Duan, Maria Neiiendam, Dianye Huang, Nassir Navab, Zhongliang Jiang

Comments Accepted by IPCAI

详情

DOI: 10.1007/s11548-026-03649-5

英文摘要

Purpose: Ultrasound-guided needle interventions are widely used in clinical practice, but their success critically depends on accurate needle placement, which is frequently hindered by the poor and intermittent visibility of needles in ultrasound images. Existing approaches remain limited by artifacts, occlusions, and low contrast, and often fail to support real-time continuous insertion. To overcome these challenges, this study introduces a robust real-time framework for continuous needle detection. Methods: We present ConVibNet, an extension of VibNet for detecting needles with significantly reduced visibility, addressing real-time, continuous needle tracking during insertion. ConVibNet leverages temporal dependencies across successive ultrasound frames to enable continuous estimation of both needle tip position and shaft angle in dynamic scenarios. To strengthen temporal awareness of needle-tip motion, we introduce a novel intersection-and-difference loss that explicitly leverages motion correlations across consecutive frames. In addition, we curated a dedicated dataset for model development and evaluation. Results: The performance of the proposed ConVibNet model was evaluated on our dataset, demonstrating superior accuracy compared to the baseline VibNet and UNet-LSTM models. Specifically, ConVibNet achieved a tip error of 2.80+-2.42 mm and an angle error of 1.69+-2.00 deg. These results represent a 0.75 mm improvement in tip localization accuracy over the best-performing baseline, while preserving real-time inference capability. Conclusion: ConVibNet advances real-time needle detection in ultrasound-guided interventions by integrating temporal correlation modeling with a novel intersection-and-difference loss, thereby improving accuracy and robustness and demonstrating high potential for integration into autonomous insertion systems.

URL PDF HTML ☆

赞 0 踩 0

2602.15603 2026-04-28 cs.LG cs.SC math.OC

Symbolic recovery of PDEs from measurement data

Erion Morina, Philipp Scholl, Martin Holler

详情

英文摘要

Models based on partial differential equations (PDEs) are powerful for describing a wide range of complex phenomena in the natural sciences. Accurately identifying the PDE model, which represents the underlying physical law, is essential for a proper understanding of the problem. This reconstruction typically relies on indirect and noisy measurements of the system's state and, without specifically tailored methods, rarely yields symbolic expressions, thereby limiting interpretability. In this work, we address this limitation by considering neural network architectures based on rational functions for the symbolic representation of physical laws. These networks combine the approximation power of rational functions with the flexibility to represent arithmetic operations, and generalize ParFam and EQL-type architectures used in symbolic regression for physical law learning. We further establish regularity results for these symbolic networks. Our main contribution is a reconstruction result showing that, if there exists an admissible physical law that is expressible within the symbolic network architecture, then in the limit of noiseless and complete measurements, symbolic networks recover a physical law within the PDE model that is representable by the architecture. Moreover, the recovered law corresponds to a regularization-minimizing parameterization, promoting interpretability and sparsity in case of $L^1$-regularization. Under an additional identifiability condition, the unique true physical law is recovered. These reconstruction and regularity results are derived at the continuous level prior to discretization due to a formulation in function space. Empirical results using the ParFam architecture are consistent with the theoretical findings and suggest the feasibility of reconstructing interpretable physical laws in practice.

URL PDF HTML ☆

赞 0 踩 0

2602.08377 2026-04-28 cs.LG cs.AI cs.CL

Reinforcement Learning with Backtracking Feedback

Bilgehan Sel, Vaishakh Keshava, Phillip Wallis, Lukas Rutishauser, Ming Jin, Dingcheng Li

Comments NeurIPS 2025

2602.07605 2026-04-28 cs.CV cs.AI

Fine-R1: Make Multi-modal LLMs Excel in Fine-Grained Visual Recognition by Chain-of-Thought Reasoning

Hulingxiao He, Zijun Geng, Yuxin Peng

Comments Published as a conference paper at ICLR 2026. The models are available at https://huggingface.co/collections/StevenHH2000/fine-r1