arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2511.08016 2026-04-24 cs.RO cs.AI cs.MA

AVOID-JACK: Avoidance of Jackknifing for Swarms of Long Heavy Articulated Vehicles

Adrian Schönnagel, Michael Dubé, Christoph Steup, Felix Keppler, Sanaz Mostaghim

Comments 6+1 pages, 9 figures, accepted for publication in IEEE MRS 2025

详情

DOI: 10.1109/MRS66243.2025.11357246

英文摘要

This paper presents a novel approach to avoiding jackknifing and mutual collisions in Heavy Articulated Vehicles (HAVs) by leveraging decentralized swarm intelligence. In contrast to typical swarm robotics research, our robots are elongated and exhibit complex kinematics, introducing unique challenges. Despite its relevance to real-world applications such as logistics automation, remote mining, airport baggage transport, and agricultural operations, this problem has not been addressed in the existing literature. To tackle this new class of swarm robotics problems, we propose a purely reaction-based, decentralized swarm intelligence strategy tailored to automate elongated, articulated vehicles. The method presented in this paper prioritizes jackknifing avoidance and establishes a foundation for mutual collision avoidance. We validate our approach through extensive simulation experiments and provide a comprehensive analysis of its performance. For the experiments with a single HAV, we observe that for 99.8% jackknifing was successfully avoided and that 86.7% and 83.4% reach their first and second goals, respectively. With two HAVs interacting, we observe 98.9%, 79.4%, and 65.1%, respectively, while 99.7% of the HAVs do not experience mutual collisions.

URL PDF HTML ☆

赞 0 踩 0

2511.07752 2026-04-24 cs.CL

Back to the Future: The Role of Past and Future Context Predictability in Incremental Language Production

Shiva Upadhye, Richard Futrell

Comments 73 pages, 12 figures

2511.06209 2026-04-24 cs.AI cs.CL

ReProbe: Efficient Test-Time Scaling of Multi-Step Reasoning by Probing Internal States of Large Language Models

Jingwei Ni, Ekaterina Fadeeva, Tianyi Wu, Mubashara Akhtar, Jiaheng Zhang, Elliott Ash, Markus Leippold, Timothy Baldwin, See-Kiong Ng, Artem Shelmanov, Mrinmaya Sachan

Comments ACL 2026 Main

2511.04638 2026-04-24 cs.LG cs.AI

Addressing divergent representations from causal interventions on neural networks

Satchel Grant, Simon Jerome Han, Alexa R. Tartaglini, Christopher Potts

2511.03232 2026-04-24 cs.CV

Transformer-Progressive Mamba Network for Lightweight Image Super-Resolution

Sichen Guo, Wenjie Li, Yuanyang Liu, Guangwei Gao, Jian Yang, Chia-Wen Lin

Comments 14 pages, 12 figures, 9 tables

2510.23912 2026-04-24 cs.LG cs.AI

Key and Value Weights Are Probably All You Need: On the Necessity of the Query, Key, Value weight Triplet in Self-Attention Transformers

Marko Karbevski, Antonij Mijoski

Comments Detailed version of the long paper (poster) accepted at the ICLR 2026 workshop on Deep Generative Models: Theory, Principle, and Efficacy (DeLTa)

2510.20505 2026-04-24 cs.CL cs.AI

RELOOP: Recursive Retrieval with Multi-Hop Reasoner and Planners for Heterogeneous QA

Ruiyi Yang, Hao Xue, Imran Razzak, Hakim Hacid, Flora D. Salim

Comments 19 pages, 2 figures

2510.20064 2026-04-24 cs.LG

Not-a-Bandit: Provably No-Regret Drafter Selection in Speculative Decoding for LLMs

Hongyi Liu, Jiaji Huang, Zhen Jia, Youngsuk Park, Yu-Xiang Wang

Comments ICLR'26

2510.18914 2026-04-24 cs.CL cs.AI

Fairness Evaluation and Inference Level Mitigation in LLMs

Afrozah Nadeem, Mark Dras, Usman Naseem

Comments Accepted at The 64th Annual Meeting of the Association for Computational Linguistics San Diego, California, United, States July 2 to 7, 2026

2510.18457 2026-04-24 cs.CV cs.LG

VFM-VAE: Vision Foundation Models Can Be Good Tokenizers for Latent Diffusion Models

Tianci Bi, Xiaoyi Zhang, Yan Lu, Nanning Zheng

Comments Accepted at CVPR 2026. Code and models available at: https://github.com/tianciB/VFM-VAE

2510.18091 2026-04-24 cs.CV cs.AI cs.LG

Accelerating Vision Transformers with Adaptive Patch Sizes

Rohan Choudhury, JungEun Kim, Jinhyung Park, Eunho Yang, László A. Jeni, Kris M. Kitani

Comments Accepted to ICLR 2026. Project page at https://rccchoudhury.github.io/apt/

2510.15313 2026-04-24 cs.CL

Capabilities and Evaluation Biases of Large Language Models in Classical Chinese Poetry Generation: A Case Study on Tang Poetry

Bolei Ma, Yina Yao, Anna-Carolina Haensch

Comments ACL 2026 Findings

2510.15096 2026-04-24 cs.AI cs.LG

OpenEstimate: Evaluating LLMs on Reasoning Under Uncertainty with Real-World Data

Alana Renda, Jillian Ross, Michael Cafarella, Jacob Andreas

2510.10971 2026-04-24 cs.CL cs.AI

RV-HATE: Reinforced Multi-Module Voting for Implicit Hate Speech Detection

Yejin Lee, Hyeseon Ahn, Yo-Sub Han

Comments 20 pages, 9 figures, ACL 2026

2510.04823 2026-04-24 cs.CV

Flow Matching for Conditional MRI-CT and CBCT-CT Image Synthesis

Arnela Hadzic, Simon Johannes Joham, Martin Urschler

Comments Published in the Proceedings of the Third Austrian Symposium on AI, Robotics, and Vision (AIRoV 2026)

2510.04573 2026-04-24 cs.LG cs.AI cs.CL

LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning

Haoqiang Kang, Yizhe Zhang, Nikki Lijing Kuang, Nicklas Majamaki, Navdeep Jaitly, Yi-An Ma, Lianhui Qin

2510.03247 2026-04-24 cs.LG cs.AI

Towards Multimodal Active Learning: Efficient Learning with Limited Paired Data

Jiancheng Zhang, Yinglun Zhu

Comments Accepted by Transactions on Machine Learning Research (TMLR)

2509.24239 2026-04-24 cs.LG cs.AI

ChessArena: A Chess Testbed for Evaluating Strategic Reasoning Capabilities of Large Language Models

Jincheng Liu, Sijun He, Jingjing Wu, Xiangsen Wang, Yang Chen, Zhaoqi Kuang, Siqi Bao, Yuan Yao

2509.20712 2026-04-24 cs.LG cs.CL

CE-GPPO: Coordinating Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning

Zhenpeng Su, Leiyu Pan, Minxuan Lv, Yuntao Li, Wenping Hu, Fuzheng Zhang, Kun Gai, Guorui Zhou

Comments This paper has been accepted by ACL 2026

2509.18629 2026-04-24 cs.LG cs.AI

HyperAdapt: Simple High-Rank Adaptation

Abel Gurung, Joseph Campbell

Comments Published in Transactions on Machine Learning Research

2509.10897 2026-04-24 cs.CV physics.optics

TV Subgradient-Guided Multi-Source Fusion for Spectral Imaging in Dual-Camera CASSI Systems

Weiqiang Zhao, Tianzhu Liu, Yuzhe Gui, Wei Bian, Yanfeng Gu

Comments Main text: 14 pages, 12 figures; Supplementary material: 8 pages, 3 figures

2508.21720 2026-04-24 cs.AI

PosterForest: Hierarchical Multi-Agent Collaboration for Scientific Poster Generation

Jiho Choi, Seojeong Park, Seongjong Song, Hyunjung Shim

Comments ACL 2026

2508.19353 2026-04-24 cs.LG cs.CV

Efficient Multi-Source Knowledge Transfer by Model Merging

Marcin Osial, Bartosz Wójcik, Bartosz Zieliński, Sebastian Cygert

2508.10177 2026-04-24 cs.AI

KompeteAI: Accelerated Autonomous Multi-Agent System for End-to-End Pipeline Generation for Machine Learning Problems

Stepan Kulibaba, Artem Dzhalilov, Roman Pakhomov, Oleg Svidchenko, Alexander Gasnikov, Aleksei Shpilman

2508.06283 2026-04-24 cs.RO

Situationally-aware Path Planning Exploiting 3D Scene Graphs

Saad Ejaz, Marco Giberna, Muhammad Shaheer, Jose Andres Millan-Romera, Ali Tourani, Paul Kremer, Holger Voos, Jose Luis Sanchez-Lopez

2507.05385 2026-04-24 cs.CL

EduCoder: An Open-Source Annotation System for Education Transcript Data

Guanzhong Pan, Mei Tan, Hyunji Nam, Lucía Langlois, James Malamut, Liliana Deonizio, Dorottya Demszky

2507.04023 2026-04-24 cs.CL

Do LLMs Overthink Basic Math Reasoning? Benchmarking the Accuracy-Efficiency Tradeoff in Language Models

Gaurav Srivastava, Aafiya Hussain, Sriram Srinivasan, Xuan Wang

Comments Accepted to ACL 2026 Findings

详情

英文摘要

Large language models (LLMs) achieve impressive performance on complex mathematical benchmarks yet sometimes fail on basic math reasoning while generating unnecessarily verbose responses. In this paper, we present LLMThinkBench, a systematic benchmark and comprehensive empirical study to evaluate the efficiency of reasoning in LLMs, focusing on the fundamental tradeoff between accuracy and overthinking. First, we formalize the accuracy-verbosity tradeoff. Second, we introduce the Overthinking Score, a harmonic-mean metric combining accuracy and token-efficiency for holistic model evaluation. Third, we establish an evaluation protocol with dynamically-generated data across 14 basic math tasks. Fourth, we conduct a large-scale empirical study evaluating 53 LLMs, including reasoning and quantized variants across different reasoning budgets. Fifth, we release LLMThinkBench as an open-source Python package and public leaderboard for reproducibility. Our findings reveal: 1) model performance on complex benchmarks does not translate directly to basic math reasoning; 2) reasoning models generate ~18x more tokens while sometimes achieving lower accuracy and exhibit catastrophic collapse when tokens are constrained, dropping by up to ~36%; 3) the accuracy-verbosity relationship is non-monotonic with extended reasoning budgets yielding diminishing returns (GPT-5/o-series models show zero accuracy gain from low -> medium -> high reasoning effort). Our findings challenge the assumption that longer reasoning in LLMs necessarily improves mathematical reasoning. Our public leaderboard is available at https://ctrl-gaurav.github.io/LLMThinkBench/. Our open-source Python package is available at https://pypi.org/project/llmthinkbench/, and the codebase can be found at https://github.com/ctrl-gaurav/LLMThinkBench for easy and reproducible evaluation.

URL PDF HTML ☆

赞 0 踩 0

2507.03933 2026-04-24 cs.CL

Losing our Tail, Again: (Un)Natural Selection & Multilingual LLMs

Eva Vanmassenhove

Comments 12 pages

2506.12721 2026-04-24 cs.AI cs.CL cs.LG stat.ML

Strategic Scaling of Test-Time Compute: A Bandit Learning Approach

Bowen Zuo, Yinglun Zhu

Comments To appear at ICLR 2026

2505.22266 2026-04-24 cs.SD cs.MM eess.AS

FGAS: Fixed Decoder Network-Based Audio Steganography with Adversarial Perturbation Generation

Jialin Yan, Yu Cheng, Zhaoxia Yin, Xinpeng Zhang, Shilin Wang, Tanfeng Sun, Xinghao Jiang

详情

英文摘要

The rapid development of Artificial Intelligence Generated Content (AIGC) has made high-fidelity generated audio widely available across the Internet, driving the advancement of audio steganography. Benefiting from advances in deep learning, current audio steganography schemes are mainly based on encoder-decoder network architectures. While these methods guarantee a certain level of perceptual quality for stego audio, they typically face high computational cost and long implementation time, as well as poor anti-steganalysis performance. To address the aforementioned issues, we pioneer a Fixed Decoder Network-Based Audio Steganography with Adversarial Perturbation Generation (FGAS). Adversarial perturbations carrying a secret message are embedded into the cover audio to generate stego audio. The receiver only needs to share the structure and key of the fixed decoder network to accurately extract the secret message from the stego audio. In FGAS, we propose an Audio Adversarial Perturbation Generation (A2PG) strategy with an optional robust extension and design a lightweight fixed decoder. The fixed decoder guarantees reliable extraction of the hidden message, while adversarial perturbations are optimized to keep the stego audio perceptually and statistically close to the cover audio, thereby improving anti-steganalysis performance. The experimental results show that FGAS significantly improves stego audio quality, achieving an average PSNR gain of over 10 dB compared to SOTA methods. Furthermore, FGAS demonstrates strong robustness against common audio processing attacks. Moreover, FGAS exhibits superior anti-steganalysis performance across different relative payloads; under high-capacity embedding, it achieves a classification error rate about 2% higher, indicating stronger anti-steganalysis performance than current SOTA methods.

URL PDF HTML ☆

赞 0 踩 0