arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.27628 2026-03-31 cs.AI

DSevolve: Enabling Real-Time Adaptive Scheduling on Dynamic Shop Floor with LLM-Evolved Heuristic Portfolios

Jin Huang, Jie Yang, XinLei Zhou, Qihao Liu, Liang Gao, Xinyu Li

详情

英文摘要

In dynamic manufacturing environments, disruptions such as machine breakdowns and new order arrivals continuously shift the optimal dispatching strategy, making adaptive rule selection essential. Existing LLM-powered Automatic Heuristic Design (AHD) frameworks evolve toward a single elite rule that cannot meet this adaptability demand. To address this, we present DSevolve, an industrial scheduling framework that evolves a quality-diverse portfolio of dispatching rules offline and adaptively deploys them online with second-level response time. Multi-persona seeding and topology-aware evolutionary operators produce a behaviorally diverse rule archive indexed by a MAP-Elites feature space. Upon each disruption event, a probe-based fingerprinting mechanism characterizes the current shop floor state, retrieves high-quality candidate rules from an offline knowledge base, and selects the best one via rapid look-ahead simulation. Evaluated on 500 dynamic flexible job shop instances derived from real industrial data, DSevolve outperforms state-of-the-art AHD frameworks, classical dispatching rules, genetic programming, and deep reinforcement learning, offering a practical and deployable solution for intelligent shop floor scheduling.

URL PDF HTML ☆

赞 0 踩 0

2603.27626 2026-03-31 cs.CL cs.AI

Umwelt Engineering: Designing the Cognitive Worlds of Linguistic Agents

Rodney Jehu-Appiah

Comments 24 pages, 2 figures, 7 tables

2603.27625 2026-03-31 cs.CV

Clore: Interactive Pathology Image Segmentation with Click-based Local Refinement

Tiantong Wang, Minfan Zhao, Jun Shi, Hannan Wang, Yue Dai

2603.27611 2026-03-31 cs.AI q-bio.NC

What does a system modify when it modifies itself?

Florentin Koch

Comments Working Paper

2603.27599 2026-03-31 cs.CV

You Only Erase Once: Erasing Anything without Bringing Unexpected Content

Yixing Zhu, Qing Zhang, Wenju Xu, Wei-Shi Zheng

Comments Accepted by CVPR2026

2603.27597 2026-03-31 cs.AI q-bio.NC

From indicators to biology: the calibration problem in artificial consciousness

Florentin Koch

Comments Working Paper (Spotlight Commentary )

2603.27593 2026-03-31 cs.CV cs.AI

STRIDE: When to Speak Meets Sequence Denoising for Streaming Video Understanding

Junho Kim, Hosu Lee, James M. Rehg, Minsu Kim, Yong Man Ro

Comments Project page: https://interlive-team.github.io/STRIDE

2603.27589 2026-03-31 cs.LG

An Energy-Efficient Spiking Neural Network Architecture for Predictive Insulin Delivery

Sahil Shrivastava

Comments 10 pages, 6 figures, 12 tables. IEEE conference format. Independent Research

详情

英文摘要

Diabetes mellitus affects over 537 million adults worldwide. Insulin-dependent patients require continuous glucose monitoring and precise dose calculation while operating under strict power budgets on wearable devices. This paper presents PDDS - an in-silico, software-complete research prototype of an event-driven computational pipeline for predictive insulin dose calculation. Motivated by neuromorphic computing principles for ultra-low-power wearable edge devices, the core contribution is a three-layer Leaky Integrate-and-Fire (LIF) Spiking Neural Network trained on 128,025 windows from OhioT1DM (66.5% real patients) and the FDA-accepted UVa/Padova physiological simulator (33.5%), achieving 85.90% validation accuracy. We present three rigorously honest evaluations: (1) a standard test-set comparison against ADA threshold rules, bidirectional LSTM (99.06% accuracy), and MLP (99.00%), where the SNN achieves 85.24% - we demonstrate this gap reflects the stochastic encoding trade-off, not architectural failure; (2) a temporal benchmark on 426 non-obvious clinician-annotated hypoglycemia windows where neither the SNN (9.2% recall) nor the ADA rule (16.7% recall) performs adequately, identifying the system's key limitation and the primary direction for future work; (3) a power-efficiency analysis showing the SNN requires 79,267x less energy per inference than the LSTM (1,551 Femtojoules vs. 122.9 nanojoules), justifying the SNN architecture for continuous wearable deployment. The system is not yet connected to physical hardware; it constitutes the computational middle layer of a five phase roadmap toward clinical validation. Keywords: spiking neural network, glucose severity classification, edge computing, hypoglycemia detection, event-driven architecture, LIF neuron, Poisson encoding, OhioT1DM, in-silico, neuromorphic, power efficiency.

URL PDF HTML ☆

赞 0 踩 0

2603.27583 2026-03-31 cs.RO cs.SY eess.SY

LLM-Enabled Low-Altitude UAV Natural Language Navigation via Signal Temporal Logic Specification Translation and Repair

Yuqi Ping, Huahao Ding, Tianhao Liang, Longyu Zhou, Guangyu Lei, Xinglin Chen, Junwei Wu, Jieyu Zhou, Tingting Zhang

2603.27579 2026-03-31 cs.CV math.OC

A Robust Low-Rank Prior Model for Structured Cartoon-Texture Image Decomposition with Heavy-Tailed Noise

Weihao Tang, Hongjin He

Comments This paper introduces a robust model for cartoon-texture image decomposition with heavy-tailed noise. It has 11 figures and 4 tables

2603.27577 2026-03-31 cs.CV cs.RO

Structured Observation Language for Efficient and Generalizable Vision-Language Navigation

Daojie Peng, Fulong Ma, Jun Ma

2603.27558 2026-03-31 cs.CV

Learning to See through Illumination Extremes with Event Streaming in Multimodal Large Language Models

Baoheng Zhang, Jiahui Liu, Gui Zhao, Weizhou Zhang, Yixuan Ma, Jun Jiang, Yingxian Chen, Wilton W. T. Fok, Xiaojuan Qi, Hayden Kwok-Hay So

Comments IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2026

2603.27556 2026-03-31 cs.CV

Towards Domain-Generalized Open-Vocabulary Object Detection: A Progressive Domain-invariant Cross-modal Alignment Method

Xiaoran Xu, Xiaoshan Yang, Jiangang Yang, Yifan Xu, Jian Liu, Changsheng Xu

2603.27555 2026-03-31 cs.CV

PANDORA: Pixel-wise Attention Dissolution and Latent Guidance for Zero-Shot Object Removal

Dinh-Khoi Vo, Van-Loc Nguyen, Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le

Comments ICME 2026

2603.27553 2026-03-31 cs.CV

Annotation-Free Detection of Drivable Areas and Curbs Leveraging LiDAR Point Cloud Maps

Fulong Ma, Daojie Peng, Jun Ma

2603.27542 2026-03-31 cs.CV

MV-RoMa: From Pairwise Matching into Multi-View Track Reconstruction

Jongmin Lee, Seungyeop Kang, Sungjoo Yoo

Comments CVPR 2026 Accepted

2603.27538 2026-03-31 cs.CV cs.CL

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Meituan LongCat Team, Bin Xiao, Chao Wang, Chengjiang Li, Chi Zhang, Chong Peng, Hang Yu, Hao Yang, Haonan Yan, Haoze Sun, Haozhe Zhao, Hong Liu, Hui Su, Jiaqi Zhang, Jiawei Wang, Jing Li, Kefeng Zhang, Manyuan Zhang, Minhao Jing, Peng Pei, Quan Chen, Taofeng Xue, Tongxin Pan, Xiaotong Li, Xiaoyang Li, Xiaoyu Zhao, Xing Hu, Xinyang Lin, Xunliang Cai, Yan Bai, Yan Feng, Yanjie Li, Yao Qiu, Yerui Sun, Yifan Lu, Ying Luo, Yipeng Mei, Yitian Chen, Yuchen Xie, Yufang Liu, Yufei Chen, Yulei Qian, Yuqi Peng, Zhihang Yu, Zhixiong Han, Changran Wang, Chen Chen, Dian Zheng, Fengjiao Chen, Ge Yang, Haowei Guo, Haozhe Wang, Hongyu Li, Huicheng Jiang, Jiale Hong, Jialv Zou, Jiamu Li, Jianping Lin, Jiaxing Liu, Jie Yang, Jing Jin, Jun Kuang, Juncheng She, Kunming Luo, Kuofeng Gao, Lin Qiu, Linsen Guo, Mianqiu Huang, Qi Li, Qian Wang, Rumei Li, Siyu Ren, Wei Wang, Wenlong He, Xi Chen, Xiao Liu, Xiaoyu Li, Xu Huang, Xuanyu Zhu, Xuezhi Cao, Yaoming Zhu, Yifei Cao, Yimeng Jia, Yizhen Jiang, Yufei Gao, Zeyang Hu, Zhenlong Yuan, Zijian Zhang, Ziwen Wang

Comments LongCat-Next Technical Report

2603.27537 2026-03-31 cs.RO

Learning Smooth and Robust Space Robotic Manipulation of Dynamic Target via Inter-frame Correlation

Siyi Lang, Hongyi Gao, Yingxin Zhang, Zihao Liu, Hanlin Dong, Zhaoke Ning, Zhiqiang Ma, Panfeng Huang

Comments none

2603.27536 2026-03-31 cs.AI

Dual-Stage LLM Framework for Scenario-Centric Semantic Interpretation in Driving Assistance

Jean Douglas Carvalho, Hugo Taciro Kenji, Ahmad Mohammad Saber, Glaucia Melo, Max Mauro Dias Santos, Deepa Kundur

2603.27534 2026-03-31 cs.RO

S3KF: Spherical State-Space Kalman Filtering for Panoramic 3D Multi-Object Tracking

Zhongyuan Liu, Shaonan Yu, Jianping Li, Pengfei Wan, Xinhang Xu, Pengfei Wang, Maggie Y. Gao, Lihua Xie

详情

英文摘要

Panoramic multi-object tracking is important for industrial safety monitoring, wide-area robotic perception, and infrastructure-light deployment in large workspaces. In these settings, the sensing system must provide full-surround coverage, metric geometric cues, and stable target association under wide field-of-view distortion and occlusion. Existing image-plane trackers are tightly coupled to the camera projection and become unreliable in panoramic imagery, while conventional Euclidean 3D formulations introduce redundant directional parameters and do not naturally unify angular, scale, and depth estimation. In this paper, we present $\mathbf{S^3KF}$, a panoramic 3D multi-object tracking framework built on a motorized rotating LiDAR and a quad-fisheye camera rig. The key idea is a geometry-consistent state representation on the unit sphere $\mathbb{S}^2$, where object bearing is modeled by a two-degree-of-freedom tangent-plane parameterization and jointly estimated with box scale and depth dynamics. Based on this state, we derive an extended spherical Kalman filtering pipeline that fuses panoramic camera detections with LiDAR depth observations for multimodal tracking. We further establish a map-based ground-truth generation pipeline using wearable localization devices registered to a shared global LiDAR map, enabling quantitative evaluation without motion-capture infrastructure. Experiments on self-collected real-world sequences show decimeter-level planar tracking accuracy, improved identity continuity over a 2D panoramic baseline in dynamic scenes, and real-time onboard operation on a Jetson AGX Orin platform. These results indicate that the proposed framework is a practical solution for panoramic perception and industrial-scale multi-object tracking.The project page can be found at https://kafeiyin00.github.io/S3KF/.

URL PDF HTML ☆

赞 0 踩 0

2603.27533 2026-03-31 cs.CV cs.AI

Demo-Pose: Depth-Monocular Modality Fusion For Object Pose Estimation

Rachit Agarwal, Abhishek Joshi, Sathish Chalasani, Woo Jin Kim

Comments Accepted at ICASSP 2026, 5 pages, 3 figures, 3 tables

2603.27531 2026-03-31 cs.CV

OmniColor: A Unified Framework for Multi-modal Lineart Colorization

Xulu Zhang, Haoqian Du, Xiaoyong Wei, Qing Li

2603.27530 2026-03-31 cs.CL

A gentle tutorial and a structured reformulation of Bock's algorithm for minimum directed spanning trees

Yuxi Wang, Jungyeul Park

2603.27528 2026-03-31 cs.SD cs.IR

Advancing Multi-Instrument Music Transcription: Results from the 2025 AMT Challenge

Ojas Chaturvedi, Kayshav Bhardwaj, Tanay Gondil, Benjamin Shiue-Hal Chou, Kristen Yeon-Ji Yun, Yung-Hsiang Lu, Yujia Yan, Sungkyun Chang

Comments 7 pages, 3 figures. Accepted to the AI for Music Workshop at NeurIPS 2025

2603.27527 2026-03-31 cs.LG

Visualization of Machine Learning Models through Their Spatial and Temporal Listeners

Siyu Wu, Lei Shi, Lei Xia, Cenyang Wu, Zipeng Liu, Yingchaojie Feng, Liang Zhou, Wei Chen

2603.27526 2026-03-31 cs.LG

Q-BIOLAT: Binary Latent Protein Fitness Landscapes for QUBO-Based Optimization

Truong-Son Hy

2603.27522 2026-03-31 cs.CL cs.CR cs.LG

Hidden Ads: Behavior Triggered Semantic Backdoors for Advertisement Injection in Vision Language Models

Duanyi Yao, Changyue Li, Zhicong Huang, Cheng Hong, Songze Li

2603.27520 2026-03-31 cs.CV

TokenDial: Continuous Attribute Control in Text-to-Video via Spatiotemporal Token Offsets

Zhixuan Liu, Peter Schaldenbrand, Yijun Li, Long Mai, Aniruddha Mahapatra, Cusuh Ham, Jean Oh, Jui-Hsien Wang

Comments Project page: https://tokendial.github.io/

2603.27519 2026-03-31 cs.CV

SPROUT: A Scalable Diffusion Foundation Model for Agricultural Vision

Shuai Xiang, Wei Guo, James Burridge, Shouyang Liu, Hao Lu, Tokihiro Fukatsu

2603.27515 2026-03-31 cs.LG

Match or Replay: Self Imitating Proximal Policy Optimization

Gaurav Chaudhary, Laxmidhar Behera, Washim Uddin Mondal