arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2503.21491 2026-04-16 cs.RO cs.SY eess.SY

Data-Driven Contact-Aware Control Method for Real-Time Deformable Tool Manipulation: A Case Study in the Environmental Swabbing

Siavash Mahmoudi, Amirreza Davar, Dongyi Wang

Comments Submitted for Journal Review

详情

DOI: 10.1002/adrr.202500142

英文摘要

Deformable Object Manipulation (DOM) remains a critical challenge in robotics due to the complexities of developing suitable model-based control strategies. Deformable Tool Manipulation (DTM) further complicates this task by introducing additional uncertainties between the robot and its environment. While humans effortlessly manipulate deformable tools using touch and experience, robotic systems struggle to maintain stability and precision. To address these challenges, we present a novel State-Adaptive Koopman LQR (SA-KLQR) control framework for real-time deformable tool manipulation, demonstrated through a case study in environmental swab sampling for food safety. This method leverages Koopman operator-based control to linearize nonlinear dynamics while adapting to state-dependent variations in tool deformation and contact forces. A tactile-based feedback system dynamically estimates and regulates the swab tool's angle, contact pressure, and surface coverage, ensuring compliance with food safety standards. Additionally, a sensor-embedded contact pad monitors force distribution to mitigate tool pivoting and deformation, improving stability during dynamic interactions. Experimental results validate the SA-KLQR approach, demonstrating accurate contact angle estimation, robust trajectory tracking, and reliable force regulation. The proposed framework enhances precision, adaptability, and real-time control in deformable tool manipulation, bridging the gap between data-driven learning and optimal control in robotic interaction tasks.

URL PDF HTML ☆

赞 0 踩 0

2503.06078 2026-04-16 cs.LG cs.IT eess.SP math.IT

Biased Federated Learning under Wireless Heterogeneity

Muhammad Faraz Ul Abrar, Nicolò Michelusi

Comments Accepted in IEEE Transactions on Wireless Communications

详情

DOI: 10.1109/TWC.2026.3678173

英文摘要

Federated learning (FL) has emerged as a promising framework for distributed learning, enabling collaborative model training without sharing private data. Existing wireless FL works primarily adopt two communication strategies: (1) over-the-air (OTA) computation, which exploits wireless signal superposition for simultaneous gradient aggregation, and (2) digital communication, which allocates orthogonal resources for gradient uploads. Prior works on both schemes typically assume \emph{homogeneous} wireless conditions (equal path loss across devices) to enforce zero-bias updates or permit uncontrolled bias, resulting in suboptimal performance and high-variance model updates in \emph{heterogeneous} environments, where devices with poor channel conditions slow down convergence. This paper addresses FL over heterogeneous wireless networks by proposing novel OTA and digital FL updates that allow a structured, time-invariant model bias, thereby reducing variance in FL updates. We analyze their convergence under a unified framework and derive an upper bound on the model ``optimality error", which explicitly quantifies the effect of bias and variance in terms of design parameters. Next, to optimize this trade-off, we study a non-convex optimization problem and develop a successive convex approximation (SCA)-based framework to jointly optimize the design parameters. We perform extensive numerical evaluations with several related design variants and state-of-the-art OTA and digital FL schemes. Our results confirm that minimizing the bias-variance trade-off while allowing a structured bias provides better FL convergence performance than existing schemes.

URL PDF HTML ☆

赞 0 踩 0

2502.00414 2026-04-16 cs.CL

Social media polarization during conflict: Insights from an ideological stance dataset on Israel-Palestine Reddit comments

Hasin Jawad Ali, Ajwad Abrar, S. M. Hozaifa Hossain, M. Firoz Mridha

2501.19378 2026-04-16 cs.CL

TableMaster: A Recipe to Advance Table Understanding with Language Models

Lang Cao, Hanbing Liu

2501.02378 2026-04-16 cs.LG q-bio.NC stat.ML

A ghost mechanism: An analytical model of abrupt learning in recurrent networks

Fatih Dinc, Ege Cirakman, Bariscan Kurtkaya, Mert Yuksekgonul, Yiqi Jiang, Mark J. Schnitzer, Hidenori Tanaka

Comments to appear in Physical Review X

详情

英文摘要

Abrupt learning is a common phenomenon in recurrent neural networks (RNNs) trained on working memory tasks. In such cases, the networks develop transient slow regions in state space that extend the effective timescales of computation. However, the mechanisms driving sudden performance improvements and their causal role remain unclear. To address this gap, we introduce the ghost mechanism, a process by which dynamical systems exhibit transient slowdown near the remnant of a saddle-node bifurcation. By reducing the high-dimensional dynamics near ghost points, we derive a one-dimensional canonical form that analytically captures learning as a process controlled by a single scale parameter. Using this model, we study a form of abrupt learning emerging from ghost points and identify a critical learning rate that scales as an inverse power law with the timescale of the learned computation. Beyond this rate, learning collapses through two interacting modes: (i) vanishing gradients and (ii) oscillatory gradients near minima. These features can lock the system into high-confidence but incorrect predictions when parameter updates trigger a no-learning zone, a region of parameter space where gradients vanish. We validate these predictions in low-rank RNNs, where ghost points precede abrupt transitions, and further demonstrate their generality in full-rank RNNs trained on canonical working memory tasks. Our theory offers two approaches to address these learning difficulties: increasing trainable ranks stabilizes learning trajectories, while reducing output confidence mitigates entrapment in no-learning zones. Overall, the ghost mechanism reveals how the computational demands of a task constrain the optimization landscape, demonstrating that well-known learning difficulties in RNNs partly arise from the dynamical systems they must learn to implement.

URL PDF HTML ☆

赞 0 踩 0

2412.09819 2026-04-16 cs.LG cs.SY eess.SY

FDM-Bench: A Comprehensive Benchmark for Evaluating Large Language Models in Additive Manufacturing Tasks

Ahmadreza Eslaminia, Adrian Jackson, Beitong Tian, Avi Stern, Hallie Gordon, Rajiv Malhotra, Klara Nahrstedt, Chenhui Shao

详情

DOI: 10.1016/j.mfglet.2025.06.161

英文摘要

Fused Deposition Modeling (FDM) is a widely used additive manufacturing (AM) technique valued for its flexibility and cost-efficiency, with applications in a variety of industries including healthcare and aerospace. Recent developments have made affordable FDM machines accessible and encouraged adoption among diverse users. However, the design, planning, and production process in FDM require specialized interdisciplinary knowledge. Managing the complex parameters and resolving print defects in FDM remain challenging. These technical complexities form the most critical barrier preventing individuals without technical backgrounds and even professional engineers without training in other domains from participating in AM design and manufacturing. Large Language Models (LLMs), with their advanced capabilities in text and code processing, offer the potential for addressing these challenges in FDM. However, existing research on LLM applications in this field is limited, typically focusing on specific use cases without providing comprehensive evaluations across multiple models and tasks. To this end, we introduce FDM-Bench, a benchmark dataset designed to evaluate LLMs on FDM-specific tasks. FDM-Bench enables a thorough assessment by including user queries across various experience levels and G-code samples that represent a range of anomalies. We evaluate two closed-source models (GPT-4o and Claude 3.5 Sonnet) and two open-source models (Llama-3.1-70B and Llama-3.1-405B) on FDM-Bench. A panel of FDM experts assess the models' responses to user queries in detail. Results indicate that closed-source models generally outperform open-source models in G-code anomaly detection, whereas Llama-3.1-405B demonstrates a slight advantage over other models in responding to user queries. These findings underscore FDM-Bench's potential as a foundational tool for advancing research on LLM capabilities in FDM.

URL PDF HTML ☆

赞 0 踩 0

2411.17511 2026-04-16 cs.LG cs.NA math.NA

Training Hamiltonian neural networks without backpropagation

Atamert Rahma, Chinmay Datar, Felix Dietrich

Comments 5 pages, 2 figures and 2 tables in the main text, includes an Appendix section, accepted to NeurIPS 2024 Workshop ML4PS

2411.10703 2026-04-16 cs.LG eess.SP

Hybrid Attention Model Using Feature Decomposition and Knowledge Distillation for Glucose Forecasting

Ebrahim Farahmand, Shovito Barua Soumma, Nooshin Taheri Chatrudi, Hassan Ghasemzadeh

Comments Final accepted version. IEEE TMC Journal

详情

英文摘要

The availability of continuous glucose monitors as over-the-counter commodities have created a unique opportunity to monitor a person's blood glucose levels, forecast blood glucose trajectories and provide automated interventions to prevent devastating chronic complications that arise from poor glucose control. However, forecasting blood glucose levels is challenging because blood glucose changes consistently in response to food intake, medication intake, physical activity, sleep, and stress. It is particularly difficult to accurately predict BGL from multimodal and irregularly sampled data and over long prediction horizons. Furthermore, these forecasting models must operate in real-time on edge devices to provide in-the-moment interventions. To address these challenges, we propose GlucoNet, an AI-powered sensor system for continuously monitoring behavioral and physiological health and robust forecasting of blood glucose patterns. GlucoNet devises a feature decomposition-based transformer model that incorporates patients' behavioral and physiological data and transforms sparse and irregular patient data (e.g., diet and medication intake data) into continuous features using a mathematical model, facilitating better integration with the BGL data. Given the non-linear and non-stationary nature of BG signals, we propose a decomposition method to extract both low and high-frequency components from the BGL signals, thus providing accurate forecasting. To reduce the computational complexity, we also propose to employ knowledge distillation to compress the transformer model. GlucoNet achieves a 60% improvement in RMSE and a 21% reduction in the number of parameters, improving RMSE and MAE by 51% and 57%, using data obtained involving 12 participants with T1-Diabetes. These results underscore GlucoNet's potential as a compact and reliable tool for real-world diabetes prevention and management.

URL PDF HTML ☆

赞 0 踩 0

2410.21326 2026-04-16 cs.LG cs.AI

Self-Supervised Learning and Opportunistic Inference for Continuous Monitoring of Freezing of Gait in Parkinson's Disease

Shovito Barua Soumma, Daniel Peterson, Shyamal Mehta, Hassan Ghasemzadeh

Comments 24 pages

2410.01473 2026-04-16 cs.CV

SinkSAM-Net: Knowledge-Driven Self-Supervised Sinkhole Segmentation Using Topographic Priors and Segment Anything Model

Osher Rafaeli, Tal Svoray, Ariel Nahlieli

Comments 17 pages, 8 figures

详情

DOI: 10.1016/j.isprsjprs.2025.06.035

英文摘要

Soil sinkholes significantly influence soil degradation, infrastructure vulnerability, and landscape evolution. However, their irregular shapes, combined with interference from shadows and vegetation, make it challenging to accurately quantify their properties using remotely sensed data. In addition, manual annotation can be laborious and costly. In this study, we introduce a novel self-supervised framework for sinkhole segmentation, termed SinkSAM-Net, which integrates traditional topographic computations of closed depressions with an iterative, geometry-aware, prompt-based Segment Anything Model (SAM). We generate high-quality pseudo-labels through pixel-level refinement of sinkhole boundaries by integrating monocular depth information with random prompts augmentation technique named coordinate-wise bounding box jittering (CWBJ). These pseudo-labels iteratively enhance a lightweight EfficientNetV2-UNet target model, ultimately transferring knowledge to a prompt-free, low-parameter, and fast inference model. Our proposed approach achieves approximately 95\% of the performance obtained through manual supervision by human annotators. The framework's performance was evaluated on a large sinkhole database, covering diverse sinkhole dateset-induced sinkholes using both aerial and high-resolution drone imagery. This paper presents the first self-supervised framework for sinkhole segmentation, demonstrating the robustness of foundational models (such as SAM and Depth Anything V2) when combined with prior topographic and geometry knowledge and an iterative self-learning pipeline. SinkSAM-Net has the potential to be trained effectively on extensive unlabeled RGB sinkholes datasets, achieving comparable performance to a supervised model. The code and interactive demo for SinkSAM-Net are available at https://osherr1996.github.io/SinkSAMNet

URL PDF HTML ☆

赞 0 踩 0

2408.00601 2026-04-16 cs.LG

AutoPV: Automatically Design Your Photovoltaic Power Forecasting Model

Dayin Chen, Xiaodan Shi, Mingkun Jiang, Haoran Zhang, Dongxiao Zhang, Yuntian Chen, Jinyue Yan

2407.08101 2026-04-16 cs.CV

What to Say and When to Say it: Live Fitness Coaching as a Testbed for Situated Interaction

Sunny Panchal, Apratim Bhattacharyya, Guillaume Berger, Antoine Mercier, Cornelius Bohm, Florian Dietrichkeit, Reza Pourreza, Xuanlin Li, Pulkit Madan, Mingu Lee, Mark Todorovich, Ingo Bax, Roland Memisevic

Comments Accepted to the 2024 NeurIPS Datasets and Benchmarks track; Data: https://www.qualcomm.com/developer/software/qevd-dataset Dataset quick start guide: https://github.com/varworkshop/ai_coach_fitness_2026 and Stream-VLM code: https://github.com/Qualcomm-AI-research/FitCoach

2405.19088 2026-04-16 cs.CL cs.CV

Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions

Zhe Hu, Tuo Liang, Jing Li, Yiren Lu, Yunlai Zhou, Yiran Qiao, Jing Ma, Yu Yin

Comments NeurIPS 2024 (Oral)

2403.08462 2026-04-16 cs.CL cs.LG

Grammar as a Behavioral Biometric: Using Cognitively Motivated Grammar Models for Authorship Verification

Andrea Nini, Oren Halvani, Lukas Graner, Sophie Titze, Valerio Gherardi, Shunichi Ishihara

2310.02540 2026-04-16 cs.LG cs.AI cs.DB cs.IR

Auto-FP: An Experimental Study of Automated Feature Preprocessing for Tabular Data

Danrui Qi, Jinglin Peng, Yongjun He, Jiannan Wang

详情

英文摘要

Classical machine learning models, such as linear models and tree-based models, are widely used in industry. These models are sensitive to data distribution, thus feature preprocessing, which transforms features from one distribution to another, is a crucial step to ensure good model quality. Manually constructing a feature preprocessing pipeline is challenging because data scientists need to make difficult decisions about which preprocessors to select and in which order to compose them. In this paper, we study how to automate feature preprocessing (Auto-FP) for tabular data. Due to the large search space, a brute-force solution is prohibitively expensive. To address this challenge, we interestingly observe that Auto-FP can be modelled as either a hyperparameter optimization (HPO) or a neural architecture search (NAS) problem. This observation enables us to extend a variety of HPO and NAS algorithms to solve the Auto-FP problem. We conduct a comprehensive evaluation and analysis of 15 algorithms on 45 public ML datasets. Overall, evolution-based algorithms show the leading average ranking. Surprisingly, the random search turns out to be a strong baseline. Many surrogate-model-based and bandit-based search algorithms, which achieve good performance for HPO and NAS, do not outperform random search for Auto-FP. We analyze the reasons for our findings and conduct a bottleneck analysis to identify the opportunities to improve these algorithms. Furthermore, we explore how to extend Auto-FP to support parameter search and compare two ways to achieve this goal. In the end, we evaluate Auto-FP in an AutoML context and discuss the limitations of popular AutoML tools. To the best of our knowledge, this is the first study on automated feature preprocessing. We hope our work can inspire researchers to develop new algorithms tailored for Auto-FP.

URL PDF HTML ☆

赞 0 踩 0

2204.13635 2026-04-16 cs.CV

SemAttNet: Towards Attention-based Semantic Aware Guided Depth Completion

Danish Nazir, Marcus Liwicki, Didier Stricker, Muhammad Zeshan Afzal

Comments accepted at IEEE Access

详情

DOI: 10.1109/ACCESS.2022.3214316
Journal ref: IEEE Access, vol. 10, pp. 120781-120791, 2022

英文摘要

Depth completion involves recovering a dense depth map from a sparse map and an RGB image. Recent approaches focus on utilizing color images as guidance images to recover depth at invalid pixels. However, color images alone are not enough to provide the necessary semantic understanding of the scene. Consequently, the depth completion task suffers from sudden illumination changes in RGB images (e.g., shadows). In this paper, we propose a novel three-branch backbone comprising color-guided, semantic-guided, and depth-guided branches. Specifically, the color-guided branch takes a sparse depth map and RGB image as an input and generates color depth which includes color cues (e.g., object boundaries) of the scene. The predicted dense depth map of color-guided branch along-with semantic image and sparse depth map is passed as input to semantic-guided branch for estimating semantic depth. The depth-guided branch takes sparse, color, and semantic depths to generate the dense depth map. The color depth, semantic depth, and guided depth are adaptively fused to produce the output of our proposed three-branch backbone. In addition, we also propose to apply semantic-aware multi-modal attention-based fusion block (SAMMAFB) to fuse features between all three branches. We further use CSPN++ with Atrous convolutions to refine the dense depth map produced by our three-branch backbone. Extensive experiments show that our model achieves state-of-the-art performance in the KITTI depth completion benchmark at the time of submission.

URL PDF HTML ☆

赞 0 踩 0

1804.09154 2026-04-16 cs.LG cs.HC stat.ML

DOOM Level Generation using Generative Adversarial Networks

Edoardo Giacomello, Pier Luca Lanzi, Daniele Loiacono

2604.14114 2026-04-16 cs.IR cs.LG

ID and Graph View Contrastive Learning with Multi-View Attention Fusion for Sequential Recommendation

Xiaofan Zhou, Kyumin Lee

2604.14075 2026-04-16 math.OC cs.LG stat.ML

Multistage Conditional Compositional Optimization

Buse Şen, Yifan Hu, Daniel Kuhn

2604.14059 2026-04-16 econ.GN cs.LG q-fin.EC

A Comparative Study of Dynamic Programming and Reinforcement Learning in Finite Horizon Dynamic Pricing

Lev Razumovskiy, Nikolay Karenin

2604.14034 2026-04-16 cs.SE cs.AI cs.IR

Large Language Models to Enhance Business Process Modeling: Past, Present, and Future Trends

João Bettencourt, Sérgio Guerreiro

Comments 27 pages, 2 images, 1 table

2604.14017 2026-04-16 math.OC cs.LG

Stochastic Trust-Region Methods for Over-parameterized Models

Aike Yang, Hao Wang

Comments 26 pages, 3 figures

2604.13956 2026-04-16 cs.HC cs.AI cs.CV

Creo: From One-Shot Image Generation to Progressive, Co-Creative Ideation

Zoe De Simone, Angie Boggust, Fredo Durand, Ashia Wilson, Arvind Satyanarayan

Comments 11 pages, 5 figures

2604.13919 2026-04-16 physics.flu-dyn cs.LG physics.comp-ph

Nested Fourier-enhanced neural operator for efficient modeling of radiation transfer in fires

Anran Jiao, Wengyao Jiang, Xiaoyi Lu, Yi Wang, Lu Lu

详情

英文摘要

Computational fluid dynamics (CFD) has become an essential tool for predicting fire behavior, yet maintaining both efficiency and accuracy remains challenging. A major source of computational cost in fire simulations is the modeling of radiation transfer, which is usually the dominant heat transfer mechanism in fires. Solving the high-dimensional radiative transfer equation (RTE) with traditional numerical methods can be a performance bottleneck. Here, we present a machine learning framework based on Fourier-enhanced multiple-input neural operators (Fourier-MIONet) as an efficient alternative to direct numerical integration of the RTE. We first investigate the performance of neural operator architectures for a small-scale 2D pool fire and find that Fourier-MIONet provides the most accurate radiative solution predictions. The approach is then extended to 3D CFD fire simulations, where the computational mesh is locally refined across multiple levels. In these high-resolution settings, monolithic surrogate models for direct field-to-field mapping become difficult to train and computationally inefficient. To address this issue, a nested Fourier-MIONet is proposed to predict radiation solutions across multiple mesh-refinement levels. We validate the approach on 3D McCaffrey pool fires simulated with FireFOAM, including fixed fire sizes and a unified model trained over a continuous range of heat release rates (HRRs). The proposed method achieves global relative errors of 2-4% for 3D varying-HRR scenarios while providing faster inference than the estimated cost of one finite-volume radiation solve in FireFOAM for the 16-solid-angle case. With fast and accurate inference, the surrogate makes higher-fidelity radiation treatments practical and enables the incorporation of more spectrally resolved radiation models into CFD fire simulations for engineering applications.

URL PDF HTML ☆

赞 0 踩 0

2604.13890 2026-04-16 physics.soc-ph cs.LG econ.EM econ.TH stat.ML

Sandpile Economics: Theory, Identification, and Evidence

Diego Vallarino

2604.13870 2026-04-16 math.OC cs.LG

Gradient Descent's Last Iterate is Often (slightly) Suboptimal

Guy Kornowski, Ohad Shamir

2604.13849 2026-04-16 cs.CR cs.AI

MCPThreatHive: Automated Threat Intelligence for Model Context Protocol Ecosystems

Yi Ting Shen, Kentaroh Toyoda, Alex Leung

Comments A white paper of our presentation at DEFCON SG 2026 (Demo Labs) https://defcon.org/html/defcon-singapore/dc-singapore-demolabs.html

2604.13830 2026-04-16 math.NA cs.LG cs.NA

Randomized Neural Networks for Integro-Differential Equations with Application to Neutron Transport

Haoning Dang, Fei Wang, Yifan Chen, Zhouyu Liu, Dong Liu, Hongchun Wu

2604.13826 2026-04-16 cs.SE cs.AI

Sentiment analysis for software engineering: How far can zero-shot learning (ZSL) go?

Reem Alfayez, Manal Binkhonain

详情

DOI: 10.1016/j.infsof.2025.107971

英文摘要

Sentiment analysis in software engineering focuses on understanding emotions expressed in software artifacts. Previous research highlighted the limitations of applying general off-the-shelf sentiment analysis tools within the software engineering domain and indicated the need for specialized tools tailored to various software engineering contexts. The development of such tools heavily relies on supervised machine learning techniques that necessitate annotated datasets. Acquiring such datasets is a substantial challenge, as it requires domain-specific expertise and significant effort. Objective: This study explores the potential of ZSL to address the scarcity of annotated datasets in sentiment analysis within software engineering Method:} We conducted an empirical experiment to evaluate the performance of various ZSL techniques, including embedding-based, NLI-based, TARS-based, and generative-based ZSL techniques. We assessed the performance of these techniques under different labels setups to examine the impact of label configurations. Additionally, we compared the results of the ZSL techniques with state-of-the-art fine-tuned transformer-based models. Finally, we performed an error analysis to identify the primary causes of misclassifications. Results: Our findings demonstrate that ZSL techniques, particularly those combining expert-curated labels with embedding-based or generative-based models, can achieve macro-F1 scores comparable to fine-tuned transformer-based models. The error analysis revealed that subjectivity in annotation and polar facts are the main contributors to ZSL misclassifications. Conclusion: This study demonstrates the potential of ZSL for sentiment analysis in software engineering. ZSL can provide a solution to the challenge of annotated dataset scarcity by reducing reliance on annotated dataset.

URL PDF HTML ☆

赞 0 踩 0

2604.13814 2026-04-16 cs.HC cs.AI

Cognitive Offloading in Agile Teams: How Artificial Intelligence Reshapes Risk Assessment and Planning Quality

Adriana Caraeni, Alexander Shick, Andrew Lan

Comments 7 pages, 5 Tables, under review