arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2503.06145 2026-03-19 cs.LG

Adaptive UAV-Assisted Hierarchical Federated Learning: Optimizing Energy, Latency, and Resilience for Dynamic Smart IoT

Xiaohong Yang, Minghui Liwang, Liqun Fu, Yuhan Su, Seyyedali Hosseinalipour, Xianbin Wang, Yiguang Hong

Comments Accepted by IEEE Transactions on Services Computing(22 pages, 11 figures)

详情

英文摘要

Hierarchical Federated Learning (HFL) extends conventional Federated Learning (FL) by introducing intermediate aggregation layers, enabling distributed learning in geographically dispersed environments, particularly relevant for smart IoT systems, such as remote monitoring and battlefield operations, where cellular connectivity is limited. In these scenarios, UAVs serve as mobile aggregators, dynamically connecting terrestrial IoT devices. This paper investigates an HFL architecture with energy-constrained, dynamically deployed UAVs prone to communication disruptions. We propose a novel approach to minimize global training costs by formulating a joint optimization problem that integrates learning configuration, bandwidth allocation, and device-to-UAV association, ensuring timely global aggregation before UAV disconnections and redeployments. The problem accounts for dynamic IoT devices and intermittent UAV connectivity and is NP-hard. To tackle this, we decompose it into three subproblems: \textit{(i)} optimizing learning configuration and bandwidth allocation via an augmented Lagrangian to reduce training costs; \textit{(ii)} introducing a device fitness score based on data heterogeneity (via Kullback-Leibler divergence), device-to-UAV proximity, and computational resources, using a TD3-based algorithm for adaptive device-to-UAV assignment; \textit{(iii)} developing a low-complexity two-stage greedy strategy for UAV redeployment and global aggregator selection, ensuring efficient aggregation despite UAV disconnections. Experiments on diverse real-world datasets validate the approach, demonstrating cost reduction and robust performance under communication disruptions.

URL PDF HTML ☆

赞 0 踩 0

2502.20030 2026-03-19 cs.LG cs.SY eess.SY math.OC

Offline Reinforcement Learning via Inverse Optimization

Ioannis Dimanidis, Tolga Ok, Peyman Mohajerin Esfahani

Comments preprint

2502.17292 2026-03-19 cs.LG cs.GT cs.IT math.IT stat.ME stat.ML

Joint Value Estimation and Bidding in Repeated First-Price Auctions

Yuxiao Wen, Yanjun Han, Zhengyuan Zhou

Comments POMS-HK 2026 Best Student Paper Finalist

2502.07139 2026-03-19 cs.CL cs.LG

Byte-token Enhanced Language Models for Temporal Point Processes Analysis

Quyu Kong, Yixuan Zhang, Yang Liu, Panrong Tong, Enqi Liu, Feng Zhou

Comments WWW 2026

2501.14622 2026-03-19 cs.LG cs.AI

ACT-JEPA: Novel Joint-Embedding Predictive Architecture for Efficient Policy Representation Learning

Aleksandar Vujinovic, Aleksandar Kovacevic

2411.15206 2026-03-19 cs.LG cs.AI

Conditional Distribution Learning for Graph Classification

Jie Chen, Hua Mao, Chuanbin Liu, Zhu Wang, Xi Peng

Comments 8 pages

2411.12127 2026-03-19 cs.LG cs.IT math.IT math.ST stat.ML stat.TH

Fine-Grained Uncertainty Quantification via Collisions

Jesse Friedbaum, Sudarshan Adiga, Ravi Tandon

2410.12346 2026-03-19 cs.CV cs.AI

Efficient Diffusion as Low Light Enhancer

Guanzhou Lan, Qianli Ma, Yuqi Yang, Zhigang Wang, Dong Wang, Xuelong Li, Bin Zhao

Comments CVPR 2025 Camera Ready

2409.17049 2026-03-19 cs.CV cs.AI

From Geometric Mimicry to Comprehensive Generation: A Context-Informed Multimodal Diffusion Model for Urban Morphology Synthesis

Fangshuo Zhou, Huaxia Li, Liuchang Xu, Rui Hu, Sensen Wu, Liang Xu, Hailin Feng, Zhenhong Du

Comments Accepted

详情

DOI: 10.1080/13658816.2026.2639026
Journal ref: International Journal of Geographical Information Science (2026)

英文摘要

Urban morphology is fundamental to determining urban functionality and vitality. Prevailing simulation methods, however, often oversimplify morphological generation as a geometric problem, lacking a profound understanding of urban semantics and geographical context. To address this limitation, this study proposes ControlCity, a diffusion model that achieves comprehensive urban morphology generation through multimodal information fusion. We first constructed a quadruple dataset comprising ``image-text-metadata-building footprints" from 22 cities worldwide. ControlCity utilizes these multidimensional information as joint control conditions, where an enhanced ControlNet architecture encodes spatial constraints from images, while text and metadata provide semantic guidance and geographical priors respectively, collectively directing the generation process. Experimental results demonstrate that compared to unimodal baselines, this method achieves significant advantages in morphological fidelity, with visual error (FID) reduced by 71.01%, reaching 50.94, and spatial overlap (MIoU) improved by 38.46%, reaching 0.36. Furthermore, the model demonstrates robust knowledge generalization and controllability, enabling cross-city style transfer and zero-shot generation for unknown cities. Ablation studies further reveal the distinct roles of images, text, and metadata in the generation process. This study confirms that multimodal fusion is crucial for achieving the transition from ``geometric mimicry" to ``understanding-based comprehensive generation," providing a novel paradigm for urban morphology research and applications.

URL PDF HTML ☆

赞 0 踩 0

2405.10642 2026-03-19 cs.LG

Hi-GMAE: Hierarchical Graph Masked Autoencoders

Chuang Liu, Zelin Yao, Xueqi Ma, Mukun Chen, Luzhi Wang, Jia Wu, Wenbin Hu

Comments 12 pages, 9 figures. Accepted by WWW 2026

2404.19725 2026-03-19 cs.LG cs.AI cs.DC

CurvFed: Curvature-Aligned Federated Learning for Fairness without Demographics

Harshit Sharma, Shaily Roy, Asif Salekin

Comments *equal contribution

详情

英文摘要

Modern human sensing applications often rely on data distributed across users and devices, where privacy concerns prevent centralized training. Federated Learning (FL) addresses this challenge by enabling collaborative model training without exposing raw data or attributes. However, achieving fairness in such settings remains difficult, as most human sensing datasets lack demographic labels, and FL's privacy guarantees limit the use of sensitive attributes. This paper introduces CurvFed: Curvature Aligned Federated Learning for Fairness without Demographics, a theoretically grounded framework that promotes fairness in FL without requiring any demographic or sensitive attribute information, a concept termed Fairness without Demographics (FWD), by optimizing the underlying loss landscape curvature. Building on the theory that equivalent loss landscape curvature corresponds to consistent model efficacy across sensitive attribute groups, CurvFed regularizes the top eigenvalue of the Fisher Information Matrix (FIM) as an efficient proxy for loss landscape curvature, both within and across clients. This alignment promotes uniform model behavior across diverse bias inducing factors, offering an attribute agnostic route to algorithmic fairness. CurvFed is especially suitable for real world human sensing FL scenarios involving single or multi user edge devices with unknown or multiple bias factors. We validated CurvFed through theoretical and empirical justifications, as well as comprehensive evaluations using three real world datasets and a deployment on a heterogeneous testbed of resource constrained devices. Additionally, we conduct sensitivity analyses on local training data volume, client sampling, communication overhead, resource costs, and runtime performance to demonstrate its feasibility for practical FL edge device deployment.

URL PDF HTML ☆

赞 0 踩 0

2311.17697 2026-03-19 cs.RO cs.MA

Swarm Self Clustering for Communication denied Environments without Global Positioning

Sweksha Jain, Rugved Katole, Leena Vachhani

Comments 36 Pages, 15 figures, 8 tables, pre-print version

2310.07147 2026-03-19 cs.CL cs.LG

QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources

Zhikai Li, Xiaoxuan Liu, Banghua Zhu, Zhen Dong, Qingyi Gu, Kurt Keutzer

Comments ICLR 2026 Workshop on Scaling Post-training for LLMs (SPOT)

2309.00952 2026-03-19 cs.CL cs.AI

Bridge Diffusion Model: Bridge Chinese Text-to-Image Diffusion Model with English Communities

Shanyuan Liu, Bo Cheng, Yuhang Ma, Liebucha Wu, Ao Ma, Xiaoyu Wu, Dawei Leng, Yuhui Yin

Comments Accepted as Oral at AAAI 2025. 8 pages, 5 figures. Published in Proceedings of the 39th AAAI Conference on Artificial Intelligence. Code: https://github.com/360CVGroup/Bridge_Diffusion_Model

详情

DOI: 10.1609/aaai.v39i5.32590
Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 39(5), 5541-5549 (2025)

英文摘要

Text-to-Image generation (TTI) technologies are advancing rapidly, especially in the English language communities. However, apart from the user input language barrier problem, English-native TTI models inherently carry biases from their English world centric training data, which creates a dilemma for development of other language-native TTI models. One common choice is to fine-tune the English-native TTI model with translated samples. It falls short of fully addressing the model bias problem. Alternatively, training non-English language native models from scratch can effectively resolve the English world bias, but model trained this way would diverge from the English TTI communities, thus not able to utilize the strides continuously gaining in the English TTI communities any more. To build Chinese TTI model meanwhile keep compatibility with the English TTI communities, we propose a novel model structure referred as "Bridge Diffusion Model" (BDM). The proposed BDM employs a backbone-branch network structure to learn the Chinese semantics while keep the latent space compatible with the English-native TTI backbone, in an end-to-end manner. The unique advantages of the proposed BDM are that it's not only adept at generating images that precisely depict Chinese semantics, but also compatible with various English-native TTI plugins, such as different checkpoints, LoRA, ControlNet, Dreambooth, and Textual Inversion, etc. Moreover, BDM can concurrently generate content seamlessly combining both Chinese-native and English-native semantics within a single image, fostering cultural interaction.

URL PDF HTML ☆

赞 0 踩 0

2306.11983 2026-03-19 cs.RO

Stability analysis of admittance control using asymmetric stiffness matrix

Toshiaki Tsuji, Yasuhiro Kato

2305.00594 2026-03-19 cs.CV

The MCC approaches the geometric mean of precision and recall as true negatives approach infinity

Jon Crall

Comments 9 pages, 0 figures. Major revision: adds Lean 4 formalization, expanded related work, and revised discussion of the object-detection setting; includes a brief note on LLM-assisted formalization and literature search

2303.18223 2026-03-19 cs.CL cs.AI

A Survey of Large Language Models

Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, Yifan Du, Chen Yang, Yushuo Chen, Zhipeng Chen, Jinhao Jiang, Ruiyang Ren, Yifan Li, Xinyu Tang, Zikang Liu, Peiyu Liu, Jian-Yun Nie, Ji-Rong Wen

Comments ongoing work; 144 pages, 1081 citations

详情

英文摘要

Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and grasping a language. As a major approach, language modeling has been widely studied for language understanding and generation in the past two decades, evolving from statistical language models to neural language models. Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora, showing strong capabilities in solving various NLP tasks. Since researchers have found that model scaling can lead to performance improvement, they further study the scaling effect by increasing the model size to an even larger size. Interestingly, when the parameter scale exceeds a certain level, these enlarged language models not only achieve a significant performance improvement but also show some special abilities that are not present in small-scale language models. To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size. Recently, the research on LLMs has been largely advanced by both academia and industry, and a remarkable progress is the launch of ChatGPT, which has attracted widespread attention from society. The technical evolution of LLMs has been making an important impact on the entire AI community, which would revolutionize the way how we develop and use AI algorithms. In this survey, we review the recent advances of LLMs by introducing the background, key findings, and mainstream techniques. In particular, we focus on four major aspects of LLMs, namely pre-training, adaptation tuning, utilization, and capacity evaluation. Besides, we also summarize the available resources for developing LLMs and discuss the remaining issues for future directions.

URL PDF HTML ☆

赞 0 踩 0

2208.05545 2026-03-19 cs.CL cs.CY cs.LG

The Moral Foundations Reddit Corpus

Jackson Trager, Alireza S. Ziabari, Elnaz Rahmati, Aida Mostafazadeh Davani, Preni Golazizian, Farzan Karimi-Malekabadi, Ali Omrani, Zhihe Li, Brendan Kennedy, Georgios Chochlakis, Nils Karl Reimer, Melissa Reyes, Kelsey Cheng, Mellow Wei, Christina Merrifield, Arta Khosravi, Evans Alvarez, Morteza Dehghani

2101.04264 2026-03-19 cs.LG

HighAir: A Hierarchical Graph Neural Network-Based Air Quality Forecasting Method

Ling Chen, Jiahui Xu, Binqing Wu, Mingqi Lv, Chaoqun Zhan, Sanjian Chen, Jian Chang

2603.17974 2026-03-19 cs.SE cs.AI

Toward Scalable Automated Repository-Level Datasets for Software Vulnerability Detection

Amine Lbath

Comments Supervisor: Prof. Massih-Reza Amini

2603.17902 2026-03-19 cs.CR cs.AI

Differential Privacy in Generative AI Agents: Analysis and Optimal Tradeoffs

Ya-Ting Yang, Quanyan Zhu

2603.17896 2026-03-19 stat.ML cs.LG

A Noise Sensitivity Exponent Controls Large Statistical-to-Computational Gaps in Single- and Multi-Index Models

Leonardo Defilippis, Florent Krzakala, Bruno Loureiro, Antoine Maillard

2603.17887 2026-03-19 cs.HC cs.AI

AI-Assisted Goal Setting Improves Goal Progress Through Social Accountability

Michel Schimpf, Julian Voigt, Thomas Bohné

2603.17836 2026-03-19 eess.SY cs.LG cs.SY

Verification and Validation of Physics-Informed Surrogate Component Models for Dynamic Power-System Simulation

Petros Ellinas, Indrajit Chaudhuri, Johanna Vorwerk, Spyros Chatzivasileiadis

2603.17829 2026-03-19 cs.SE cs.AI cs.CL

CodeScout: An Effective Recipe for Reinforcement Learning of Code Search Agents

Lintang Sutawika, Aditya Bharat Soni, Bharath Sriraam R R, Apurva Gandhi, Taha Yassine, Sanidhya Vijayvargiya, Yuchen Li, Xuhui Zhou, Yilin Zhang, Leander Melroy Maben, Graham Neubig

2603.17826 2026-03-19 cs.SE cs.AI

FailureMem: A Failure-Aware Multimodal Framework for Autonomous Software Repair

Ruize Ma, Yilei Jiang, Shilin Zhang, Zheng Ma, Yi Feng, Vincent Ng, Zhi Wang, Xiangyu Yue, Chuanyi Li, Lewei Lu

2603.17822 2026-03-19 eess.AS cs.CL

Multi-Source Evidence Fusion for Audio Question Answering

Aivo Olev, Tanel Alumäe

2603.17785 2026-03-19 math.OC cs.AI

A Dual Certificate Approach to Sparsity in Infinite-Width Shallow Neural Networks

Leonardo Del Grande, Christoph Brune, Marcello Carioni

2603.17767 2026-03-19 cs.HC cs.CV

Facial Movement Dynamics Reveal Workload During Complex Multitasking

Carter Sale, Melissa N. Stolar, Gaurav Patil, Michael J. Gostelow, Julia Wallier, Margaret C. Macpherson, Jan-Louis Kruger, Mark Dras, Simon G. Hosking, Rachel W. Kallen, Michael J. Richardson

Comments 26 pages, 7 figures, under review at Royal Society Open Science

2603.17704 2026-03-19 cs.GR cs.CV cs.HC

DancingBox: A Lightweight MoCap System for Character Animation from Physical Proxies

Haocheng Yuan, Adrien Bousseau, Hao Pan, Lei Zhong, Changjian Li

Comments Accepted to CHI2026