arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 1261
专题追踪 全部专题
2510.04407 2026-02-18 cs.GT cs.LG

Scale-Invariant Regret Matching and Online Learning with Optimal Convergence: Bridging Theory and Practice in Zero-Sum Games

Brian Hu Zhang, Ioannis Anagnostides, Tuomas Sandholm

Comments Compared to the previous version, this version includes new results on harmonic games and extensive-form games. Abstract abridged due to arXiv length constraints

详情
英文摘要

A considerable chasm has been looming for decades between theory and practice in zero-sum game solving through first-order methods. Although a convergence rate of $T^{-1}$ has long been established, the most effective paradigm in practice is counterfactual regret minimization (CFR), which is based on regret matching and its modern variants. In particular, the state of the art across most benchmarks is predictive regret matching$^+$ (PRM$^+$). Yet, such algorithms can exhibit slower $T^{-1/2}$ convergence even in self-play. In this paper, we close the gap between theory and practice. We propose a new scale-invariant and parameter-free variant of PRM$^+$, which we call IREG-PRM$^+$. We show that it achieves $T^{-1/2}$ best-iterate and $T^{-1}$ (i.e., optimal) average-iterate convergence guarantees, while also being on par or even better relative to PRM$^+$ on benchmark games. From a technical standpoint, we draw an analogy between (IREG-)PRM$^+$ and optimistic gradient descent with adaptive learning rate. Reflecting this theoretical bridge, we find that the adaptive version of optimistic gradient descent we consider performs on par with IREG-PRM$^+$. This demystifies the effectiveness of the regret matching family vis-a-vis more standard optimization techniques. Moreover, we extend our analysis beyond zero-sum games to a family of variational inequality problems that includes harmonic games, as well as extensive-form games with fully-mixed equilibria, via a new and intriguing connection between CFR and harmonic games. Unlike prior work in harmonic games, our algorithms do not require knowing the underlying weights by virtue of scale invariance. Under the weighted Minty condition, we show that any algorithm satisfying a scale-invariant RVU property (such as IREG-PRM$^+$) has constant regret (in self-play) and $T^{-1/2}$ iterate convergence.

2509.14461 2026-02-18 quant-ph cs.CC cs.LG

Learning depth-3 circuits via quantum agnostic boosting

Srinivasan Arunachalam, Arkopal Dutt, Alexandru Gheorghiu, Michael de Oliveira

Comments 53 pages; Typos fixed for depth-3 circuits result

详情
英文摘要

We initiate the study of quantum agnostic learning of phase states with respect to a function class $\mathsf{C}\subseteq \{c:\{0,1\}^n\rightarrow \{0,1\}\}$: given copies of an unknown $n$-qubit state $|ψ\rangle$ which has fidelity $\textsf{opt}$ with a phase state $|ϕ_c\rangle=\frac{1}{\sqrt{2^n}}\sum_{x\in \{0,1\}^n}(-1)^{c(x)}|x\rangle$ for some $c\in \mathsf{C}$, output $|ϕ\rangle$ which has fidelity $|\langle ϕ| ψ\rangle|^2 \geq \textsf{opt}-\varepsilon$. To this end, we give agnostic learning protocols for the following classes: (i) Size-$t$ decision trees which runs in time $\textsf{poly}(n,t,1/\varepsilon)$. This also implies $k$-juntas can be agnostically learned in time $\textsf{poly}(n,2^k,1/\varepsilon)$. (ii) $s$-term DNF formulas in time $\textsf{poly}(n,(s/\varepsilon)^{\log \log (s/\varepsilon) \cdot \log(1/\varepsilon)})$. Our main technical contribution is a quantum agnostic boosting protocol which converts a weak agnostic learner, which outputs a parity state $|ϕ\rangle$ such that $|\langle ϕ|ψ\rangle|^2\geq \textsf{opt}/\textsf{poly}(n)$, into a strong learner which outputs a superposition of parity states $|ϕ'\rangle$ such that $|\langle ϕ'|ψ\rangle|^2\geq \textsf{opt} - \varepsilon$. Using quantum agnostic boosting, we obtain a $n^{O(\log(n/\varepsilon) \cdot \log \log n)}$-time algorithm for $\varepsilon$-learning $\textsf{poly}(n)$-sized depth-$3$ circuits (consisting of $\textsf{AND}$, $\textsf{OR}$, $\textsf{NOT}$ gates) in the uniform $\textsf{PAC}$ model given quantum examples. Classically, obtaining an algorithm with a similar complexity has been an open question in the $\textsf{PAC}$ model and our work answers this given quantum examples.

2509.07790 2026-02-18 nucl-th cs.LG

Nuclear Data Adjustment for Nonlinear Applications in the OECD/NEA WPNCS SG14 Benchmark -- A Bayesian Inverse UQ-based Approach for Data Assimilation

Christopher Brady, Xu Wu

Comments 31 pages, 9 tables, 8 figures, submitted to Nuclear Science and Engineering, included in proceedings of International Conference on Mathematics and Computational Methods Applied to Nuclear Science and Engineering (M&C 2025)

详情
英文摘要

The Organization for Economic Cooperation and Development (OECD) Working Party on Nuclear Criticality Safety (WPNCS) proposed a benchmark exercise to assess the performance of current nuclear data adjustment techniques applied to nonlinear applications and experiments with low correlation to applications. This work introduces Bayesian Inverse Uncertainty Quantification (IUQ) as a method for nuclear data adjustments in this benchmark, and compares IUQ to the more traditional methods of Generalized Linear Least Squares (GLLS) and Monte Carlo Bayes (MOCABA). Posterior predictions from IUQ showed agreement with GLLS and MOCABA for linear applications. When comparing GLLS, MOCABA, and IUQ posterior predictions to computed model responses using adjusted parameters, we observe that GLLS predictions fail to replicate computed response distributions for nonlinear applications, while MOCABA shows near agreement, and IUQ uses computed model responses directly. We also discuss observations on why experiments with low correlation to applications can be informative to nuclear data adjustments and identify some properties useful in selecting experiments for inclusion in nuclear data adjustment. Performance in this benchmark indicates potential for Bayesian IUQ in nuclear data adjustments.

2509.03561 2026-02-18 quant-ph cs.AI cs.LG

Quantum-Assisted Correlation Clustering

Antonio Macaluso, Supreeth Mysore Venkatesh, Diego Arenas, Matthias Klusch, Andreas Dengel

Comments To be published in IEEE QAI 2025 conference

详情
英文摘要

This work introduces a hybrid quantum-classical method to correlation clustering, a graph-based unsupervised learning task that seeks to partition the nodes in a graph based on pairwise agreement and disagreement. In particular, we adapt GCS-Q, a quantum-assisted solver originally designed for coalition structure generation, to maximize intra-cluster agreement in signed graphs through recursive divisive partitioning. The proposed method encodes each bipartitioning step as a quadratic unconstrained binary optimization problem, solved via quantum annealing. This integration of quantum optimization within a hierarchical clustering framework enables handling of graphs with arbitrary correlation structures, including negative edges, without relying on metric assumptions or a predefined number of clusters. Empirical evaluations on synthetic signed graphs and real-world hyperspectral imaging data demonstrate that, when adapted for correlation clustering, GCS-Q outperforms classical algorithms in robustness and clustering quality on real-world data and in scenarios with cluster size imbalance. Our results highlight the promise of hybrid quantum-classical optimization for advancing scalable and structurally-aware clustering techniques in graph-based unsupervised learning.

2509.02594 2026-02-18 q-bio.QM cs.AI cs.ET cs.IR

OpenAIs HealthBench in Action: Evaluating an LLM-Based Medical Assistant on Realistic Clinical Queries

Sandhanakrishnan Ravichandran, Shivesh Kumar, Rogerio Corga Da Silva, Miguel Romano, Reinhard Berkels, Michiel van der Heijden, Olivier Fail, Valentine Emmanuel Gnanapragasam

Comments 13 pages, two graphs

详情
英文摘要

Evaluating large language models (LLMs) on their ability to generate high-quality, accurate, situationally aware answers to clinical questions requires going beyond conventional benchmarks to assess how these systems behave in complex, high-stakes clinical scenarios. Traditional evaluations are often limited to multiple-choice questions that fail to capture essential competencies such as contextual reasoning, contextual awareness, and uncertainty handling. To address these limitations, we evaluate our agentic RAG-based clinical support assistant, DR. INFO, using HealthBench, a rubric-driven benchmark composed of open-ended, expert-annotated health conversations. On the Hard subset of 1,000 challenging examples, DR. INFO achieves a HealthBench Hard score of 0.68, outperforming leading frontier LLMs including the GPT-5 model family (GPT-5: 0.46, GPT-5.2: 0.42, GPT-5.1: 0.40), Grok 3 (0.23), Gemini 2.5 Pro (0.19), and Claude 3.7 Sonnet (0.02) across all behavioral axes (accuracy, completeness, instruction following, etc.). In a separate 100-sample evaluation against similar agentic RAG assistants (OpenEvidence and Pathway.md, now DoxGPT by Doximity), it maintains a performance lead with a HealthBench Hard score of 0.72. These results highlight the strengths of DR. INFO in communication, instruction following, and accuracy, while also revealing areas for improvement in context awareness and response completeness. Overall, the findings underscore the utility of behavior-level, rubric-based evaluation for building reliable and trustworthy AI-enabled clinical support systems.

2508.02766 2026-02-18 cs.CY cs.AI

The Generative Reasonable Person

Yonathan A. Arbel

Comments 51 pages, 8 figures

详情
英文摘要

This Article introduces the generative reasonable person, a new tool for estimating how ordinary people judge reasonableness. As claims about AI capabilities often outpace evidence, the Article proceeds empirically: adapting randomized controlled trials to large language models, it replicates three published studies of lay judgment across negligence, consent, and contract interpretation, drawing on nearly 10,000 simulated decisions. The findings reveal that models can replicate subtle patterns that run counter to textbook treatment. Like human subjects, models prioritize social conformity over cost-benefit analysis when assessing negligence, inverting the hierarchy that textbooks teach. They reproduce the paradox that material lies erode consent less than lies about a transaction's essence. And they track lay contract formalism, judging hidden fees more enforceable than fair. For two centuries, scholars have debated whether the reasonable person is empirical or normative, majoritarian or aspirational. But much of this debate assumed a constraint that no longer holds: that lay judgments are expensive to surface, slow to collect, and unavailable at scale. Generative reasonable people loosen that constraint. They offer judges empirical checks on elite intuition, give resource-constrained litigants access to simulated jury feedback, and let regulators pilot-test public comprehension, all at a fraction of survey costs. The reasonable person standard has long functioned as a vessel for judicial intuition precisely because the empirical baseline was missing. With that baseline now available, departures from lay understanding become transparent rather than hidden, a choice to be justified, not a fact to be assumed. Properly cabined, the generative reasonable person may become a dictionary for reasonableness judgments.

2507.16641 2026-02-18 quant-ph cs.AI cs.LG

Hybrid Reward-Driven Reinforcement Learning for Efficient Quantum Circuit Synthesis

Sara Giordano, Kornikar Sen, Miguel A. Martin-Delgado

Comments 35 pages, 7 figures, color figures

Journal ref Quantum Mach. Intell. 8, 9 (2026)

详情
英文摘要

A reinforcement learning (RL) framework is introduced for the efficient synthesis of quantum circuits that generate specified target quantum states from a fixed initial state, addressing a central challenge in both the Noisy Intermediate-Scale Quantum (NISQ) era and future fault-tolerant quantum computing. The approach utilizes tabular Q-learning, based on action sequences, within a discretized quantum state space, to effectively manage the exponential growth of the space dimension. The framework introduces a hybrid reward mechanism, combining a static, domain-informed reward that guides the agent toward the target state with customizable dynamic penalties that discourage inefficient circuit structures such as gate congestion and redundant state revisits. This is a circuit-aware reward, in contrast to the current trend of works on this topic, which are primarily fidelity-based. By leveraging sparse matrix representations and state-space discretization, the method enables practical navigation of high-dimensional environments while minimizing computational overhead. Benchmarking on graph-state preparation tasks for up to seven qubits, we demonstrate that the algorithm consistently discovers minimal-depth circuits with optimized gate counts. Moreover, extending the framework to a universal gate set still yields low depth circuits, highlighting the algorithm robustness and adaptability. The results confirm that this RL-driven approach, with our completely circuit-aware method, efficiently explores the complex quantum state space and synthesizes near-optimal quantum circuits, providing a resource-efficient foundation for quantum circuit optimization.

2507.12202 2026-02-18 cs.IR cs.AI cs.LG

Sparse Autoencoders for Sequential Recommendation Models: Interpretation and Flexible Control

Anton Klenitskiy, Konstantin Polev, Daria Denisova, Alexey Vasilev, Dmitry Simakov, Gleb Gusev

详情
英文摘要

Many current state-of-the-art models for sequential recommendations are based on transformer architectures. Interpretation and explanation of such black box models is an important research question, as a better understanding of their internals can help understand, influence, and control their behavior, which is very important in a variety of real-world applications. Recently, sparse autoencoders (SAE) have been shown to be a promising unsupervised approach to extract interpretable features from neural networks. In this work, we extend SAE to sequential recommender systems and propose a framework for interpreting and controlling model representations. We show that this approach can be successfully applied to the transformer trained on a sequential recommendation task: directions learned in such an unsupervised regime turn out to be more interpretable and monosemantic than the original hidden state dimensions. Further, we demonstrate a straightforward way to effectively and flexibly control the model's behavior, giving developers and users of recommendation systems the ability to adjust their recommendations to various custom scenarios and contexts.

2506.20367 2026-02-18 cs.GR cs.CV

DreamAnywhere: Object-Centric Panoramic 3D Scene Generation

Edoardo Alberto Dominici, Jozef Hladky, Floor Verhoeven, Lukas Radl, Thomas Deixelberger, Stefan Ainetter, Philipp Drescher, Stefan Hauswiesner, Arno Coomans, Giacomo Nazzaro, Konstantinos Vardis, Markus Steinberger

Comments WACV 2026 Oral

详情
英文摘要

Recent advances in text-to-3D scene generation have demonstrated significant potential to transform content creation across multiple industries. Although the research community has made impressive progress in addressing the challenges of this complex task, existing methods often generate environments that are only front-facing, lack visual fidelity, exhibit limited scene understanding, and are typically fine-tuned for either indoor or outdoor settings. In this work, we address these issues and propose DreamAnywhere, a modular system for the fast generation and prototyping of 3D scenes. Our system synthesizes a 360° panoramic image from text, decomposes it into background and objects, constructs a complete 3D representation through hybrid inpainting, and lifts object masks to detailed 3D objects that are placed in the virtual environment. DreamAnywhere supports immersive navigation and intuitive object-level editing, making it ideal for scene exploration, visual mock-ups, and rapid prototyping -- all with minimal manual modeling. These features make our system particularly suitable for low-budget movie production, enabling quick iteration on scene layout and visual tone without the overhead of traditional 3D workflows. Our modular pipeline is highly customizable as it allows components to be replaced independently. Compared to current state-of-the-art text and image-based 3D scene generation approaches, DreamAnywhere shows significant improvements in coherence in novel view synthesis and achieves competitive image quality, demonstrating its effectiveness across diverse and challenging scenarios. A comprehensive user study demonstrates a clear preference for our method over existing approaches, validating both its technical robustness and practical usefulness.

2506.15723 2026-02-18 q-fin.ST cs.LG econ.GN q-fin.EC stat.AP

Modern approaches to building interpretable models of the property market using machine learning on the base of mass cadastral valuation

Alexey S. Tanashkin, Irina G. Tanashkina, Alexander S. Maksimchuik

Comments 62 pages, 21 figures, 11 tables; after the major revision, accepted in journal Land Use Policy; changes: literature review is added to introduction section, new conclusion, comparison of the models with the random forest is added, the feature selection section is reconsidered, many minor corrections, language sufficiently improved

Journal ref Land Use Policy, Volume 165, 2026, 107970

详情
英文摘要

In this paper, we review modern approaches to building interpretable models of property markets using machine learning on the base of mass valuation of property in the Primorye region, Russia. There are numerous potential difficulties one could encounter in the effort to build a good model. Their main source is the huge difference between noisy real market data and ideal data usually used in tutorials on machine learning. This paper covers all stages of modeling: collection of initial data, identification of outliers, search and analysis of patterns in the data, formation and final choice of price factors, building of the model, and evaluation of its efficiency. For each stage, we highlight potential issues and describe sound methods for overcoming emerging difficulties on actual examples. We show that the combination of classical linear regression with kriging (interpolation method of geostatistics) allows to build an effective model for land parcels. For flats, when many objects are attributed to one spatial point, the application of geostatistical methods becomes problematic. Instead, we suggest linear regression with automatic generation and selection of additional rules on the base of decision trees, so called the RuleFit method. We compare the performance of our inherently interpretable models with well-proven "black-box" Random Forest method and demonstrate similar results. Thus we show, that despite such a strong restriction as the requirement of interpretability which is important in practical aspects, for example, legal matters, it is still possible to build effective models of real property markets.

2504.15370 2026-02-18 physics.chem-ph cond-mat.mtrl-sci cs.LG

Transferable Learning of Reaction Pathways from Geometric Priors

Juno Nam, Miguel Steiner, Max Misterka, Soojung Yang, Avni Singhal, Rafael Gómez-Bombarelli

Comments 14 pages, 6 figures; Supporting Information in ancillary files

详情
英文摘要

Identifying minimum-energy paths (MEPs) is crucial for understanding chemical reaction mechanisms but remains computationally demanding. We introduce MEPIN, a scalable machine-learning method for efficiently predicting MEPs from reactant and product configurations, without relying on transition-state geometries or pre-optimized reaction paths during training. The task is defined as predicting deviations from geometric interpolations along reaction coordinates. We address this task with a continuous reaction path model based on a symmetry-broken equivariant neural network that generates a flexible number of intermediate structures. The model is trained using an energy-based objective, with efficiency enhanced by incorporating geometric priors from geodesic interpolation as initial interpolations or pre-training objectives. Our approach generalizes across diverse chemical reactions and achieves accurate alignment with reference intrinsic reaction coordinates, as demonstrated on various small molecule reactions and [3+2] cycloadditions. Our method enables the exploration of large chemical reaction spaces with efficient, data-driven predictions of reaction pathways.

2504.14696 2026-02-18 cs.IT cs.CR cs.DS cs.LG math.IT

Reveal-or-Obscure: A Differentially Private Sampling Algorithm for Discrete Distributions

Naima Tasnim, Atefeh Gilani, Lalitha Sankar, Oliver Kosut

Comments 8 pages, 3 figures

详情
英文摘要

We introduce a differentially private (DP) algorithm called reveal-or-obscure (ROO) to generate a single representative sample from a dataset of $n$ observations drawn i.i.d. from an unknown discrete distribution $P$. Unlike methods that add explicit noise to the estimated empirical distribution, ROO achieves $ε$-differential privacy by randomly choosing whether to "reveal" or "obscure" the empirical distribution. While ROO is structurally identical to Algorithm 1 proposed by Cheu and Nayak (arXiv:2412.10512), we prove a strictly better bound on the sampling complexity than that established in Theorem 12 of (arXiv:2412.10512). To further improve the privacy-utility trade-off, we propose a novel generalized sampling algorithm called Data-Specific ROO (DS-ROO), where the probability of obscuring the empirical distribution of the dataset is chosen adaptively. We prove that DS-ROO satisfies $ε$-DP, and provide empirical evidence that DS-ROO can achieve better utility under the same privacy budget of vanilla ROO.

2504.08456 2026-02-18 quant-ph cs.AI

Generalization Bounds in Hybrid Quantum-Classical Machine Learning Models

Tongyan Wu, Amine Bentellis, Alona Sakhnenko, Jeanette Miriam Lorenz

Comments Added activation function and multi layer formulations for the generalization bound

详情
英文摘要

Hybrid classical-quantum models aim to harness the strengths of both quantum computing and classical machine learning, but their practical potential remains poorly understood. In this work, we develop a unified mathematical framework for analyzing generalization in hybrid models, offering insight into how these systems learn from data. We establish a novel generalization bound of the form $\tilde{\mathcal O}\left( \tfrac{α^{k}}{\sqrt{N}}\, \big( k^{\tfrac{3}{2}}\sqrt{m n}\;+\;\sqrt{T\log T}\big) \right)$ for $N$ training data points, $T$ trainable quantum gates, $n$ dimensional quantum circuit output, and $k$ bounded linear layers $ \|F_i\|_F \leq α$ where $ i = 1, \dots, k $ and $F_i \in \mathbb{R}^{m \times n} $ interspersed with activation functions. This generalization bound decomposes into quantum and classical contributions, providing a theoretical framework to separate their influence and clarifying their interaction. Alongside the bound, we highlight conceptual limitations of applying classical statistical learning theory in the hybrid setting and suggest promising directions for future theoretical work.

2502.18545 2026-02-18 cs.CR cs.AI cs.CL cs.IR

PII-Bench: Evaluating Query-Aware Privacy Protection Systems

Hao Shen, Zhouhong Gu, Haokai Hong, Weili Han

详情
英文摘要

The widespread adoption of Large Language Models (LLMs) has raised significant privacy concerns regarding the exposure of personally identifiable information (PII) in user prompts. To address this challenge, we propose a query-unrelated PII masking strategy and introduce PII-Bench, the first comprehensive evaluation framework for assessing privacy protection systems. PII-Bench comprises 2,842 test samples across 55 fine-grained PII categories, featuring diverse scenarios from single-subject descriptions to complex multi-party interactions. Each sample is carefully crafted with a user query, context description, and standard answer indicating query-relevant PII. Our empirical evaluation reveals that while current models perform adequately in basic PII detection, they show significant limitations in determining PII query relevance. Even state-of-the-art LLMs struggle with this task, particularly in handling complex multi-subject scenarios, indicating substantial room for improvement in achieving intelligent PII masking.

2410.11855 2026-02-18 cs.DC cs.AI cs.AR cs.LG

Online GPU Energy Optimization with Switching-Aware Bandits

Xiongxiao Xu, Solomon Abera Bekele, Brice Videau, Kai Shu

Comments ACM Web Conference 2026 (WWW'26)

详情
英文摘要

Energy consumption has become a bottleneck for future computing architectures, from wearable devices to leadership-class supercomputers. Existing energy management techniques largely target CPUs, even though GPUs now dominate power draw in heterogeneous high performance computing (HPC) systems. Moreover, many prior methods rely on either purely offline or hybrid offline and online training, which is impractical and results in energy inefficiencies during data collection. In this paper, we introduce a practical online GPU energy optimization problem in a HPC scenarios. The problem is challenging because (1) GPU frequency scaling exhibits performance-energy trade-offs, (2) online control must balance exploration and exploitation, and (3) frequent frequency switching incurs non-trivial overhead and degrades quality of service (QoS). To address the challenges, we formulate online GPU energy optimization as a multi-armed bandit problem and propose EnergyUCB, a lightweight UCB-based controller that dynamically adjusts GPU core frequency in real time to save energy. Specifically, EnergyUCB (1) defines a reward that jointly captures energy and performance using a core-to-uncore utilization ratio as a proxy for GPU throughput, (2) employs optimistic initialization and UCB-style confidence bonuses to accelerate learning from scratch, and (3) incorporates a switching-aware UCB index and a QoS-constrained variant that enforce explicit slowdown budgets while discouraging unnecessary frequency oscillations. Extensive experiments on real-world workloads from the world's third fastest supercomputer Aurora show that EnergyUCB achieves substantial energy savings with modest slowdown and that the QoS-constrained variant reliably respects user-specified performance budgets.

2405.20178 2026-02-18 eess.SY cs.LG cs.SY

Non-intrusive data-driven model order reduction for circuits based on Hammerstein architectures

Joshua Hanson, Paul Kuberry, Biliana Paskaleva, Pavel Bochev

Comments 14 pages, 18 figures; accepted to IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Journal ref IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 44(6) (2025) 2314-2327

详情
英文摘要

We demonstrate that system identification techniques can provide a basis for effective, non-intrusive model order reduction (MOR) for common circuits that are key building blocks in microelectronics. Our approach is motivated by the practical operation of these circuits and utilizes a canonical Hammerstein architecture. To demonstrate the approach we develop parsimonious Hammerstein models for a nonlinear CMOS differential amplifier and an operational amplifier circuit. We train these models on a combination of direct current (DC) and transient Spice circuit simulation data using a novel sequential strategy to identify their static nonlinear and linear dynamical parts. Simulation results show that the Hammerstein model is an effective surrogate for for these types of circuits that accurately and efficiently reproduces their behavior over a wide range of operating points and input frequencies.

2309.08615 2026-02-18 cs.CY cs.AI cs.PF

Energy Concerns with HPC Systems and Applications

Roblex Nana, Claude Tadonki, Petr Dokladal, Youssef Mesri

Comments 20 pages

详情
英文摘要

For various reasons including those related to climate changes, {\em energy} has become a critical concern in all relevant activities and technical designs. For the specific case of computer activities, the problem is exacerbated with the emergence and pervasiveness of the so called {\em intelligent devices}. From the application side, we point out the special topic of {\em Artificial Intelligence}, who clearly needs an efficient computing support in order to succeed in its purpose of being a {\em ubiquitous assistant}. There are mainly two contexts where {\em energy} is one of the top priority concerns: {\em embedded computing} and {\em supercomputing}. For the former, power consumption is critical because the amount of energy that is available for the devices is limited. For the latter, the heat dissipated is a serious source of failure and the financial cost related to energy is likely to be a significant part of the maintenance budget. On a single computer, the problem is commonly considered through the electrical power consumption. This paper, written in the form of a survey, we depict the landscape of energy concerns in computer activities, both from the hardware and the software standpoints.

2211.01231 2026-02-18 eess.SY cs.AI cs.SY

Interval Markov Decision Processes with Continuous Action-Spaces

Giannis Delimpaltadakis, Morteza Lahijanian, Manuel Mazo, Luca Laurenti

Comments This work will be presented at the 26th ACM International Conference on Hybrid Systems Computation and Control (HSCC), 09-12 May, 2023, San Antonio, TX, USA

Journal ref 26th ACM International Conference on Hybrid Systems Computation and Control (HSCC), 2023

详情
英文摘要

Interval Markov Decision Processes (IMDPs) are finite-state uncertain Markov models, where the transition probabilities belong to intervals. Recently, there has been a surge of research on employing IMDPs as abstractions of stochastic systems for control synthesis. However, due to the absence of algorithms for synthesis over IMDPs with continuous action-spaces, the action-space is assumed discrete a-priori, which is a restrictive assumption for many applications. Motivated by this, we introduce continuous-action IMDPs (caIMDPs), where the bounds on transition probabilities are functions of the action variables, and study value iteration for maximizing expected cumulative rewards. Specifically, we decompose the max-min problem associated to value iteration to $|\mathcal{Q}|$ max problems, where $|\mathcal{Q}|$ is the number of states of the caIMDP. Then, exploiting the simple form of these max problems, we identify cases where value iteration over caIMDPs can be solved efficiently (e.g., with linear or convex programming). We also gain other interesting insights: e.g., in certain cases where the action set $\mathcal{A}$ is a polytope, synthesis over a discrete-action IMDP, where the actions are the vertices of $\mathcal{A}$, is sufficient for optimality. We demonstrate our results on a numerical example. Finally, we include a short discussion on employing caIMDPs as abstractions for control synthesis.

2602.15470 2026-02-18 physics.soc-ph cs.LG

The Skeletal Trap: Mapping Spatial Inequality and Ghost Stops in Ankara's Transit Network

Elifnaz Kancan

Comments 13 pages, 12 figures. Spatial analysis of Ankara transit network using anomaly detection and grid-based modeling

详情
英文摘要

Ankara's public transport crisis is commonly framed as a shortage of buses or operational inefficiency. This study argues that the problem is fundamentally morphological and structural. The city's leapfrog urban expansion has produced fragmented peripheral clusters disconnected from a rigid, center-oriented bus network. As a result, demand remains intensely concentrated along the Kizilay-Ulus axis and western corridors, while peripheral districts experience either chronic under-service or enforced transfer dependency. The deficiency is therefore not merely quantitative but rooted in the misalignment between urban macroform and network architecture. The empirical analysis draws on a 173-day operational dataset derived from route-level passenger and trip reports published by EGO under the former "Transparent Ankara" initiative. To overcome the absence of stop-level geospatial data, a Connectivity-Based Weighted Distribution Model reallocates passenger volumes to 1 km x 1 km grid cells using network centrality. The findings reveal persistent center-periphery asymmetries, structural bottlenecks, and spatially embedded accessibility inequalities.

2602.15439 2026-02-18 cs.CY cs.AI cs.SI

Algorithmic Approaches to Opinion Selection for Online Deliberation: A Comparative Study

Salim Hafid, Manon Berriche, Jean-Philippe Cointet

详情
英文摘要

During deliberation processes, mediators and facilitators typically need to select a small and representative set of opinions later used to produce digestible reports for stakeholders. In online deliberation platforms, algorithmic selection is increasingly used to automate this process. However, such automation is not without consequences. For instance, enforcing consensus-seeking algorithmic strategies can imply ignoring or flattening conflicting preferences, which may lead to erasing minority voices and reducing content diversity. More generally, across the variety of existing selection strategies (e.g., consensus, diversity), it remains unclear how each approach influences desired democratic criteria such as proportional representation. To address this gap, we benchmark several algorithmic approaches in this context. We also build on social choice theory to propose a novel algorithm that incorporates both diversity and a balanced notion of representation in the selection strategy. We find empirically that while no single strategy dominates across all democratic desiderata, our social-choice-inspired selection rule achieves the strongest trade-off between proportional representation and diversity.

2602.15381 2026-02-18 cs.IR cs.CV

Automatic Funny Scene Extraction from Long-form Cinematic Videos

Sibendu Paul, Haotian Jiang, Caren Chen

Journal ref Association for the Advancement of Artificial Intelligence 2026

详情
英文摘要

Automatically extracting engaging and high-quality humorous scenes from cinematic titles is pivotal for creating captivating video previews and snackable content, boosting user engagement on streaming platforms. Long-form cinematic titles, with their extended duration and complex narratives, challenge scene localization, while humor's reliance on diverse modalities and its nuanced style add further complexity. This paper introduces an end-to-end system for automatically identifying and ranking humorous scenes from long-form cinematic titles, featuring shot detection, multimodal scene localization, and humor tagging optimized for cinematic content. Key innovations include a novel scene segmentation approach combining visual and textual cues, improved shot representations via guided triplet mining, and a multimodal humor tagging framework leveraging both audio and text. Our system achieves an 18.3% AP improvement over state-of-the-art scene detection on the OVSD dataset and an F1 score of 0.834 for detecting humor in long text. Extensive evaluations across five cinematic titles demonstrate 87% of clips extracted by our pipeline are intended to be funny, while 98% of scenes are accurately localized. With successful generalization to trailers, these results showcase the pipeline's potential to enhance content creation workflows, improve user engagement, and streamline snackable content generation for diverse cinematic media formats.

2602.15379 2026-02-18 cs.DC cs.LG

FlashMem: Supporting Modern DNN Workloads on Mobile with GPU Memory Hierarchy Optimizations

Zhihao Shu, Md Musfiqur Rahman Sanim, Hangyu Zheng, Kunxiong Zhu, Miao Yin, Gagan Agrawal, Wei Niu

详情
英文摘要

The increasing size and complexity of modern deep neural networks (DNNs) pose significant challenges for on-device inference on mobile GPUs, with limited memory and computational resources. Existing DNN acceleration frameworks primarily deploy a weight preloading strategy, where all model parameters are loaded into memory before execution on mobile GPUs. We posit that this approach is not adequate for modern DNN workloads that comprise very large model(s) and possibly execution of several distinct models in succession. In this work, we introduce FlashMem, a memory streaming framework designed to efficiently execute large-scale modern DNNs and multi-DNN workloads while minimizing memory consumption and reducing inference latency. Instead of fully preloading weights, FlashMem statically determines model loading schedules and dynamically streams them on demand, leveraging 2.5D texture memory to minimize data transformations and improve execution efficiency. Experimental results on 11 models demonstrate that FlashMem achieves 2.0x to 8.4x memory reduction and 1.7x to 75.0x speedup compared to existing frameworks, enabling efficient execution of large-scale models and multi-DNN support on resource-constrained mobile GPUs.

2602.15376 2026-02-18 cs.CR cs.AI

A Unified Evaluation of Learning-Based Similarity Techniques for Malware Detection

Udbhav Prasad, Aniesh Chawla

详情
英文摘要

Cryptographic digests (e.g., MD5, SHA-256) are designed to provide exact identity. Any single-bit change in the input produces a completely different hash, which is ideal for integrity verification but limits their usefulness in many real-world tasks like threat hunting, malware analysis and digital forensics, where adversaries routinely introduce minor transformations. Similarity-based techniques address this limitation by enabling approximate matching, allowing related byte sequences to produce measurably similar fingerprints. Modern enterprises manage tens of thousands of endpoints with billions of files, making the effectiveness and scalability of the proposed techniques more important than ever in security applications. Security researchers have proposed a range of approaches, including similarity digests and locality-sensitive hashes (e.g., ssdeep, sdhash, TLSH), as well as more recent machine-learning-based methods that generate embeddings from file features. However, these techniques have largely been evaluated in isolation, using disparate datasets and evaluation criteria. This paper presents a systematic comparison of learning-based classification and similarity methods using large, publicly available datasets. We evaluate each method under a unified experimental framework with industry-accepted metrics. To our knowledge, this is the first reproducible study to benchmark these diverse learning-based similarity techniques side by side for real-world security workloads. Our results show that no single approach performs well across all dimensions; instead, each exhibits distinct trade-offs, indicating that effective malware analysis and threat-hunting platforms must combine complementary classification and similarity techniques rather than rely on a single method.

2602.15362 2026-02-18 cs.SE cs.AI

Automated Multi-Source Debugging and Natural Language Error Explanation for Dashboard Applications

Devendra Tata, Mona Rajhans

Comments Accepted for publication at the 12th (Springer CCIS) International Conference on Information Management, March 27-29, 2026, Oxford, UK

详情
英文摘要

Modern web dashboards and enterprise applications increasingly rely on complex, distributed microservices architectures. While these architectures offer scalability, they introduce significant challenges in debugging and observability. When failures occur, they often manifest as opaque error messages to the end-user such as Something went wrong. This masks the underlying root cause which may reside in browser side exceptions, API contract violations, or server side logic failures. Existing monitoring tools capture these events in isolation but fail to correlate them effectively or provide intelligible explanations to non technical users. This paper proposes a novel system for Automated Multi Source Debugging and Natural Language Error Explanation. The proposed framework automatically collects and correlates error data from disparate sources such as browser, API, server logs and validates API contracts in real time, and utilizes Large Language Models to generate natural language explanations. This approach significantly reduces Mean Time to Resolution for support engineers and improves the user experience by transforming cryptic error codes into actionable insights.

2602.15350 2026-02-18 eess.SY cs.AI cs.SY

Fine-Tuning LLMs to Generate Economical and Reliable Actions for the Power Grid

Mohamad Chehade, Hao Zhu

详情
英文摘要

Public Safety Power Shutoffs (PSPS) force rapid topology changes that can render standard operating points infeasible, requiring operators to quickly identify corrective transmission switching actions that reduce load shedding while maintaining acceptable voltage behavior. We present a verifiable, multi-stage adaptation pipeline that fine-tunes an instruction-tuned large language model (LLM) to generate \emph{open-only} corrective switching plans from compact PSPS scenario summaries under an explicit switching budget. First, supervised fine-tuning distills a DC-OPF MILP oracle into a constrained action grammar that enables reliable parsing and feasibility checks. Second, direct preference optimization refines the policy using AC-evaluated preference pairs ranked by a voltage-penalty metric, injecting voltage-awareness beyond DC imitation. Finally, best-of-$N$ selection provides an inference-time addition by choosing the best feasible candidate under the target metric. On IEEE 118-bus PSPS scenarios, fine-tuning substantially improves DC objective values versus zero-shot generation, reduces AC power-flow failure from 50\% to single digits, and improves voltage-penalty outcomes on the common-success set. Code and data-generation scripts are released to support reproducibility.

2602.15339 2026-02-18 eess.IV cs.AI cs.CV

Benchmarking Self-Supervised Models for Cardiac Ultrasound View Classification

Youssef Megahed, Salma I. Megahed, Robin Ducharme, Inok Lee, Adrian D. C. Chan, Mark C. Walker, Steven Hawken

Comments 10 pages, 3 figures, 3 tables

详情
英文摘要

Reliable interpretation of cardiac ultrasound images is essential for accurate clinical diagnosis and assessment. Self-supervised learning has shown promise in medical imaging by leveraging large unlabelled datasets to learn meaningful representations. In this study, we evaluate and compare two self-supervised learning frameworks, USF-MAE, developed by our team, and MoCo v3, on the recently introduced CACTUS dataset (37,736 images) for automated simulated cardiac view (A4C, PL, PSAV, PSMV, Random, and SC) classification. Both models used 5-fold cross-validation, enabling robust assessment of generalization performance across multiple random splits. The CACTUS dataset provides expert-annotated cardiac ultrasound images with diverse views. We adopt an identical training protocol for both models to ensure a fair comparison. Both models are configured with a learning rate of 0.0001 and a weight decay of 0.01. For each fold, we record performance metrics including ROC-AUC, accuracy, F1-score, and recall. Our results indicate that USF-MAE consistently outperforms MoCo v3 across metrics. The average testing AUC for USF-MAE is 99.99% (+/-0.01% 95% CI), compared to 99.97% (+/-0.01%) for MoCo v3. USF-MAE achieves a mean testing accuracy of 99.33% (+/-0.18%), higher than the 98.99% (+/-0.28%) reported for MoCo v3. Similar trends are observed for the F1-score and recall, with improvements statistically significant across folds (paired t-test, p=0.0048 < 0.01). This proof-of-concept analysis suggests that USF-MAE learns more discriminative features for cardiac view classification than MoCo v3 when applied to this dataset. The enhanced performance across multiple metrics highlights the potential of USF-MAE for improving automated cardiac ultrasound classification.

2602.15326 2026-02-18 eess.SP cs.AI cs.DC cs.LG

SCENE OTA-FD: Self-Centering Noncoherent Estimator for Over-the-Air Federated Distillation

Hao Chen, Zavareh Bozorgasl

Comments Work in progress. Codes will be available on: https://github.com/zavareh1

详情
英文摘要

We propose SCENE (Self-Centering Noncoherent Estimator), a pilot-free and phase-invariant aggregation primitive for over-the-air federated distillation (OTA-FD). Each device maps its soft-label (class-probability) vector to nonnegative transmit energies under constant per-round power and constant-envelope signaling (PAPR near 1). At the server, a self-centering energy estimator removes the noise-energy offset and yields an unbiased estimate of the weighted soft-label average, with variance decaying on the order of 1/(SM) in the number of receive antennas M and repetition factor S. We also develop a pilot-free ratio-normalized variant that cancels unknown large-scale gains, provide a convergence bound consistent with coherent OTA-FD analyses, and present an overhead-based crossover comparison. SCENE targets short-coherence and hardware-constrained regimes, where avoiding per-round CSI is essential: it trades a modest noncoherent variance constant for zero uplink pilots, unbiased aggregation, and hardware-friendly transmission, and can outperform coherent designs when pilot overhead is non-negligible.

2602.15323 2026-02-18 cs.CR cs.AI cs.LG

Unforgeable Watermarks for Language Models via Robust Signatures

Huijia Lin, Kameron Shahabi, Min Jae Song

Comments 60 pages, 7 figures

详情
英文摘要

Language models now routinely produce text that is difficult to distinguish from human writing, raising the need for robust tools to verify content provenance. Watermarking has emerged as a promising countermeasure, with existing work largely focused on model quality preservation and robust detection. However, current schemes provide limited protection against false attribution. We strengthen the notion of soundness by introducing two novel guarantees: unforgeability and recoverability. Unforgeability prevents adversaries from crafting false positives, texts that are far from any output from the watermarked model but are nonetheless flagged as watermarked. Recoverability provides an additional layer of protection: whenever a watermark is detected, the detector identifies the source text from which the flagged content was derived. Together, these properties strengthen content ownership by linking content exclusively to its generating model, enabling secure attribution and fine-grained traceability. We construct the first undetectable watermarking scheme that is robust, unforgeable, and recoverable with respect to substitutions (i.e., perturbations in Hamming metric). The key technical ingredient is a new cryptographic primitive called robust (or recoverable) digital signatures, which allow verification of messages that are close to signed ones, while preventing forgery of messages that are far from all previously signed messages. We show that any standard digital signature scheme can be boosted to a robust one using property-preserving hash functions (Boyle, LaVigne, and Vaikuntanathan, ITCS 2019).

2602.15307 2026-02-18 eess.AS cs.SD

What Do Neurons Listen To? A Neuron-level Dissection of a General-purpose Audio Model

Takao Kawamura, Daisuke Niizumi, Nobutaka Ono

Comments 5 pages, 8 figures. Submitted to EUSIPCO 2026

详情
英文摘要

In this paper, we analyze the internal representations of a general-purpose audio self-supervised learning (SSL) model from a neuron-level perspective. Despite their strong empirical performance as feature extractors, the internal mechanisms underlying the robust generalization of SSL audio models remain unclear. Drawing on the framework of mechanistic interpretability, we identify and examine class-specific neurons by analyzing conditional activation patterns across diverse tasks. Our analysis reveals that SSL models foster the emergence of class-specific neurons that provide extensive coverage across novel task classes. These neurons exhibit shared responses across different semantic categories and acoustic similarities, such as speech attributes and musical pitch. We also confirm that these neurons have a functional impact on classification performance. To our knowledge, this is the first systematic neuron-level analysis of a general-purpose audio SSL model, providing new insights into its internal representation.

2602.15306 2026-02-18 stat.ML cs.LG

Sparse Additive Model Pruning for Order-Based Causal Structure Learning

Kentaro Kanamori, Hirofumi Suzuki, Takuya Takagi

Comments 15 pages, 12 figures, to appear in the 40th AAAI Conference on Artificial Intelligence (AAAI 2026)

详情
英文摘要

Causal structure learning, also known as causal discovery, aims to estimate causal relationships between variables as a form of a causal directed acyclic graph (DAG) from observational data. One of the major frameworks is the order-based approach that first estimates a topological order of the underlying DAG and then prunes spurious edges from the fully-connected DAG induced by the estimated topological order. Previous studies often focus on the former ordering step because it can dramatically reduce the search space of DAGs. In practice, the latter pruning step is equally crucial for ensuring both computational efficiency and estimation accuracy. Most existing methods employ a pruning technique based on generalized additive models and hypothesis testing, commonly known as CAM-pruning. However, this approach can be a computational bottleneck as it requires repeatedly fitting additive models for all variables. Furthermore, it may harm estimation quality due to multiple testing. To address these issues, we introduce a new pruning method based on sparse additive models, which enables direct pruning of redundant edges without relying on hypothesis testing. We propose an efficient algorithm for learning sparse additive models by combining the randomized tree embedding technique with group-wise sparse regression. Experimental results on both synthetic and real datasets demonstrated that our method is significantly faster than existing pruning methods while maintaining comparable or superior accuracy.