arXivDaily arXiv每日学术速递 周一至周五更新
重置
2604.01224 2026-04-02 cs.RO

Functional Force-Aware Retargeting from Virtual Human Demos to Soft Robot Policies

Uksang Yoo, Mengjia Zhu, Evan Pezent, Jom Preechayasomboon, Jean Oh, Jeffrey Ichnowski, Amir Memar, Ben Abbatematteo, Homanga Bharadhwaj, Ashish Deshpande, Harsha Prahlad

详情
英文摘要

We introduce SoftAct, a framework for teaching soft robot hands to perform human-like manipulation skills by explicitly reasoning about contact forces. Leveraging immersive virtual reality, our system captures rich human demonstrations, including hand kinematics, object motion, dense contact patches, and detailed contact force information. Unlike conventional approaches that retarget human joint trajectories, SoftAct employs a two-stage, force-aware retargeting algorithm. The first stage attributes demonstrated contact forces to individual human fingers and allocates robot fingers proportionally, establishing a force-balanced mapping between human and robot hands. The second stage performs online retargeting by combining baseline end-effector pose tracking with geodesic-weighted contact refinements, using contact geometry and force magnitude to adjust robot fingertip targets in real time. This formulation enables soft robotic hands to reproduce the functional intent of human demonstrations while naturally accommodating extreme embodiment mismatch and nonlinear compliance. We evaluate SoftAct on a suite of contact-rich manipulation tasks using a custom non-anthropomorphic pneumatic soft robot hand. SoftAct's controller reduces fingertip trajectory tracking RMSE by up to 55 percent and reduces tracking variance by up to 69 percent compared to kinematic and learning-based baselines. At the policy level, SoftAct achieves consistently higher success in zero-shot real-world deployment and in simulation. These results demonstrate that explicitly modeling contact geometry and force distribution is essential for effective skill transfer to soft robotic hands, and cannot be recovered through kinematic imitation alone. Project videos and additional details are available at https://soft-act.github.io/.

2604.01221 2026-04-02 cs.AI cs.CV

HippoCamp: Benchmarking Contextual Agents on Personal Computers

Zhe Yang, Shulin Tian, Kairui Hu, Shuai Liu, Hoang-Nhat Nguyen, Yichi Zhang, Zujin Guo, Mengying Yu, Zinan Zhang, Jingkang Yang, Chen Change Loy, Ziwei Liu

Comments Project Page: https://hippocamp-ai.github.io/

详情
英文摘要

We present HippoCamp, a new benchmark designed to evaluate agents' capabilities on multimodal file management. Unlike existing agent benchmarks that focus on tasks like web interaction, tool use, or software automation in generic settings, HippoCamp evaluates agents in user-centric environments to model individual user profiles and search massive personal files for context-aware reasoning. Our benchmark instantiates device-scale file systems over real-world profiles spanning diverse modalities, comprising 42.4 GB of data across over 2K real-world files. Building upon the raw files, we construct 581 QA pairs to assess agents' capabilities in search, evidence perception, and multi-step reasoning. To facilitate fine-grained analysis, we provide 46.1K densely annotated structured trajectories for step-wise failure diagnosis. We evaluate a wide range of state-of-the-art multimodal large language models (MLLMs) and agentic methods on HippoCamp. Our comprehensive experiments reveal a significant performance gap: even the most advanced commercial models achieve only 48.3% accuracy in user profiling, struggling particularly with long-horizon retrieval and cross-modal reasoning within dense personal file systems. Furthermore, our step-wise failure diagnosis identifies multimodal perception and evidence grounding as the primary bottlenecks. Ultimately, HippoCamp exposes the critical limitations of current agents in realistic, user-centric environments and provides a robust foundation for developing next-generation personal AI assistants.

2604.01220 2026-04-02 cs.CL

Universal YOCO for Efficient Depth Scaling

Yutao Sun, Li Dong, Tianzhu Ye, Shaohan Huang, Jianyong Wang, Furu Wei

详情
英文摘要

The rise of test-time scaling has remarkably boosted the reasoning and agentic proficiency of Large Language Models (LLMs). Yet, standard Transformers struggle to scale inference-time compute efficiently, as conventional looping strategies suffer from high computational overhead and a KV cache that inflates alongside model depth. We present Universal YOCO (YOCO-U), which combines the YOCO decoder-decoder architecture with recursive computation to achieve a synergistic effect greater than either alone. Built on the YOCO framework, YOCO-U implements a Universal Self-Decoder that performs multiple iterations via parameter sharing, while confining the iterative process to shallow, efficient-attention layers. This combination yields a favorable capability-efficiency tradeoff that neither YOCO nor recursion achieves independently. The YOCO architecture provides a constant global KV cache and linear pre-filling, while partial recursion enhances representational depth with limited overhead. Together, YOCO-U improves token utility and scaling behavior while maintaining efficient inference. Empirical results confirm that YOCO-U remains highly competitive in general and long-context benchmarks, demonstrating that the integration of efficient-attention architectures and recursive computation is a promising direction for scalable LLMs.

2604.01217 2026-04-02 quant-ph cond-mat.other cs.IT hep-th math-ph math.IT math.MP

Conditional channel entropy sets fundamental limits on thermodynamic quantum information processing

Himanshu Badhani, Siddhartha Das

Comments 33+20 pages, 1 table, 3 figures

详情
英文摘要

The thermodynamic resourcefulness of quantum channels primarily depends on their underlying causal structure and their ability to generate quantum correlations. We quantify this interplay within the resource theory of athermality for bipartite quantum channels in the presence of a side channel acting as memory, referred to as the resource theory of conditional athermality. For channels with trivial output Hamiltonians, we characterize the optimal one-shot rates for distilling the identity gate from a given channel, as well as the cost of simulating the channel using the identity gate, under conditional Gibbs-preserving superchannels. We show that these rates have a direct trade-off relation with the conditional channel entropies, attributing operational significance to signaling in quantum processes. Furthermore, we establish an equipartition property for the conditional channel min-entropy for classes of channels that are either tele-covariant or no-signaling from the non-conditioning input to the conditioning output. As a consequence, we demonstrate asymptotic reversibility of the resource theory for these channels. The asymptotic conditional athermality capacity of a tele-covariant channel is half the superdense coding capacity of its Choi state. Our work establishes the conditional channel entropy as a primitive information-theoretic concept for quantum processes, elucidating its potential for wider applications in quantum information science.

2604.01216 2026-04-02 cs.LG cs.AI cs.CV

LAtent Phase Inference from Short time sequences using SHallow REcurrent Decoders (LAPIS-SHRED)

Yuxuan Bao, Xingyue Zhang, J. Nathan Kutz

详情
英文摘要

Reconstructing full spatio-temporal dynamics from sparse observations in both space and time remains a central challenge in complex systems, as measurements can be spatially incomplete and can be also limited to narrow temporal windows. Yet approximating the complete spatio-temporal trajectory is essential for mechanistic insight and understanding, model calibration, and operational decision-making. We introduce LAPIS-SHRED (LAtent Phase Inference from Short time sequence using SHallow REcurrent Decoders), a modular architecture that reconstructs and/or forecasts complete spatiotemporal dynamics from sparse sensor observations confined to short temporal windows. LAPIS-SHRED operates through a three-stage pipeline: (i) a SHRED model is pre-trained entirely on simulation data to map sensor time-histories into a structured latent space, (ii) a temporal sequence model, trained on simulation-derived latent trajectories, learns to propagate latent states forward or backward in time to span unobserved temporal regions from short observational time windows, and (iii) at deployment, only a short observation window of hyper-sparse sensor measurements from the true system is provided, from which the frozen SHRED model and the temporal model jointly reconstruct or forecast the complete spatiotemporal trajectory. The framework supports bidirectional inference, inherits data assimilation and multiscale reconstruction capabilities from its modular structure, and accommodates extreme observational constraints including single-frame terminal inputs. We evaluate LAPIS-SHRED on six experiments spanning complex spatio-temporal physics: turbulent flows, multiscale propulsion physics, volatile combustion transients, and satellite-derived environmental fields, highlighting a lightweight, modular architecture suited for operational settings where observation is constrained by physical or logistical limitations.

2604.01215 2026-04-02 cs.LG cs.AI physics.ao-ph

The Recipe Matters More Than the Kitchen:Mathematical Foundations of the AI Weather Prediction Pipeline

Piyush Garg, Diana R. Gergel, Andrew E. Shao, Galen J. Yacalis

详情
英文摘要

AI weather prediction has advanced rapidly, yet no unified mathematical framework explains what determines forecast skill. Existing theory addresses specific architectural choices rather than the learning pipeline as a whole, while operational evidence from 2023-2026 demonstrates that training methodology, loss function design, and data diversity matter at least as much as architecture selection. This paper makes two interleaved contributions. Theoretically, we construct a framework rooted in approximation theory on the sphere, dynamical systems theory, information theory, and statistical learning theory that treats the complete learning pipeline (architecture, loss function, training strategy, data distribution) rather than architecture alone. We establish a Learning Pipeline Error Decomposition showing that estimation error (loss- and data-dependent) dominates approximation error (architecture-dependent) at current scales. We develop a Loss Function Spectral Theory formalizing MSE-induced spectral blurring in spherical harmonic coordinates, and derive Out-of-Distribution Extrapolation Bounds proving that data-driven models systematically underestimate record-breaking extremes with bias growing linearly in record exceedance. Empirically, we validate these predictions via inference across ten architecturally diverse AI weather models using NVIDIA Earth2Studio with ERA5 initial conditions, evaluating six metrics across 30 initialization dates spanning all seasons. Results confirm universal spectral energy loss at high wavenumbers for MSE-trained models, rising Error Consensus Ratios showing that the majority of forecast error is shared across architectures, and linear negative bias during extreme events. A Holistic Model Assessment Score provides unified multi-dimensional evaluation, and a prescriptive framework enables mathematical evaluation of proposed pipelines before training.

2604.01213 2026-04-02 cs.RO cs.MA

Collaborative Task and Path Planning for Heterogeneous Robotic Teams using Multi-Agent PPO

Matthias Rubio, Julia Richter, Hendrik Kolvenbach, Marco Hutter

Comments 8 pages, 3 figures, associated code on https://github.com/leggedrobotics/multi_robot_global_planner

详情
英文摘要

Efficient robotic extraterrestrial exploration requires robots with diverse capabilities, ranging from scientific measurement tools to advanced locomotion. A robotic team enables the distribution of tasks over multiple specialized subsystems, each providing specific expertise to complete the mission. The central challenge lies in efficiently coordinating the team to maximize utilization and the extraction of scientific value. Classical planning algorithms scale poorly with problem size, leading to long planning cycles and high inference costs due to the combinatorial growth of possible robot-target allocations and possible trajectories. Learning-based methods are a viable alternative that move the scaling concern from runtime to training time, setting a critical step towards achieving real-time planning. In this work, we present a collaborative planning strategy based on Multi-Agent Proximal Policy Optimization (MAPPO) to coordinate a team of heterogeneous robots to solve a complex target allocation and scheduling problem. We benchmark our approach against single-objective optimal solutions obtained through exhaustive search and evaluate its ability to perform online replanning in the context of a planetary exploration scenario.

2604.01212 2026-04-02 cs.CL cs.AI

$\texttt{YC-Bench}$: Benchmarking AI Agents for Long-Term Planning and Consistent Execution

Muyu He, Adit Jain, Anand Kumar, Vincent Tu, Soumyadeep Bakshi, Sachin Patro, Nazneen Rajani

Comments 16 pages, 10 figures

详情
英文摘要

As LLM agents tackle increasingly complex tasks, a critical question is whether they can maintain strategic coherence over long horizons: planning under uncertainty, learning from delayed feedback, and adapting when early mistakes compound. We introduce $\texttt{YC-Bench}$, a benchmark that evaluates these capabilities by tasking an agent with running a simulated startup over a one-year horizon spanning hundreds of turns. The agent must manage employees, select task contracts, and maintain profitability in a partially observable environment where adversarial clients and growing payroll create compounding consequences for poor decisions. We evaluate 12 models, both proprietary and open source, across 3 seeds each. Only three models consistently surpass the starting capital of \$200K, with Claude Opus 4.6 achieving the highest average final funds at \$1.27 M, followed by GLM-5 at \$1.21 M at 11$\times$ lower inference cost. Scratchpad usage, the sole mechanism for persisting information across context truncation, is the strongest predictor of success, and adversarial client detection is the primary failure mode, accounting for $47\%$ of bankruptcies. Our analysis reveals that frontier models still fail through distinct failure modes such as over-parallelization, demonstrating the capability gaps for long-horizon performance. $\texttt{YC-Bench}$ is open-source, reproducible, and configurable.

2604.01211 2026-04-02 eess.SY cs.SY

Making Every Bit Count for $A$-Optimal State Estimation

Cameron Khanpour, Daniel Turizo, Samuel Talkington

详情
英文摘要

We study the problem of controlling how a limited communication bandwidth budget is allocated across heterogeneously quantized sensor measurements. The performance criterion is the trace of the error covariance matrix of the linear minimum mean square error (LMMSE) state estimator, i.e., an $A$-optimal design criterion. Minimizing this criterion with a bit budget constraint yields a nonconvex optimization problem. We derive a formula that reduces each evaluation of the gradient to a single Cholesky factorization. This enables efficient optimization by both a projection-free Frank-Wolfe method (with a computable convergence certificate) and an interior point method with L-BFGS Hessian approximation over the problem's continuous relaxation. A largest remainder rounding procedure recovers integer bit allocations with a bound on the quality of the rounded solution. Numerical experiments in IEEE power grid test cases with up to 300 buses compare both solvers and demonstrate that the analytic gradient is the key computational enabler for both methods. Additionally, the heterogeneous bit allocation is compared to standard uniform bit allocation on the 500 bus IEEE power grid test case.

2604.01210 2026-04-02 cs.LG cs.AI

CliffSearch: Structured Agentic Co-Evolution over Theory and Code for Scientific Algorithm Discovery

Youssef Mroueh, Carlos Fonseca, Brian Belgodere, David Cox

详情
英文摘要

Scientific algorithm discovery is iterative: hypotheses are proposed, implemented, stress-tested, and revised. Current LLM-guided search systems accelerate proposal generation, but often under-represent scientific structure by optimizing code-only artifacts with weak correctness/originality gating. We present CliffSearch, an agentic evolutionary framework in which the core evolution operators (pair selection, crossover, mutation, and review) are implemented as LLM agents, and the loop is designed around three principles: (1) each node is a structured scientific artifact, instantiated in either theory+code or code_only mode, (2) reviewer judgments of correctness and originality are first-class selection gates alongside optimization of the benchmark metric of interest, and (3) mutation is split into exploration and correction pathways with distinct objectives. Exploration mutation imports ideas from adjacent scientific domains to increase novelty, while correction mutation performs targeted evidence-guided repair using reviewer signals over theory, code, benchmark results, and runtime errors. We illustrate the framework on three benchmark-grounded studies: transformer hyper-connection evolution, optimizer discovery on a fixed nanoGPT stack, and a smaller native-optimizer ablation. Across these settings, the same loop supports explicit metric direction, reproducible persistence, and reviewer-gated comparison of discoveries under controlled search conditions. The result is a discovery workflow that prioritizes scientific interpretability and correctness while optimizing task metrics under controlled novelty constraints, rather than maximizing candidate throughput alone. Full run artifacts, interactive visualizations, and exported best nodes for the reported studies are available at https://cliffsearch.ai .

2604.01207 2026-04-02 cs.CV

TRACE: High-Fidelity 3D Scene Editing via Tangible Reconstruction and Geometry-Aligned Contextual Video Masking

Jiyuan Hu, Zechuan Zhang, Zongxin Yang, Yi Yang

Comments 22 pages, 9 figures

详情
英文摘要

We present TRACE, a mesh-guided 3DGS editing framework that achieves automated, high-fidelity scene transformation. By anchoring video diffusion with explicit 3D geometry, TRACE uniquely enables fine-grained, part-level manipulatio--such as local pose shifting or component replacemen--while preserving the structural integrity of the central subject, a capability largely absent in existing editing methods. Our approach comprises three key stages: (1) Multi-view 3D-Anchor Synthesis, which leverages a sparse-view editor trained on our MV-TRACE datase--the first multi-view consistent dataset dedicated to scene-coherent object addition and modificatio--to generate spatially consistent 3D-anchors; (2) Tangible Geometry Anchoring (TGA), which ensures precise spatial synchronization between inserted meshes and the 3DGS scene via two-phase registration; and (3) Contextual Video Masking (CVM), which integrates 3D projections into an autoregressive video pipeline to achieve temporally stable, physically-grounded rendering. Extensive experiments demonstrate that TRACE consistently outperforms existing methods especially in editing versatility and structural integrity.

2604.01206 2026-04-02 cs.CL cs.LG

LLM REgression with a Latent Iterative State Head

Yiheng Su, Matthew Lease

详情
英文摘要

We present RELISH (REgression with a Latent Iterative State Head), a novel, lightweight architecture designed for text regression with large language models. Rather than decoding numeric targets as text or aggregating multiple generated outputs, RELISH predicts scalar values directly from frozen LLM representations by iteratively refining a learned latent state through cross-attention over token-level representations, and then mapping the final state to a point estimate with a linear regressor. Across five datasets, four LLM backbones, and two LLM training regimes, RELISH consistently outperforms prior baselines from all three major LLM regression families, including autoregressive decoding, regression-aware inference, and existing predictive head methods. Despite these gains, RELISH remains highly parameter-efficient, requiring only 3.4-3.7M trainable parameters across frozen LLM backbones (only 0.01-0.04% additional overhead), far less than LoRA-based alternatives that grow with model size (0.26-0.42%).

2604.01205 2026-04-02 quant-ph cs.NA math.NA

Programmable Signal Design for Quantum Phase Estimation via Quantum Signal Processing

Zikang Jia, Suying Liu, Yulong Dong

Comments 23 pages, 7 figures

详情
英文摘要

Quantum phase estimation is a central primitive in quantum algorithms and sensing, where performance is governed by the sensitivity of measurement signals to the target parameter. While existing methods have developed increasingly sophisticated inference and adaptive design strategies, the signal family used for phase learning is often largely pre-specified. Here we propose a programmable signal design framework for quantum phase estimation based on quantum signal processing, which enables the measurement signal to be tailored to the current uncertainty region. We cast phase estimation as a max-min optimization problem over admissible signals and introduce a sensitivity efficiency parameter that quantifies information gain per query depth. The resulting iterative algorithm combines optimized quantum signal transformations with structured classical inference, retaining Heisenberg-limited scaling while improving sensitivity efficiency and practical resource prefactors. Numerical results show reduced estimation variance compared with standard protocols such as robust phase estimation. Our framework also extends to Hamiltonian eigenvalue estimation in higher dimensions and establishes a quantum-classical co-design paradigm through programmable signal shaping.

2604.01200 2026-04-02 math.NA cs.NA math.AP

A Posteriori Error Analysis of Runge-Kutta Discontinuous Galerkin Schemes with SIAC Post-Processing for Nonlinear Convection-Diffusion Systems

Jan Giesselmann, Kiwoong Kwon, Sebastian Krumscheid

Comments 21 pages, 1 figure, 10 tables

详情
英文摘要

We develop reliable a posteriori error estimators for fully discrete Runge-Kutta discontinuous Galerkin approximations of nonlinear convection-diffusion systems endowed with a convex entropy in multiple spatial dimensions on the flat torus T^d, with a focus on the convection-dominated regime. In order to use the relative entropy method, we reconstruct the numerical solution via tensor-product Smoothness-Increasing Accuracy-Conserving (SIAC) filtering which has superconvergence properties. We then derive reliable a posteriori error estimators for the difference between the entropy weak solution and the reconstruction, with constants that are uniform in the vanishing viscosity limit. Our numerical experiments show that the a posteriori error bounds converge with the same order as the error of the reconstructed numerical solution.

2604.01199 2026-04-02 math.NA cs.NA

A high-order, structure preserving scheme for the stochastic Galerkin shallow water equations -- unification and two-dimensional extension

Philipp Öffner, Per Pettersson, Andrew R. Winters

详情
英文摘要

Recently, two independent research efforts have been made to study the stochastic Galerkin formulation of the shallow water equations. %In particular, Bender and Öffner developed entropy-conservative discontinuous Galerkin (DG) methods to solve the stochastic shallow water equations in an stochastic Galerkin framework using Roe variable transformation, while Dai, Epshteyn and collaborators proposed second-order, energy-stable and well-balanced schemes for the same class of problems with a specific projection step used inside the Galerkin projection together with high-order quadrature rules and a time-step restriction. In this paper, we provide a comprehensive comparison of the two methodologies mentioned, focusing on their theoretical properties and practical implementation aspects. We highlight shared foundational concepts and key differences of both approaches, with a particular focus on the selection of basis functions in the stochastic domain. As a highlight, we show that under specific conditions, the two formulations align, offering a unified framework that connects these distinct approaches. From our theoretical findings, we extend the development of high-order entropy conservative DG methods for the one-dimensional stochastic Galerkin shallow equations to two space dimensions; constructing entropy conservative two-point fluxes via primitive variables instead of entropy variables and applying it in our high-order DG setting. In numerical simulations, we verify and support our theoretical findings of a well-balanced and entropy-stable DG scheme which can be used to solve geophyiscal fluid flows with uncertainty.

2604.01194 2026-04-02 cs.CR

AgentWatcher: A Rule-based Prompt Injection Monitor

Yanting Wang, Wei Zou, Runpeng Geng, Jinyuan Jia

Comments The code is available at https://github.com/wang-yanting/AgentWatcher

详情
英文摘要

Large language models (LLMs) and their applications, such as agents, are highly vulnerable to prompt injection attacks. State-of-the-art prompt injection detection methods have the following limitations: (1) their effectiveness degrades significantly as context length increases, and (2) they lack explicit rules that define what constitutes prompt injection, causing detection decisions to be implicit, opaque, and difficult to reason about. In this work, we propose AgentWatcher to address the above two limitations. To address the first limitation, AgentWatcher attributes the LLM's output (e.g., the action of an agent) to a small set of causally influential context segments. By focusing detection on a relatively short text, AgentWatcher can be scalable to long contexts. To address the second limitation, we define a set of rules specifying what does and does not constitute a prompt injection, and use a monitor LLM to reason over these rules based on the attributed text, making the detection decisions more explainable. We conduct a comprehensive evaluation on tool-use agent benchmarks and long-context understanding datasets. The experimental results demonstrate that AgentWatcher can effectively detect prompt injection and maintain utility without attacks. The code is available at https://github.com/wang-yanting/AgentWatcher.

2604.01193 2026-04-02 cs.CL

Embarrassingly Simple Self-Distillation Improves Code Generation

Ruixiang Zhang, Richard He Bai, Huangjie Zheng, Navdeep Jaitly, Ronan Collobert, Yizhe Zhang

详情
英文摘要

Can a large language model (LLM) improve at code generation using only its own raw outputs, without a verifier, a teacher model, or reinforcement learning? We answer in the affirmative with simple self-distillation (SSD): sample solutions from the model with certain temperature and truncation configurations, then fine-tune on those samples with standard supervised fine-tuning. SSD improves Qwen3-30B-Instruct from 42.4% to 55.3% pass@1 on LiveCodeBench v6, with gains concentrating on harder problems, and it generalizes across Qwen and Llama models at 4B, 8B, and 30B scale, including both instruct and thinking variants. To understand why such a simple method can work, we trace these gains to a precision-exploration conflict in LLM decoding and show that SSD reshapes token distributions in a context-dependent way, suppressing distractor tails where precision matters while preserving useful diversity where exploration matters. Taken together, SSD offers a complementary post-training direction for improving LLM code generation.

2604.01188 2026-04-02 eess.SY cs.SY

Learning Neural Network Controllers with Certified Robust Performance via Adversarial Training

Neelay Junnarkar, Yasin Sonmez, Murat Arcak

详情
英文摘要

Neural network (NN) controllers achieve strong empirical performance on nonlinear dynamical systems, yet deploying them in safety-critical settings requires robustness to disturbances and uncertainty. We present a method for jointly synthesizing NN controllers and dissipativity certificates that formally guarantee robust closed-loop performance using adversarial training, in which we use counterexamples to the robust dissipativity condition to guide training. Verification is done post-training using alpha,beta-CROWN, a branch-and-bound-based method that enables direct analysis of the nonlinear dynamical system. The proposed method uses quadratic constraints (QCs) only for characterization of non-parametric uncertainties. The method is tested in numerical experiments on maximizing the volume of the set on which a system is certified to be robustly dissipative. Our method certifies regions up to 78 times larger than the region certified by a linear matrix inequality-based approach that we derive for comparison.

2604.01186 2026-04-02 cs.DL cs.IR

From Validity to Inter-Subjectivity: An Argument for Reliability Signals in Search Environments

Frans van der Sluis

Comments 4 pages. Extended abstract / conference paper for SEASON 2025 (September 24-25, 2025, Hamburg, Germany). Peer reviewed

详情
英文摘要

Search engines and information platforms are increasingly scrutinized for their role in spreading misinformation. Traditional responses often focus on detecting falsehoods or verifying the ultimate validity of claims. This paper argues that such a validity-centered framing is inadequate for the epistemic challenges of search environments.

2602.02326 2026-04-02 cs.CL

Language Steering for Multilingual In-Context Learning

Neeraja Kirtane, Kuan-Hao Huang

详情
英文摘要

If large language models operate in a universal semantic space, then switching between languages should require only a simple activation offset. To test this, we take multilingual in-context learning as a case study, where few-shot demonstrations are provided in English but the test query is in a target language. We propose language vectors, computed as the mean activation difference between parallel source and target language examples at a particular layer, and added as an offset to hidden states at inference time to shift the model's internal representations toward the target language. We evaluate our method across three multilingual tasks spanning 19 languages and three models. Our results show consistent improvements on multilingual in-context learning over baselines across all tasks and languages tested, demonstrating that a simple activation offset is sufficient to redirect a model's language mode without any parameter updates. Beyond performance, the vectors encode interpretable linguistic structure, with closely related languages forming tight clusters and vectors transferring across tasks, suggesting that language identity occupies separable and structured directions in a model's activation space.

2601.02728 2026-04-02 cs.LG

CRoPE: Efficient Parametrization of Rotary Positional Embedding

Beicheng Lou, Zifei Xu, Vivian W. H. Wong

详情
英文摘要

Rotary positional embedding has become the state-of-the-art approach to encode position information in transformer-based models. While it is often succinctly expressed in complex linear algebra, we note that the actual implementation of $Q/K/V$-projections is not equivalent to a complex linear transformation. We argue that complex linear transformation is a more natural parametrization and saves near 50\% parameters within the attention block. We show empirically that removing such redundancy has negligible impact on the model performance. Our modification achieves more efficient parameter usage, as well as a cleaner interpretation of the representation space.

2511.23445 2026-04-02 quant-ph cs.CC cs.LO math.CO

Quantum Polymorphisms and the Complexity of Quantum Constraint Satisfaction

Lorenzo Ciardo, Gideo Joubert, Antoine Mottet

Comments We included several new results on quantum polymorphisms, quantum relational constructions, and the complexity of quantum CSPs

详情
英文摘要

We introduce the concept of quantum polymorphisms to the complexity theory of quantum constraint satisfaction. Via this notion, we build an algebraic framework of reductions between quantum CSPs, and we establish a Galois connection between quantum polymorphism minions and quantum relational constructions. By leveraging a contextuality property of quantum polymorphisms, we fully characterise the existence of commutativity gadgets for relational structures, introduced by Ji as a method for achieving quantum soundness of classical CSP reductions. Prior to our work, only a partial classification was known for a subclass of Boolean languages and for non-Boolean languages meeting specific structural conditions [Culf--Mastel, FOCS'25]. As an application of our framework, we prove that the quantum CSPs parameterised by odd cycles and the quantum CSP expressing quantum satisfiability of Siggers clauses are undecidable.

2511.08592 2026-04-02 cs.CL cs.AI

The Collective Turing Test: Large Language Models Can Generate Realistic Multi-User Discussions

Azza Bouleimen, Giordano De Marzo, Taehee Kim, Nicol`o Pagan, Hannah Metzler, Silvia Giordano, Anikó Hannák, David Garcia

详情
英文摘要

Large Language Models (LLMs) offer new avenues to simulate online communities and social media. Potential applications range from testing the design of content recommendation algorithms to estimating the effects of content policies and interventions. However, the validity of using LLMs to simulate conversations between various users remains largely untested. We evaluated whether LLMs can convincingly mimic human group conversations on social media. We collected authentic human conversations from Reddit and generated artificial conversations on the same topic with two LLMs: Llama 3 70B and GPT-4o. When presented side-by-side to study participants, LLM-generated conversations were mistaken for human-created content 39\% of the time. In particular, when evaluating conversations generated by Llama 3, participants correctly identified them as AI-generated only 56\% of the time, barely better than random chance. Our study demonstrates that LLMs can generate social media conversations sufficiently realistic to deceive humans when reading them, highlighting both a promising potential for social simulation and a warning message about the potential misuse of LLMs to generate new inauthentic social media content.

2505.20507 2026-04-02 cs.CV cs.AI

Electrolyzers-HSI: Close-Range Multi-Scene Hyperspectral Imaging Benchmark Dataset

Elias Arbash, Ahmed Jamal Afifi, Ymane Belahsen, Margret Fuchs, Pedram Ghamisi, Paul Scheunders, Richard Gloaguen

详情
Journal ref
Sci Data 12, 1818 (2025)
英文摘要

The global challenge of sustainable recycling demands automated, fast, and accurate, state-of-the-art (SOTA) material detection systems that act as a bedrock for a circular economy. Democratizing access to these cutting-edge solutions that enable real-time waste analysis is essential for scaling up recycling efforts and fostering the Green Deal. In response, we introduce \textbf{Electrolyzers-HSI}, a novel multimodal benchmark dataset designed to accelerate the recovery of critical raw materials through accurate electrolyzer materials classification. The dataset comprises 55 co-registered high-resolution RGB images and hyperspectral imaging (HSI) data cubes spanning the 400--2500 nm spectral range, yielding over 4.2 million pixel vectors and 424,169 labeled ones. This enables non-invasive spectral analysis of shredded electrolyzer samples, supporting quantitative and qualitative material classification and spectral properties investigation. We evaluate a suite of baseline machine learning (ML) methods alongside SOTA transformer-based deep learning (DL) architectures, including Vision Transformer, SpectralFormer, and the Multimodal Fusion Transformer, to investigate architectural bottlenecks for further efficiency optimisation when deploying transformers in material identification. We implement zero-shot detection techniques and majority voting across pixel-level predictions to establish object-level classification robustness. In adherence to the FAIR data principles, the electrolyzers-HSI dataset and accompanying codebase are openly available at https://github.com/hifexplo/Electrolyzers-HSI and https://rodare.hzdr.de/record/3668, supporting reproducible research and facilitating the broader adoption of smart and sustainable e-waste recycling solutions.

2503.19115 2026-04-02 q-bio.MN cs.NE

Implementation of Support Vector Machines using Reaction Networks

Amey Choudhary, Jiaxin Jin, Abhishek Deshpande

Comments 28 pages, 4 figures, 1 table

详情
英文摘要

Can machine learning algorithms be implemented using chemistry? We demonstrate that this is possible in the case of support vector machines (SVMs). SVMs are powerful tools for data classification, leveraging Vapnik-Chervonenkis theory to handle high-dimensional data and small datasets effectively. In this work, we propose a chemical reaction network scheme for implementing SVMs, utilizing the steady-state behavior of reaction network dynamics to model key computational aspects of SVMs. This approach introduces a novel biochemical framework for implementing machine learning algorithms in non-traditional computational environments.

1809.03377 2026-04-02 math.NA cs.NA

Isogeometric Simulation and Shape Optimization with Applications to Electrical Machines

Peter Gangl, Ulrich Langer, Angelos Mantzaflaris, Rainer Schneckenleitner

详情
英文摘要

Future e-mobility calls for efficient electrical machines. For different areas of operation, these machines have to satisfy certain desired properties that often depend on their design. Here we investigate the use of multipatch Isogeometric Analysis (IgA) for the simulation and shape optimization of the electrical machines. In order to get fast simulation and optimization results, we use non-overlapping domain decomposition (DD) methods to solve the large systems of algebraic equations arising from the IgA discretization of underlying partial differential equations. The DD is naturally related to the multipatch representation of the computational domain, and provides the framework for the parallelization of the DD solvers.

1609.06236 2026-04-02 math.NA cs.NA

A Local Mesh Modification Strategy for Interface Problems with Application to Shape and Topology Optimization

Peter Gangl, Ulrich Langer

Comments 8 pages, 2 Figures, submitted to proceedings of SCEE (Scientific Computing in Electrical Engineering) 2016 in Strobl, Austria

详情
英文摘要

We present and analyze a new finite element method for solving interface problems on a triangular grid. The method locally modifies a given triangulation such that the interfaces are accurately resolved and the maximal angle condition holds. Therefore, optimal order of convergence can be shown. Moreover, an appropriate scaling of the basis functions yields an optimal condition number of the stiffness matrix. The method is applied to an optimal design problem for an electric motor where the interface between different materials is evolving in the course of the optimization procedure.

2604.01183 2026-04-02 cs.HC

Assessing Affective Objectives for Communicative Visualizations

Elsie Lee-Robbins, Eytan Adar

详情
英文摘要

Using learning objectives to define designer intents for communicative visualizations can be a powerful design tool. Cognitive and affective objectives are concrete and specific, which can be translated to assessments when creating, evaluating, or comparing visualization ideas. However, while there are many well-validated assessments for cognitive objectives, affective objectives are uniquely challenging. It is easy to see if a visualization helps someone remember the number of patients in a clinic, but harder to observe the change in their attitudes around donations to a crisis. In this work, we define a set of criteria for selecting assessments--from education, advocacy, economics, health, and psychology--that align with affective objectives. We illustrate the use of the framework in a complex affective design task that combines personal narratives and visualizations. Our chosen assessments allow us to evaluate different designs in the context of our objectives and competing psychological theories.

2604.01181 2026-04-02 cs.HC cs.CL cs.CV

True (VIS) Lies: Analyzing How Generative AI Recognizes Intentionality, Rhetoric, and Misleadingness in Visualization Lies

Graziano Blasilli, Marco Angelini

详情
英文摘要

This study investigates the ability of multimodal Large Language Models (LLMs) to identify and interpret misleading visualizations, and recognize these observations along with their underlying causes and potential intentionality. Our analysis leverages concepts from visualization rhetoric and a newly developed taxonomy of authorial intents as explanatory lenses. We formulated three research questions and addressed them experimentally using a dataset of 2,336 COVID-19-related tweets, half of which contain misleading visualizations, and supplemented it with real-world examples of perceptual, cognitive, and conceptual errors drawn from VisLies, the IEEE VIS community event dedicated to showcasing deceptive and misleading visualizations. To ensure broad coverage of the current LLM landscape, we evaluated 16 state-of-the-art models. Among them, 15 are open-weight models, spanning a wide range of model sizes, architectural families, and reasoning capabilities. The selection comprises small models, namely Nemotron-Nano-V2-VL (12B parameters), Mistral-Small-3.2 (24B), DeepSeek-VL2 (27B), Gemma3 (27B), and GTA1 (32B); medium-sized models, namely Qianfan-VL (70B), Molmo (72B), GLM-4.5V (108B), LLaVA-NeXT (110B), and Pixtral-Large (124B); and large models, namely Qwen3-VL (235B), InternVL3.5 (241B), Step3 (321B), Llama-4-Maverick (400B), and Kimi-K2.5 (1000B). In addition, we employed OpenAI GPT-5.4, a frontier proprietary model. To establish a human perspective on these tasks, we also conducted a user study with visualization experts to assess how people perceive rhetorical techniques and the authorial intentions behind the same misleading visualizations. This allows comparison between model and expert behavior, revealing similarities and differences that provide insights into where LLMs align with human judgment and where they diverge.

2604.01180 2026-04-02 math.NA cs.NA

On the error of the Euler scheme for approximation of solutions of nonlinear DDEs under inexact information

Paweł Przybyłowicz, Martyna Wiącek

详情
英文摘要

We analyze the behavior of the Euler method for delay differential equations under nonstandard assumptions on the right-hand-side function f, when evaluations of f are corrupted by informational noise. We provide theoretical upper bounds on the Euler discretization error and present results from the numerical experiments.