arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 2967
2603.26728 2026-03-31 cs.DB cs.AI cs.CL

SEAR: Schema-Based Evaluation and Routing for LLM Gateways

Zecheng Zhang, Han Zheng, Yue Xu

Comments 10 pages, 6 pages appendix, 4 figures, 12 tables

详情
英文摘要

Evaluating production LLM responses and routing requests across providers in LLM gateways requires fine-grained quality signals and operationally grounded decisions. To address this gap, we present SEAR, a schema-based evaluation and routing system for multi-model, multi-provider LLM gateways. SEAR defines an extensible relational schema covering both LLM evaluation signals (context, intent, response characteristics, issue attribution, and quality scores) and gateway operational metrics (latency, cost, throughput), with cross-table consistency links across around one hundred typed, SQL-queryable columns. To populate the evaluation signals reliably, SEAR proposes self-contained signal instructions, in-schema reasoning, and multi-stage generation that produces database-ready structured outputs. Because signals are derived through LLM reasoning rather than shallow classifiers, SEAR captures complex request semantics, enables human-interpretable routing explanations, and unifies evaluation and routing in a single query layer. Across thousands of production sessions, SEAR achieves strong signal accuracy on human-labeled data and supports practical routing decisions, including large cost reductions with comparable quality.

2603.26725 2026-03-31 cs.LO cs.AI

Capability Safety as Datalog: A Foundational Equivalence

Cosimo Spera

详情
英文摘要

We prove that capability safety admits an exact representation as propositional Datalog evaluation (Datalogprop: the monadic, ground, function-free fragment of first-order logic), enabling the transfer of algorithmic and structural results unavailable in the native formulation. This addresses two structural limitations of the capability hypergraph framework of Spera [2026]: the absence of efficient incremental maintenance, and the absence of a decision procedure for audit surface containment. The equivalence is tight: capability hypergraphs correspond to exactly this fragment, no more.

2603.26722 2026-03-31 cs.NE cs.AI cs.AR cs.OS

Brain-inspired AI for Edge Intelligence: a systematic review

Yingchao Cheng, Meijia Wang, Zhifeng Hao, Rajkumar Buyya

详情
英文摘要

While Spiking Neural Networks (SNNs) promise to circumvent the severe Size, Weight, and Power (SWaP) constraints of edge intelligence, the field currently faces a "Deployment Paradox" where theoretical energy gains are frequently negated by the inefficiencies of mapping asynchronous, event-driven dynamics onto traditional von Neumann substrates. Transcending the reductionism of algorithm-only reviews, this survey adopts a rigorous system-level hardware-software co-design perspective to examine the 2020-2025 trajectory, specifically targeting the "last mile" technologies - from quantization methodologies to hybrid architectures - that translate biological plausibility into silicon reality. We critically dissect the interplay between training complexity (the dichotomy of direct learning vs. conversion), the "memory wall" bottlenecking stateful neuronal updates, and the critical software gap in neuromorphic compilation toolchains. Finally, we envision a roadmap to reconcile the fundamental "Sync-Async Mismatch," proposing the development of a standardized Neuromorphic OS as the foundational layer for realizing a ubiquitous, energy-autonomous Green Cognitive Substrate.

2603.26721 2026-03-31 eess.SP cs.AI cs.CV cs.LG

Stress Classification from ECG Signals Using Vision Transformer

Zeeshan Ahmad, Naimul Khan

Comments 10 pages

详情
英文摘要

Vision Transformers have shown tremendous success in numerous computer vision applications; however, they have not been exploited for stress assessment using physiological signals such as Electrocardiogram (ECG). In order to get the maximum benefit from the vision transformer for multilevel stress assessment, in this paper, we transform the raw ECG data into 2D spectrograms using short time Fourier transform (STFT). These spectrograms are divided into patches for feeding to the transformer encoder. We also perform experiments with 1D CNN and ResNet-18 (CNN model). We perform leave-onesubject-out cross validation (LOSOCV) experiments on WESAD and Ryerson Multimedia Lab (RML) dataset. One of the biggest challenges of LOSOCV based experiments is to tackle the problem of intersubject variability. In this research, we address the issue of intersubject variability and show our success using 2D spectrograms and the attention mechanism of transformer. Experiments show that vision transformer handles the effect of intersubject variability much better than CNN-based models and beats all previous state-of-the-art methods by a considerable margin. Moreover, our method is end-to-end, does not require handcrafted features, and can learn robust representations. The proposed method achieved 71.01% and 76.7% accuracies with RML dataset and WESAD dataset respectively for three class classification and 88.3% for binary classification on WESAD.

2603.26716 2026-03-31 eess.SP cs.LG

FEMBA on the Edge: Physiologically-Aware Pre-Training, Quantization, and Deployment of a Bidirectional Mamba EEG Foundation Model on an Ultra-low Power Microcontroller

Anna Tegon, Nicholas Lehmann, Yawei Li, Andrea Cossettini, Luca Benini, Thorir Mar Ingolfsson

Comments 10 pages, 9 tables, 1 figure

详情
英文摘要

Objective: To enable continuous, long-term neuro-monitoring on wearable devices by overcoming the computational bottlenecks of Transformer-based Electroencephalography (EEG) foundation models and the quantization challenges inherent to State-Space Models (SSMs). Methods: We present FEMBA, a bidirectional Mamba architecture pre-trained on over 21,000 hours of EEG. We introduce a novel Physiologically-Aware pre-training objective, consisting of a reconstruction with low-pass filtering, to prioritize neural oscillations over high-frequency artifacts. To address the activation outliers common in SSMs, we employ Quantization-Aware Training (QAT) to compress the model to 2-bit weights. The framework is deployed on a parallel ultra-low-power RISC-V microcontroller (GAP9) using a custom double-buffered memory streaming scheme. Results: The proposed low-pass pre-training improves downstream AUROC on TUAB from 0.863 to 0.893 and AUPR from 0.862 to 0.898 compared to the best contrastive baseline. QAT successfully compresses weights with negligible performance loss, whereas standard post-training quantization degrades accuracy by approximately \textbf{30\%}. The embedded implementation achieves deterministic real-time inference (\textbf{1.70~s} per 5~s window) and reduces the memory footprint by \textbf{74\%} (to $\approx$2~MB), achieving competitive accuracy with up to \textbf{27$\times$} fewer FLOPs than Transformer benchmarks. Conclusion: FEMBA demonstrates that Mamba-based foundation models can be effectively quantized and deployed on extreme-edge hardware without sacrificing the representation quality required for robust clinical analysis. Significance: This work establishes the first full-stack framework for deploying large-scale EEG foundation models on ultra-low-power wearables, facilitating continuous, SSM based monitoring for epilepsy and sleep disorders.

2603.26712 2026-03-31 cs.SE cs.AI cs.CY econ.GN q-fin.EC

On the Carbon Footprint of Economic Research in the Age of Generative AI

Andres Alonso-Robisco, Carlos Esparcia, Francisco Jareño

详情
英文摘要

Generative artificial intelligence (AI) is increasingly used to write and refactor research code, expanding computational workflows. At the same time, Green AI research has largely measured the footprint of models rather than the downstream workflows in which GenAI is a tool. We shift the unit of analysis from models to workflows and treat prompts as decision policies that allocate discretion between researcher and system, governing what is executed and when iteration stops. We contribute in two ways. First, we map the recent Green AI literature into seven themes: training footprint is the largest cluster, while inference efficiency and system level optimisation are growing rapidly, alongside measurement protocols, green algorithms, governance, and security and efficiency trade-offs. Second, we benchmark a modern economic survey workflow, an LDA-based literature mapping implemented with GenAI assisted coding and executed in a fixed cloud notebook, measuring runtime and estimated CO2e with CodeCarbon. Injecting generic green language into prompts has no reliable effect, whereas operational constraints and decision rule prompts deliver large and stable footprint reductions while preserving decision equivalent topic outputs. The results identify human in the loop governance as a practical lever to align GenAI productivity with environmental efficiency.

2603.26710 2026-03-31 cs.IR cs.AI cs.CL cs.MA

Agentic AI for Human Resources: LLM-Driven Candidate Assessment

Kamer Ali Yuksel, Abdul Basit Anees, Ashraf Elneima, Sanjika Hewavitharana, Mohamed Al-Badrashiny, Hassan Sawaf

Comments Published in 19th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2026)

详情
Journal ref
19th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2026), Rabat, Morocco
英文摘要

In this work, we present a modular and interpretable framework that uses Large Language Models (LLMs) to automate candidate assessment in recruitment. The system integrates diverse sources, including job descriptions, CVs, interview transcripts, and HR feedback; to generate structured evaluation reports that mirror expert judgment. Unlike traditional ATS tools that rely on keyword matching or shallow scoring, our approach employs role-specific, LLM-generated rubrics and a multi-agent architecture to perform fine-grained, criteria-driven evaluations. The framework outputs detailed assessment reports, candidate comparisons, and ranked recommendations that are transparent, auditable, and suitable for real-world hiring workflows. Beyond rubric-based analysis, we introduce an LLM-Driven Active Listwise Tournament mechanism for candidate ranking. Instead of noisy pairwise comparisons or inconsistent independent scoring, the LLM ranks small candidate subsets (mini-tournaments), and these listwise permutations are aggregated using a Plackett-Luce model. An active-learning loop selects the most informative subsets, producing globally coherent and sample-efficient rankings. This adaptation of listwise LLM preference modeling (previously explored in financial asset ranking) provides a principled and highly interpretable methodology for large-scale candidate ranking in talent acquisition.

2603.26709 2026-03-31 eess.SY cs.RO cs.SY eess.SP

Neural Aided Adaptive Innovation-Based Invariant Kalman Filter

Barak Diker, Itzik Klein

Comments 11 pages and 2 figures

详情
英文摘要

Autonomous platforms require accurate positioning to complete their tasks. To this end, a Kalman filter-based algorithms, such as the extended Kalman filter or invariant Kalman filter, utilizing inertial and external sensor fusion are applied. To cope with real-world scenarios, adaptive noise estimation methods have been developed primarily for classical Euclidean formulations. However, these methods remain largely unexplored in the tangent Lie space, despite it provides a principled geometric framework with favorable error dynamics on Lie groups. To fill this gap, we combine invariant filtering theory with neural-aided adaptive noise estimation in real-world settings. To this end, we derive a novel theoretical extension of classical innovation-based process noise adaptation formulated directly within the Lie-group framework. We further propose a lightweight neural network that estimates the process noise covariance parameters directly from raw inertial data. Trained entirely in a sim2real framework via domain adaptation, the network captures motion-dependent and sensor-dependent noise characteristics without requiring labeled real-world data. To examine our proposed neural-aided adaptive invariant Kalman filter, we focus on the challenging real-world scenario of autonomous underwater navigation. Experimental results demonstrate superior performance compared to existing methods in terms of position root mean square error. These results validate our sim2real pipeline and further confirm that geometric invariance significantly enhances learning-based adaptation and that adaptive noise estimation in the tangent Lie space offers a powerful mechanism for improving navigation accuracy in nonlinear autonomous platforms.

2603.26704 2026-03-31 eess.SY cs.AI cs.CV cs.SY

Deep Learning Multi-Horizon Irradiance Nowcasting: A Comparative Evaluation of Three Methods for Leveraging Sky Images

Erling W. Eriksen, Magnus M. Nygård, Niklas Erdmann, Heine N. Riise

详情
英文摘要

We investigate three distinct methods of incorporating all-sky imager (ASI) images into deep learning (DL) irradiance nowcasting. The first method relies on a convolutional neural network (CNN) to extract features directly from raw RGB images. The second method uses state-of-the-art algorithms to engineer 2D feature maps informed by domain knowledge, e.g., cloud segmentation, the cloud motion vector, solar position, and cloud base height. These feature maps are then passed to a CNN to extract compound features. The final method relies on aggregating the engineered 2D feature maps into time-series input. Each of the three methods were then used as part of a DL model trained on a high-frequency, 29-day dataset to generate multi-horizon forecasts of global horizontal irradiance up to 15 minutes ahead. The models were then evaluated using root mean squared error and skill score on 7 selected days of data. Aggregated engineered ASI features as model input yielded superior forecasting performance, demonstrating that integration of ASI images into DL nowcasting models is possible without complex spatially-ordered DL-architectures and inputs, underscoring opportunities for alternative image processing methods as well as the potential for improved spatial DL feature processing methods.

2603.26699 2026-03-31 eess.SP cs.CV cs.LG

EMPD: An Event-based Multimodal Physiological Dataset for Remote Pulse Wave Detection

Qian Feng, Pengfei Li, Rongshan Gao, Jiale Xu, Rui Gong, Yidi Li

Comments 12 pages, 4 figures, 2 tables

详情
英文摘要

Remote photoplethysmography (rPPG) based on traditional frame-based cameras often struggles with motion artifacts and limited temporal resolution. To address these limitations, we introduce EMPD (Event-based Multimodal Physiological Dataset), the first benchmark dataset specifically designed for non-contact physiological sensing via event cameras. The dataset leverages a laser-assisted acquisition system where a high-coherence laser modulates subtle skin vibrations from the radial artery into significant signals detectable by a neuromorphic sensor. The hardware platform integrates a high-resolution event camera to capture micro-motions and intensity transients, an industrial RGB camera to provide traditional rPPG benchmarks, and a clinical-grade pulse oximeter to record ground truth PPG waveforms. EMPD contains 193 valid records collected from 83 subjects, covering a wide heart rate range (40-110 BPM) under both resting and post-exercise conditions. By providing precisely synchronized multimodal data with microsecond-level temporal precision, EMPD serves as a crucial resource for developing robust algorithms in the field of neuromorphic physiological monitoring. The dataset is publicly available at: https://doi.org/10.5281/zenodo.18765701

2603.26697 2026-03-31 eess.SY cs.AI cs.SY

Physicochemical-Neural Fusion for Semi-Closed-Circuit Respiratory Autonomy in Extreme Environments

Phillip Kingston, Nicholas Johnston

Comments 46 pages, 2 figures

详情
英文摘要

This paper introduces Galactic Bioware's Life Support System, a semi-closed-circuit breathing apparatus designed for integration into a positive-pressure firefighting suit and governed by an AI control system. The breathing loop incorporates a soda lime CO2 scrubber, a silica gel dehumidifier, and pure O2 replenishment with finite consumables. One-way exhaust valves maintain positive pressure while creating a semi-closed system in which outward venting gradually depletes the gas inventory. Part I develops the physicochemical foundations from first principles, including state-consistent thermochemistry, stoichiometric capacity limits, adsorption isotherms, and oxygen-management constraints arising from both fire safety and toxicity. Part II introduces an AI control architecture that fuses three sensor tiers, external environmental sensing, internal suit atmosphere sensing (with triple-redundant O2 cells and median voting), and firefighter biometrics. The controller combines receding-horizon model-predictive control (MPC) with a learned metabolic model and a reinforcement learning (RL) policy advisor, with all candidate actuator commands passing through a final control-barrier-function safety filter before reaching the hardware. This architecture is intended to optimize performance under unknown mission duration and exertion profiles. In this paper we introduce an 18-state, 3-control nonlinear state-space formulation using only sensors viable in structural firefighting, with triple-redundant O2 sensing and median voting. Finally, we introduce an MPC framework with a dynamic resource scarcity multiplier, an RL policy advisor for warm-starting, and a final control-barrier-function safety filter through which all actuator commands must pass, demonstrating 18-34% endurance improvement in simulation over PID baselines while maintaining tighter physiological and fire-safety margins.

2603.26695 2026-03-31 eess.SP cs.AI cs.LG eess.IV quant-ph

Complementarity-Preserving Generative Theory for Multimodal ECG Synthesis: A Quantum-Inspired Approach

Timothy Oladunni, Farouk Ganiyu-Adewumi, Clyde Baidoo, Kyndal Maclin

详情
英文摘要

Multimodal deep learning has substantially improved electrocardiogram (ECG) classification by jointly leveraging time, frequency, and time-frequency representations. However, existing generative models typically synthesize these modalities independently, resulting in synthetic ECG data that are visually plausible yet physiologically inconsistent across domains. This work establishes a Complementarity-Preserving Generative Theory (CPGT), which posits that physiologically valid multimodal signal generation requires explicit preservation of cross-domain complementarity rather than loosely coupled modality synthesis. We instantiate CPGT through Q-CFD-GAN, a quantum-inspired generative framework that models multimodal ECG structure within a complex-valued latent space and enforces complementarity-aware constraints regulating mutual information, redundancy, and morphological coherence. Experimental evaluation demonstrates that Q-CFD-GAN reduces latent embedding variance by 82%, decreases classifier-based plausibility error by 26.6%, and restores tri-domain complementarity from 0.56 to 0.91, while achieving the lowest observed morphology deviation (3.8%). These findings show that preserving multimodal information geometry, rather than optimizing modality-specific fidelity alone, is essential for generating synthetic ECG signals that remain physiologically meaningful and suitable for downstream clinical machine-learning applications.

2603.26688 2026-03-31 cs.IR cs.LG

EVNextTrade: Learning-to-Rank-Based Recommendation of Next Charging Nodes for EV-EV Energy Trading

Md Mahfujur Rahmana, Alistair Barros, Raja Jurdak, Darshika Koggalahewa

详情
英文摘要

Peer-to-peer energy trading among electric vehicles (EVs) has been increasingly studied as a promising solution for improving supply-side resilience under growing charging demand and constrained charging infrastructure. While prior studies on EV-EV energy trading and related EV research have largely focused on transaction management or isolated mobility prediction tasks, the problem of identifying which charging nodes are more suitable for EV-EV trading in journey contexts remains open. We address this gap by formulating next charging nodes recommendation as a learning-to-rank problem, where each EV decision event is associated with a set of candidate charging locations. We propose a supervised ranking framework applied to a large-scale urban EV mobility dataset comprising millions of journey records and multidimensional EV trading-related features, including EV energy level, trading role, distance to charging locations, charging speed, and temporal station popularity. To account for uncertainty arising from the mobility of both energy providers and consumers, as well as the presence of multiple viable charging nodes at a decision point, we employ probabilistic relevance refinement to generate graded labels for ranking. We evaluate gradient-boosted learning-to-rank models, including LightGBM, XGBoost, and CatBoost, on EV journey records enriched with candidate charging nodes. Experimental results show that LightGBM consistently achieves the strongest ranking performance across standard metrics, including NDCG@k, Recall@k, and MRR, with particularly strong early-ranking quality, reflected in the highest NDCG@1 (0.9795) and MRR (0.9990). These results highlight the effectiveness of uncertainty-aware learning-to-rank for charging node recommendation and support improved coordination and matching in decentralized EV-EV energy trading systems.

2603.26683 2026-03-31 cs.IR cs.AI cs.CL cs.CV

LITTA: Late-Interaction and Test-Time Alignment for Visually-Grounded Multimodal Retrieval

Seonok Kim

详情
英文摘要

Retrieving relevant evidence from visually rich documents such as textbooks, technical reports, and manuals is challenging due to long context, complex layouts, and weak lexical overlap between user questions and supporting pages. We propose LITTA, a query-expansion-centric retrieval framework for evidence page retrieval that improves multimodal document retrieval without retriever retraining. Given a user query, LITTA generates complementary query variants using a large language model and retrieves candidate pages for each variant using a frozen vision retriever with late-interaction scoring. Candidates from expanded queries are then aggregated through reciprocal rank fusion to improve evidence coverage and reduce sensitivity to any single phrasing. This simple test-time strategy significantly improves retrieval robustness while remaining compatible with existing multimodal embedding indices. We evaluate LITTA on visually grounded document retrieval tasks across three domains: computer science, pharmaceuticals, and industrial manuals. Multi-query retrieval consistently improves top-k accuracy, recall, and MRR compared to single-query retrieval, with particularly large gains in domains with high visual and semantic variability. Moreover, the accuracy-efficiency trade-off is directly controllable by the number of query variants, making LITTA practical for deployment under latency constraints. These results demonstrate that query expansion provides a simple yet effective mechanism for improving visually grounded multimodal retrieval.

2603.26682 2026-03-31 cs.HC cs.AI cs.CY

Operationalizing Perceptions of Agent Gender: Foundations and Guidelines

Katie Seaborn, Madeleine Steeds, Ilaria Torre, Martina De Cet, Katie Winkle, Marcus Göransson

详情
Journal ref
CHI 2026 Full Paper
英文摘要

The "gender" of intelligent agents, virtual characters, social robots, and other agentic machines has emerged as a fundamental topic in studies of people's interactions with computers. Perceptions of agent gender can help explain user attitudes and behaviours -- from preferences to toxicity to stereotyping -- across a variety of systems and contexts of use. Yet, standards in capturing perceptions of agent gender do not exist. A scoping review was conducted to clarify how agent gender has been operationalized -- labelled, defined, and measured -- as a perceptual variable. One-third of studies manipulated but did not measure agent gender. Norms in operationalizations remain obscure, limiting comprehension of results, congruity in measurement, and comparability for meta-analyses. The dominance of the gender binary model and latent anthropocentrism have placed arbitrary limits on knowledge generation and reified the status quo. We contribute a systematically-developed and theory-driven meta-level framework that offers operational clarity and practical guidance for greater rigour and inclusivity.

2603.26679 2026-03-31 cs.CY cs.AI cs.HC

AI Meets Mathematics Education: A Case Study on Supporting an Instructor in a Large Mathematics Class with Context-Aware AI

Jérémy Barghorn, Anna Sotnikova, Sacha Friedli, Antoine Bosselut

详情
英文摘要

Large-enrollment university courses face persistent challenges in providing timely and scalable instructional support. While generative AI holds promise, its effective use depends on reliability and pedagogical alignment. We present a human-centered case study of AI-assisted support in a Calculus I course, implemented in close collaboration with the course instructor. We developed a system to answer students' questions on a discussion forum, fine-tuning a lightweight language model on 2,588 historical student-instructor interactions. The model achieved 75.3% accuracy on a benchmark of 150 representative questions annotated by five instructors, and in 36% of cases, its responses were rated equal to or better than instructor answers. Post-deployment student survey (N = 105) indicated that students valued the alignment of the responses with the course materials and their immediate availability, while still relying on the instructor verification for trust. We highlight the importance of hybrid human-AI workflows for safe and effective course support.

2603.26678 2026-03-31 cs.CY cs.AI econ.TH

Power Couple? AI Growth and Renewable Energy Investment

Luyi Gui, Tinglong Dai

Comments 32 pages, 5 figures, 11-page appendix

详情
英文摘要

AI and renewable energy are increasingly framed as a "power couple" -- the idea that surging AI electricity demand will accelerate clean-energy investment -- yet concerns persist that AI will instead entrench fossil-fuel carbon lock-in. We reconcile these views by modeling the equilibrium interaction between AI growth and renewable investment. In a parsimonious game, a policymaker invests in renewable capacity available to AI and an AI developer chooses capability; the equilibrium depends on scaling regimes and market incentives. When the market payoff to capability is supermodular and performance gains are near-linear in compute, developers push toward frontier scale even when the marginal megawatt-hour is fossil-based. In this regime, renewable expansion can primarily relax scaling constraints rather than displace fossil generation one-for-one, weakening incentives to build enough clean capacity and reinforcing fossil dependence. This yields an "adaptation trap": as climate damages rise, the value of AI-enabled adaptation increases, which strengthens incentives to enable frontier scaling while tolerating residual fossil use. When AI faces diminishing returns and lower scaling efficiency, energy costs discipline capability choices; renewable investment then both enables capability and decarbonizes marginal compute, generating an "adaptation pathway" in which climate stress strengthens incentives for clean-capacity expansion and can support a carbon-free equilibrium. A calibrated case study illustrates these mechanisms using observed magnitudes for investment, capability, and energy use. Decarbonizing AI is an equilibrium outcome: effective policy must keep clean capacity binding at the margin as compute expands.

2603.26676 2026-03-31 cs.CY cs.AI cs.HC

Evaluating Human-AI Safety: A Framework for Measuring Harmful Capability Uplift

Michelle Vaccaro, Jaeyoon Song, Abdullah Almaatouq, Michiel A. Bakker

详情
英文摘要

Current frontier AI safety evaluations emphasize static benchmarks, third-party annotations, and red-teaming. In this position paper, we argue that AI safety research should focus on human-centered evaluations that measure harmful capability uplift: the marginal increase in a user's ability to cause harm with a frontier model beyond what conventional tools already enable. We frame harmful capability uplift as a core AI safety metric, ground it in prior social science research, and provide concrete methodological guidance for systematic measurement. We conclude with actionable steps for developers, researchers, funders, and regulators to make harmful capability uplift evaluation a standard practice.

2603.26673 2026-03-31 cs.CY cs.AI

Can AI be a Teaching Partner? Evaluating ChatGPT, Gemini, and DeepSeek across Three Teaching Strategies

Talita de Paula Cypriano de Souza, Shruti Mehta, Matheus Arataque Uema, Luciano Bernardes de Paula, Seiji Isotani

详情
英文摘要

There are growing promises that Large Language Models (LLMs) can support students' learning by providing explanations, feedback, and guidance. However, despite their rapid adoption and widespread attention, there is still limited empirical evidence regarding the pedagogical skills of LLMs. This article presents a comparative study of popular LLMs, namely, ChatGPT, DeepSeek, and Gemini, acting as teaching agents. An evaluation protocol was developed, focusing on three pedagogical strategies: Examples, Explanations and Analogies, and the Socratic Method. Six human judges conducted the evaluations in the context of teaching the C programming language to beginners. The results indicate that LLM models exhibited similar interaction patterns in the pedagogical strategies of Examples and Explanations and Analogies. In contrast, for the Socratic Method, the models showed greater sensitivity to the pedagogical strategy and the initial prompt. Overall, ChatGPT and Gemini received higher scores, whereas DeepSeek obtained lower scores across the criteria, indicating differences in pedagogical performance across models.

2603.26670 2026-03-31 cs.IR cs.CL

SRAG: RAG with Structured Data Improves Vector Retrieval

Shalin Shah, Srikanth Ryali, Ramasubbu Venkatesh

详情
英文摘要

Retrieval Augmented Generation (RAG) provides the necessary informational grounding to LLMs in the form of chunks retrieved from a vector database or through web search. RAG could also use knowledge graph triples as a means of providing factual information to an LLM. However, the retrieval is only based on representational similarity between a question and the contents. The performance of RAG depends on the numeric vector representations of the query and the chunks. To improve these representations, we propose Structured RAG (SRAG), which adds structured information to a query as well as the chunks in the form of topics, sentiments, query and chunk types (e.g., informational, quantitative), knowledge graph triples and semantic tags. Experiments indicate that this method significantly improves the retrieval process. Using GPT-5 as an LLM-as-a-judge, results show that the method improves the score given to answers in a question answering system by 30% (p-value = 2e-13) (with tighter bounds). The strongest improvement is in comparative, analytical and predictive questions. The results suggest that our method enables broader, more diverse, and episodic-style retrieval. Tail risk analysis shows that SRAG attains very large gains more often, with losses remaining minor in magnitude.

2603.26669 2026-03-31 cs.IR cs.AI cs.LG

ReCQR: Incorporating conversational query rewriting to improve Multimodal Image Retrieval

Yuan Hu, ZhiYu Cao, PeiFeng Li, QiaoMing Zhu

Comments 4 pages,3 figures

详情
英文摘要

With the rise of multimodal learning, image retrieval plays a crucial role in connecting visual information with natural language queries. Existing image retrievers struggle with processing long texts and handling unclear user expressions. To address these issues, we introduce the conversational query rewriting (CQR) task into the image retrieval domain and construct a dedicated multi-turn dialogue query rewriting dataset. Built on full dialogue histories, CQR rewrites users' final queries into concise, semantically complete ones that are better suited for retrieval. Specifically, We first leverage Large Language Models (LLMs) to generate rewritten candidates at scale and employ an LLM-as-Judge mechanism combined with manual review to curate approximately 7,000 high-quality multimodal dialogues, forming the ReCQR dataset. Then We benchmark several SOTA multimodal models on the ReCQR dataset to assess their performance on image retrieval. Experimental results demonstrate that CQR not only significantly enhances the accuracy of traditional image retrieval models, but also provides new directions and insights for modeling user queries in multimodal systems.

2603.26667 2026-03-31 cs.IR cs.AI

M-RAG: Making RAG Faster, Stronger, and More Efficient

Sun Xu, Tongkai Xu, Baiheng Xie, Li Huang, Qiang Gao, Kunpeng Zhang

详情
英文摘要

Retrieval-Augmented Generation (RAG) has become a widely adopted paradigm for enhancing the reliability of large language models (LLMs). However, RAG systems are sensitive to retrieval strategies that rely on text chunking to construct retrieval units, which often introduce information fragmentation, retrieval noise, and reduced efficiency. Recent work has even questioned the necessity of RAG, arguing that long-context LLMs may eliminate multi-stage retrieval pipelines by directly processing full documents. Nevertheless, expanded context capacity alone does not resolve the challenges of relevance filtering, evidence prioritization, and isolating answer-bearing information. To this end, we proposed M-RAG, a novel Chunk-free retrieval strategy. Instead of retrieving coarse-grained textual chunks, M-RAG extracts structured, k-v decomposition meta-markers, with a lightweight, intent-aligned retrieval key for retrieval and a context-rich information value for generation. Under this setting, M-RAG enables efficient and stable query-key similarity matching without sacrificing expressive ability. Experimental results on the LongBench subtasks demonstrate that M-RAG outperforms chunk-based RAG baselines across varying token budgets, particularly under low-resource settings. Extensive analysis further reveals that M-RAG retrieves more answer-friendly evidence with high efficiency, validating the effectiveness of decoupling retrieval representation from generation and highlighting the proposed strategy as a scalable and robust alternative to existing chunk-based methods.

2603.28761 2026-03-31 astro-ph.CO

Cosmic Shear in Effective Field Theory at Two-Loop Order: Revisiting $S_8$ in Dark Energy Survey Data

Shi-Fan Chen, Joseph DeRose, Mikhail M. Ivanov, Oliver H. E. Philcox

Comments 13 pages including supplemental material, 2 + 8 figures

详情
英文摘要

Cosmic shear is a powerful probe of cosmological distances, matter abundance and clustering in the low-redshift Universe. Cosmological parameter extraction from cosmic shear data is limited by our understanding of baryonic astrophysics, which severely restricts the range of scales used in such analyses. We show that the remaining scales are largely perturbative and can be accurately described with two-loop effective field theory (EFT) predictions. We present the first consistent analysis of the public cosmic shear data from the DES-Y3 catalogs in EFT at the two-loop order, renormalizing small-scale sensitivity in cosmic-shear predictions via a lensing-counterterm expansion and accounting for the intrinsic alignments of galaxies with spin-2 EFT predictions. We constrain the lensing amplitude competitively with standard (empirically-modeled) methods, finding $S_8 = 0.783^{+0.038}_{-0.031}$ ($S_8 = 0.802^{+0.031}_{-0.026}$ with BAO). The perturbativity of cosmic shear suggests novel opportunities for testing new physics with ongoing and upcoming cosmic shear experiments like Roman, Euclid, and LSST. As an example, we derive matter clustering constraints within the dynamical dark energy model from a combination of our DES-EFT cosmic shear likelihood, early-universe CMB priors, DESI BAO, and supernovae data, finding $S_8 = 0.824\pm 0.029$, indicating no $S_8$ tension in the growth of cosmic structure regardless of the underlying cosmological model and expansion history.

2603.28756 2026-03-31 cs.MS

Fast Large-Scale Model-Based Iterative Tomography via Exploiting Mathematical Structure, Hierarchical Optimization, Smart Initialization, and Distributed GPU Computing

Dinesh Kumar, Jeffrey Donatelli

详情
英文摘要

Model-Based Iterative Reconstruction (MBIR) is important because direct methods, such as Filtered Back-Projection (FBP) can introduce significant noise and artifacts in sparse-angle tomography, especially for time-evolving samples. Although MBIR produces high-quality reconstructions through prior-informed optimization, its computational cost has traditionally limited its broader adoption. In previous work, we addressed this limitation by expressing the Radon transform and its adjoint using non-uniform fast Fourier transforms (NUFFTs), reducing computational complexity relative to conventional projection-based methods. We further accelerated computation by employing a multi-GPU system for parallel processing. In this work, we further accelerate our Fourier-domain framework, by introducing four main strategies: (1) a reformulation of the MBIR forward and adjoint operators that exploits their multi-level Toeplitz structure for efficient Fourier-domain computation; (2) an improved initialization strategy that uses back-projected data filtered with a standard ramp filter as the starting estimate; (3) a hierarchical multi-resolution reconstruction approach that first solves the problem on coarse grids and progressively transitions to finer grids using Lanczos interpolation; and (4) a distributed-memory implementation using MPI that enables near-linear scaling on large high-performance computing (HPC) systems. Together, these innovations significantly reduce iteration counts, improve parallel efficiency, and make high-quality MBIR reconstruction practical for large-scale tomographic imaging. These advances open the door to near-real-time MBIR for applications such as in situ, in operando, and time-evolving experiments.

2603.28755 2026-03-31 cs.CY

Graphilosophy: Graph-Based Digital Humanities Computing with The Four Books

Minh-Thu Do, Quynh-Chau Le-Tran, Duc-Duy Nguyen-Mai, Thien-Trang Nguyen, Khanh-Duy Le, Minh-Triet Tran, Tam V. Nguyen, Trung-Nghia Le

Comments AI & Society journal

详情
英文摘要

The Four Books have shaped East Asian intellectual traditions, yet their multi-layered interpretive complexity limits their accessibility in the digital age. While traditional bilingual commentaries provide a vital pedagogical bridge, computational frameworks are needed to preserve and explore this wisdom. This paper bridges AI and classical philosophy by introducing Graphilosophy, an ontology-guided, multi-layered knowledge graph framework for modeling and interpreting The Four Books. Integrating natural language processing, multilingual semantic embeddings, and humanistic analysis, the framework transforms a bilingual Chinese-Vietnamese corpus into an interpretively grounded resource. Graphilosophy encodes linguistic, conceptual, and interpretive relationships across interconnected layers, enabling cross-lingual retrieval and AI-assisted reasoning while explicitly preserving scholarly nuance and interpretive plurality. The system also enables non-expert users to trace the evolution of ethical concepts across borders and languages, ensuring that ancient wisdom remains a living resource for modern moral discourse rather than a static relic of the past. Through an interactive interface, users can trace the evolution of ethical concepts across languages, ensuring ancient wisdom remains relevant for modern discourse. A preliminary user study suggests the system's capacity to enhance conceptual understanding and cross-cultural learning. By linking algorithmic representation with ethical inquiry, this research exemplifies how AI can serve as a methodological bridge, accommodating the ambiguity of cultural heritage rather than reducing it to static data. The Source code and data are released at https://github.com/ThuDoMinh1102/confucian-texts-knowledge-graph.

2603.28754 2026-03-31 eess.SY cs.SY

Sparse State-Space Realizations of Linear Controllers

Yaozhi Du, Jing Shuang Li

Comments Submitted to 2026 CDC

详情
英文摘要

This paper provides a novel approach for finding sparse state-space realizations of linear systems (e.g., controllers). Sparse controllers are commonly used in distributed control, where a controller is synthesized with some sparsity penalty. Here, motivated by a modeling problem in sensorimotor neuroscience, we study a complementary question: given a linear time-invariant system (e.g., controller) in transfer function form and a desired sparsity pattern, can we find a suitably sparse state-space realization for the transfer function? This problem is highly nonconvex, but we propose an exact method to solve it. We show that the problem reduces to finding an appropriate similarity transform from the modal realization, which in turn reduces to solving a system of multivariate polynomial equations. Finally, we leverage tools from algebraic geometry (namely, the Gröbner basis) to solve this problem exactly. We provide algorithms to find real- and complex-valued sparse realizations and demonstrate their efficacy on several examples.

2603.28753 2026-03-31 cs.NI

Iran's January 2026 Internet Shutdown: Public Data, Censorship Methods, and Circumvention Techniques

Giuseppe Aceto, Valerio Persico, Antonio Pescapè

Comments 12 pages, 3 figures, 1 table

详情
英文摘要

This paper analyzes the Internet shutdown that occurred in Iran in January 2026 in the context of protests, focusing on its impact on the country's digital communication infrastructure and on information access and control dynamics. The scale, complexity, and nation-state nature of the event motivate a comprehensive investigation that goes beyond isolated reports, aiming to provide a unified and systematic understanding of what happened and how it was observed. The study is guided by a set of research questions addressing: the characterization of the shutdown via the timeline of the disruption events and post-event "new normal"; the detectability of the event, encompassing monitoring initiatives, measurement techniques, and precursory signals; and the interplay between censorship and circumvention, assessing both the imposed restrictions and the effectiveness of tools designed to bypass them. To answer these questions, we adopt a multi-source, multi-perspective methodology that integrates heterogeneous public data, primarily from grey literature produced by network measurement and monitoring initiatives, complemented by additional private measurements. This approach enables a holistic view of the event and allows us to reconcile and compare partial observations from different sources.

2603.28752 2026-03-31 cond-mat.mes-hall physics.optics quant-ph

Topological Optical Chirality Dichroism

Wojciech J. Jankowski, Giandomenico Palumbo, Robert-Jan Slager

Comments 7+5 pages, 2+1 figures

详情
英文摘要

We report on a universal topological dichroism of chiral three-dimensional systems in response to the chirality of light. We show that chiral topological invariants result in integer-quantized dichroic excitation rate differences. Moreover, we demonstrate that such topological effects arise more generally from coupling optical chirality to higher tensor Berry curvatures and Dixmier-Douady invariants of quantum states, including Hopf indices. We finally propose an experimental setup that leverages superchiral light as a smoking-gun probe of chiral band topologies in three-dimensional materials. Our findings establish an optical route for probing to date unobserved chiral electronic band topologies.

2603.28749 2026-03-31 eess.SP physics.class-ph

Spatial Degrees of Freedom and Channel Strength for Antenna Systems

Mats Gustafsson, Yaniv Brick

详情
英文摘要

The number of spatial degrees of freedom (NDoF) and channel strength in antenna systems are examined within a geometric framework. Starting from a correlation-operator representation of the channel between transmitter and receiver regions, we analyze the associated eigenspectrum and relate the NDoF to its spectral transition (corner). We compare the spectrum-based effective NDoF and effective rank metrics, clarifying their behavior for both idealized and realistic eigenvalue distributions. In parallel, we develop geometry-based asymptotic estimates in terms of mutual shadow (view) measures and coupling strength. Specifically, we show that while the projected length or area predicts the number of usable modes in two- and three-dimensional settings, the coupling strength determines the average eigenvalue level. Canonical configurations of parallel lines and regions are used to derive closed-form asymptotic expressions for the effective NDoF, revealing significant deviations from the spectral corner in closely spaced configurations. The results illustrate that these are physically grounded. The proposed theory and techniques are computationally efficient and form a toolbox for estimating the modal richness in near-field channels, with implications for array design, inverse problems, and high-capacity communication systems.

2603.28748 2026-03-31 math.CO

Odd Hadwiger number and graph products

Henry Echeverría, Andrea Jiménez, Suchismita Mishra, Daniel A. Quiroz, Mauricio Yépez

Comments 12 pages, 6 figures, 2 tables

详情
英文摘要

The Odd Hadwiger number of a graph $G$ is the largest integer $r$ such that $G$ has a clique of size $r$ as an odd minor. In this paper, we investigate how large is the Odd Hadwiger number of the product of two graphs, when considering any of the four standard graph products: Cartesian, direct, lexicographic, strong. We provide an optimal lower bound in the cases of the strong and lexicographic products.