arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2406.17438 2026-03-04 cs.CV

Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes

Qi Ma, Danda Pani Paudel, Ender Konukoglu, Luc Van Gool

详情

DOI: 10.52202/079017-2814

英文摘要

Neural implicit functions have demonstrated significant importance in various areas such as computer vision, graphics. Their advantages include the ability to represent complex shapes and scenes with high fidelity, smooth interpolation capabilities, and continuous representations. Despite these benefits, the development and analysis of implicit functions have been limited by the lack of comprehensive datasets and the substantial computational resources required for their implementation and evaluation. To address these challenges, we introduce "Implicit-Zoo": a large-scale dataset requiring thousands of GPU training days designed to facilitate research and development in this field. Our dataset includes diverse 2D and 3D scenes, such as CIFAR-10, ImageNet-1K, and Cityscapes for 2D image tasks, and the OmniObject3D dataset for 3D vision tasks. We ensure high quality through strict checks, refining or filtering out low-quality data. Using Implicit-Zoo, we showcase two immediate benefits as it enables to: (1) learn token locations for transformer models; (2) directly regress 3D cameras poses of 2D images with respect to NeRF models. This in turn leads to an improved performance in all three task of image classification, semantic segmentation, and 3D pose regression, thereby unlocking new avenues for research.

URL PDF HTML ☆

赞 0 踩 0

2403.06567 2026-03-04 cs.CV cs.IR

Leveraging Foundation Models for Content-Based Image Retrieval in Radiology

Stefan Denner, David Zimmerer, Dimitrios Bounias, Markus Bujotzek, Shuhan Xiao, Raphael Stock, Lisa Kausch, Philipp Schader, Tobias Penzkofer, Paul F. Jäger, Klaus Maier-Hein

2401.03175 2026-03-04 cs.CL cs.AI cs.LG

Part-of-Speech Tagger for Bodo Language using Deep Learning approach

Dhrubajyoti Pathak, Sanjib Narzary, Sukumar Nandi, Bidisha Som

Comments Accepted to Natural Language Engineering

2309.11896 2026-03-04 cs.CL cs.CY

Focal Inferential Infusion Coupled with Tractable Density Discrimination for Implicit Hate Detection

Sarah Masud, Ashutosh Bajpai, Tanmoy Chakraborty

Comments 23 pages, 6 Figures, 9 Tables. Accepted at NLE

2308.09497 2026-03-04 cs.CL cs.AI cs.LG

Predictive Authoring for Brazilian Portuguese Augmentative and Alternative Communication

Jayr Pereira, Rodrigo Nogueira, Cleber Zanchettin, Robson Fidalgo

2307.15931 2026-03-04 cs.LG

Dynamic Deep-Reinforcement-Learning Algorithm in Partially Observable Markov Decision Processes

Saki Omi, Hyo-Sang Shin, Namhoon Cho, Antonios Tsourdos

2303.08032 2026-03-04 cs.CL cs.LG

Verifying the Robustness of Automatic Credibility Assessment

Piotr Przybyła, Alexander Shvets, Horacio Saggion

详情

DOI: 10.1017/nlp.2024.54
Journal ref: Nat. lang. process. 31 (2025) 1134-1162

英文摘要

Text classification methods have been widely investigated as a way to detect content of low credibility: fake news, social media bots, propaganda, etc. Quite accurate models (likely based on deep neural networks) help in moderating public electronic platforms and often cause content creators to face rejection of their submissions or removal of already published texts. Having the incentive to evade further detection, content creators try to come up with a slightly modified version of the text (known as an attack with an adversarial example) that exploit the weaknesses of classifiers and result in a different output. Here we systematically test the robustness of common text classifiers against available attacking techniques and discover that, indeed, meaning-preserving changes in input text can mislead the models. The approaches we test focus on finding vulnerable spans in text and replacing individual characters or words, taking into account the similarity between the original and replacement content. We also introduce BODEGA: a benchmark for testing both victim models and attack methods on four misinformation detection tasks in an evaluation framework designed to simulate real use-cases of content moderation. The attacked tasks include (1) fact checking and detection of (2) hyperpartisan news, (3) propaganda and (4) rumours. Our experimental results show that modern large language models are often more vulnerable to attacks than previous, smaller solutions, e.g. attacks on GEMMA being up to 27\% more successful than those on BERT. Finally, we manually analyse a subset adversarial examples and check what kinds of modifications are used in successful attacks.

URL PDF HTML ☆

赞 0 踩 0

2301.00539 2026-03-04 cs.CL

Statistical Machine Translation for Indic Languages

Sudhansu Bala Das, Divyajoti Panda, Tapas Kumar Mishra, Bidyut Kr. Patra

Comments 32pages, 1 figure, 4 tables

2208.07395 2026-03-04 cs.CL

Reproduction and Replication of an Adversarial Stylometry Experiment

Haining Wang, Patrick Juola, Allen Riddell

2202.09935 2026-03-04 cs.RO cs.HC

In the Arms of a Robot: Designing Autonomous Hugging Robots with Intra-Hug Gestures

Alexis E. Block, Hasti Seifi, Otmar Hilliges, Roger Gassert, Katherine J. Kuchenbecker

Comments 48 pages, 22 figures, 4 supplementary videos

2110.03427 2026-03-04 cs.LG cs.CL cs.SD eess.AS eess.SP

Is Attention always needed? A Case Study on Language Identification from Speech

Atanu Mandal, Santanu Pal, Indranil Dutta, Mahidas Bhattacharya, Sudip Kumar Naskar

Comments Accepted for publication in Natural Language Engineering

2109.13139 2026-03-04 cs.CV cs.CL

Multimodal Integration of Human-Like Attention in Visual Question Answering

Ekta Sood, Fabian Kögel, Philipp Müller, Dominike Thomas, Mihai Bace, Andreas Bulling

2109.13116 2026-03-04 cs.CV cs.CL

VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering

Ekta Sood, Fabian Kögel, Florian Strohm, Prajit Dhar, Andreas Bulling

Comments CoNLL 2021

2107.12220 2026-03-04 cs.LG cs.AI cs.CL cs.CV

Thought Flow Nets: From Single Predictions to Trains of Model Thought

Hendrik Schuff, Heike Adel, Ngoc Thang Vu

Comments 15 pages, 7 figures

2101.07679 2026-03-04 cs.RO cs.HC

The Six Hug Commandments: Design and Evaluation of a Human-Sized Hugging Robot with Visual and Haptic Perception

Alexis E. Block, Sammy Christen, Roger Gassert, Otmar Hilliges, Katherine J. Kuchenbecker

Comments 9 pages, 6 Figures, 2 Tables, ACM/IEEE Human-Robot Interaction (HRI) Conference 2021

2012.15834 2026-03-04 cs.LG cs.AI cs.IT math.DS math.IT

Loss Barcode: A Topological Measure of Escapability in Loss Landscapes

Serguei Barannikov, Daria Voronkova, Alexander Mironenko, Ilya Trofimov, Alexander Korotin, Grigorii Sotnikov, Evgeny Burnaev

1906.12314 2026-03-04 cs.AI

The Winnability of Klondike Solitaire and Many Other Patience Games

Charlie Blake, Ian P. Gent

Comments Published version, please cite the JAIR version published in Feb 2026, doi 10.1613/jair.1.17167

2603.02731 2026-03-04 cs.LG cs.AI

Practical FP4 Training for Large-Scale MoE Models on Hopper GPUs

Wuyue Zhang, Chongdong Huang, Chunbo You, Cheng Gu, Fengjuan Wang, Mou Sun

2603.02729 2026-03-04 cs.LG math.OC stat.ML

The power of small initialization in noisy low-tubal-rank tensor recovery

ZHiyu Liu, Haobo Geng, Xudong Wang, Yandong Tang, Zhi Han, Yao Wang

2603.02726 2026-03-04 cs.CV

Cross-view geo-localization, Image retrieval, Multiscale geometric modeling, Frequency domain enhancement

Hongying Zhang, ShuaiShuai Ma

2603.02724 2026-03-04 cs.SD cs.LG

Single Microphone Own Voice Detection based on Simulated Transfer Functions for Hearing Aids

Mathuranathan Mayuravaani, W. Bastiaan Kleijn, Andrew Lensen, Charlotte Sørensen

2603.02720 2026-03-04 cs.CV

TenExp: Mixture-of-Experts-Based Tensor Decomposition Structure Search Framework

Ting-Wei Zhou, Xi-Le Zhao, Sheng Liu, Wei-Hao Wu, Yu-Bang Zheng, Deyu Meng

2603.02712 2026-03-04 cs.CV cs.MM eess.IV

From "What" to "How": Constrained Reasoning for Autoregressive Image Generation

Ruxue Yan, Xubo Liu, Wenya Guo, Zhengkun Zhang, Ying Zhang, Xiaojie Yuan

2603.02711 2026-03-04 cs.AI

A Natural Language Agentic Approach to Study Affective Polarization

Stephanie Anneris Malvicini, Ewelina Gajewska, Arda Derbent, Katarzyna Budzynska, Jarosław A. Chudziak, Maria Vanina Martinez

Comments Accepted at ICAART 2026 (18th International Conference on Agents and Artificial Intelligence). The final published version is available in the conference proceedings (SCITEPRESS)

2603.02710 2026-03-04 cs.CV

MiM-DiT: MoE in MoE with Diffusion Transformers for All-in-One Image Restoration

Lingshun Kong, Jiawei Zhang, Zhengpeng Duan, Xiaohe Wu, Yueqi Yang, Xiaotao Wang, Dongqing Zou, Lei Lei, Jinshan Pan

Comments Project website: https://github.com/kkkls/MIM-DiT

2603.02704 2026-03-04 cs.CV cs.AI

Intelligent Pathological Diagnosis of Gestational Trophoblastic Diseases via Visual-Language Deep Learning Model

Yuhang Liu, Yueyang Cang, Wenge Que, Xinru Bai, Xingtong Wang, Kuisheng Chen, Jingya Li, Xiaoteng Zhang, Xinmin Li, Lixia Zhang, Pingge Hu, Qiaoting Xie, Peiyu Xu, Xianxu Zeng, Li Shi

Comments 29 pages, 3 figures

2603.02701 2026-03-04 cs.CL

Graph-GRPO: Stabilizing Multi-Agent Topology Learning via Group Relative Policy Optimization

Yueyang Cang, Xiaoteng Zhang, Erlu Zhao, Zehua Ji, Yuhang Liu, Yuchen He, Zhiyuan Ning, Chen Yijun, Wenge Que, Li Shi

2603.02695 2026-03-04 cs.LG

Addressing Missing and Noisy Modalities in One Solution: Unified Modality-Quality Framework for Low-quality Multimodal Data

Sijie Mai, Shiqin Han, Haifeng Hu

2603.02692 2026-03-04 cs.CV

FiDeSR: High-Fidelity and Detail-Preserving One-Step Diffusion Super-Resolution

Aro Kim, Myeongjin Jang, Chaewon Moon, Youngjin Shin, Jinwoo Jeong, Sang-hyo Park

Comments Accepted to CVPR 2026

2603.02691 2026-03-04 cs.CV

ReCo-Diff: Residual-Conditioned Deterministic Sampling for Cold Diffusion in Sparse-View CT

Yong Eun Choi, Hyoung Suk Park, Kiwan Jeon, Hyun-Cheol Park, Sung Ho Kang

Comments 10 pages, 4 figures. Submitted to MICCAI 2026