Unsupervised Multimodal Graph-based Model for Geo-social Analysis
无监督多模态图模型用于地理社交分析
Ehsaneddin Jalilian, Bernd Resch
AI总结 本文提出无监督多模态图模型,整合语义与地理信息,提升灾难管理和舆论监控中的内容分析效果。
详情
用户生成的社会媒体内容系统分析,特别是结合地理空间上下文,在灾害管理和舆论监控等领域具有重要作用。尽管多模态方法取得显著进展,但现有模型仍碎片化,各自处理模态而非整合为统一端到端模型。为此,我们提出无监督多模态图方法,将语义和地理信息嵌入共享表示空间。该方法包含两种架构:MonoGraph模型联合编码两种模态,MultiGraph模型分别建模语义和地理关系并通过多头注意力机制整合。复合损失结合对比、连贯性和对齐目标,指导学习过程生成语义连贯且空间紧凑的聚类。在四个真实灾难数据集上的实验表明,我们的模型在话题质量、空间连贯性和可解释性上均优于现有基线。该框架固有领域无关,可扩展到多种多模态数据和广泛下游分析任务。
The systematic analysis of user-generated social media content, especially when enriched with geospatial context, plays a vital role in domains such as disaster management and public opinion monitoring. Although multimodal approaches have made significant progress, most existing models remain fragmented, processing each modality separately rather than integrating them into a unified end-to-end model. To address this, we propose an unsupervised, multimodal graph-based methodology that jointly embeds semantic and geographic information into a shared representation space. The proposed methodology comprises two architectural paradigms: a mono graph (MonoGrah) model that jointly encodes both modalities, and a multi graph (MultiGraph) model that separately models semantic and geographic relationships and subsequently integrates them through multi-head attention mechanisms. A composite loss, combining contrastive, coherence, and alignment objectives, guides the learning process to produce semantically coherent and spatially compact clusters. Experiments on four real-world disaster datasets demonstrate that our models consistently outperform existing baselines in topic quality, spatial coherence, and interpretability. Inherently domain-independent, the framework can be readily extended to diverse forms of multimodal data and a wide range of downstream analysis tasks.