LMT: A Bayesian Framework for Causal Discovery from Textual Alarm Records in Manufacturing Systems
LMT: 制造系统中文本告警记录的因果发现贝叶斯框架
Xiaofeng Xiao, Jianhong Chen, Qiuzhuang Sun, Naichen Shi, Xubo Yue
AI总结 提出LMT框架,结合大语言模型提取的语义信号和基于泊松过程的时间证据,通过贝叶斯方法从文本告警记录中发现因果图,在小样本场景下表现优异。
Comments 19 pages
详情
文本事件记录(如告警日志)已成为工程和制造系统中越来越常见的数据源。除了识别相关性或重复模式外,工程师通常有兴趣了解在系统运行过程中哪些类型的事件因果性地触发或影响其他事件。文本事件描述可能包含关于此类因果关系的语义线索,而最近的大语言模型(LLM)为提取这些信号提供了有前景的工具。然而,仅依赖LLM编码的文本信息不足以进行准确的因果发现,因为语义模式并不直接揭示因果机制,并且可能将因果关系与相关性或频繁的顺序模式混淆。为了解决这些挑战,我们提出了\textbf{LMT},一个用于工程事件数据的贝叶斯因果发现框架,它联合利用了文本描述和时间戳。具体来说,LMT首先使用LLM从事件描述中提取语义因果信号,并构建事件类型或事件簇之间因果图的先验分布。然后,它通过基于泊松过程的似然函数纳入时间证据,使得基于时间戳的统计证据能够精炼LLM信息先验。通过整合文本和时间信息,LMT生成一个既可解释又有数据支持的因果图。模拟研究表明,所提出的框架在不同设置下都是有效的,并且在样本量较小的告警事件场景中尤其具有优势。
Textual event records, such as alarm logs, have become an increasingly common data source in engineering and manufacturing systems. Beyond identifying correlations or recurring patterns, engineers are often interested in understanding which types of events causally trigger or influence other events during system operation. Textual event descriptions may contain semantic clues about such causal relationships, and recent large language models (LLMs) provide a promising tool for extracting these signals. However, relying solely on LLM-encoded textual information is insufficient for accurate causal discovery, since semantic patterns do not directly reveal causal mechanisms and may confuse causation with correlation or frequent sequential patterns. To address these challenges, we propose \textbf{LMT}, a Bayesian causal discovery framework for engineering event data that jointly leverages textual descriptions and timestamps. Specifically, LMT first uses LLMs to extract semantic causal signals from event descriptions and constructs a prior distribution over causal graphs among event types or event clusters. It then incorporates temporal evidence through a Poisson-process-based likelihood, allowing the LLM-informed prior to be refined by timestamp-based statistical evidence. By integrating the textual and temporal information, LMT produces a causal graph that is both interpretable and data-supported. Simulation studies show that the proposed framework is effective across different settings and is especially advantageous in small-sample alarm-event scenarios.