基于数据与知识双驱动的TransSeisNet地震活动性预测

Data and knowledge dual-driven TransSeisNet for seismicity prediction

  • 摘要: 随着地震目录记录数据的不断增长,传统统计方法在数据利用层面存在较大局限性。机器学习方法相较于传统方法的优势在于能够有效处理大量数据,但存在不可解释性问题以及缺少相关的物理约束。本文提出了一种基于地震目录数据与统计地震学知识双驱动的TransSeisNet模型用于解决上述问题。该模型以Transformer为核心架构,基于地震时间点数据进行数据驱动建模,并且为了增强模型的物理合理性,TransSeisNet在输出层和损失函数中融入了地震经验知识约束。模型最终输出相关地震活动性参数并对未来地震活动进行预测。实验结果表明,与传统的传染型余震序列(ETAS)模型相比,TransSeisNet在地震活动预测中具有精度高和稳定性强的显著优势,在真实地震目录和合成地震目录上都具有比基准模型ETAS更好的性能。研究结果可为地震活动性分析提供新的思路和方法。

     

    Abstract:
    Seismicity prediction remains a critical challenge in seismology, requiring a delicate balance between data-driven insights and domain-specific physical principles. Traditional statistical methods, such as the epidemic-type aftershock sequence (ETAS) model, have long served as fundamental tools for analyzing earthquake catalogs. However, these approaches struggle to fully utilize rapidly growing seismic data due to their reliance on simplified parametric assumptions and limited adaptability to complex spatiotemporal patterns. On the other hand, purely data-driven machine learning models, while capable of processing high dimensional datasets, often produce predictions lacking physical interpretability that can violate established seismological laws. To bridge this gap, this study proposes TransSeisNet, a hybrid framework that synergizes the computational power of deep learning with the empirical rigor of statistical seismology. By directly embedding domain knowledge into the model architecture and optimization process, TransSeisNet achieves both high predictive accuracy and adherence to physical constraints, providing a robust solution for earthquake forecasting.
    Methodological framework
    The TransSeisNet architecture is based on the Transformer neural network paradigm, renowned for its ability to model long-term dependencies in sequential data through self-attention mechanisms. The model processes earthquake catalogs (continuous records of seismic events containing temporal, spatial, and magnitude information) to predict future seismic activity. Key innovations include: ① Physical constraint layer. The physical constraint layer is integrated into the output layer to enforce compliance with empirical seismological laws. For instance, the magnitude distribution of predicted events is normalized to follow the power-law relationship of the Gutenberg-Richter (GR) Law, ensuring model outputs conform to observed frequency-magnitude scaling. Additionally, temporal clustering patterns, such as the rapid aftershock decay described by the Omori-Utsu Law, are explicitly encoded to prevent non-physical predictions. ② Knowledge-guided loss function. The training objective combines conventional negative log-likelihood terms with regularization terms derived from statistical seismology. For example, deviations from the GR Law’s b-value or violations of Omori-Utsu decay parameters are penalized during optimization. This dual-objective approach ensures simultaneous optimization of data fidelity and physical consistency.
    Integration of domain knowledge
    TransSeisNet systematically incorporates three pillars of statistical seismology: ① Gutenberg-Richter Law. It constrains the predicted event magnitude-frequency distribution to follow a power-law scaling, preventing unrealistic overprediction of large-magnitude events. ② ETAS model characteristics. The self-attention mechanism implicitly captures the ETAS model’s core premise — earthquakes can trigger subsequent events — by modeling temporal and spatial triggering probabilities. ③ Omori-Utsu decay. It regularizes the temporal decay of aftershock productivity to align with empirically observed trends, ensuring predicted aftershock sequences decay at rates consistent with historical observations.
    Experimental results
    TransSeisNet was rigorously evaluated on two real-world earthquake catalogs and one synthetic catalog: ① Southern California catalog, covering the San Jacinto fault zone (1981−2023), and containing over 12 000 events with magnitudes M≥2.0; ② Japan Meteorological Agency catalog, covering the Tohoku and Kanto regions (1990−2023), and containing over 20 000 events, with emphasis on subduction zone seismicity; ③ Synthetic catalog, generated using ETAS parameters to validate the model’s ability to recover known triggering dynamics. Performance Highlights: ① Superior accuracy. TransSeisNet consistently outperformed the benchmark ETAS model in predicting the timing, location, and magnitude of seismic events across all catalogs. For instance, in the Southern California catalog, the model demonstrated a 30% improvement in likelihood-based evaluation metrics compared to ETAS. ② Enhanced stability. By incorporating physical constraints, TransSeisNet exhibited reduced sensitivity to data noise and outliers, producing stable predictions even during periods of high seismic activity (e.g., aftershock sequences following major earthquakes). ③ Generalization capability. The model maintained robust performance across diverse tectonic settings, including strike-slip fault systems (San Jacinto) and subduction zones (Japan), highlighting its adaptability to varying seismogenic regimes.
    Model optimization and analysis
    Architectural depth: Comparative studies of model depth revealed that a 6-layer Transformer configuration achieved optimal performance, balancing computational efficiency and predictive power. Shallower architectures (e.g., four layers) exhibited underfitting on complex sequences, while deeper models (more than eight layers) showed diminishing returns. Activation function selection: Experiments with ReLU, ELU, and Swish activations indicated minimal performance differences, though ELU marginally improved training stability due to its smooth gradient properties. Comparative analysis with machine learning baselines: TransSeisNet outperformed alternative machine learning architectures including LSTM and GRU networks, which struggled to simultaneously capture long-term dependencies and enforce physical constraints.
    Conclusion
    TransSeisNet represents a significant advancement in seismicity prediction by unifying data-driven machine learning with empirical seismological principles. Its dual knowledge-data-driven framework addresses limitations of both traditional statistical methods (e.g., rigid parametric assumptions) and purely machine learning approaches (e.g., lack of interpretability). The model’s success underscores the value of integrating domain knowledge into neural network design, particularly in geophysical applications where physical plausibility is paramount. Future work will focus on extending the framework to incorporate real-time geodetic data (e.g., GNSS measurements) and multi-physics simulations, further enhancing its utility for operational earthquake forecasting and hazard assessment.

     

/

返回文章
返回