Li H R,Li X H,Niu C,Zhang Y,Liu J H,Tan X F. 2025. Small sample classification of natural earthquakes and blasting based on GA-XGBoost. Acta Seismologica Sinica47(2):221−231. DOI: 10.11939/jass.20230147
Citation: Li H R,Li X H,Niu C,Zhang Y,Liu J H,Tan X F. 2025. Small sample classification of natural earthquakes and blasting based on GA-XGBoost. Acta Seismologica Sinica47(2):221−231. DOI: 10.11939/jass.20230147

Small sample classification of natural earthquakes and blasting based on GA-XGBoost

More Information
  • Received Date: November 05, 2023
  • Revised Date: March 12, 2024
  • Accepted Date: April 07, 2024
  • Available Online: March 21, 2025
  • The rapid development of seismic networks and the advancement of monitoring equipment have enabled the recording of various seismic events, including natural earthquakes and man-made blasting activities. Notably, nuclear explosions can also be detected through seismic monitoring, and this detection is a crucial aspect in the verification process of the Comprehensive Nuclear Test Ban Treaty. However, distinguishing between natural seismic events and those caused by blasting is challenging. Both appear as fluctuating curves on seismic records and share a striking resemblance, making manual identification resource-intensive and prone to human error, potentially leading to misjudgments and confusion in earthquake catalogs. This issue can compromise the effectiveness of earthquake early warning systems and emergency response measures. Therefore, the automated classification and discrimination between seismic events originated from natural sources and those caused by blasting are of great significance for both earth science research and national defense.

    Currently automatic classification techniques predominantly rely on deep learning, which typically requires extensive labeled datasets for training. Obtaining sufficient high-quality data for nuclear explosion events can be challenging due to their unique nature, limiting the application of deep learning for this purpose. This paper focuses on the classification and discrimination of natural earthquakes and blasting with limited sample data. The test data consists of vertical component recordings of short-period natural earthquakes and nuclear explosions. These recordings are preprocessed by employing the SPA method to eliminate the trend component. Subsequently, complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) is utilized to extract a series of intrinsic mode functions. Wavelet thresholding is applied to reduce noise, and then the denoised components are reconstructed to generate the final signals. The preprocessed signals are expanded by translation and noise injection techniques. This process results in a final training set, which consists of 500 event signals for each category.

    The features from the energy spectrum, power spectrum, and cepstrum are extracted to form a high-dimensional small-sample dataset. Given the excellent performance of the eXtreme Gradient Boosting (XGBoost) model in small-sample classification tasks, this study employs its strategy of aggregating weak classifiers. The model improves the accuracy by performing second-order Taylor expansion on the objective function, thereby retaining more target-related information. The XGBoost model is then utilized to classify natural and blasting seismic events. To address the complexity and numerous parameters in the conventional XGBoost, this paper employs the genetic algorithm (GA) to optimize three key hyperparameters that significantly impact classification accuracy: the number of iterations, maximum tree depth, and learning rate. The GA’s advantages include its independence from initial conditions, robustness, and suitability for complex optimization problems. Taking advantage of these strengths, the GA-XGBoost model is constructed.

    In the tests conducted with the high-dimensional small-sample dataset, the GA-XGBoost model achieved the highest classification accuracy of 94.094% on the power spectrum feature set. When using the power spectrum feature as input, the GA-XGBoost model outperformed both LSTM and GS (grid search)-XGBoost models in accuracy. Notably, compared with the GS-XGBoost, the GA-XGBoost model improved the classification accuracy by 2.037% and reduced the runtime from 409.26 seconds to 55.48 seconds, increasing operational efficiency by over 86%. However, the preprocessing and feature extraction process presented in this paper is relatively complex and requires professional expertise. Moreover, the paper utilizes a 1500-dimensional power spectrum feature. It should be noted that different feature dimensions can impact test results. Hence, it is necessary to select the appropriate feature dimension according to the specific data and tests.

    Although the tests have verified the classification effectiveness of high-dimensional features such as the power spectrum, energy spectrum, and cepstrum, the optimal features may vary with different datasets. Therefore, it is essential to conduct tests and select the most suitable features based on the available data. Finally, while the study explores hyperparameter optimization using genetic algorithm, with the emergence of new optimization algorithms, there is potential for further investigation into these algorithms for hyperparameter selection. Overall, the tests demonstrate that the GA-XGBoost model offers a balance of accuracy, stability, and efficiency, showing promise for small-sample classification tasks.

  • 柏晗,杨耘,崔琴芳,贾鹏,王丽霞. 2022. 基于GA-XGBoost模型的GF-5卫星影像土壤重金属含量反演研究[J]. 激光与光电子学进展,59(12):1230001.
    Bai H,Yang Y,Cui Q F,Jia P,Wang L X. 2022. Retrieval of heavy metal content in soil using GF-5 satellite images based on GA-XGBoost model[J]. Laser and Optoelectronics Progress,59(12):1230001 (in Chinese). doi: 10.3788/LOP202259.1230001
    陈法法,陈保家,程珩,杨晶晶. 2016. 运用免疫遗传算法优化WNN诊断滚动轴承早期故障[J]. 噪声与振动控制,36(6):158–163.
    Chen F F,Chen B J,Cheng H,Yang J J. 2016. Early fault diagnosis of roller bearings based on wavelet neural network optimized by immune genetic algorithm[J]. Noise and Vibration Control,36(6):158–163 (in Chinese).
    陈润航,黄汉明,柴慧敏. 2018. 地震和爆破事件源波形信号的卷积神经网络分类研究[J]. 地球物理学进展,33(4):1331–1338. doi: 10.6038/pg2018BB0326
    Chen R H,Huang H M,Chai H M. 2018. Study on the discrimination of seismic waveform signals between earthquake and explosion events by convolutional neural network[J]. Progress in Geophysics,33(4):1331–1338 (in Chinese).
    戴邵武,陈强强,戴洪德,聂子健. 2019. 基于平滑先验分析和模糊熵的滚动轴承故障诊断[J]. 航空动力学报,34(10):2218–2226.
    Dai S W,Chen Q Q,Dai H D,Nie Z J. 2019. Rolling bearing fault diagnosis based on smoothness priors approach and fuzzy entropy[J]. Journal of Aerospace Power,34(10):2218–2226 (in Chinese).
    董祎玮,赵建明,李金,刘楚,任佳,王妍,符泽宇,荣伟健,孟令焕. 2017. 曹妃甸地震台网天然地震与爆破波形记录特征[J]. 地震地磁观测与研究,38(3):30–34. doi: 10.3969/j.issn.1003-3246.2017.03.006
    Dong Y W,Zhao J M,Li J,Liu C,Ren J,Wang Y,Fu Z Y,Rong W J,Meng L H. 2017. About waveform recording characteristics of natural earthquakes and blasting in Caofeidian seismic network[J]. Seismological and Geomagnetic Observation and Research,38(3):30–34 (in Chinese).
    和雪松,李世愚,沈萍,冯全雄. 2006. 用小波包识别地震和矿震[J]. 中国地震,22(4):425–434. doi: 10.3969/j.issn.1001-4683.2006.04.010
    He X S,Li S Y,Shen P,Feng Q X. 2006. A wavelet packet approach to wave classification of earthquakes and mining shocks[J]. Earthquake Research in China,22(4):425–434 (in Chinese).
    黄汉明,边银菊,卢世军,蒋正锋,李锐. 2010. 天然地震与人工爆破的波形小波特征研究[J]. 地震学报,32(3):270–276. doi: 10.3969/j.issn.0253-3782.2010.03.002
    Huang H M,Bian Y J,Lu S J,Jiang Z F,Li R. 2010. A wavelet feature research on seismic waveforms of earthquakes and explosions[J]. Acta Seismologica Sinica,32(3):270–276 (in Chinese).
    李欣,俞卫琴. 2020. 基于改进GS-XGBoost的个人信用评估[J]. 计算机系统应用,29(11):145–150.
    Li X,Yu W Q. 2020. Personal credit evaluation based on improved GS-XGBoost[J]. Computer Systems &Applications,29(11):145–150 (in Chinese).
    孟娟,张家声,李亚南. 2022. 基于改进EWT和LogitBoost集成分类器的地震事件分类识别算法[J]. 地震工程学报,44(5):1233–1242.
    Meng J,Zhang J S,Li Y N. 2022. Classification and recognition algorithm for earthquake events based on the improved EWT and LogitBoost ensemble classifier[J]. China Earthquake Engineering Journal,44(5):1233–1242 (in Chinese).
    田宵,汪明军,张雄,王向腾,盛书中,吕坚. 2022. 基于多输入卷积神经网络的天然地震和爆破事件识别[J]. 地球物理学报,65(5):1802–1812. doi: 10.6038/cjg2022P0352
    Tian X,Wang M J,Zhang X,Wang X T,Sheng S Z,Lü J. 2022. Discrimination of earthquake and quarry blast based on multi-input convolutional neural network[J]. Chinese Journal of Geophysics,65(5):1802–1812 (in Chinese).
    王义国,林峰,李琦,刘钰淇,胡贵洋,孟祥宇. 2024. 基于TCN-LSTM模型的电网电能质量扰动分类研究[J]. 电力系统保护与控制,52(17):161–167.
    Wang Y G,Lin F,Li Q,Liu Y Q,Hu G Y,Meng X Y. 2024. Classification of power quality disturbances in a power grid based on the TCN-LSTM model[J]. Power System Protection and Control,52(17):161–167 (in Chinese).
    隗永刚,杨千里,王婷婷,蒋长胜,边银菊. 2019. 基于深度学习残差网络模型的地震和爆破识别[J]. 地震学报,41(5):646–657. doi: 10.11939/jass.20190030
    Wei Y G,Yang Q L,Wang T T,Jiang C S,Bian Y J. 2019. Earthquake and explosion identification based on deep learning residual network model[J]. Acta Seismologica Sinica,41(5):646–657 (in Chinese).
    杨帅,郭茂祖,赵玲玲,李阳. 2022. 融合遗传算法与XGBoost的玉米百粒重相关基因挖掘[J]. 智能系统学报,17(1):170–180.
    Yang S,Guo M Z,Zhao L L,Li Y. 2022. The method of 100-kernel weight related genes mining in maize mixed with genetic algorithm and XGboost[J]. CAAI Transactions on Intelligent Systems,17(1):170–180 (in Chinese).
    赵昕迪. 2022. 基于EFA-GA-XGBoost组合预测模型的绝缘子表面污秽程度预测方法[J]. 电子测试,(5):68–70. doi: 10.3969/j.issn.1000-8519.2022.05.018
    Zhao X D. 2022. Prediction method of insulator surface pollution degree based on EFA-GA-XGBoost combined prediction model[J]. Electronic Test,(5):68–70 (in Chinese).
    Astiz L,Eakins J A,Martynov V G,Cox T A,Tytell J,Reyes J C,Newman R L,Karasu G H,Mulder T,White M,Davis G A,Busby R W,Hafner K,Meyer J C,Vernon F L. 2014. The array network facility seismic bulletin:Products and an unbiased view of United States seismicity[J]. Seismol Res Lett,85(3):576–593. doi: 10.1785/0220130141
    Chen T,Guestrin C. 2016. XGBoost:A scalable tree boosting system[C]//KDD ′ 16:Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:Association for Computing Machinery:785−794.
    Choi W,Yang W,Na J,Lee G,Nam W. 2021. Feature optimization for gait phase estimation with a genetic algorithm and Bayesian optimization[J]. Appl Sci,11(19):8940. doi: 10.3390/app11198940
    Ewees A A,Al-Qaness M A A,Abualigah L,Oliva D,Algamal Z Y,Anter A M,Ibrahim R A,Ghoniem R M,Elaziz M A. 2021. Boosting arithmetic optimization algorithm with genetic algorithm operators for feature selection:Case study on Cox proportional hazards model[J]. Mathematics,9(18):2321. doi: 10.3390/math9182321
    Gomes D E,Iglésias M I D,Proença A P,Lima T M,Gaspar P D. 2021. Applying a genetic algorithm to a m-TSP:Case study of a decision support system for optimizing a beverage logistics vehicles routing problem[J]. Electronics,10(18):2298. doi: 10.3390/electronics10182298
    Kekovalı K,Kalafat D,Deniz P. 2012. Spectral discrimination between mining blasts and natural earthquakes:Application to the vicinity of Tunçbilek mining area,Western Turkey[J]. Int J Phys Sci,7(35):5339–5352.
    Kiszely M. 2001. Discrimination of quarry-blasts from earthquakes using spectral analysis and coda waves in Hungary[J]. Acta Geodaet Geophys Hung,36(4):439–448. doi: 10.1556/AGeod.36.2001.4.5
    Koper K D,Pechmann J C,Burlacu R,Pankow K L,Stein J,Hale J M,Roberson P,Mccarter M K. 2016. Magnitude-based discrimination of man-made seismic events from naturally occurring earthquakes in Utah,USA[J]. Geophys Res Lett,43(20):10638–10645.
    Li H R,Li X H,Tan X F,Liu T Y,Zhang Y,Liu J H,Niu C. 2024. Classification of small sample nuclear explosion seismic events based on MSSA-XGBoost[J]. Appl Geophys,21(1):108–118.
    Linville L,Pankow K,Draelos T. 2019. Deep learning models augment analyst decisions for event discrimination[J]. Geophys Res Lett,46:3643–3651.
    Luat N V,Han S W,Lee K. 2021. Genetic algorithm hybridized with eXtreme gradient boosting to predict axial compressive capacity of CCFST columns[J]. Compos Struct,278:114733. doi: 10.1016/j.compstruct.2021.114733
    Ouyang A J,Lu Y S,Liu Y M,Wu M,Peng X Y. 2021. An improved adaptive genetic algorithm based on DV-Hop for locating nodes in wireless sensor networks[J]. Neurocomputing,458:500–510. doi: 10.1016/j.neucom.2020.04.156
    Rabin N,Bregman Y,Lindenbaum O,Ben-Horin Y,Averbuch A. 2016. Earthquake-explosion discrimination using diffusion maps[J]. Geophys J Int,207(3):1484–1492. doi: 10.1093/gji/ggw348
    Roui M B,Zomorodi M,Sarvelayati M,Abdar M,Noori H,Pławiak P,Tadeusiewicz R,Zhou X J,Khosravi A,Nahavandi S,Acharya U R. 2021. A novel approach based on genetic algorithm to speed up the discovery of classification rules on GPUs[J]. Knowl-Based Syst,231:107419. doi: 10.1016/j.knosys.2021.107419
    Stevenson P R. 1976. Microearthquakes at Flathead Lake,Montana:A study using automatic earthquake processing[J]. Bull Seismol Soc Am,66(1):61–80. doi: 10.1785/BSSA0660010061
    Stump B W,Hedlin M A H,Pearson D C,Hsu V. 2002. Characterization of mining explosions at regional distances:Implications with the international monitoring system[J]. Rev Geophys,40(4):1011.
    Torres M E,Colominas M A,Schlotthauer G,Flandrin P. 2011. A complete ensemble empirical mode decomposition with adaptive noise[C]//2011 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP). Prague,Czech Republic:IEEE:4144−4147.
    Yavuz E,Sertçelik F,Livaoğlu H,Woith H,Lühr B G. 2019. Discrimination of quarry blasts from tectonic events in the Armutlu Peninsula,Turkey[J]. J Seismol,23(1):59–76. doi: 10.1007/s10950-018-9793-2
    Yeh J R,Shieh J S,Huang N E. 2010. Complementary ensemble empirical mode decomposition:A novel noise enhanced data analysis method[J]. Adv Adapt Data Anal,2(2):135–156. doi: 10.1142/S1793536910000422

Catalog

    Article views (23) PDF downloads (5) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return