A new deep neural network for phase picking with balanced speed and accuracy
-
摘要: 深度神经网络虽然在震相拾取中取得了良好效果,但作为高复杂度的机器学习模型,深度神经网络在取得较高精度的同时需要付出较高的计算代价,而且试验研究表明震相拾取中并不需要过高的模型复杂度。为此,本文根据地震波形的特点设计了四种具有不同复杂度的深度神经网络改进模型,可以综合具体的精度和速度需求从中选取合适的模型。在此基础上,将改进模型与现有四种到时拾取的深度学习网络模型进行了对比,结果表明本文中的网络模型在到时拾取上具有较高的速度和精度。同时,本文的深度神经网络通过使用多种深度学习模型压缩手段可将震相拾取模型的大小压缩到2.0 MB以内,从而使得模型可以在低功耗设备上完成高速震相拾取的同时尽可能地减少精度损失。Abstract: The deep neural network (DNN) has achieved good results in phase picking. As a high complexity machine learning model, DNN suffers from high computational cost to achieve high accuracy. The experimental results show that there is no need to build a high complexity model for phase picking. So, we designed four network models with different complexity to improve speed and accuracy of phase picking based on characteristics of seismic waveforms, which offers a choice of accuracy and/or speed of the phase picking. And then we compared our results with those obtained from four existing DNN models, and verified the relative high speed and accuracy. More importantly, the DNN models can be compressed to within 2.0 MB after a variety of model compression methods, which allows the structure to perform high-speed phase picking with relatively high accuracy on low-power-consumption devices.
-
-
图 6 基于P波(a)和S波(b)的神经网络模型预测误差统计图
RE为使用深度可分离卷积代替传统卷积结构,FP16为网络计算中使用16位浮点。箱型图矩形边界为上下四分位数,绿色线条为中位数,三角形为均值,黑色短线为数值边界。矩形上下四分位数越接近,且均值越接近0,效果越好
Figure 6. The error staticstics of different DNN models on P phase (a) and S phase (b)
RE stands for separable CNN optimized network,FP16 stands for 16bit floating for computing. The box rectangle is the upper and lower quartiles,the green line is the median,the triangle symbol is the mean value,and the black short line is the boundary. The closer the upper and lower quartiles of the rectangle are and the closer the mean is to zero,the better
图 7 不同噪声对于模型预测的影响
RE为使用深度可分离卷积代替传统卷积结构,FP16为网络计算中使用16位浮点(a) 近震波形分析;(b) 远震P波拾取失败分析
Figure 7. Effect of different noises on model prediction
RE stands for separable CNN optimized network,FP16 stands for 16bit floating for computing(a) Analysis of near-shock waveforms;(b) Analysis of failures on tele-seismics
图 8 八个测试模型在不同噪声情况下的ROC曲线
ROC曲线下方面积越大效果越好。RE代表深度可分离卷积优化的网络,FP16代表16位浮点精度计算结果(a) P波拾取的ROC曲线;(b) S波拾取的ROC曲线
Figure 8. ROC curves of eight DNN models with different signal to noise ratio (SNR)
The bigger area under ROC,the better. RE stands for separable CNN optimized network,FP16 stands for 16bit floating for computing (a) ROC curve of picking P phase;(b) ROC curve of picking S phase
表 1 七层卷积神经网络
Table 1 Seven-layer CNN
层号 基础网络 核心 特征数量 扩张率 激活函数 正则化 1 卷积 3 32 1 LeakyReLU BN 2 卷积 3 64 1 LeakyReLU BN 3 卷积 3 128 1 LeakyReLU BN 4 卷积 3 128 1 LeakyReLU BN 5 卷积 3 128 2 LeakyReLU BN 6 卷积 3 128 4 LeakyReLU BN 7 卷积 3 128 8 LeakyReLU BN 8 全连接 3 无 无 表 2 三层卷积神经网络
Table 2 Three-layer CNN
层号 基础网络 卷积核心 特征数量 激活函数 正则化 1 卷积 5 32 LeakyReLU BN 2 卷积 5 64 LeakyReLU BN 3 卷积 5 128 LeakyReLU BN 4 全连接 3 无 无 表 3 编码解码模型(U-Net)
Table 3 The encoder-decoder model (U-Net)
层号 基础网络 核心 特征数量 降采样率 激活函数 正则化 1 卷积×2+池化 3 32 2 LeakyReLU BN 2 卷积×2+池化 3 64 2 LeakyReLU BN 3 卷积×2+池化 3 128 2 LeakyReLU BN 4 卷积×2+池化 3 256 2 LeakyReLU BN 5 转置卷积+卷积×2 3 256,128 0.5 LeakyReLU BN 5 转置卷积+卷积×2 3 128,64 0.5 LeakyReLU BN 6 转置卷积+卷积×2 3 64,32 0.5 LeakyReLU BN 8 全连接 3 无 无 表 4 双向循环神经网络用于到时拾取
Table 4 Bidirectional RNN used for phase picking
层号 基础网络 核心 特征数量 激活函数 正则化 1 卷积 3 32 LeakyReLU BN 2 卷积 3 64 LeakyReLU BN 3 卷积 3 128 LeakyReLU BN 4 GRU (双向) − 128 tanh 无 5 GRU (双向) − 128 tanh 无 8 全连接 3 无 无 表 5 CPU和GPU上的网络性能统计
Table 5 Performance statistics on CPU and GPU
网络名称 可训练
参数个数CPU环境15个
样本推断时长/sGPU环境15个
样本推断时长/s3层CNN 52 195 0.12 0.008 8 7层CNN 229 283 0.51 0.025 编码解码结构 1 748 771 0.67 0.030 CNN+RNN 476 195 1.2 1.0 WaveNet 2 715 651 4.5 0.17 7层CNN (RE) 78 758 0.19 0.021 7层CNN (RE,FP16) 78758 16 0.020 WaveNet (RE) 770692 2.4 0.17 注:RE代表深度可分离卷积优化,FP16代表16位浮点计算。 -
于子叶,储日升,盛敏汉. 2018. 深度神经网络拾取地震P和S波到时[J]. 地球物理学报,<bold>61</bold>(12):4873–4886. doi: 10.6038/cjg2018L0725 Yu Z Y,Chu R S,Sheng M H. 2018. Pick onset time of P and S phase by deep neural network[J]. <italic>Chinese Journal of Geophysics</italic>,<bold>61</bold>(12):4873–4886 (in Chinese). doi: 10.6038/cjg2018L0725(inChinese)
赵明,陈石,房立华,Yuen D A. 2019. 基于U形卷积神经网络的震相识别与到时拾取方法研究[J]. 地球物理学报,<bold>62</bold>(8):3034–3042. doi: 10.6038/cjg2019M0495 Zhao M,Chen S,Fang L H,Yuen D A. 2019. Earthquake phase arrival auto-picking based on U-shaped convolutional neural network[J]. <italic>Chinese Journal of Geophysics</italic>,<bold>62</bold>(8):3034–3042 (in Chinese). doi: 10.6038/cjg2019M0495
Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation[Z]. arXiv preprint. arXiv: 1406.1078.
Courbariaux M, Bengio Y, David J P. 2014. Training deep neural networks with low precision multiplications[Z]. arXiv preprint. arXiv: 1412.7024.
García L,Álvarez I,Benítez C,Titos M,Titos M,Bueno Á,Mota S,De La Torre Á,Segura J C,Alguacil G,Díaz-Moreno A,Prudencio J,García-Yeguas A,Ibáñez J M,Zuccarello L,Cocina O,Patanè D. 2016. Advances on the automatic estimation of the P-wave onset time[J]. <italic>Ann Geophys</italic>,<bold>59</bold>(4):S0434.
He K M, Zhang X Y, Ren S Q, Sun J. 2016. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE: 770–778.
Hinton G, Vinyals O, Dean J. 2015. Distilling the knowledge in a neural network[Z]. arXiv preprint. arXiv: 1503.02531.
Hochreiter S,Schmidhuber J. 1997. Long short-term memory[J]. <italic>Neural Comput</italic>,<bold>9</bold>(8):1735–1780. doi: 10.1162/neco.1997.9.8.1735
Howard A G, Zhu M L, Chen B, Kalenichenko D, Wang W J, Weyand T, Andreetto M, Adam H. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications[Z]. arXiv preprint. arXiv: 1704.04861.
Hu L L,Zheng X D,Duan Y T,Yan X F,Hu Y,Zhang X L. 2019. First-arrival picking with a U-net convolutional network[J]. <italic>Geophysics</italic>,<bold>84</bold>(6):1–58. doi: 10.1190/geo2019-1029-tiogeo.1
Ioffe S, Szegedy C. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift[Z]. arXiv preprint. arXiv: 1502.03167.
LeCun Y,Bottou L,Bengio Y,Haffner P. 1998. Gradient-based learning applied to document recognition[J]. <italic>Proc IEEE</italic>,<bold>86</bold>(11):2278–2324. doi: 10.1109/5.726791
Maass W. 1997. Networks of spiking neurons:The third generation of neural network models[J]. <italic>Neu Networks</italic>,<bold>10</bold>(9):1659–1671. doi: 10.1016/S0893-6080(97)00011-7
Ronneberger O, Fischer P, Brox T. 2015. U-Net: Convolutional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234–241.
Ross Z E,Meier M A,Hauksson E. 2018. P wave arrival picking and first‐motion polarity determination with deep learning[J]. <italic>J Geophys Res</italic>:<italic>Solid Earth</italic>,<bold>123</bold>(6):5120–5129. doi: 10.1029/2017JB015251
Rumelhart D E,Hinton G E,Williams R J. 1986. Learning representations by back-propagating errors[J]. <italic>Nature</italic>,<bold>323</bold>(6088):533–536. doi: 10.1038/323533a0
Sandler M, Howard A, Zhu M L, Zhmoginov A, Chen L C. 2018. MobileNetV2: Inverted residuals and linear bottlenecks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE: 4510–4520.
Schuster M,Paliwal K K. 1997. Bidirectional recurrent neural networks[J]. <italic>IEEE Trans Signal Process</italic>,<bold>45</bold>(11):2673–2681. doi: 10.1109/78.650093
van der Baan M,Jutten C. 2000. Neural networks in geophysical applications[J]. <italic>Geophysics</italic>,<bold>65</bold>(4):1032–1047. doi: 10.1190/1.1444797
van den Oord A, Dieleman S, Zen H G, Simonyan K, Vinyals O, Graves A, Kalchbrenner N, Senior A, Kavukcuoglu K. 2016. WaveNet: A generative model for raw audio[Z]. arXiv preprint. arXiv: 1609.03499.
Zeiler M D, Krishnan D, Taylor G W, Fergus R. 2010. Deconvolutional networks[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE: 2528–2535.
Zhang K P,Zhang Z P,Li Z F,Qiao Y. 2016. Joint face detection and alignment using multitask cascaded convolutional networks[J]. <italic>IEEE Signal Process Lett</italic>,<bold>23</bold>(10):1499–1503. doi: 10.1109/LSP.2016.2603342
Zhao Y,Takano K. 1999. An artificial neural network approach for broadband seismic phase picking[J]. <italic>Bull Seismol Soc Am</italic>,<bold>89</bold>(3):670–680.
Zhou Y J,Yue H,Kong Q K,Zhou S Y. 2019. Hybrid event detection and phase‐picking algorithm using convolutional and recurrent neural networks[J]. <italic>Seismol Res Lett</italic>,<bold>90</bold>(3):1079–1087. doi: 10.1785/0220180319
Zhu W Q,Beroza G C. 2018. PhaseNet:A deep-neural-network-based seismic arrival-time picking method[J]. <italic>Geophys J Int</italic>,<bold>216</bold>(1):261–273.