• 中国期刊全文数据库
  • 中国学术期刊综合评价数据库
  • 中国科技论文与引文数据库
  • 中国核心期刊(遴选)数据库
李锡辉, 时圆圆, 祝光湖. 基于机器学习和时间序列方法的传染病规模预测模型[J]. 桂林电子科技大学学报, 2024, 44(1): 87-92. DOI: 10.16725/j.1673-808X.2023184
引用本文: 李锡辉, 时圆圆, 祝光湖. 基于机器学习和时间序列方法的传染病规模预测模型[J]. 桂林电子科技大学学报, 2024, 44(1): 87-92. DOI: 10.16725/j.1673-808X.2023184
LI Xihui, SHI Yuanyuan, ZHU Guanghu. Prediction of notifiable infectious diseases based on the combination of machine learning and time series methods[J]. Journal of Guilin University of Electronic Technology, 2024, 44(1): 87-92. DOI: 10.16725/j.1673-808X.2023184
Citation: LI Xihui, SHI Yuanyuan, ZHU Guanghu. Prediction of notifiable infectious diseases based on the combination of machine learning and time series methods[J]. Journal of Guilin University of Electronic Technology, 2024, 44(1): 87-92. DOI: 10.16725/j.1673-808X.2023184

基于机器学习和时间序列方法的传染病规模预测模型

Prediction of notifiable infectious diseases based on the combination of machine learning and time series methods

  • 摘要: 疾病预测是疫情风险评估中的重要指标,为探索时间序列分析与机器学习方法在传染病发病趋势预测中的应用, 以全国2012–2022年月度法定传染病疫情资料为基础,分别使用传统时间序列分析方法(SARIMA模型)、机器学习方法(SVR、BP神经网络)和二者的组合(SARIMA-SVR、SARIMA-BPANN),建立5种传染病发病数的预测模型,进而比较模型预测性能。结果发现,对法定传染病的预测,相较于单一SARIMA、SVR、BP神经网络模型,组合模型SARIMA-SVR和SARIMA-BPNN的平均绝对百分比误差分别减小了6.85%、7.48%、6.97%和6.36%、6.99%、6.48%;同样,对于甲乙类和丙类传染病,组合模型相较于单一模型的预测精度都有一定程度的提升。这表明组合模型SARIMA-SVR和SARIMA-BPANN比单一模型在传染病预测中更有优势,可推广于疫情数据的预测分析应用中。

     

    Abstract: Disease prediction is an important index in epidemic risk assessment. This paper aims at studying the application of time series analysis and machine learning methods for predicting the incidence trend of infectious diseases. Based on the monthly data of notifiable infectious diseases in China from 2012 to 2022, traditional time series analysis methods (SARIMA model), machine learning methods (SVR, BP neural network), and their combination methods (ARIMA-SVR, ARIMA-BPANN) were used, respectively. The prediction models of epidemic incidence were established, and their performances were compared. It is found that for predicting the transmission of infectious diseases, the mean absolute percentage errors (MAPE) of the combined models SARIMA-SVR and SARIMA-BPNN were separately reduced by 6.85%, 7.48%, 6.97%, and 6.36%, 6.99%, 6.48%, compared with single SARIMA, SVR, and BP neural network models. Similarly, for the classes of A, B and C infectious diseases, the prediction accuracy of the combined model is also improved to a certain extent compared with the single model. The finding indicated that combination models SARIMA-SVR and SARIMA-BPNN have more advantages in predicting epidemic data than single model.

     

/

返回文章
返回