Global or local modeling for XGBoost in geospatial studies upon simulated data and German COVID-19 infection forecasting.

0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Ximeng Cheng, Jackie Ma

Ngôn ngữ: eng

Ký hiệu phân loại:

Thông tin xuất bản: England : Scientific reports , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 712169

Thêm vào giỏ Liên kết toàn văn

Methods from artificial intelligence (AI) and, in particular, machine learning and deep learning, have advanced rapidly in recent years and have been applied to multiple fields including geospatial analysis. Due to the spatial heterogeneity and the fact that conventional methods can not mine large data, geospatial studies typically model homogeneous regions locally within the entire study area. However, AI models can process large amounts of data, and, theoretically, the more diverse the train data, the more robust a well-trained model will be. In this paper, we study a typical machine learning method XGBoost, with the question: Is it better to build a single global or multiple local models for XGBoost in geospatial studies? To compare the global and local modeling, XGBoost is first studied on simulated data and then also studied to forecast daily infection cases of COVID-19 in Germany. The results indicate that if the data under different relationships between independent and dependent variables are balanced and the corresponding value ranges are similar, i.e., low spatial variation, global modeling of XGBoost is better for most cases
otherwise, local modeling of XGBoost is more stable and better, especially for the secondary data. Besides, local modeling has the potential of using parallel computing because each sub-model is trained independently, but the spatial partition of local modeling requires extra attention and can affect results.

Tạo bộ sưu tập với mã QR