Estimation of Soil Organic Carbon Stocks Utilizing Machine Learning Algorithms and Multi-source Geospatial Data in Coastal Wetlands of Tianjin and Hebei, China
-
Rui YANG,
-
Mingyue LIU,
-
Yongbin ZHANG,
-
Weidong MAN,
-
Jingfen TONG,
-
Dong LIU,
-
Qingwen ZHANG,
-
Caiyao KOU,
-
Xiang LI,
-
Yahui LIU,
-
Di TIAN,
-
Xuan YIN,
-
Jiannan HE
-
Graphical Abstract
-
Abstract
Coastal wetlands are crucial for the ‘blue carbon sink’, significantly contributing to regulating climate change. This study utilized 160 soil samples, 35 remote sensing features, and 5 geo-climatic data to accurately estimate the soil organic carbon stocks (SOCS) in the coastal wetlands of Tianjin and Hebei, China. To reduce data redundancy, simplify model complexity, and improve model interpretability, Pearson correlation analysis (PsCA), Boruta, and recursive feature elimination (RFE) were employed to optimize features. Combined with the optimized features, the soil organic carbon density (SOCD) prediction model was constructed by using multivariate adaptive regression splines (MARS), extreme gradient boosting (XGBoost), and random forest (RF) algorithms and applied to predict the spatial distribution of SOCD and estimate the SOCS of different wetland types in 2020. The results show that: 1) different feature combinations have a significant influence on the model performance. Better prediction performance was attained by building a model using RFE-based feature combinations. RF has the best prediction accuracy (R2 = 0.587, RMSE = 0.798 kg/m2, MAE = 0.660 kg/m2). 2) Optical features are more important than radar and geo-climatic features in the MARS, XGBoost, and RF algorithms. 3) The size of SOCS is related to SOCD and the area of each wetland type, aquaculture pond has the highest SOCS, followed by marsh, salt pan, mudflat, and sand shore.
-
-