Abstract:
Accurate extraction of surface water extent is a fundamental prerequisite for monitoring its dynamic changes. Although machine learning algorithms have been widely applied to surface water mapping, most studies focus primarily on algorithmic outputs, with limited systematic evaluation of their applicability and constrained classification accuracy. In this study, we focused on the Songnen Plain in Northeast China and employed Sentinel-2 imagery acquired during 2020–2021 via the Google Earth Engine (GEE) platform to evaluate the performance of Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) for surface water classification. The classification process was optimized by incorporating automated training sample selection and integration of time series features. Validation with independent samples demonstrated the feasibility of automatic sample selection, yielding mean overall accuracies of 91.16%, 90.99%, and 90.76% for RF, SVM, and CART, respectively. After integrating time series features, the mean overall accuracies of the three algorithms improved by 4.51%, 5.45%, and 6.36%, respectively. In addition, spectral features such as MNDWI (Modified Normalized Difference Water Index), SWIR (Short Wave Infrared), and NDVI (Normalized Difference Vegetation Index) were identified as more important for surface water classification. This study establishes a more consistent framework for surface water mapping, offering new perspectives for improving and automating classification processes in the era of big and open data.