GaitherNews Escape the Algorithm
Today --°
Updated
Categories
Computing 2 sources 0 views

Cross-dataset benchmarking of machine learning models for marine and atmospheric environmental prediction

Article excerpt

by Xuehua Zhou, Hanming Zhang, Tiantian Du, Quanbo Yuan, Huijuan Wang Accurate prediction of marine and atmospheric environmental variables is important for climate adaptation, ecosystem management, and operational decision-making, yet practitioners still lack clear guidance on which machine-learning models are…

by Xuehua Zhou, Hanming Zhang, Tiantian Du, Quanbo Yuan, Huijuan Wang

Accurate prediction of marine and atmospheric environmental variables is important for climate adaptation, ecosystem management, and operational decision-making, yet practitioners still lack clear guidance on which machine-learning models are reliable across heterogeneous environmental tasks. We therefore developed a unified, leakage-aware benchmark across nine datasets, of which seven passed quality checks for modeling, spanning chlorophyll-a, wind speed, hydrographic observations, biotoxins, and bathymetry, and compared representative linear, tree-based, and sequence models under a common evaluation framework. Results show strong heterogeneity across tasks and model classes: tree ensembles are robust baselines for tabular problems, LSTM-based recurrent sequence modeling is most useful when temporal structure is central, and predictive skill depends more on target structure and covariate quality than on model complexity alone. Within the observational settings represented in this benchmark, predominantly Chinese coastal/estuarine and regional marine datasets, plus one atmospheric reanalysis wind task and one global cast archive, quality-controlled chlorophyll-a is comparatively predictable, whereas event-driven biotoxins and bathymetry inversion remain difficult under the current predictors. These findings provide practical guidance for researchers and environmental monitoring practitioners working in similar data regimes, but they should not be assumed to transfer automatically to untested regions such as the North Atlantic, the Mediterranean, or tropical open-ocean systems without further validation.