Abstract:A storm surge is the anomalous rising of the sea surface induced by intense atmospheric disturbances.Storm surges caused by tropical cyclones often cause great socio-economic, human activity and life and property hazards to coastal areas.Therefore, realizing accurate and timely storm surge floodplain prediction is critical.Numerical models are currently the primary method used to predict storm surges, and high-resolution floodplain models always need a significant investment in both research funds and processing time.The machine learning approach, which depends on the robust nonlinear mapping capability driven by data, has an edge over the conventional numerical model prediction in terms of research time and computational resource consumption.This paper uses the convolutional long-short term memory network (ConvLSTM) machine learning algorithm to predict storm surge floodplain in the Pearl River Estuary in Guangdong Province.Using the numerical model products driven by reanalysis data, the historical typhoon floodplain data set is constructed for machine learning model training, verification and testing.The paper studies two prediction techniques including the autoregressive prediction based on the sea surface height field and the prediction based on the predicted wind field and initial sea surface height field, which may realize the storm surge floodplain forecast based on data-driven scheme.Among them, the autoregressive prediction model performs better.By testing the previous model, it concludes that ConvLSTM can predict floodplains with a general error of less than 0.2 m based on the sea surface height field a few hours ago, even if the boundary conditions, topography, surface runoff and atmospheric signals are unknown.Under such conditions, the larger errors mostly occur at the coast and on both sides of the river.By analyzing the errors of the two models, it finds that adding wind field input to ConvLSTM does not significantly improve the prediction skills of the model.Further studies are required to determine the better way to train the data-driven prediction model by adding more features.