As the global demand for clean energy continues to rise, wind power has become one of the most important renewable energy sources. However, wind power data often contains a high proportion of dense anomalies, which not only significantly affect the accuracy of wind power forecasting models but may also mislead grid scheduling decisions, thereby jeopardizing grid security. To address this issue, this paper proposes an adaptive threshold robust regression model (RPR model) based on the combination of the Random Sample Consensus (RANSAC) algorithm and polynomial linear regression for wind power data cleaning. The model successfully captures the nonlinear relationship between wind speed and power by extending the polynomial features of wind speed and power, enabling the linear regression model to handle the nonlinearity. By combining the RANSAC algorithm and polynomial linear regression, a robust polynomial regression model is constructed to tackle anomalous data and enhance the accuracy of data cleaning. During the cleaning process, the model first fits the raw data by randomly selecting a minimal sample set, then dynamically adjusts the decision thresholds based on the median of residuals and median absolute deviation (MAD), ensuring effective identification and cleaning of anomalous data. The model's robustness allows it to maintain efficient cleaning performance even with a high proportion of anomalous data, addressing the limitations of existing methods when handling densely distributed anomalies. The effectiveness and innovation of the proposed method were validated by applying it to real data from a wind farm operated by Longyuan Power. Compared to other commonly used cleaning methods, such as the Bidirectional Change Point Grouping Quartile Statistical Model, Principal Contour Image Processing Model, DBSCAN Clustering Model, and Support Vector Machine (SVM) Model, experimental results showed that the proposed method delivered the best performance in improving data quality. Specifically, the method significantly reduced the average absolute error (MAE) of the wind power forecasting model by 72.1%, which is higher than the reductions observed in other methods (ranging from 37.3 to 52.7%). Moreover, it effectively reduced the prediction error of the Convolutional Neural Network (CNN)â+âGated Recurrent Unit (GRU) forecasting model, ensuring high prediction accuracy. The adaptive threshold robust regression model proposed in this study is innovative and has significant application potential. It provides an effective new approach for wind power data cleaning, applicable not only to conventional scenarios with low proportions of anomalous data but also to complex datasets with a high proportion of dense anomalies.
Wind power data cleaning using RANSAC-based polynomial and linear regression with adaptive threshold.
使用基于 RANSAC 的多项式和线性回归以及自适应阈值进行风力发电数据清洗。
阅读:10
作者:
| 期刊: | Scientific Reports | 影响因子: | 3.900 |
| 时间: | 2025 | 起止号: | 2025 Feb 11; 15(1):5105 |
| doi: | 10.1038/s41598-025-89177-9 | ||
特别声明
1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。
2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。
3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。
4、投稿及合作请联系:info@biocloudy.com。
