From Offshore to Onshore: Scaling Up Wind Generation Forecasting in Germany
Project update and details on how it works
Project Page: GitHub
Germany's ambitious renewable energy transition, the Energiewende, has made it a world leader in wind power integration. With onshore wind providing nearly a quarter of Germany's electricity in 2023, accurate forecasting of wind generation has become crucial for grid stability and market operations. Just a week after launching our offshore wind forecasting system, we're excited to announce the expansion of our predictive analytics platform to include onshore wind generation forecasts for all German transmission system operators (TSOs).
The Power of Modular Design
Our success in rapidly expanding from offshore to onshore wind forecasting stems from a fundamental design philosophy: building modular, scalable systems that combine statistical learning with domain-specific physics knowledge. This approach allows us to tackle new forecasting challenges without reinventing the wheel, while maintaining high accuracy through careful consideration of the underlying physical processes.
The system architecture consists of two main components: a DataOps pipeline for automated data collection and preprocessing, and an MLOps pipeline for model training and forecasting. Both are designed with extensibility in mind, making it straightforward to add new variables and regions to our forecasting capabilities.
The Data Foundation
Our DataOps pipeline automatically collects and processes data from multiple sources:
SMARD: Historical and real-time generation data
ENTSO-E: Cross-border power flows and system-wide data
OpenMeteo: High-resolution weather forecasts
For each TSO region, we monitor 3-4 strategic locations corresponding to major wind farm clusters. This gives us a comprehensive view of weather conditions affecting wind generation while keeping computational requirements manageable.
The Science Behind the Forecasts
What sets our approach apart is the sophisticated feature engineering process that transforms raw weather data into physically meaningful predictors. Instead of simply feeding raw weather variables into our models, we calculate derived features that directly relate to wind power generation:
Wind shear profiles
Turbulence intensity
Wind power density
These engineered features capture the complex physics of wind energy conversion more effectively than raw measurements alone. However, determining the optimal combination of features and locations isn't straightforward. Data quality issues and collinearity between locations can impact model performance, requiring a careful statistical approach to feature selection.
The Power of Automated Model Selection
Our MLOps pipeline employs a systematic approach to model selection and training. Rather than manually specifying feature combinations and model parameters, we treat this as a statistical optimization problem. The pipeline:
Samples from the combined space of model parameters and dataset configurations
Evaluates performance using recursive week-ahead forecasts
Selects the best-performing models based on RMSE over multiple forecast horizons
Combines individual models into an ensemble for improved stability
This automated approach ensures we're using the most effective combination of features and model parameters for each specific forecasting task, while the ensemble method helps maintain accuracy across the entire forecast horizon.
Weekly Retraining for Optimal Performance
To maintain forecast accuracy, our models undergo weekly retraining. This helps prevent model drift and ensures our predictions remain reliable as seasonal patterns change and new data becomes available. While the training process is computationally intensive and runs locally, the daily forecast updates are performed online through our GitHub-based infrastructure.
Looking Forward
The successful expansion to onshore wind forecasting demonstrates the scalability of our approach. Looking ahead, we plan to extend our forecasting capabilities to other generation types, building toward a comprehensive forecast of Germany's entire energy mix. This will include:
Solar power generation
Conventional power plants
Cross-border power flows
Electricity demand
Ultimately, our goal is to provide accurate forecasts of electricity market prices by combining these various components. The modular architecture we've developed makes this ambitious expansion feasible, allowing us to tackle each new variable systematically while maintaining high accuracy through our physics-informed machine learning approach.
By continuing to refine and expand our system, we aim to contribute to the stability and efficiency of Germany's evolving energy landscape, helping to facilitate the integration of renewable energy sources into the grid.
Project Page: GitHub