MORE General Logo | MORE - Management of Real-time Energy Data

MORE | Management of Real-time Energy Data - Incremental models for time-series data


Wind turbines generate optimal power output when the turbine blades are perpendicular to the wind direction. However, due to technical errors or malfunctioned sensors, this is not always the case - when a wind turbine does not face the wind it is defined as yaw misalignment. Detecting and correcting yaw misalignment creates a direct impact on the bottom line of wind farm companies. In the past, physical models, machine learning models, and hybrids of these two approaches have been used to detect yaw misalignment. However, it has been identified as a challenge that is difficult to solve with 5-10 minute aggregated data, which is the usual frequency at which data is stored in wind farms. Data at higher velocity/frequency is now being analysed to detect yaw misalignment. Similarly, very short-term forecasting for energy bidding also requires data at high frequency. In general, there is a growing demand for machine learning algorithms on time-series data across various domains like finance, inventory management etc. where high- velocity or high-volume data needs to be analysed. Further complications, like pre-processing noisy streaming data, memory constraints of models that need to be deployed on the edge, and AutoML models that need to be constantly updated due to concept drifts, arise in the incoming data and target variables.

To address some of these issues, we have developed Streams and Incremental Learning (SAIL) library as part of the MORE (www.more2020.eu) project funded under the H2020 programme. SAIL includes a standard set of incremental machine learning algorithms and wrapper functionality to existing libraries (River, Scikit-Multiflow, Scikit-Learn, Skorch) with standard Scikit-Learn APIs for incremental models through the partial_fit function. Ensemble learning for incremental models and a combination of batch and incremental models are part of SAIL. Elementary distributed computing functionality for model selection in SAIL has also been set up with Ray.

References:

  1. https://github.com/online-ml/River
  2. https://github.com/scikit-multiflow/scikit-multiflow
  3. https://github.com/scikit-learn/scikit-learn
  4. https://github.com/skorch-dev/skorch
  5. https://github.com/skejserjensen/ModelarDB
  6. https://github.com/IBM/sail
  7. https://github.com/ray-project/ray

01 Mar 2022