gainML: Preparation and Implementation

This documents illustrates how to prepare data, how to implement the package, and what the resulting objects are.

Data preparation

For the analysis, this package requires to use at least three turbine datasets (dataframes); one for each of reference turbine, baseline control turbine, and neutral control turbine.

  • All dataframes must have at least two data columns, one for timestamp and another for turbine id, the column numbers can be set through col.time and col.turb.
  • Other than the two columns, a dataset of a reference turbine must include wind direction, power output, and air density in sequence.
  • Other than the two columns, a dataset of a control turbine (both baseline and neutral) must include wind speed and power output in sequence.

Implementation

To use the package, a user first needs to load the package (attach the package to the current R environment).

library(gainML)

Point estimation of gain

Once the package is loaded, a user can (i) simply run a single function analyze.gain or (ii) choose to run multiple functions in sequence (analyze.gain basically runs these functions in sequence).

  • When using analyze.gain:

    # Analyze Gain in a Single Step
    point.res <- analyze.gain(df.ref, df.ctrb, df.ctrn, p1.beg = '2014-10-24',
                              p1.end = '2015-10-25', p2.beg = '2015-10-25',
                              p2.end = '2016-10-26', ratedPW = 1000, AEP = 300000,
                              pw.freq = pw.freq)
    
    point.res$gain.res$gain   #Provides the point estimate of gain
  • When using multiple functions:

    # Prepare Data
    data <- arrange.data(df.ref, df.ctrb, df.ctrn, p1.beg = '2014-10-24',
                         p1.end = '2015-10-25', p2.beg = '2015-10-25', p2.end = '2016-10-26')
    
    # Period 1 Analysis
    p1.res <- analyze.p1(data$train, data$test, ratedPW = 1000)
    
    # Period 2 Analysis
    p2.res <- analyze.p2(data$per1, data$per2, p1.res$opt.cov)
    
    # Quantify gain
    gain.res <- quantify.gain(p1.res, p2.res, ratedPW = 1000, AEP = 300000, pw.freq = pw.freq)
    
    gain.res$gain   #Provides the point estimate of gain
  • When using analyze.gain for free sector analysis:

    free.sec <- list(c(310, 50), c(150, 260))   #Defines the free sectors
    
    # Analyze Gain in a Single Step
    point.res <- analyze.gain(df.ref, df.ctrb, df.ctrn, p1.beg = '2014-10-24',
                              p1.end = '2015-10-25', p2.beg = '2015-10-25',
                              p2.end = '2016-10-26', ratedPW = 1000, AEP = 300000,
                              pw.freq = pw.freq, free.sec = free.sec)

    Note: free.sec is a list of vectors defining free sectors. Each vector in the list has two scalars: one for starting direction and another for ending direction, ordered clockwise.

For the details about the functions, please refer to the package manual (in a pdf format).

Interval estimation of gain (by using bootstrap)

Once the package is loaded, a user needs to run a series of functions as illustrated below.

  • Full sector analysis:

    # Prepare Data
    data <- arrange.data(df.ref, df.ctrb, df.ctrn, p1.beg = '2014-10-24',
                         p1.end = '2015-10-25', p2.beg = '2015-10-25', p2.end = '2016-10-26')
    
    # Period 1 Analysis
    p1.res <- analyze.p1(data$train, data$test, ratedPW = 1000)
    
    # Gain Analysis by Using Bootstrap
    n.rep <- 10   #Defines the number of replications.
    interval.res <- bootstrap.gain(df.ref, df.ctrb, df.ctrn, opt.cov = p1.res$opt.cov,
                                   n.rep = n.rep, p1.beg = '2014-10-24',
                                   p1.end = '2015-10-25', p2.beg = '2015-10-25',
                                   p2.end = '2016-10-26', ratedPW = 1000, AEP = 300000,
                                   pw.freq = pw.freq, write.path = NULL)
    
    sapply(res, function(ls) ls$gain.res$gainCurve)   #Provides 10 gain curves
    sapply(res, function(ls) ls$gain.res$gain)        #Provides 10 gain values
  • Free sector analysis:

    free.sec <- list(c(310, 50), c(150, 260))   #Defines the free sectors
    
    # Prepare Data
    data <- arrange.data(df.ref, df.ctrb, df.ctrn, p1.beg = '2014-10-24',
                         p1.end = '2015-10-25', p2.beg = '2015-10-25',
                         p2.end = '2016-10-26', free.sec = free.sec)
    
    # Period 1 Analysis
    p1.res <- analyze.p1(data$train, data$test, ratedPW = 1000)
    
    # Gain Analysis by Using Bootstrap
    n.rep <- 10   #Defines the number of replications.
    interval.res <- bootstrap.gain(df.ref, df.ctrb, df.ctrn, opt.cov = p1.res$opt.cov,
                                   n.rep = n.rep, free.sec = free.sec, p1.beg = '2014-10-24',
                                   p1.end = '2015-10-25', p2.beg = '2015-10-25',
                                   p2.end = '2016-10-26', ratedPW = 1000, AEP = 300000,
                                   pw.freq = pw.freq, write.path = NULL)
    
    sapply(res, function(ls) ls$gain.res$gainCurve)   #Provides 10 gain curves
    sapply(res, function(ls) ls$gain.res$gain)        #Provides 10 gain values

    Note: The only difference is to define free.sec and set it as an argument when using arrange.data and bootstrap.gain functions.

Remarks

  • Period 1 analysis will take a significant amount of time, so its progress will be indicated in the R console.

  • A user needs to read and store the long term frequency data manually. To see a desired format, please refer to the pw.freq part in the manual or, in the R console, run

    head(pw.freq)

Resulting Objects

The analysis outcome can be obtained from the quantify.gain function (the return from analyze.gain and bootstrap.gain will also include this outcome). The outcome includes:

  • Gain quantification: initial effect, offset, and gain with offset adjustment.

  • Bin-wise curve: effect curve, offset curve, and gain curve corresponding to each of the above gain quantification, respectively.

Please refer to the package manual for more details.