Skip to content Skip to sidebar Skip to footer

Lstm/rnn Pre Processing On Multiindex Dataframe With Forecast Data

RNN and LSTM requires to define sequences for each feature data point. Forecast data (e.g. weather forecast) are characterized by having a calculation timestamp and a forecast time

Solution 1:

We can define a generator function which groups the dataframe by the dt_calc column and uses the rolling operation with window of size n to aggregate the columns to list thereby yielding sequences.

def seq(n):
    df = data.reset_index()
    for g in df.groupby('dt_calc', sort=False).rolling(n):
        yield g[data.columns].to_numpy().T if len(g) == n else []

pd.DataFrame(seq(2), index=data.index, columns=data.columns).dropna()

# n=2
                                                    temp   temp_2
dt_calc    dt_fore             positional_index                  
2019-07-02 2019-07-02 01:00:00 0                  [2, 4]   [3, 5]
           2019-07-02 02:00:00 0                  [4, 6]   [5, 7]
           2019-07-02 03:00:00 0                  [6, 8]   [7, 9]
           2019-07-02 04:00:00 0                 [8, 10]  [9, 11]
2019-07-04 2019-07-04 01:00:00 0                 [12, 9]  [13, 8]
           2019-07-04 02:00:00 0                  [9, 8]   [8, 9]
           2019-07-04 03:00:00 0                  [8, 5]   [9, 4]
           2019-07-04 04:00:00 0                  [5, 3]   [4, 3]

# n=3
                                                       temp      temp_2
dt_calc    dt_fore             positional_index                        
2019-07-02 2019-07-02 02:00:00 0                  [2, 4, 6]   [3, 5, 7]
           2019-07-02 03:00:00 0                  [4, 6, 8]   [5, 7, 9]
           2019-07-02 04:00:00 0                 [6, 8, 10]  [7, 9, 11]
2019-07-04 2019-07-04 02:00:00 0                 [12, 9, 8]  [13, 8, 9]
           2019-07-04 03:00:00 0                  [9, 8, 5]   [8, 9, 4]
           2019-07-04 04:00:00 0                  [8, 5, 3]   [9, 4, 3]

Post a Comment for "Lstm/rnn Pre Processing On Multiindex Dataframe With Forecast Data"