Lstm/rnn Pre Processing On Multiindex Dataframe With Forecast Data
RNN and LSTM requires to define sequences for each feature data point. Forecast data (e.g. weather forecast) are characterized by having a calculation timestamp and a forecast time
Solution 1:
We can define a generator function which groups the dataframe by the dt_calc
column and uses the rolling operation with window of size n
to aggregate the columns to list thereby yielding sequences.
def seq(n):
df = data.reset_index()
for g in df.groupby('dt_calc', sort=False).rolling(n):
yield g[data.columns].to_numpy().T if len(g) == n else []
pd.DataFrame(seq(2), index=data.index, columns=data.columns).dropna()
# n=2
temp temp_2
dt_calc dt_fore positional_index
2019-07-02 2019-07-02 01:00:00 0 [2, 4] [3, 5]
2019-07-02 02:00:00 0 [4, 6] [5, 7]
2019-07-02 03:00:00 0 [6, 8] [7, 9]
2019-07-02 04:00:00 0 [8, 10] [9, 11]
2019-07-04 2019-07-04 01:00:00 0 [12, 9] [13, 8]
2019-07-04 02:00:00 0 [9, 8] [8, 9]
2019-07-04 03:00:00 0 [8, 5] [9, 4]
2019-07-04 04:00:00 0 [5, 3] [4, 3]
# n=3
temp temp_2
dt_calc dt_fore positional_index
2019-07-02 2019-07-02 02:00:00 0 [2, 4, 6] [3, 5, 7]
2019-07-02 03:00:00 0 [4, 6, 8] [5, 7, 9]
2019-07-02 04:00:00 0 [6, 8, 10] [7, 9, 11]
2019-07-04 2019-07-04 02:00:00 0 [12, 9, 8] [13, 8, 9]
2019-07-04 03:00:00 0 [9, 8, 5] [8, 9, 4]
2019-07-04 04:00:00 0 [8, 5, 3] [9, 4, 3]
Post a Comment for "Lstm/rnn Pre Processing On Multiindex Dataframe With Forecast Data"