Numpy: Conditional Np.where Replace
I have the following dataframe: 'customer_id','transaction_dt','product','price','units' 1,2004-01-02 00:00:00,thing1,25,47 1,2004-01-17 00:00:00,thing2,150,8 2,2004-01-29 00:00:00
Solution 1:
I think you can use:
tra = df['transaction_dt'].values[:, None]
idx = np.argmax(end_date_range.values > tra, axis=1)
sdr = start_date_range[idx]
m = df['transaction_dt'] < sdr
#change value by condition with previous
df["window_start_dt"] = np.where(m, start_date_range[idx - 1], sdr)
df['window_end_dt'] = end_date_range[idx]
print (df)
customer_id transaction_dt product price units window_start_dt \
0 1 2004-01-02 thing1 25 47 2004-01-01
1 1 2004-01-17 thing2 150 8 2004-01-01
2 2 2004-01-29 thing2 150 25 2004-01-01
3 3 2017-07-15 thing3 55 17 2017-06-21
4 3 2016-05-12 thing3 55 47 2016-04-27
5 4 2012-02-23 thing2 150 22 2012-02-18
6 4 2009-10-10 thing1 25 12 2009-10-01
7 4 2014-04-04 thing2 150 2 2014-03-09
8 5 2008-07-09 thing2 150 43 2008-07-08
9 5 2004-01-30 thing1 25 40 2004-01-01
10 5 2004-01-31 thing1 25 22 2004-01-01
11 5 2004-02-01 thing1 25 2 2004-01-31
Solution 2:
You can use numpy.where() like :
numpy.where(df['transaction_dt'] <= df['window_start_dt'], *operation when True*, *operation when False*)
Solution 3:
What about something like this?
# get argmax indices
idx = df.transaction_dt.apply(lambda x: np.argmax(end_date_range > x)).values
# define window_start_dt
df = df.assign(window_start_dt = start_date_range[idx])
# identify exceptions
mask = df.transaction_dt.le(df.window_start_dt)
# replace with shifted start_date_rage
df.loc[mask, "window_start_dt"] = start_date_range[idx - 1][mask]
Output:
customer_id transaction_dt product price units window_start_dt
0 1 2004-01-02 thing1 25 47 2004-01-01
1 1 2004-01-17 thing2 150 8 2004-01-01
2 2 2004-01-29 thing2 150 25 2004-01-01
3 3 2017-07-15 thing3 55 17 2017-06-21
4 3 2016-05-12 thing3 55 47 2016-04-27
5 4 2012-02-23 thing2 150 22 2012-02-18
6 4 2009-10-10 thing1 25 12 2009-10-01
7 4 2014-04-04 thing2 150 2 2014-03-09
8 5 2008-07-09 thing2 150 43 2008-07-08
9 5 2004-01-30 thing1 25 40 2004-01-01
10 5 2004-01-31 thing1 25 22 2004-01-01
11 5 2004-02-01 thing1 25 2 2004-01-31
Post a Comment for "Numpy: Conditional Np.where Replace"