Skip to content Skip to sidebar Skip to footer

Pandas Row Analysis For Consecutive Dates

Following a 'chain' of rows and counting the consecutive months from a CSV file. Currently I am reading a CSV file with 5 columns of interest (based on insurance policies): CONTRAC

Solution 1:

Not sure if I totally undertand your requirement, but does something like this work?:

df_contract['TOTAL_YEARS'] = (df_contract['END_DATE'] - df_contract['START_DATE']
                             )/np.timedelta64(1,'Y')

df_contract['TOTAL_YEARS'][(df['CANCEL_FLAG'] == 1) && (df['TOTAL_YEARS'] > 1)] = 1

Solution 2:

After a lot of trial and error I got it working!

This finds the time difference between the first and last contracts in the chain and finds the length of the chain.

Not the cleanest code by far, but it works:

test = 'START_DATE'


df_short = df_policy[['OLD_CON_ID',test,'CONTRACT_ID']]
df_short.rename(columns={'OLD_CON_ID':'PID','CONTRACT_ID':'CID'}, 

inplace = True)
df_test = df_policy[['CONTRACT_ID','END_DATE']]
df_test.rename(columns={'CONTRACT_ID':'CID','END_DATE': 'PED'}, inplace = True)


df_copy1 = df_short.copy()
df_copy2 = df_short.copy()
df_copy2.rename(columns={'PID':'PPID','CID':'PID'}, inplace = True)

df_merge1 = pd.merge(df_short, df_copy2,
    how='left',
    on=['PID'])

df_merge1['START_DATE_y'].fillna(df_merge1['START_DATE_x'], inplace = True)
df_merge1.rename(columns={'START_DATE_x':'1_EFF','START_DATE_y':'2_EFF'}, inplace=True)

The copy, merge, fillna, and rename code is repeated for 5 merged dataframes then:

df_merged = pd.merge(df_merge5, df_test,
    how='right',
    on=['CID'])

df_merged['TOTAL_MONTHS'] = ((df_merged['PED'] - df_merged['6_EFF']
                             )/np.timedelta64(1,'M'))

df_merged4 = df_merged[
    (df_merged['PED'] >= pd.to_datetime('2015-07-06')) 
df_merged4['CHAIN_LENGTH'] = df_merged4.drop(['PED','1_EFF','2_EFF','3_EFF','4_EFF','5_EFF'], axis=1).apply(lambda row: len(pd.unique(row)), axis=1) -3

Hopefully my code is understood and will help someone in the future.

Post a Comment for "Pandas Row Analysis For Consecutive Dates"