Skip to content Skip to sidebar Skip to footer

R Foverlaps Equivalent In Python

I am trying to rewrite some R code in Python and cannot get past one particular bit of code. I've found the foverlaps function in R to be very useful when performing a time-based j

Solution 1:

Consider a straightforward merge with subset using pandas.Series.between(). Merge joins all combinations of the join columns and the subset keeps rows that align to time intervals.

df = pd.merge(table_A, table_B, on=['x', 'y'])                   
df = df[df['time'].between(df['start_time'], df['end_time'], inclusive=True)]

However, one important item is your dates should be casted as datetime type. Currently, your post shows string dates which affects above .between(). Below assumes US dates with month first as MM/DD/YYYY. Either you can convert types during file read in:

dateparse = lambda x: pd.datetime.strptime(x, '%m/%d/%Y %H:%M:%S')

table_A = pd.read_csv('data.csv', parse_dates=[0], date_parser=dateparse, dayfirst=False)

table_B = pd.read_csv('data.csv', parse_dates=[0,1], date_parser=dateparse, dayfirst=False)

Or after read in:

table_A['time'] = pd.to_datetime(table_A['time'], format='%m/%d/%Y %H:%M:%S')

table_B['start_time'], table_B['end_time']=(pd.to_datetime(ser, format='%m/%d/%Y %H:%M:%S') \
                                    for ser in [table_B['start_time'], table_B['end_time']])

Post a Comment for "R Foverlaps Equivalent In Python"