R Foverlaps Equivalent In Python
I am trying to rewrite some R code in Python and cannot get past one particular bit of code. I've found the foverlaps function in R to be very useful when performing a time-based j
Solution 1:
Consider a straightforward merge with subset using pandas.Series.between()
. Merge joins all combinations of the join columns and the subset keeps rows that align to time intervals.
df = pd.merge(table_A, table_B, on=['x', 'y'])
df = df[df['time'].between(df['start_time'], df['end_time'], inclusive=True)]
However, one important item is your dates should be casted as datetime type. Currently, your post shows string dates which affects above .between()
. Below assumes US dates with month first as MM/DD/YYYY
. Either you can convert types during file read in:
dateparse = lambda x: pd.datetime.strptime(x, '%m/%d/%Y %H:%M:%S')
table_A = pd.read_csv('data.csv', parse_dates=[0], date_parser=dateparse, dayfirst=False)
table_B = pd.read_csv('data.csv', parse_dates=[0,1], date_parser=dateparse, dayfirst=False)
Or after read in:
table_A['time'] = pd.to_datetime(table_A['time'], format='%m/%d/%Y %H:%M:%S')
table_B['start_time'], table_B['end_time']=(pd.to_datetime(ser, format='%m/%d/%Y %H:%M:%S') \
for ser in [table_B['start_time'], table_B['end_time']])
Post a Comment for "R Foverlaps Equivalent In Python"