Convert A Column In Pandas Of Hh:mm To Minutes
Solution 1:
I suggest you avoid row-wise calculations. You can use a vectorised approach with Pandas / NumPy:
df = pd.DataFrame({'time': ['02:32', '02:14', '02:31', '02:15', '02:28', '02:15', 
                            '02:22', '02:16', '02:22', '02:14', np.nan]})
values = df['time'].fillna('00:00').str.split(':', expand=True).astype(int)
factors = np.array([60, 1])
df['mins'] = (values * factors).sum(1)
print(df)
     time  mins
0   02:32   152
1   02:14   134
2   02:31   151
3   02:15   135
4   02:28   148
5   02:15   135
6   02:22   142
7   02:16   136
8   02:22   142
9   02:14   134
10    NaN     0
Solution 2:
If you want to use split you will need to use the str accessor, ie s.str.split(':').
However I think that in this case it makes more sense to use apply:
df = pd.DataFrame({'Enroute_time_(hh mm)': ['02:32', '02:14', '02:31', 
                                            '02:15', '02:28', '02:15', 
                                            '02:22', '02:16', '02:22', '02:14']})
def convert_to_minutes(value):
    hours, minutes = value.split(':')
    return int(hours) * 60 + int(minutes)
df['Enroute_time_(hh mm)'] = df['Enroute_time_(hh mm)'].apply(convert_to_minutes)
print(df)
#       Enroute_time_(hh mm)#    0                   152#    1                   134#    2                   151#    3                   135#    4                   148#    5                   135#    6                   142#    7                   136#    8                   142#    9                   134Solution 3:
I understood that you have a column in a DataFrame with multiple Timedeltas as Strings. Then you want to extract the total minutes of the Deltas. After that you want to fill the NaN values with the median of the total minutes.
import pandas as pddf= pd.DataFrame(
     {'hhmm' : ['02:32',
                '02:14',
                '02:31',
                '02:15',
                '02:28',
                '02:15',
                '02:22',
                '02:16',
                '02:22',
                '02:14']})
Your Timedeltas are not Timedeltas. They are strings. So you need to convert them first.
df.hhmm = pd.to_datetime(df.hhmm, format='%H:%M') df.hhmm = pd.to_timedelta(df.hhmm - pd.datetime(1900, 1, 1))This gives you the following values (Note the dtype: timedelta64[ns] here)
002:32:00102:14:00202:31:00302:15:00402:28:00502:15:00602:22:00702:16:00802:22:00902:14:00Name:hhmm,dtype:timedelta64[ns]Now that you have true timedeltas, you can use some cool functions like
total_seconds()and then calculate the minutes.df.hhmm.dt.total_seconds() / 60If that is not what you wanted, you can also use the following.
df.hhmm.dt.components.minutesThis gives you the minutes from the HH:MM string as if you would have split it.
Fill the na-values.
df.hhmm.fillna((df.hhmm.dt.total_seconds() / 60).mean())or
df.hhmm.fillna(df.hhmm.dt.components.minutes.mean())
Post a Comment for "Convert A Column In Pandas Of Hh:mm To Minutes"