Skip to content Skip to sidebar Skip to footer

Pandas: How To Create A Datetime Object From Week And Year?

I have a dataframe that provides two integer columns with the Year and Week of the year: import pandas as pd import numpy as np L1 = [43,44,51,2,5,12] L2 = [2016,2016,2016,2017,201

Solution 1:

Try this:

In [19]: pd.to_datetime(df.Year.astype(str), format='%Y') + \
             pd.to_timedelta(df.Week.mul(7).astype(str) + ' days')
Out[19]:
02016-10-2812016-11-0422016-12-2332017-01-1542017-02-0552017-03-26
dtype: datetime64[ns]

Initially I have timestamps in s

It's much easier to parse it from UNIX epoch timestamp:

df['Date'] = pd.to_datetime(df['UNIX_Time'], unit='s')

Timing for 10M rows DF:

Setup:

In [26]:df=pd.DataFrame(pd.date_range('1970-01-01',freq='1T',periods=10**7),columns=['date'])In [27]:df.shapeOut[27]:(10000000,1)In [28]:df['unix_ts']=df['date'].astype(np.int64)//10**9In [30]:dfOut[30]:dateunix_ts01970-01-01 00:00:00          011970-01-01 00:01:00         6021970-01-01 00:02:00        12031970-01-01 00:03:00        18041970-01-01 00:04:00        24051970-01-01 00:05:00        30061970-01-01 00:06:00        36071970-01-01 00:07:00        42081970-01-01 00:08:00        48091970-01-01 00:09:00        540.........99999901989-01-05 10:30:00  59999940099999911989-01-05 10:31:00  59999946099999921989-01-05 10:32:00  59999952099999931989-01-05 10:33:00  59999958099999941989-01-05 10:34:00  59999964099999951989-01-05 10:35:00  59999970099999961989-01-05 10:36:00  59999976099999971989-01-05 10:37:00  59999982099999981989-01-05 10:38:00  59999988099999991989-01-05 10:39:00  599999940

[10000000rowsx2columns]

Check:

In [31]:pd.to_datetime(df.unix_ts,unit='s')Out[31]:01970-01-01 00:00:0011970-01-01 00:01:0021970-01-01 00:02:0031970-01-01 00:03:0041970-01-01 00:04:0051970-01-01 00:05:0061970-01-01 00:06:0071970-01-01 00:07:0081970-01-01 00:08:0091970-01-01 00:09:00...99999901989-01-05 10:30:0099999911989-01-05 10:31:0099999921989-01-05 10:32:0099999931989-01-05 10:33:0099999941989-01-05 10:34:0099999951989-01-05 10:35:0099999961989-01-05 10:36:0099999971989-01-05 10:37:0099999981989-01-05 10:38:0099999991989-01-05 10:39:00Name:unix_ts,Length:10000000,dtype:datetime64[ns]

Timing:

In [32]: %timeit pd.to_datetime(df.unix_ts, unit='s')
10 loops, best of 3: 156 ms per loop

Conclusion: I think 156 milliseconds for converting 10.000.000 rows is not that slow

Solution 2:

Like @Gianmario Spacagna mentioned for datetimes higher like 2018 use %V with %G:

L1= [43,44,51,2,5,12,52,53,1,2,5,52]
L2= [2016,2016,2016,2017,2017,2017,2018,2018,2019,2019,2019,2019]
df=pd.DataFrame({"Week":L1,"Year":L2})df['new']=pd.to_datetime(df.Week.astype(str)+df.Year.astype(str).add('-1'),format='%V%G-%u')print(df)WeekYearnew0432016 2016-10-241442016 2016-10-312512016 2016-12-19322017 2017-01-09452017 2017-01-305122017 2017-03-206522018 2018-12-247532018 2018-12-31812019 2018-12-31922019 2019-01-071052019 2019-01-2811522019 2019-12-23

Solution 3:

There is something fishy going on with weeks starting from 2019. The ISO-8601 standard assigns the 31st December 2018 to the week 1 of year 2019. The other approaches based on:

pd.to_datetime(df.Week.astype(str)+
                  df.Year.astype(str).add('-2') ,format='%W%Y-%w')

will give shifted results starting from 2019.

In order to be compliant with the ISO-8601 standard you would have to do the following:

import pandas as pd
import datetime

L1 = [52,53,1,2,5,52]
L2 = [2018,2018,2019,2019,2019,2019]
df = pd.DataFrame({"Week":L1,"Year":L2})
df['ISO'] = df['Year'].astype(str) + '-W' + df['Week'].astype(str) + '-1'df['DT'] = df['ISO'].map(lambda x: datetime.datetime.strptime(x, "%G-W%V-%u"))
print(df)

It prints:

WeekYearISODT0522018  2018-W52-12018-12-241532018  2018-W53-12018-12-31212019   2019-W1-12018-12-31322019   2019-W2-12019-01-07452019   2019-W5-12019-01-285522019  2019-W52-12019-12-23

The week 53 of 2018 is ignored and mapped to the week 1 of 2019.

Please verify yourself on https://www.epochconverter.com/weeks/2019.

Solution 4:

If you want to follow ISO Week Date

Weeks start with Monday. Each week's year is the Gregorian year in which the Thursday falls. The first week of the year, hence, always contains 4 January. ISO week year numbering therefore slightly deviates from the Gregorian for some days close to 1 January.

The following sample code, generates a sequence of 60 Dates, starting from 18Dec2016 Sun and adds the appropriate columns.

It adds:

  • A "Date"
  • Week Day of the "Date"
  • Finds the Week Starting Monday of that "Date"
  • Finds the Year of the Week Starting Monday of that "Date"
  • Adds a Week Number (ISO)
  • Gets the Starting Monday Date, from Year and Week Number

Sample Code Below:

# Generate Some Dates
dft1 = pd.DataFrame(pd.date_range('2016-12-18', freq='D', periods=60))
dft1.columns = ['e_FullDate']
dft1['e_FullDateWeekDay'] = dft1.e_FullDate.dt.day_name().str.slice(0,3)


#Add a Week Start Date (Monday)
dft1['e_week_start'] = dft1['e_FullDate'] - pd.to_timedelta(dft1['e_FullDate'].dt.weekday,
                                                      unit='D')
dft1['e_week_startWeekDay'] = dft1.e_week_start.dt.day_name().str.slice(0,3)

#Add a Week Start Year
dft1['e_week_start_yr'] = dft1.e_week_start.dt.year

#Add a Week Number of Week Start Monday
dft1['e_week_no'] = dft1['e_week_start'].dt.week

#Add a Week Start generate from Week Number and Year
dft1['e_week_start_from_week_no'] = pd.to_datetime(dft1.e_week_no.astype(str)+
                  dft1.e_week_start_yr.astype(str).add('-1') ,format='%W%Y-%w')
dft1['e_week_start_from_week_noWeekDay'] = dft1.e_week_start_from_week_no.dt.day_name().str.slice(0,3)


with pd.option_context('display.max_rows', 999, 'display.max_columns', 0, 'display.max_colwidth', 9999):
    display(dft1)

enter image description here

Post a Comment for "Pandas: How To Create A Datetime Object From Week And Year?"