How To Get Aggregate Of Data From Multiple Dates In Pandas?
I have the following data import pandas as pd import numpy as np df = pd.DataFrame(data={'name':['a', 'b', 'c', 'd', 'e', 'f'], 'vaccine_1':['2021-01-20',
Solution 1:
We can first melt the dataframe using DataFrame.melt
then use pd.crosstab
out= df.filter(like='vaccine').melt(var_name='vaccine', value_name='date')
print(pd.crosstab(out['date'], out['vaccine']))
vaccine vaccine_1 vaccine_2
date2021-01-20202021-02-20102021-02-22122021-02-23102021-02-25012021-03-2201
Solution 2:
Count each vaccine
column separately:
(df.filter(like='vaccine')
.apply(pd.Series.value_counts)
.fillna(0)
.add_suffix('_total')
.rename_axis('date')
.reset_index())
date vaccine_1_total vaccine_2_total
02021-01-202.00.012021-02-201.00.022021-02-221.02.032021-02-231.00.042021-02-250.01.052021-03-220.01.0
Solution 3:
You could do a melt, get the value counts, then unstack to put the vaccines as headers:
(df.melt('name', value_name = 'Date')
.drop(columns='name')
.value_counts()
.unstack('variable', fill_value=0)
.add_suffix('_total')
# last two not necessary# indexes are a good thing
.rename_axis(columns=None)
.reset_index()
)
vaccine_1_total vaccine_2_total
Date
2021-01-20202021-02-20102021-02-22122021-02-23102021-02-25012021-03-2201
Post a Comment for "How To Get Aggregate Of Data From Multiple Dates In Pandas?"