Split Dataframes In Groups And Sub-groups And Store The Output In A Csv File

December 19, 2023 Post a Comment

Suppose I have a large dataframe like this: A B C 27/6/2017 4:00:00 928.04 4.83 27/6/2017 4:20:00 927.71 4.61 27/6/2017 4:40:00 928.22 4.49 27/6/2

Solution 1:

You could use diff and pd.Timedelta for the first level groupby, and df.B // x * x to divide B into ranged groups.

grps = [(df.A.diff() > pd.Timedelta(hours=4)).cumsum(), df.B // 100 * 100]
for i, g in df.groupby(grps):
     g.to_csv('{}_{}.csv'.format(*i))
     print(g)

                    A       B     C
3 2017-06-27 05:00:00  898.74  3.81
4 2017-06-27 05:20:00  895.16  3.55
5 2017-06-27 05:40:00  895.05  3.40
6 2017-06-27 06:00:00  895.68  3.30 

                    A       B     C
0 2017-06-27 04:00:00  928.04  4.83
1 2017-06-27 04:20:00  927.71  4.61
2 2017-06-27 04:40:00  928.22  4.49 

                     A       B     C
7  2017-06-27 16:20:00  662.45  1.52
8  2017-06-27 16:40:00  639.98  1.48
13 2017-06-27 19:00:00  652.10  1.51
14 2017-06-27 19:20:00  638.58  1.68
15 2017-06-27 19:40:00  633.14  1.66
16 2017-06-27 20:00:00  654.66  1.45 

                     A       B     C
9  2017-06-27 17:40:00  732.02  1.79
10 2017-06-27 18:00:00  722.63  1.98
11 2017-06-27 18:20:00  713.26  1.79
12 2017-06-27 18:40:00  705.80  1.54

Python Guru

Split Dataframes In Groups And Sub-groups And Store The Output In A Csv File

Solution 1:

Post a Comment for "Split Dataframes In Groups And Sub-groups And Store The Output In A Csv File"