Skip to content Skip to sidebar Skip to footer

How To Categorize A Range Of Values In Pandas Dataframe

Supose I have the following DataFrame: Area 0 14.68 1 40.54 2 10.82 3 2.31 4 22.3 And I want to categorize that values in range. Like A: [1,10], B: [11,20], C... Area 0

Solution 1:

For me working cat.codes with indexing by converting list a to numpy array:

a = list('ABCDEF')
df['new'] = np.array(a)[pd.cut(df["Area"], bins = bins).cat.codes]
print (df)
     Area new
0   14.68   B
1   40.54   C
2   10.82   A
3    2.31   A
4   22.30   C
5  600.00   F

catDf = pd.Series(np.array(a)[pd.cut(df["Area"], bins = bins).cat.codes], index=df.index)
print (catDf)
0    B
1    C
2    A
3    A
4    C
5    F
dtype: object

Solution 2:

Assuming that bins is a global variable, you could do that

defnumber_to_bin(number):
        ALPHABETS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"for i, bininenumerate(bins):
            if number >= bin[0] and number <= bin[1]:
                return ALPHABETS[i]

   df["area"] = df["area"].apply(number_to_bin)

Solution 3:

You can specify the labels like following:

Note not sure which ranges you used:

pd.cut(df.Area, [1,10, 20, 50, 100], labels=['A', 'B', 'C', 'D'])

0B1C2B3A4CName: Area, dtype: categoryCategories (4, object): [A < B < C < D]

Post a Comment for "How To Categorize A Range Of Values In Pandas Dataframe"