Skip to content Skip to sidebar Skip to footer

How To Get Row, Column Indices Of All Non-nan Items In Pandas Dataframe

How do I iterate over a dataframe like the following and return the non-NaN value locations as a tuple. i.e. df: 0 1 2 0 NaN NaN 1 1 1 NaN NaN 2 NaN 2

Solution 1:

Assuming you don't need in order, you could stack the nonnull values and work on index values.

In [26]: list(df[df.notnull()].stack().index)
Out[26]: [(0L, '2'), (1L, '0'), (2L, '1')]

In [27]: df[df.notnull()].stack().index
Out[27]:
MultiIndex(levels=[[0, 1, 2], [u'0', u'1', u'2']],
           labels=[[0, 1, 2], [2, 0, 1]])

Furthermore, using stack method, NaN are ignored anyway.

In [28]: list(df.stack().index)
Out[28]: [(0L, '2'), (1L, '0'), (2L, '1')]

Solution 2:

To get the non-null locations:

import numpy as np

>>> np.argwhere(df.notnull().values).tolist()
[[0, 2], [1, 0], [2, 1]]

If you really want them as tuple pairs, just use a list comprehension:

>>>[tuple(pair) for pair in np.argwhere(df.notnull().values).tolist()]
[(0, 2), (1, 0), (2, 1)]

To get the null locations:

>>> np.argwhere(df.isnull().values).tolist()
[[0, 0], [0, 1], [1, 1], [1, 2], [2, 0], [2, 2]]

Solution 3:

A direct way :

list(zip(*np.where(df.notnull())))

for

[(0, 2), (1, 0), (2, 1)]

Post a Comment for "How To Get Row, Column Indices Of All Non-nan Items In Pandas Dataframe"