Skip to content Skip to sidebar Skip to footer

Flexibly Select Pandas Dataframe Rows Using Dictionary

Suppose I have the following dataframe: df = pd.DataFrame({'color':['red', 'green', 'blue'], 'brand':['Ford','fiat', 'opel'], 'year':[2016,2016,2017]}) brand color yea

Solution 1:

Yes, there is! You can build a query string using a simple list comprehension, and pass the string to query for dynamic evaluation.

query = ' and '.join([f'{k} == {repr(v)}' for k, v in m.items()]) 
# query = ' and '.join(['{} == {}'.format(k, repr(v)) for k, v in m.items()]) new_df = df.query(query)

print(query)
# "color == 'red' and year == 2016"print(new_df)
  color brand  year
0   red  Ford  2016

For more on query (and eval), see my post here: Dynamic Expression Evaluation in pandas using pd.eval()


For better performance, AND handling column names with spaces, etc, use logical_and.reduce:

df[np.logical_and.reduce([df[k] == v for k,v in m.items()])] 

  color brand  year
0   red  Ford  2016

Solution 2:

With single expression:

In [728]: df = pd.DataFrame({'color':['red', 'green', 'blue'], 'brand':['Ford','fiat', 'opel'], 'year':[2016,2016,2017]})

In [729]: d = {'color':'red', 'year':2016}

In [730]: df.loc[np.all(df[list(d)] == pd.Series(d), axis=1)]
Out[730]: 
  brand color  year
0  Ford   red  2016

Post a Comment for "Flexibly Select Pandas Dataframe Rows Using Dictionary"