Skip to content Skip to sidebar Skip to footer

How To Remove A Value From A List In A Pandas Dataframe?

I have created a dataframe: [in] testing_df =pd.DataFrame(test_array,columns=['transaction_id','product_id']) # Split the product_id's for the testing data testing_df.set_index(['

Solution 1:

I would do it before splitting:

Data:

In[269]: dfOut[269]:
                 product_idtransaction_id1P012P01,P023P01,P02,P094P01,P035P01,P03,P056P01,P03,P077P01,P03,P088P01,P049P01,P04,P0510P01,P04,P08

Answer :

In [271]: df['product_id'] = df['product_id'].str.replace(r'\,*?(?:P04|P08)\,*?', '') \
                                             .str.split(',')

In [272]: df
Out[272]:
                     product_id
transaction_id
1                         [P01]
2                    [P01, P02]
3               [P01, P02, P09]
4                    [P01, P03]
5               [P01, P03, P05]
6               [P01, P03, P07]
7                    [P01, P03]
8                         [P01]
9                    [P01, P05]
10                        [P01]

alternatively you can change:

testing_df['product_id'] = testing_df['product_id'].apply(lambda row: row.split(','))

with:

testing_df['product_id'] = testing_df['product_id'].apply(lambda row: list(set(row.split(','))- set(['P04','P08'])))

Demo:

In[280]: df.product_id.apply(lambda row: list(set(row.split(','))- set(['P04','P08'])))
Out[280]:
transaction_id1[P01]2[P01, P02]3[P09, P01, P02]4[P01, P03]5[P01, P03, P05]6[P07, P01, P03]7[P01, P03]8[P01]9[P01, P05]10[P01]Name: product_id, dtype: object

Solution 2:

store all your elements to be removed in a list.

remove_results = ['P04','P08']
forkinrange(len(testing_df['product_id'])):
    forrin remove_results:
        if r in testing_df['product_id'][k]:
            testing_df['product_id][k].remove(r)

Solution 3:

A list comprehension will likely be most efficient:

exc = {'P04', 'P08'}
df['product_id'] = [[i for i in L if i not in exc] for L in df['product_id']]

Note that an inefficient Python-level loop is unavoidable. apply + lambda, map + lambda or an in-place solution all involve a Python-level loop.

Post a Comment for "How To Remove A Value From A List In A Pandas Dataframe?"