Skip to content Skip to sidebar Skip to footer

Check For Words From List And Remove Those Words In Pandas Dataframe Column

I have a list as follows, remove_words = ['abc', 'deff', 'pls'] The following is the data frame which I am having with column name 'string' data['string'] 0 abc stack ove

Solution 1:

Try this:

In [98]: pat = r'\b(?:{})\b'.format('|'.join(remove_words))

In [99]: pat
Out[99]: '\\b(?:abc|def|pls)\\b'

In [100]: df['new'] = df['string'].str.replace(pat, '')

In [101]: df
Out[101]:
               string              new
0  abc stack overflow   stack overflow
1              abc123           abc123
2defcomedy           comedy
3          definitely       definitely
4            pls lkjh             lkjh
5             pls1234          pls1234

Solution 2:

Totally taking @MaxU's pattern!

We can use pd.DataFrame.replace by setting the regex parameter to True and passing a dictionary of dictionaries that specifies the pattern and what to replace with for each column.

pat = '|'.join([r'\b{}\b'.format(w) for w in remove_words])

df.assign(new=df.replace(dict(string={pat: ''}), regex=True))

               string              new
0  abc stack overflow   stack overflow
1              abc123           abc123
2defcomedy           comedy
3          definitely       definitely
4            pls lkjh             lkjh
5             pls1234          pls1234

Post a Comment for "Check For Words From List And Remove Those Words In Pandas Dataframe Column"