Skip to content Skip to sidebar Skip to footer

Python Pandas Error While Removing Extra White Space

I am trying to clean a column in data frame of extra white space using command. The data frame has close to 8 million records datt2.My_variable=datt2.My_variable.str.replace('\s+',

Solution 1:

Question: I am trying to clean a column in data frame of extra white space ... datt2.My_variable=datt2.My_variable.str.replace('\s+', ' ')

Please comment, do I understand your expression correctly?

 pandas       ColumnColumn              DataSeries
 DataFrame     Name           DataSeries             Methode
|-^-||----^-----|   |-------^-------|  |----------^----------|
datt2       .My_variable = datt2.My_variable  .str.replace('\s+', ' ')

I'm pretty sure using re.sub is the same as use pandas.str.replace(...), but without copy the whole column Data.

From the pandas doc: Series.str.replace(pat, repl, n=-1, case=True, flags=0) Replace occurrences of pattern/regex in the Series/Index with some other string. Equivalent to str.replace() or re.sub().


Try pure python, for instance:

    import re
    for idx in df.index:
        df.loc[idx, 'My_variable'] = re.sub('\s\s+', ' ', df.loc[idx, 'My_variable'])  

Note: Consider to use '\s\s+' instead of '\s+'. Using '\s+' will replace ONE BLANK with ONE BLANK, which is useless.

Tested with Python:3.4.2 - pandas:0.19.2 Come back and Flag your Question as answered if this is working for you or comment why not.

Post a Comment for "Python Pandas Error While Removing Extra White Space"