Skip to content Skip to sidebar Skip to footer

Change With Nan If Values Stuck At A Single Value Over Time Using Python

As you can see below, my contains some identical consecutive values, i.e. 1, 2, and 3. Date Value 0 2017-07-18 07:40:00 1 1 2017-07-18 07:45:00 1 2 2017-07-18 07:50:0

Solution 1:

You could GroupBy consecutive values using a custom grouping scheme, check which groups have a size greater or equal to 3 and use the result to index the dataframe and set the rows of interest to NaN:

g=df.Value.diff().fillna(0).ne(0).cumsum()m=df.groupby(g).Value.transform('size').ge(3)df.loc[m,'Value']=np.nanDateValue02017-07-18-07:40:00NaN12017-07-18-07:45:00NaN22017-07-18-07:50:00NaN32017-07-18-07:55:002414.042017-07-18-08:00:002.052017-07-18-08:05:002.062017-07-18-08:10:004416.072017-07-18-08:15:004416.082017-07-18-08:20:00NaN92017-07-18-08:25:00NaN102017-07-18-08:30:00NaN112017-07-18-08:35:006998.0

Where:

df.assign(grouper=g,mask=m,result=df_.Value)DateValuegroupermaskresult02017-07-18-07:40:0010TrueNaN12017-07-18-07:45:0010TrueNaN22017-07-18-07:50:0010TrueNaN32017-07-18-07:55:002414        1False2414.042017-07-18-08:00:0022False2.052017-07-18-08:05:0022False2.062017-07-18-08:10:004416        3False4416.072017-07-18-08:15:004416        3False4416.082017-07-18-08:20:0034TrueNaN92017-07-18-08:25:0034TrueNaN102017-07-18-08:30:0034TrueNaN112017-07-18-08:35:006998        5False6998.0

Solution 2:

Count the values. The result is a series, it needs a name for further references:

counts = df['Value'].value_counts()
counts.name = '_'

Merge the select values from the series with the original dataframe:

keep = counts[counts < 3]
df.merge(keep, left_on='Value', right_index=True)[df.columns]
#                   Date  Value#3  2017-07-18  07:55:00   2414#4  2017-07-18  08:00:00      2#5  2017-07-18  08:05:00      2#6  2017-07-18  08:10:00   4416#7  2017-07-18  08:15:00   4416#11 2017-07-18  08:35:00   6998

The result is a filtered dataframe.

If you use pandas version <0.24, you should upgrade, but here is a workaround:

df.merge(pd.DataFrame(keep), left_on='Value', right_index=True)[df.columns]

Post a Comment for "Change With Nan If Values Stuck At A Single Value Over Time Using Python"