Skip to content Skip to sidebar Skip to footer

I Need To Filter Contents Of My Text File

I have a text file that I want to loop through, slice some contents, and store in a separate list. The text file contains: blu sre before we start start the process blah blah blah

Solution 1:

Regex is built for that.

import re

part = re.compile(r"start the process(.*?)end the process", flags=re.DOTALL)
wirh open("my_file.text", "r") as file:
    data = file.read()

results = list(part.findall(data))

EDIT update code based on @Xosrov comment

Solution 2:

@Florian Bernard et al

My requirements has somewhat changed as i am working on a dataframe. i want to loop through the data frame and slice the data with conditions and store all values between the start and stop index in an array or a new dataframe's first line. so if there are 4 occurrences of my start and stop then they should be 4 lines in my array or dataframe.

NB. my dataframe has just one column with texts

here is some code i have done

corpus = []
count = 0forindex,row in df.iterrows():
    if df['row'].str.match('start'):
        start = indexif df['row'].str.match('stop'):
        stop = index
    corpus[count] = df.loc[start:stop]  
    count += 1

Post a Comment for "I Need To Filter Contents Of My Text File"