Skip to content Skip to sidebar Skip to footer

Match A Pattern And Save To Variable Using Python

I have an output file containing thousands of lines of information. Every so often I find in the output file information of the following form¨ Input Orientation: ... content ...

Solution 1:

Regex isn't necessary here. All you need is good ol' indexing. Python strings have index and rindex methods that take a substring, finds it in the text, and returns the index of the first character in the substring. Reading this doc should get you familiar with slicing strings. The program could look something like this:

withopen(input_file) as f:
    s = f.read()  # reads the file as one big string

last_block = s[s.rindex('Input'):s.rindex('Distance')]

The last line of that code finds the first occurrence of 'Input' starting from the end of the file, since we used rindex, and moving towards the front and marks that position as an integer. It then does the same with 'Distance'. It then uses those integers to return only the portion of the string that rests between them. in the case of your example file it would return:

                                      Input orientation:
             ---------------------------------------------------------------------
             Center     AtomicAtomic             Coordinates (Angstroms)
             Number     Number       Type             X           Y           Z
             ---------------------------------------------------------------------160        Correct    Correct     Correct
                  210        Correct    Correct     Correct
                  310        Correct    Correct     Correct
                  410        Correct    Correct     Correct
                  5170        Correct    Correct     Correct
                  690        Correct    Correct     Correct
             ---------------------------------------------------------------------

If you don't want the 'Input orientation' header, you can simply add to the result of rindex('Input') until you get the desired result. That could look like s[s.rindex('Input') + 19:s.rindex('Distance')], for instance.

It is also important to note that index and rindex throw errors if the substring is not found. If that is not desired, you can use find and rfind.

Post a Comment for "Match A Pattern And Save To Variable Using Python"