How To Slice Numbered Lists Into Sublists
I have opened a file and used readlines() and split() with regex '\t' to remove TABs and it has resulted into the following lists: ['1', 'cats', '--,'] ['2', 'chase', '--,'] ['3',
Solution 1:
Something like:
from itertools import groupby
withopen('yourfile') as fin:
# split lines
lines = (line.split() for line in fin)
# group by consecutive ints
grouped = groupby(enumerate(lines), lambda (idx, el): idx - int(el[0]))
# build sentences from words in groups
sentences = [' '.join(el[1][1] for el in g) for k, g in grouped]
# ['cats chase dogs', 'the car is gray']
NB: This works based on your example data of:
example = [
["1", "cats", "--,"],
["2", "chase", "--,"],
["3", "dogs", "--,"],
["1", "the", "--,"],
["2", "car", "--,"],
["3", "is", "--,"],
["4", "gray", "--,"]
]
Solution 2:
Choosing the suitable data structures make the job easier:
container = [["1", "cats", "--,"],
["2", "chase", "--,"],
["3", "dogs", "--,"],
["1", "the", "--,"],
["2", "car", "--,"],
["3", "is", "--,"],
["4", "gray", "--,"]]
Nest your lists in a container list then use a dictionary to store the output lists:
from collections import defaultdict
out = defaultdict(list) # Initialize dictionary for output
key = 0# Initialize key for idx, word, _ in container: # Unpack sublistsifint(idx) == 1: # Check if we are at start of new sentence
key += 1# Increment key for new sentence
out[key].append(word) # Add word to list
Gives:
{
1: ['cats', 'chase', 'dogs'],
2: ['the', 'car', 'is', 'gray']
}
Post a Comment for "How To Slice Numbered Lists Into Sublists"