Computing Frequencies In A Nested List
I'm trying to compute the frequencies of words using a dictionary in a nested lists. Each nested list is a sentence broken up into each word. Also, I want to delete proper nouns
Solution 1:
Since your data is nested, you can flatten it with chain.from_iterable
like this
from itertools import chain
from collections import Counter
print Counter(chain.from_iterable(x))
# Counter({'doing': 2, 'Kyle': 2, 'what': 1, 'timeis': 1, 'am': 1, 'Hey': 1, 'I': 1, 'are': 1, 'it': 1, 'you': 1, 'fine': 1})
If you want to use generator expression, then you can do
from collections import Counter
print Counter(item for items in x for item in items)
If you want to do this without using Counter, then you can use a normal dictionary like this
my_counter = {}
for line in x:
for word in line:
my_counter[word] = my_counter.get(word, 0) + 1
print my_counter
You can also use collections.defaultdict
, like this
from collections import defaultdict
my_counter = defaultdict(int)
for line in x:
for word in line:
my_counter[word] += 1
print my_counter
Okay, if you simply want to convert the Counter
object to a dict
object (which I believe is not necessary at all since Counter
is actually a dictionary. You can access key-values, iterate, delete update the Counter
object just like a normal dictionary object), you can use bsoist's suggestion,
print dict(Counter(chain.from_iterable(x)))
Solution 2:
The problem is that you are iterating over L
twice.
Replace the inner loop:
for word in L:
with:
for word in listofWords:
Though, if want to go "pythonic" - check out @thefourtheye's solution.
Post a Comment for "Computing Frequencies In A Nested List"