Skip to content Skip to sidebar Skip to footer

Computing Frequencies In A Nested List

I'm trying to compute the frequencies of words using a dictionary in a nested lists. Each nested list is a sentence broken up into each word. Also, I want to delete proper nouns

Solution 1:

Since your data is nested, you can flatten it with chain.from_iterable like this

from itertools import chain
from collections import Counter
print Counter(chain.from_iterable(x))
# Counter({'doing': 2, 'Kyle': 2, 'what': 1, 'timeis': 1, 'am': 1, 'Hey': 1, 'I': 1, 'are': 1, 'it': 1, 'you': 1, 'fine': 1})

If you want to use generator expression, then you can do

from collections import Counter
print Counter(item for items in x for item in items)

If you want to do this without using Counter, then you can use a normal dictionary like this

my_counter = {}
for line in x:
    for word in line:
        my_counter[word] = my_counter.get(word, 0) + 1
print my_counter

You can also use collections.defaultdict, like this

from collections import defaultdict
my_counter = defaultdict(int)
for line in x:
    for word in line:
        my_counter[word] += 1

print my_counter

Okay, if you simply want to convert the Counter object to a dict object (which I believe is not necessary at all since Counter is actually a dictionary. You can access key-values, iterate, delete update the Counter object just like a normal dictionary object), you can use bsoist's suggestion,

print dict(Counter(chain.from_iterable(x)))

Solution 2:

The problem is that you are iterating over L twice.

Replace the inner loop:

for word in L:

with:

for word in listofWords:

Though, if want to go "pythonic" - check out @thefourtheye's solution.


Post a Comment for "Computing Frequencies In A Nested List"