Skip to content Skip to sidebar Skip to footer

Grouping On Tems In A List In Python

I have 60 records with a column 'skillsList' '('skillsList' is a list of skills) and 'IdNo'. I want to find out how many 'IdNo's' have a skill in common. How can I do it in python.

Solution 1:

You have to do it by yourself. you may use a dictionnary of skills , each item of the dic beeing inited to zero. Then iterate over your records and increment skill item when seen.

Solution 2:

struct = [{id: 1, skills: ['1', '2', '3']}, {...}]
for el instruct:
   if'1'in el.get('skills'):
      print 'id %s get this skill' % el.get('id')

Solution 3:

You can build a inverted index of skills. So you build a dictionary with each key as a skill name and the value of the key is a set of IdNo. That way you can also find out which IdNos have some set of skills

The code would look like

skills = {}
withopen('filename.txt') as f:
    for line in f.readlines():
        items = [item.strip() for item in line.split(',')]
        idNo = items[0]
        skill_list = items[1:]
        for skill in skill_list:
            if skill in skills:
                skills[skill].add(idNo)
            else:
                skills[skill] = set([idNo, ])

Now you have skills dictionary which would look like

skills = {
    'Training': set(1,2,3),
    'Powerpoint': set(1,3,4),
    'E-learning': set(9,10,11),
    .....,
    .....,

}

Now you see that 1,3,4 have Powerpoint as a skill and if you want to know idNo who have both 'Training' and 'Powerpoint' skills you can do

skills['Powerpoint'].intersection(skills['Training'])

and if you want to know idNo who have either 'Training' or 'Powerpoint' skills you can do

skills['Powerpoint'].union(skills['Training'])

Post a Comment for "Grouping On Tems In A List In Python"