Grouping On Tems In A List In Python
I have 60 records with a column 'skillsList' '('skillsList' is a list of skills) and 'IdNo'. I want to find out how many 'IdNo's' have a skill in common. How can I do it in python.
Solution 1:
You have to do it by yourself. you may use a dictionnary of skills , each item of the dic beeing inited to zero. Then iterate over your records and increment skill item when seen.
Solution 2:
struct = [{id: 1, skills: ['1', '2', '3']}, {...}]
for el instruct:
if'1'in el.get('skills'):
print 'id %s get this skill' % el.get('id')
Solution 3:
You can build a inverted index of skills. So you build a dictionary with each key as a skill name and the value of the key is a set of IdNo
. That way you can also find out which IdNo
s have some set of skills
The code would look like
skills = {}
withopen('filename.txt') as f:
for line in f.readlines():
items = [item.strip() for item in line.split(',')]
idNo = items[0]
skill_list = items[1:]
for skill in skill_list:
if skill in skills:
skills[skill].add(idNo)
else:
skills[skill] = set([idNo, ])
Now you have skills
dictionary which would look like
skills = {
'Training': set(1,2,3),
'Powerpoint': set(1,3,4),
'E-learning': set(9,10,11),
.....,
.....,
}
Now you see that 1,3,4 have Powerpoint
as a skill and if you want to know idNo
who have both 'Training' and 'Powerpoint' skills you can do
skills['Powerpoint'].intersection(skills['Training'])
and if you want to know idNo
who have either 'Training' or 'Powerpoint' skills you can do
skills['Powerpoint'].union(skills['Training'])
Post a Comment for "Grouping On Tems In A List In Python"