performance - Faster way to do a correspondence replace operation in python? -
i not sure i'm using right term this---i'd call merge operation maybe? simple matching?
i have 2 dictionaries. 1 of them contains list of tag ids. other 1 correspondence between tag ids , tag id names. want match ids , include tag names in first dict.
so, first dictionary looks this:
>>> myjson [ {"tags" : ["1","3"],"otherdata" : "blah"}, {"tags" : ["2","4"],"otherdata" : "blah blah"} ] second dictionary looks this:
>>> tagnames [ {"id": "1", "name":"bassoon"}, {"id": "2", "name":"banjo"}, {"id": "3", "name":"paw paw"}, {"id": "4", "name":"foxes"} ] to replace tag ids in myjson tag id names, doing this:
data = [] j in myjson: d = j d['tagnames'] = [i['name'] in tagnames y in d['tags'] if y==i['id']] data.append(d) my desired output this:
>>> data [ {"tags" : ["1","3"],"otherdata" : "blah", "tagname" : ["bassoon","paw paw"]}, {"tags" : ["2","4"],"otherdata" : "blah blah", "tagname": ["banjo","foxes"]} ] i'm getting right output, seems slow. it's doing full iterations of each element in myjson x full iterations of each element in tagnames (is m x n? n x n?) every time , that slow, maybe there smarter syntax or tricks speeding up? walk array once instead of n times?
oooh, also, cool if suggest way assignment slick map or functional approach rather outer forloop.
you want transform tagnames list dictionary:
tagnames_map = {t['id']: t['name'] t in tagnames} now can find matching tagnames faster; code made in-place changes, i'll simplify to:
for d in myjson: d['tagnames'] = [tagnames_map[t] t in tagnames_map.viewkeys() & d['tags']] the dict.viewkeys() method returns dictionary view object acts set. intersect set against list of tags, resulting in sequence of tags listed in tagnames_map. doing don't have worry tags missing map.
if using python 3, use tagnames_map.keys() directly; in python 3 .keys(), .values() , items() methods have been changed return dictionary view objects.
if wanted make copy instead, using d.copy():
data = [] d in myjson: d = d.copy() d['tagnames'] = [tagnames_map[t] t in tagnames_map.viewkeys() & d['tags']] data.append(d) dict.copy() creates shallow copy; mutable values not copied, new dict reference same values. not altering values here fine.
running against sample input gives:
>>> pprint(data) [{'otherdata': 'blah', 'tagnames': ['bassoon', 'paw paw'], 'tags': ['1', '3']}, {'otherdata': 'blah blah', 'tagnames': ['banjo', 'foxes'], 'tags': ['2', '4']}]
Comments
Post a Comment