python - How to add data to a dict-like Item Field using ItemLoaders? -


i'm using scrapy's xpathitemloader, it's api documents adding values item field, not deeper :( mean:

def parse_item(self, response):     loader = xpathitemloader(response=response)     loader.add_xpath('name', '//h1') 

will add values found xpath item.name, how add them item.profile['name']?

xpathitemloader.add_xpath doesn't support writing nested fields. should construct profile dict manually , write via add_value method (in case still need go loaders). or, can write own custom loader.

here's example using add_value:

from scrapy.contrib.loader import xpathitemloader scrapy.item import item, field scrapy.selector import htmlxpathselector scrapy.spider import basespider   class testitem(item):     others = field()   class wikispider(basespider):     name = "wiki"     allowed_domains = ["en.wikipedia.org"]     start_urls = ["http://en.wikipedia.org/wiki/main_page"]       def parse(self, response):         hxs = htmlxpathselector(response)         loader = xpathitemloader(item=testitem(), response=response)          others = {}         crawled_items = hxs.select('//div[@id="mp-other"]/ul/li/b/a')         item in crawled_items:             href = item.select('@href').extract()[0]             name = item.select('text()').extract()[0]             others[name] = href          loader.add_value('others', others)         return loader.load_item() 

run via: scrapy runspider <script_name> --output test.json.

the spider collects items of other areas of wikipedia main wikipedia page , writes dictionary field others.

hope helps.


Comments

Popular posts from this blog

linux - xterm copying to CLIPBOARD using copy-selection causes automatic updating of CLIPBOARD upon mouse selection -

c++ - qgraphicsview horizontal scrolling always has a vertical delta -