python - How to add data to a dict-like Item Field using ItemLoaders? -
i'm using scrapy's xpathitemloader, it's api documents adding values item field, not deeper :( mean:
def parse_item(self, response): loader = xpathitemloader(response=response) loader.add_xpath('name', '//h1') will add values found xpath item.name, how add them item.profile['name']?
xpathitemloader.add_xpath doesn't support writing nested fields. should construct profile dict manually , write via add_value method (in case still need go loaders). or, can write own custom loader.
here's example using add_value:
from scrapy.contrib.loader import xpathitemloader scrapy.item import item, field scrapy.selector import htmlxpathselector scrapy.spider import basespider class testitem(item): others = field() class wikispider(basespider): name = "wiki" allowed_domains = ["en.wikipedia.org"] start_urls = ["http://en.wikipedia.org/wiki/main_page"] def parse(self, response): hxs = htmlxpathselector(response) loader = xpathitemloader(item=testitem(), response=response) others = {} crawled_items = hxs.select('//div[@id="mp-other"]/ul/li/b/a') item in crawled_items: href = item.select('@href').extract()[0] name = item.select('text()').extract()[0] others[name] = href loader.add_value('others', others) return loader.load_item() run via: scrapy runspider <script_name> --output test.json.
the spider collects items of other areas of wikipedia main wikipedia page , writes dictionary field others.
hope helps.
Comments
Post a Comment