Issue
I'm getting started with Scrapy and I wanted to try out some tutorials to create a spider with Scrapy.
This is my code so far:
import scrapy
class QuotesSpider(scrapy.Spider):
name = "quotes"
def start_request(self):
urls = [
'http://quotes.toscrape.com/page/1/',
'http://quotes.toscrape.com/page/2/'
]
for url in urls:
yield scrapy.Request(url=url, callback=self.parse)
def parse(self, response):
page = response.url.split("/")[-2]
filename = "quotes-%s.html" % page
with open(filename, "wb") as f:
f.write(response.body)
self.log('saved file %s' % filename)
As a report I get the following:
Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)] >>> import scrapy >>> self.log('saved file %s' % filename) Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'self' is not defined
I'm quite new to this and don't get how I can solve this. Hope you can help me. :)
Solution
You should be using scrapy crawl quotes
in anaconda, from within the project directory, to start the spider
Answered By - reg202
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.