Issue
I have my script when I run script my print statement gives 'None' value. But when the same thing is run from scrapy shell I can get result what i want;
What can be reason for such different results;
Code is below
import scrapy
from scrapy.crawler import CrawlerProcess
class TestSpiderSpider(scrapy.Spider):
name = 'test_spider'
allowed_domains = ['dvlaregistrations.direct.gov.uk']
start_urls = ['https://dvlaregistrations.dvla.gov.uk/search/results.html?search=CO11CTD"&"action=index"&"pricefrom=0"&"priceto="&"prefixmatches="&"currentmatches="&"limitprefix="&"limitcurrent="&"limitauction="&"searched=true"&"openoption="&"language=en"&"prefix2=Search"&"super="&"super_pricefrom="&"super_priceto='
]
def parse(self, response):
price=response.css('div.resultsstrip p::text').get()
print(price)
print('---+---')
all_prices = response.css('div.resultsstrip p::text')
for element in all_prices:
yield print(element.css('::text').get())
[url][1]=''
process = CrawlerProcess()
process.crawl(TestSpiderSpider)
process.start()
this script when run gives None value, but when response.css('div.resultsstrip p::text').get() '£250'
shell gives value what is located
Solution
You should edit your original post with the new url in your comment because the one in your question doesn't point to the same address.
Also you are trying to extract the text from a selector that points to the text only, that is why it is returning None
.
In the following line your selector list already targets the ::text
of the element.
all_prices = response.css('div.resultsstrip p::text')
Which is why when you try to extract the ::text
again it doesn't work.
print(element.css('::text').get())
What would have worked would just calling get
on the element
itself.
print(element.get())
Try this:
import scrapy
from scrapy.crawler import CrawlerProcess
class TestSpiderSpider(scrapy.Spider):
name = 'test_spider'
allowed_domains = ['dvlaregistrations.direct.gov.uk']
start_urls = ["https://dvlaregistrations.dvla.gov.uk/search/results.html?search=CO11CTD&action=index&pricefrom=0&priceto=&prefixmatches=¤tmatches=&limitprefix=&limitcurrent=&limitauction=&searched=true&openoption=&language=en&prefix2=Search&super=&super_pricefrom=&super_priceto="]
def parse(self, response):
for row in response.css('div.resultsstrip'):
plate = row.css('a::text').get()
price = row.css('p::text').get()
yield {"plate": plate.strip(), "price": price.strip()}
process = CrawlerProcess()
process.crawl(TestSpiderSpider)
process.start()
output:
&limitprefix=&limitcurrent=&limitauction=&searched=true&openoption=&language=en&prefix2=Search&super=&super_pricefrom=&super_priceto=>
{'plate': 'CO02 CTO', 'price': '£250'}
2023-01-14 15:22:25 [scrapy.core.scraper] DEBUG: Scraped from <200 https://dvlaregistrations.dvla.gov.uk/search/results.html?search=CO11CTD&action=index&pricefrom=0&priceto=&prefixmatches=¤tmatches=
&limitprefix=&limitcurrent=&limitauction=&searched=true&openoption=&language=en&prefix2=Search&super=&super_pricefrom=&super_priceto=>
{'plate': 'CO03 CTO', 'price': '£250'}
2023-01-14 15:22:25 [scrapy.core.scraper] DEBUG: Scraped from <200 https://dvlaregistrations.dvla.gov.uk/search/results.html?search=CO11CTD&action=index&pricefrom=0&priceto=&prefixmatches=¤tmatches=
&limitprefix=&limitcurrent=&limitauction=&searched=true&openoption=&language=en&prefix2=Search&super=&super_pricefrom=&super_priceto=>
{'plate': 'CO04 CTO', 'price': '£250'}
2023-01-14 15:22:25 [scrapy.core.scraper] DEBUG: Scraped from <200 https://dvlaregistrations.dvla.gov.uk/search/results.html?search=CO11CTD&action=index&pricefrom=0&priceto=&prefixmatches=¤tmatches=
&limitprefix=&limitcurrent=&limitauction=&searched=true&openoption=&language=en&prefix2=Search&super=&super_pricefrom=&super_priceto=>
{'plate': 'CO05 CTO', 'price': '£250'}
2023-01-14 15:22:25 [scrapy.core.scraper] DEBUG: Scraped from <200 https://dvlaregistrations.dvla.gov.uk/search/results.html?search=CO11CTD&action=index&pricefrom=0&priceto=&prefixmatches=¤tmatches=
&limitprefix=&limitcurrent=&limitauction=&searched=true&openoption=&language=en&prefix2=Search&super=&super_pricefrom=&super_priceto=>
{'plate': 'B14 CCD', 'price': '£399'}
2023-01-14 15:22:25 [scrapy.core.scraper] DEBUG: Scraped from <200 https://dvlaregistrations.dvla.gov.uk/search/results.html?search=CO11CTD&action=index&pricefrom=0&priceto=&prefixmatches=¤tmatches=
&limitprefix=&limitcurrent=&limitauction=&searched=true&openoption=&language=en&prefix2=Search&super=&super_pricefrom=&super_priceto=>
{'plate': 'B15 CCD', 'price': '£399'}
2023-01-14 15:22:25 [scrapy.core.scraper] DEBUG: Scraped from <200 https://dvlaregistrations.dvla.gov.uk/search/results.html?search=CO11CTD&action=index&pricefrom=0&priceto=&prefixmatches=¤tmatches=
&limitprefix=&limitcurrent=&limitauction=&searched=true&openoption=&language=en&prefix2=Search&super=&super_pricefrom=&super_priceto=>
{'plate': 'B17 CCD', 'price': '£399'}
2023-01-14 15:22:25 [scrapy.core.scraper] DEBUG: Scraped from <200 https://dvlaregistrations.dvla.gov.uk/search/results.html?search=CO11CTD&action=index&pricefrom=0&priceto=&prefixmatches=¤tmatches=
&limitprefix=&limitcurrent=&limitauction=&searched=true&openoption=&language=en&prefix2=Search&super=&super_pricefrom=&super_priceto=>
{'plate': 'B18 CCD', 'price': '£399'}
Answered By - Alexander
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.