Issue
How to crawl website : https://kateglo.com/?mod=dictionary&srch=all
for dt
and dd
tag values simultaneously.
I Have tried and succeed but dt and dd not same line
Sorry about bad english.
Solution
There are at least two ways.
First one is less reliable - just extract two lists and zip them:
dt_list = response.css('dt::text').extract()
dd_list = response.css('dd::text').extract()
final_list = zip(dt_list, dd_list)
You'll get list of tuples with corresponding dt
and dd
values.
Second one is more correct but will require reading docs a bit. You should take a look at xPath Following sibling
Finally you'll get something like this:
dt_list = response.css('dt')
for dt in dt_list:
dt_value = dt.css('::text').get()
# Getting corresponding dd value
dd_value = dt.xpath('./following-sibling::dd/text()').get()
So it goes.
Answered By - Michael Savchenko
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.