Sunday, December 31, 2023

[FIXED] Python requests get doesn't return anything

December 31, 2023 python, python-requests, scrapy, web-crawler No comments

Issue

I want to request this url and get its information:

https://search.codal.ir/api/search/v2/q?=&Audited=true&AuditorRef=-1&Category=1&Childs=false&CompanyState=0&CompanyType=-1&Consolidatable=true&IsNotAudited=false&Isic=232007&Length=12&LetterCode=%D9%86-10&LetterType=-1&Mains=true&NotAudited=false&NotConsolidatable=true&PageNumber=1&Publisher=false&Symbol=%D8%B4%D9%BE%D9%86%D8%A7&TracingNo=-1&search=true

It's a simple page, if you open it you will notice. The code I use is:

import requests
req = requests.Session()
link12month = "https://search.codal.ir/api/search/v2/q?&Audited=true&AuditorRef=-1&Category=1&Childs=false&CompanyState=0&CompanyType=-1&Consolidatable=true&IsNotAudited=false&Isic=232007&Length=12&LetterCode=ن-10&LetterType=-1&Mains=true&NotAudited=false&NotConsolidatable=true&PageNumber=1&Publisher=false&Symbol=شپنا&TracingNo=-1&search=true"
response = req.get(link12month, verify=False
                   , headers={'Content-Type': 'application/xml; charset=utf-8'})
print(response.status_code, response.text)

But Requests.get does not return anything to me(No error will be returned and no response will be received). I have already received information about this page with selenium, but selenium is much slower than requisition because I want to req about 1000 pages (same url with minor changes).

If you have a way, thank you for guiding me. What about requests in another quick way (for example, can scrapy be used?) If yes, please tell me

Solution

Found the issue:

You need to have a User-Agent header:

import requests

if __name__ == '__main__':
    req = requests.Session()
    link12month = "https://search.codal.ir/api/search/v2/q?&Audited=true&AuditorRef=-1&Category=1&Childs=false&CompanyState=0&CompanyType=-1&Consolidatable=true&IsNotAudited=false&Isic=232007&Length=12&LetterCode=ن-10&LetterType=-1&Mains=true&NotAudited=false&NotConsolidatable=true&PageNumber=1&Publisher=false&Symbol=شپنا&TracingNo=-1&search=true"
    response = req.get(link12month, headers={'Accept': 'application/xml; charset=utf-8','User-Agent':'foo'})
    print(response.status_code, response.text)

You can also simplify your code a bit here as well:

import requests

if __name__ == '__main__':
    link12month = "https://search.codal.ir/api/search/v2/q?&Audited=true&AuditorRef=-1&Category=1&Childs=false&CompanyState=0&CompanyType=-1&Consolidatable=true&IsNotAudited=false&Isic=232007&Length=12&LetterCode=ن-10&LetterType=-1&Mains=true&NotAudited=false&NotConsolidatable=true&PageNumber=1&Publisher=false&Symbol=شپنا&TracingNo=-1&search=true"
    response = requests.get(link12month, headers={'Accept': 'application/xml; charset=utf-8','User-Agent':'foo'})
    print(response.status_code, response.text)

Make sure you verify if you can as well. If you want JSON, remove the 'Accept' header.

You can also simplify the URL a lot by adding the parameters as a dictionary, check out the docs here for the params argument https://www.w3schools.com/python/ref_requests_get.asp

Answered By - Ash Oldershaw

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, December 31, 2023

[FIXED] Python requests get doesn't return anything

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels