Issue
Code example:
async def download_page(session, url):
print(True)
async def downloader_init(session):
while True:
url = await download_queue.get()
task = asyncio.create_task(download_page(session, url))
print(task)
print(f"url: {url}")
async def get_urls(url):
while True:
try:
url = find_somewhere_url
await download_queue.put(url)
except NoSuchElementException:
break
return True
async def main():
async with aiohttp.ClientSession(headers=headers) as session:
get_urls_task = asyncio.create_task(get_urls(url))
downloader_init_task = asyncio.create_task(downloader_init(session))
asyncio.gather(get_urls_task, downloader_init_task)
if __name__ == "__main__":
asyncio.get_event_loop().run_until_complete(main())
Output:
<Task pending coro=<download_page() running at main.py:69>>
url: https://someurl.com/example
<Task pending coro=<download_page() running at main.py:69>>
url: https://someurl.com/example
<Task pending coro=<download_page() running at main.py:69>>
url: https://someurl.com/example
Why is the method download_page
not executed?
The strange thing is that the script just ends its work, there are no errors anywhere.
downloader_init
should work endlessly, but it does not.
In download_queue
, method get_urls
adds links as it finds them, after which it stops working.
downloader_init
should immediately execute as soon as a new link appears in the queue, but it starts its work only when get_urls
has completed its work.
Solution
Try this instead:
Note: Your problem wasn't with the task creation, it was because
there wasn't an await at the asyncio.gather
part.
import asyncio
import aiohttp
async def download_page(session, url):
# Dummy function.
print(f"session={session}, url={url}")
async def downloader_init(session):
while True:
url = await download_queue.get()
task = asyncio.create_task(download_page(session, url))
print(f"task={task}, url={url}")
async def get_urls(url):
while True:
try:
url = find_somewhere_url()
await download_queue.put(url)
except NoSuchElementException:
break
async def main():
async with aiohttp.ClientSession(headers=headers) as session:
get_urls_task = asyncio.create_task(get_urls(url))
downloader_init_task = asyncio.create_task(downloader_init(session))
# Use await here to make it finish the tasks.
await asyncio.gather(get_urls_task, downloader_init_task)
if __name__ == "__main__":
# Use this as it deals with the loop creation, shutdown,
# and other stuff for you.
asyncio.run(main()) # This is new in Python 3.7
Answered By - GeeTransit
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.