Issue
What's the standard way in Python to ensure that all concurrent tasks are completed before the event loop ends? Here's a simplified example:
import asyncio
async def foo(delay):
print("Start foo.") # Eg: Send message
asyncio.create_task(bar(delay))
print("End foo.")
async def bar(delay):
print("Start bar.")
await asyncio.sleep(delay)
print("End bar.") # Eg: Delete message after delay
def main():
asyncio.run(foo(2))
if __name__ == "__main__":
main()
Current output:
Start foo. # Eg: Send message
End foo.
Start bar.
Desired output:
Start foo. # Eg: Send message
End foo.
Start bar.
End bar. # Eg: Delete message after delay
I've tried to run all outstanding tasks after loop.run_until_complete()
, but that doesn't work since the loop will have been terminated by then. I've also tried modifying the main function to the following:
async def main():
await foo(2)
tasks = asyncio.all_tasks()
if len(tasks) > 0:
await asyncio.wait(tasks)
if __name__ == "__main__":
asyncio.run(main())
The output is correct, but it never terminates since the coroutine main()
is one of the tasks. The setup above is also how discord.py
sends a message and deletes it after a period of time, except that it uses loop.run_forever()
instead, so does not encounter the problem.
Solution
There is no standard way to wait for all tasks in asyncio
(and similar frameworks), and in fact one should not try to. Speaking in terms of threads, a Task
expresses both regular and daemon activities. Waiting for all tasks indiscriminately may cause an application to stall indefinitely.
A task that is created but never await
ed is de-facto a background/daemon task. In contrast, if a task should not be treated as background/daemon then it is the callers responsibility to ensure it is await
ed.
The simplest solution is for every coroutine to await
and/or cancel all tasks it spawns.
async def foo(delay):
print("Start foo.")
task = asyncio.create_task(bar(delay))
print("End foo.")
await task # foo is done here, it ensures the other task finishes as well
Since the entire point of async
/tasks is to have cheap task switching, this is a cheap operation. It should also not affect any well-designed applications:
- If the purpose of a function is to produce a value, any child tasks should be part of producing that value.
- If the purpose of a function is some side-effect, any child tasks should be parts of that side-effect.
For more complex situations, it can be worthwhile to return any outstanding tasks.
async def foo(delay):
print("Start foo.")
task = asyncio.create_task(bar(delay))
print("End foo.")
return task # allow the caller to wait for our child tasks
This requires the caller to explicitly handle outstanding tasks, but gives prompt replies and the most control. The top-level task is then responsible for handling any orphan tasks.
For async
programming in general, the structured programming paradigm encodes the idea of "handling outstanding tasks" in a managing object. In Python, this pattern has been encoded by the trio
library as so-called Nursery
objects.
import trio
async def foo(delay, nursery):
print("Start foo.")
# spawning a task via a nursery means *someone* awaits it
nursery.start_soon(bar, delay)
print("End foo.")
async def bar(delay):
print("Start bar.")
await trio.sleep(delay)
print("End bar.")
async def main():
# a task may spawn a nursery and pass it to child tasks
async with trio.open_nursery() as nursery:
await foo(2, nursery)
if __name__ == "__main__":
trio.run(main)
While this pattern has been suggested for asyncio
as TaskGroups
, so far it has been deferred.
Various ports of the pattern for asyncio
are available via third-party libraries, however.
Answered By - MisterMiyagi
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.