Scraping Content Using Pyppeteer In Association With Asyncio
I've written a script in python in combination with pyppeteer along with asyncio to scrape the links of different posts from its landing page and eventually get the title of each p
Solution 1:
The problem is in the following lines:
tasks = [await browse_all_links(link, page) for link in linkstorage]
results = await asyncio.gather(*tasks)
The intention is for tasks
to be a list of awaitable objects, such as coroutine objects or futures. The list is to be passed to gather
, so that the awaitables can run in parallel until they all complete. However, the list comprehension contains an await, which means that it:
- executes each
browser_all_links
to completion in series rather than in parallel; - places the return values of
browse_all_links
invocations into the list.
Since browse_all_links
doesn't return a value, you are passing a list of None
objects to asyncio.gather
, which complains that it didn't get an awaitable object.
To resolve the issue, just drop the await
from the list comprehension.
Post a Comment for "Scraping Content Using Pyppeteer In Association With Asyncio"