Member-only story
Imagine you have a file containing 100,000 URLs, and your goal is to send an HTTP request to each one, then print the status codes from the responses. How can you write code to perform this task as quickly and efficiently as possible?
Python offers various methods for concurrent programming, including the multi-threading standard library threading
, the concurrent.futures
module, coroutines using asyncio
, and the asynchronous library grequests
. Each of these approaches provides a way to handle this large volume of requests effectively, depending on the specific requirements and environment.
Let’s explore some of the tips for sending 100k requests in Python.
Queue + Multithreading
Define a queue with a size of 400, and then start 200 threads, each thread is constantly getting the url from the queue and accessing it.
The main thread reads the url in the file and puts it into the queue, and then waits for all elements in the queue to be received and processed. The code looks like below:
import sys
from threading import Thread
from queue import Queue
import requests
concurrent = 200
def do_work():
while True:
url = q.get()
status, url = get_status(url)
do_sth_with_result(status…