Welcome to akashITS (Akash IT Solutions), In this article, you will learn about: How to download a file over HTTP in Python
How to download a file over HTTP in Python
Python provides different modules like urllib, requests etc to download files from the web. I am going to use the request library of python to efficiently download files from the URLs.
Downloading a file over HTTP in Python can be achieved using the requests library, which simplifies working with HTTP requests. If you don’t have the library installed, you can install it using
pip install requests
Example 1: Basic File Download
import requests url = "https://images.brickset.com/sets/images/1-8.jpg" response = requests.get(url) # Check if the request was successful (status code 200) if response.status_code == 200: with open("downloaded_file.jpg", "wb") as file: file.write(response.content) print("File downloaded successfully.") else: print(f"Failed to download file. Status code: {response.status_code}")
In this example, we use the get method from the requests library to send a GET request to the specified URL. The content of the response is then written to a local file using the open function in binary write mode (“wb”).
Example 2: Handling Errors
It’s important to handle potential errors during the download process. For instance, the server might respond with a non-200 status code, indicating an error
import requests url = "https://example.com/xyz.jpg" response = requests.get(url) if response.status_code == 200: with open("downloaded_file.zip", "wb") as file: file.write(response.content) print("File downloaded successfully.") else: print(f"Failed to download file. Status code: {response.status_code}") # Failed to download file. Status code: 404
In this example, the script checks the status code before attempting to write the file. If the status code is not 200, it prints an error message.
Example 3: Downloading Large Files in Chunks
For large files, it’s more efficient to download them in chunks to avoid loading the entire file into memory at once. This can be accomplished by iterating over the response content in chunks.
import os,requests import requests url = "https://example.com/large_file.zip" response = requests.get(url, stream=True) if response.status_code == 200: with open("downloaded_large_file.zip", "wb") as file: for chunk in response.iter_content(chunk_size=128): file.write(chunk) print("Large file downloaded successfully.") else: print(f"Failed to download large file. Status code: {response.status_code}")
Setting stream=True in the get method ensures that the content is not immediately downloaded. Instead, the content is retrieved in chunks using the iter_content method.
</span> </div><div><div><pre><span style="font-size: 12pt; font-family: verdana, geneva, sans-serif;" data-mce-style="font-size: 12pt; font-family: verdana, geneva, sans-serif;">import requests from bs4 import BeautifulSoup # specify the URL of the archive here archive_url = "https://sample-videos.com/" def get_video_links(): # create response object r = requests.get(archive_url) # create beautiful-soup object soup = BeautifulSoup(r.content, 'html5lib') # find all links on web-page links = soup.findAll('a') # filter the link sending with .mp4 video_links = [archive_url + link['href'] for link in links if link['href'].endswith('mp4')] return video_links def download_video_series(video_links): for link in video_links: '''iterate through all links in video_links and download them one by one''' # obtain filename by splitting url and getting # last string file_name = link.split('/')[-1] print("Downloading file:%s" % file_name) # create response object r = requests.get(link, stream=True) # download started with open(file_name, 'wb') as f: for chunk in r.iter_content(chunk_size=1024 * 1024): if chunk: f.write(chunk) print("%s downloaded!\n" % file_name) print("All videos downloaded successfully!") return if __name__ == "__main__": # getting all video links video_links = get_video_links() # download all videos download_video_series(video_links)
Using urllib.request
</span></div> <div> <div> <pre><span style="font-family: verdana, geneva, sans-serif; font-size: 12pt;">import urllib.request url = 'https://images.brickset.com/sets/images/2-8.jpg' urllib.request.urlretrieve(url, 'new set.png') print("file downloaded successfully") #