Get all image links from a web page using Python
Anytime you visit a website, you could find a variety of stuff, from text to photos, music to videos. Sometimes, all you want to do is read the text and skim the material. In other cases, you might want to bookmark the data on the page for future use.
Take the scenario when you wish to see every image link from a website. However, you may use Python to resolve this. In this tutorial, you will discover how to use Python to extract every image link on a website in this lesson.
Get all image links from the webpage
In essence, web scraping is a technique for gathering data from numerous sources. Any kind of data is acceptable, including text, images, audio, and video.
We immediately retrieve the website’s underlying HTML code during web scraping. The necessary web page data may then be replicated or retrieved using this code.
Let’s learn how to apply the aforementioned method, but with Python, to extract image links from a web page.
Step 1: First we import the required modules:
import urllib.request from urllib.parse import urlparse from bs4 import BeautifulSoup
Step 2: Take the URL of the website from the user and use urlopen()
it from urllib.request
to Open the URL and extract the HTML source code with the help of html.parser method.
response = urllib.request.urlopen("https://www.satyug.edu.in/") soup = BeautifulSoup(response,'html.parser', from_encoding=response.info().get_param('charset')) print(soup)
Step 3: Now with the help of for loop we will find all the tags which contain ‘img’ and then extract all image links using Beautiful Soup.
for img in soup.findAll('img'): print(img.get('src'))
So our final code will be:
import urllib.request from urllib.parse import urlparse from bs4 import BeautifulSoup response = urllib.request.urlopen("https://www.satyug.edu.in/") soup = BeautifulSoup(response,'html.parser', from_encoding=response.info().get_param('charset')) for img in soup.findAll('img'): print(img.get('src'))
The output will be:
None https://images.squarespace-cdn.com/content/v1/5c9f784af8135a660e96e75f/1553997059412-COA4BYPZXR4P1GJDIT26/manisha.jpg None https://images.squarespace-cdn.com/content/v1/5c9f784af8135a660e96e75f/1553997100051-7KG839JZ4KPXX9ATM9I1/pavi.jpg None https://images.squarespace-cdn.com/content/v1/5c9f784af8135a660e96e75f/1554053090130-Z8CGOIQSJVLEJ24GSMZ2/sidhant.jpg None https://images.squarespace-cdn.com/content/v1/5c9f784af8135a660e96e75f/1553997642821-HJAL30GTRB9AZQ1IALE9/nisha.jpg None Process finished with exit code 0
So in this way, we can extract all the image links from a Web page.
I hope you like this article
Thanks
Leave a Reply