Get all image links from a web page using Python

Anytime you visit a website, you could find a variety of stuff, from text to photos, music to videos. Sometimes, all you want to do is read the text and skim the material. In other cases, you might want to bookmark the data on the page for future use.

Take the scenario when you wish to see every image link from a website.  However, you may use Python to resolve this. In this tutorial, you will discover how to use Python to extract every image link on a website in this lesson.

Get all image links from the webpage

In essence, web scraping is a technique for gathering data from numerous sources. Any kind of data is acceptable, including text, images, audio, and video.
We immediately retrieve the website’s underlying HTML code during web scraping. The necessary web page data may then be replicated or retrieved using this code.

Let’s learn how to apply the aforementioned method, but with Python, to extract image links from a web page.

Step 1:  First we import the required modules:

import urllib.request
from urllib.parse import urlparse
from bs4 import BeautifulSoup

Step 2: Take the URL of the website from the user and use urlopen() it from urllib.request to Open the URL and extract the HTML  source code with the help of html.parser method.

response = urllib.request.urlopen("https://www.satyug.edu.in/")
soup = BeautifulSoup(response,'html.parser',
                         from_encoding=response.info().get_param('charset'))
print(soup)

Step 3: Now with the help of for loop we will find all the tags which contain ‘img’ and then extract all image links using Beautiful Soup.

for img in soup.findAll('img'):
    print(img.get('src'))

So our final code will be:

import urllib.request
from urllib.parse import urlparse
from bs4 import BeautifulSoup

response = urllib.request.urlopen("https://www.satyug.edu.in/")
soup = BeautifulSoup(response,'html.parser',
                         from_encoding=response.info().get_param('charset'))
for img in soup.findAll('img'):
    print(img.get('src'))

The output will be: 

None
https://images.squarespace-cdn.com/content/v1/5c9f784af8135a660e96e75f/1553997059412-COA4BYPZXR4P1GJDIT26/manisha.jpg
None
https://images.squarespace-cdn.com/content/v1/5c9f784af8135a660e96e75f/1553997100051-7KG839JZ4KPXX9ATM9I1/pavi.jpg
None
https://images.squarespace-cdn.com/content/v1/5c9f784af8135a660e96e75f/1554053090130-Z8CGOIQSJVLEJ24GSMZ2/sidhant.jpg
None
https://images.squarespace-cdn.com/content/v1/5c9f784af8135a660e96e75f/1553997642821-HJAL30GTRB9AZQ1IALE9/nisha.jpg
None

Process finished with exit code 0

So in this way, we can extract all the image links from a Web page.

I hope you like this article

Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *