+4 votes
in Programming Languages by (60.0k points)
I want to get the text from img alt tag using Python package beautifulsoup. I am using find_all() method to find all img tag on a web page, but not sure how to get the text from the alt tag.


From the following HTML code, I want output="hello".

<img src="imgfile.png" alt="hello">

1 Answer

+3 votes
by (75.9k points)
selected by
Best answer

You need to use the find_all() method with the parameter "alt=True" to get the alt text.

Here is an example:

The code scans the returned value of the find_all() method and selects alt text using "alt" as a key.

from bs4 import BeautifulSoup

import urllib.request



# define a user agent so that request is not declined

user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/2009021910 Firefox/3.0.7'

headers = {'User-Agent': user_agent, }

url = your_url_containing_images



# request to open the web page

request = urllib.request.Request(url, None, headers)

response = urllib.request.urlopen(request)

soup = BeautifulSoup(response, 'html5lib')



# find all img tag and select alt text

for foo in soup.find_all('img', alt=True):