← Go Back

How to Extract All Links From a Web Page

The easiest method is to use Beautiful Soup, a powerful library for parsing HTML content.

from urllib.request import urlopen
from bs4 import BeautifulSoup

r = urlopen("https://www.wikipedia.org/")
bs = BeautifulSoup(r.read(), "html.parser")
r.close()

for link in bs.find_all("a"):
    print(link.get("href"))

To run this code, you must install the bs4 module first:

pip install beautifulsoup4

urllib url beautiful-soup html

🐍 You might also find interesting: