Blog Logo
TAGS

Making a Link Extractor In Python

Extract all links of a web page using Python. Learn how to build a link extractor tool in Python from scratch using requests and BeautifulSoup libraries. This tutorial is useful for web scrapers, SEO diagnostics process or even information gathering phase for penetration testers. In this tutorial, we’ll be using requests to make HTTP requests, BeautifulSoup for parsing HTML, and colorama for changing text color. The article covers building functions like is_valid(url) to check whether a URL is valid and get_all_website_links(url) to return all the valid URLs of a web page. The article also explains how to distinguish between internal and external links using colorama, and how to remove HTTP GET parameters from the URLs.