Professional project
Python code solution and explanation
requests, bs4, BeautifulSoup
Introduction
Do you find yourself checking the price of an Amazon product every single day? How can you save as much as possible on a pretty expensive product?
How about an amazon price tracker. Today we’re going to make a program that alerts you when the price of a product on Amazon changes, or you can choose a budget. This will scrape Amazon for the current price of a specific product you choose.
Solution
Let’s start by importing the required libraries. The requests
library in Python is a useful tool for making HTTP requests to websites and processing the response data, allowing users to easily retrieve and interact with data. beautifulsoup
is used for web scraping and parsing HTML and XML documents, allowing users to easily extract data for further analysis.
import requests
from bs4 import BeautifulSoup
The product I will be tracking is an Apple AirPods 3rd Generation. You can choose your own product if you like.
Amazon link:
https://www.amazon.ca/
Apple AirPods 3rd Generation link:
https://www.amazon.ca/Apple-AirPods-3rd-Generation-Lightning-Charging/dp/B0BDHB9Y8H/ref=sr_1_4?crid=2DRBZVXW2QDPN&keywords=airpods+pro+2&qid=1675996169&sprefix=airpods+pro+2%2Caps%2C155&sr=8-4
This is what the website will look like.
URL = "https://www.amazon.ca/Apple-AirPods-3rd-Generation-Lightning-Charging/dp/B0BDHB9Y8H/ref=sr_1_4?crid=2DRBZVXW2QDPN&keywords=airpods+pro+2&qid=1675996169&sprefix=airpods+pro+2%2Caps%2C155&sr=8-4"
Now we have to pass in a header. In an HTTP request, a header is a dictionary with key-value pairs that provide additional information about the request context and the requested resource. This includes the type of content being sent, the preferred response format, or authentication credentials.
You can see your own browser headers by going here:
http://myhttpheader.com/
To actually access the Amazon page, you need to pass along “User-Agent” and “Accept-Language” at minimum.
header = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36",
"Accept-Language": "en-US,en;q=0.9,fr;q=0.8,te;q=0.7"
}
Now we can get a request from the URL and save the text. Remember to pass along the header in the headers
parameter.
response = requests.get(URL, headers=header)
page = response.text
We can use BeautifulSoup to grab a hold of the document using the HTML parser.
soup = BeautifulSoup(page, "html.parser")
Now that we’ve arrived at the Amazon product page, how will we grab the price? We need to find a specific class with the price.
We can use Ctrl + Shift + C to navigate to the text you want. Once you click on it, you will see the HTML code corresponding to it. The price had a class "a-price
–hole"
.
Now we can use the soup.find() method, and insert the class name. Remember to add the underscore after class_.
rough_price = soup.find(class_="a-price-whole").text
If we print rough_price
, we get this:
199.
To remove the dot, we can split this number into 2 parts using .split() and specify where you want to split it by. Then we can grab the first item on that list.
price = rough_price.split(".")[0]
If we print price
, we get this:
199
We can add a dollar sign and save that to final_price
.
final_price = f"${price}"
If we print final_price
, we get this:
$199
Solution
To view my full code, please visit my GitHub repository:
https://github.com/Gursehaj-Singh/amazon-price-tracker/tree/main
import requests
from bs4 import BeautifulSoup
URL = "https://www.amazon.ca/Apple-AirPods-3rd-Generation-Lightning-Charging/dp/B0BDHB9Y8H/ref=sr_1_2_sspa?crid=" \
"2BWTPNX2T3NV3&keywords=airpods&qid=1673738882&sprefix=airpod%2Caps%2C176&sr=8-2-spons&psc=1&spLa=ZW5jcnlw" \
"dGVkUXVhbGlmaWVyPUEzUEYxQ1hFNjhYSzlUJmVuY3J5cHRlZElkPUEwNzkwMjc4QTZKQk1JRTMxRVVNJmVuY3J5cHRlZEFkSWQ9QTA2N" \
"DkyMjkzNEhYUEsyNkxQVVNMJndpZGdldE5hbWU9c3BfYXRmJmFjdGlvbj1jbGlja1JlZGlyZWN0JmRvTm90TG9nQ2xpY2s9dHJ1ZQ=="
header = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/102.0.0.0 Safari/537.36",
"Accept-Language": "en-US,en;q=0.9,fr;q=0.8,te;q=0.7"
}
response = requests.get(URL, headers=header) # , headers=header
page = response.text
soup = BeautifulSoup(page, "html.parser")
rough_price = soup.find(class_="a-price-whole").text
price = rough_price.split(".")[0]
final_price = f"${price}"
print(final_price)
Further Steps
Now that we have been able to scrape the price from the website, you can now further work with it.
For example, you may want to email or receive notifications when the price changes. Or you can keep track of price changes over time, so you can determine the best time to buy a product based on its price history. Maybe even make a chart using matplotlib. You may also set up price drop alerts so that you’re notified when the price of a product drops below a certain threshold.
Hope you enjoyed my code.