Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Webscrapping with Beautiful Soup #6

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"githubPullRequests.ignoredPullRequestBranches": [
"main"
]
}
14 changes: 14 additions & 0 deletions Level 5/Beginners/newegg.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Web_scrapping
The internet is full of huge amount of data which can be used for different purposes.
To collect this data we need to know how to scrape data from a website.
Web Scrapping is the process of extracting and collecting data
from websites and storing in a database or local machine.

For this project I would be using.
BeautifulSoup
Selenium
Requests
To achieve my goals. It also important that we understand HTML tags, ids, and classes, therefore basics of HTML. They would help us target the content we are in need of.

Newegg: This is an eCommerce website, you are to choose any product category and scrape at least 500 products and save the product name, price, shipping type/price, volume discounts.

68 changes: 68 additions & 0 deletions Level 5/Beginners/newegg.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
'''Web Scrapping is the process of extracting and collecting data
from websites and storing in a database or local machine

I will be using BeautifulSoup and Requests package to scrape Newegg site.

Newegg is an eCommerce website, I would be scrapping products from Gaming category.
I would save the product name, price, shipping type/price, volume discounts

To start scraping let's import requests, beautiful soup and get the website url
It is also important to understand the basics of HTML and CSS selectors. We target content
from a website using HTML tags, classes or/and ids'''

'''import libraries'''
import requests
from bs4 import BeautifulSoup
from csv import writer

'''declare url variable for the website'''
url = "https://www.newegg.com/PS5-Systems/SubCategory/ID-3762?cm_sp=Cat_PlayStation_1-_-VisNav-_-PS5-Systems"

'''Use the requests get method to fetch the data from the url'''
response = requests.get(url)

'''Check status, anything other than 200 -which means the data fetching was successful- indicates there is an error'''
# print(response.status_code)

'''Using the beautifulSoup to parse content from the page,
content function allows us to get all the contents in the website, '''

content = response.content
soup = BeautifulSoup(content, 'html.parser')
# print(soup)
#<title>Computer Parts, PC Components, Laptops, Gaming Systems, and more - Newegg.com</title>

# print(soup.title.get_text())
# Computer Parts, PC Components, Laptops, Gaming Systems, and more - Newegg.com

# print(soup.body)
# This gives the body of the entire website

'''Now we want to get the data we are looking for, first we have to find the parent category, so we would
access all the data we need'''

sections = soup.find_all('div', class_ ='item-cell')
#We used the find_all to find all the products items, we would be using a loop
# to find the product name, price, type of shipping and its discount
# I used underscore after the class to indicate that this is not a python class
# but CSS

# open csv file and write the data we got into the csv file
with open('newegg_products.csv', 'w', encoding='utf8', newline='') as f:
'''The writer would be responsible for writing into the file'''
thewriter = writer(f)
'''Now lets create a header for the file'''
header = ['Product_Name', 'Price', 'Shipping Type', 'Discount']
thewriter.writerow(header)
for section in sections:
product_name = section.find('a', class_= 'item-title').get_text()
product_price = section.find('li', class_= 'price-current').get_text()
price = product_price[:7]
product_ship_type = section.find('li', class_= 'price-ship').get_text()
product_discount= section.find('span', class_= 'price-save-percent')
discount = 'NaN' if product_discount == None else product_discount.get_text()
# print(discount)
info = [product_name, price, product_ship_type, discount]
thewriter.writerow(info)


39 changes: 39 additions & 0 deletions Level 5/Beginners/newegg_products.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
Product_Name,Price,Shipping Type,Discount
PlayStation 5 Digital Console– God of War™ Ragnarök Bundle,$584.80,Free Shipping,NaN
PS5 Bundle - Includes PS5 Console and an Additional DualSense 5 Controller,$698.94,Free Shipping,NaN
"omarando PS5 Console Carrying Storage Case ,Compatible PS5 CD-ROM and Digital Edition, Nylon Waterproof Material,Zinc Alloy zipper, Including PS5 Game Controller Protection Case (Black-Blue)",$45.99 ,Free Shipping,NaN
SONY PLAYSTATION 5 DIGITAL EDITION CONSOLE,$536.05,Free Shipping,5%
PlayStation 5 Console - Horizon Forbidden West Bundle,$678.98,Free Shipping,NaN
"omarando PS5 Console Carrying Storage Case ,Compatible PS5 CD-ROM and Digital Edition, Nylon Waterproof Material,Zinc Alloy zipper, Including PS5 Game Controller Protection Case (Black-Green)",$49.99 ,Free Shipping,NaN
PlayStation PS5 Console,$629.80,Free Shipping,NaN
PlayStation 5 Digital Console,$549.89,Free Shipping,NaN
PlayStation®5 Digital Edition – Horizon Forbidden West™ Bundle,$549.99,Free Shipping,8%
"PlayStation 5 Disc Console God of War Ragnarok Bundle with Extra Controller, Cefesfy",$723.65,Free Shipping,NaN
PlayStation 5 Disc Edition Horizon Forbidden West Bundle with Ozeal charging station,$689.00,Free Shipping,NaN
PS5 Bundle - Includes PlayStation 5 Console and an Additional Cosmic Red DualSense Controller,$688.90,Free Shipping,NaN
PlayStation 5 Disc Bundle + DualSense Wireless Controller + SpiderMan: Miles Morales,$749.95,Free Shipping,NaN
Sony PlayStation 5 PS5 Console Disc Blu-Ray US Version,$639.99,Free Shipping,NaN
Sony PlayStation 5 PS5 Console Digital,$568.60,Free Shipping,NaN
New PlayStation 5 Console - Call of Duty Modern Warfare II Bundle,$769.99,Free Shipping,NaN
PlayStation 5 Console,$654.75,Free Shipping,NaN
PlayStation 5 Console Disc Version With Extra DualSense Wireless Controller,$699.70,Free Shipping,NaN
PS5 Core with Extra Red DualSense Controller and Accessories Kit,$829.99,Free Shipping,NaN
"PS5 Bundle: Includes PlayStation 5 Digital Console, Additional Cosmic Red DualSense Controller and PS5 Media Remote",$679.95,Free Shipping,NaN
PlayStation 5 Digital Edition Horizon Forbidden West Bundle with Ozeal charging station,$669.00,Free Shipping,NaN
PS5 Core with Extra Blue Dualsense Controller and Accessories Kit,$829.99,Free Shipping,NaN
PlayStation 5 Disc Edition Horizon Forbidden West Bundle with Two DualSense Controllers and Mytrix Dual Controller Charger,$799.99,Free Shipping,NaN
"SONY Playstation 5 Disc Gaming Console Horizon Forbidden West Bundle, JAWFOAL Accessories",$749.99,Free Shipping,NaN
"YPINGK PS5 Vertical Stand with Cooling Fan and Dual Controller Charger - Indicator Lamps and 15 Game Slots, Fast Cooling through Metal Base, PS5 Console Compatible",$49.99 ,$4.08 Shipping,NaN
PlayStation 5 Digital Edition with Two DualSense Controllers and Mytrix Dual Controller Charger,$614.99,Free Shipping,7%
"Newest PlayStation 5 Digital Version Video Game Console God of War Ragnarök Bundle, Up to 120 fps, 4K UHD Blu-ray Player, HDMI, USB, w/Pearlite Tech. High Speed HDMI Cable",$679.99,Free Shipping,NaN
PlayStation 5 Disc Edition Call of Duty Modern Warfare II Bundle with Ozeal charging station,$789.00,Free Shipping,NaN
PS5 Core with Extra Purple Dualsense Controller and Accessories Kit,$829.99,Free Shipping,NaN
"omarando PS5 Console Carrying Storage Case ,Compatible PS5 CD-ROM and Digital Edition, Nylon Waterproof Material,Zinc Alloy zipper, Including PS5 Game Controller Protection Case(Black-White)",$45.99 ,Free Shipping,NaN
"PS5 Bundle- Includes PlayStation 5 Digital Console, Additional DualSense Controller, PlayStation Media Remote, and DualSense Charging station",$749.95,Free Shipping,NaN
"PS5 Deluxe bundle: PS5 Disc Version + Wireless Controller+Five Games (Marvel's Spider-Man: Miles Morales, Cyberpunk 2077, Hitman 3, Demon's Soul and Assassin’s Creed Valhalla) +Ozeal Charging Station",$999.00,Free Shipping,9%
"Sony PlayStation 5 Digital Edition Console, JAWFOAL HDMI Cable",$554.98,Free Shipping,8%
PS5 Bundle: PS5 Disc Console+DualSense Wireless Controller + Watch Dogs: Legion and SpiderMan: Miles Morales+Ozeal charging station for PS5,$799.00,Free Shipping,NaN
PlayStation 5 Disc Console and PlayStation 5 - DualSense Wireless Controller - Midnight Black,$729.90,Free Shipping,NaN
"PS5 Bundle - Includes Playstation 5 Digital Console, Additional DualSense Controller and HD Camera for PS5",$699.95,Free Shipping,NaN
PlayStation 5 Console with Miles Morales Game and Accessories,$879.99,Free Shipping,NaN
SONY,$770.00,Free Shipping,NaN