Skip to content

Latest commit

 

History

History
40 lines (25 loc) · 3.8 KB

README.md

File metadata and controls

40 lines (25 loc) · 3.8 KB

The repository contains support material for the lab classes of the Algorithmic Methods for Data Science course taught as part of the Data Science master at Sapienza University. The class was tought during the following years:

The material for earlier years is available under the 2015-2020 branch, tought during:

Introduction to Python Basic & Compound data types and Pandas using Rome's Municipality Open Data Portal

The lab provides a quick introduction to basic operations on Python data types and on Pandas Series and Dataframes. The lab concludes with some hands-on data engineering based on a dataset (in CSV format) provided by the Open Data portal of the City of Rome. In particular, the lab uses the dataset listing all hotels, hostels and in general all the structures that can receive guests within Rome, that are active during January 2019.

The lab material is available as an iPython notebook:

Data Visualization based on MatplotLib and Pandas using Rome's Municipality Open Data Portal

The lab provides a quick introduction to basic operations of the MatplotLib, a Python 2D plotting library. The lab uses the dataset listing all hotels, hostels and in general all the structures that can receive guests within Rome, that are active during January 2019 in order to visualize certain aspects.

The lab material is available as an iPython notebook:

Data Engineering with Pandas and Document Databases using Rome's Municipality Open Data Portal

The lab carries out a series of data engineering tasks on the dataset listing all hotels, hostels and in general all the structures that can receive guests within Rome, that are active during January 2019 in combination with the dataset listing all hotels, hostels and in general all the structures that can receive guests within Rome, that are active during January 2018 to compare the growth between the 2 years.

In the sequel, a document database, such as MongoDB, to store & organize the data.

The lab material is available as an iPython notebook:

Web Scraping using the ParkRun Web site

The lab focuses on retrieving data from web pages by examining the HTML documents using the BeautifulSoup. The process of fetching documents from the world wide web is also known as Web Scraping and in this lab we use the ParkRun web site.

The lab material is available as an iPython notebook: