Materials for Essentials for Data Sciences course: relational databases, SQL and relational objects.
- This course is Python-based and uses Python packages:
- SQLAlchemy for the database access and the SQL language.
- pandas for data access and presentation.
- The materials are developed and tested in:
- Python >=3.7
- Jupyter notebook
- Google Colaboratory.
- Understand general database concepts.
- Understand relational databases design (data model, types and representaions of relationships, normal forms).
- Practice SQL language (
SELECT
queries of growing complexity, tableJOIN
operations, data content and table structure modification commands). - Work with Object Relational Mapper (use data from a relational database in an object-oriented code).
- Relation/table
- Keys, primary keys, prime attributes
- Database design anomalies
- Database normalisation
- Types of relationships
- Column data types
- Advantages/disadvantages of relational databases
- Downloading and connecting to the example database: Lecture
- Querying and selecting data (
SELECT
,LIMIT
,AS
,ORDER
,DISTINCT
,WHERE
,IN
,BETWEEN
,LIKE
): Lecture, Exercises
- Grouping and summarising (
GROUP BY
,HAVING
,COUNT
,SUM
,AVG
,MIN
,MAX
,GROUP_CONCAT
): Lecture, Exercises - Modification statements (
UPDATE
,INSERT
,DELETE
): Lecture, Exercises - Data definition language (
CREATE TABLE
,DROP TABLE
): Lecture - Joining tables 1 (
INNER JOIN
,LEFT JOIN
,CREATE TEMP TABLE
): Lecture, Exercises - Joining tables 2 (
UNION
,EXCEPT
,INTERSECT
, self joins,CROSS JOIN
, subqueries,EXIST
): Lecture, Exercises
- Building object-oriented interface to a database: Practical
(This is one long session but it cannot be easily split into smaller parts without major code repetitions.)