Lecture 25 - Analysing Classic Video Games & Studios in SQL
In this session, you will practice your SQL skills by creating and querying a database of classic video games and their development studios. You will use SELECT
statements, aggregate functions, joins, and other SQL features to answer a series of questions.
pd.read_sql()
to execute your query and display the results as a pandas DataFrame.We will work with two tables: studios
and games
.
studios
table:
Column | Type | Description |
---|---|---|
studio_id |
INTEGER | Primary Key for the studio. |
studio_name |
TEXT | The name of the game development studio. |
country |
TEXT | The country where the studio is based. |
year_founded |
INTEGER | The year the studio was founded. |
games
table:
Column | Type | Description |
---|---|---|
game_id |
INTEGER | Primary Key for the game. |
title |
TEXT | The title of the video game. |
studio_id |
INTEGER | Foreign Key referencing studios(studio_id) . |
genre |
TEXT | The genre of the game (e.g., RPG, Platformer). |
release_year |
INTEGER | The year the game was released. |
units_sold_millions |
REAL | Units sold, in millions. |
# Drop tables if they already exist to ensure a clean slate
cursor.execute("DROP TABLE IF EXISTS games;")
cursor.execute("DROP TABLE IF EXISTS studios;")
# Create studios table
cursor.execute("""
CREATE TABLE studios (
studio_id INTEGER PRIMARY KEY,
studio_name TEXT NOT NULL,
country TEXT,
year_founded INTEGER
);
""")
# Create games table
cursor.execute("""
CREATE TABLE games (
game_id INTEGER PRIMARY KEY,
title TEXT NOT NULL,
studio_id INTEGER,
genre TEXT,
release_year INTEGER,
units_sold_millions REAL,
FOREIGN KEY (studio_id) REFERENCES studios(studio_id)
);
""")
print("Tables 'studios' and 'games' created successfully.")
# Insert data into studios table
studios_data = [
(1, 'Nintendo', 'Japan', 1889),
(2, 'Blizzard Entertainment', 'USA', 1991),
(3, 'Square Enix', 'Japan', 2003),
(4, 'CD Projekt Red', 'Poland', 1994),
(5, 'Rockstar Games', 'USA', 1998),
(6, 'FromSoftware', 'Japan', 1986)
]
cursor.executemany("INSERT INTO studios VALUES (?, ?, ?, ?)", studios_data)
# Insert data into games table
games_data = [
(1, 'The Legend of Zelda: Ocarina of Time', 1, 'Action-Adventure', 1998, 7.6),
(2, 'World of Warcraft', 2, 'MMORPG', 2004, 12.0),
(3, 'Final Fantasy VII', 3, 'RPG', 1997, 10.0),
(4, 'The Witcher 3: Wild Hunt', 4, 'RPG', 2015, 50.0),
(5, 'Super Mario 64', 1, 'Platformer', 1996, 11.9),
(6, 'Grand Theft Auto V', 5, 'Action-Adventure', 2013, 195.0),
(7, 'Diablo II', 2, 'Action RPG', 2000, 4.0),
(8, 'Chrono Trigger', 3, 'RPG', 1995, 2.63),
(9, 'Elden Ring', 6, 'Action RPG', 2022, 23.0),
(10, 'Red Dead Redemption 2', 5, 'Action-Adventure', 2018, 61.0)
]
cursor.executemany("INSERT INTO games VALUES (?, ?, ?, ?, ?, ?)", games_data)
conn.commit()
print("Data inserted successfully.")
SUM
and WHERE
ORDER BY
and LIMIT
LEFT JOIN
List all games and their studio names. Ensure that all games are listed, even if their studio is not in the studios
table (which is not the case in our sample data, but the query should be written to handle it).
GROUP BY
with an Aggregate FunctionGROUP BY
with HAVING
LIKE
CASE
Categorise games based on their sales: ‘Blockbuster’ for games with over 50 million units sold, ‘Hit’ for games with 10 to 50 million units sold (inclusive), and ‘Classic’ for games with under 10 million units sold. Show the game title
and its sales_category
.
WHERE
ClauseRANK
)