Lecture 11 - Introduction to SQL
sqlite3
.quit to exit the SQLite prompt!sqlite3 DLL and the command-line shell (sqlite-dll... and sqlite-tools...)C:\>sqlite and unzip the files in this folder, which will give you sqlite3.def, sqlite3.dll and sqlite3.exe filessqlite3 file to open the software, and you will see the SQLite promptEdit the system environment variablesEnvironment Variables...C:\sqlite to Path in both or either User variables and/or System variables, then you can use SQLite on terminal (command prompt)SQLite3 Editor in the extensions tab in VSCodeInstall to install the extensiongit diff Download it here or search for it in the extensions tab in VSCode
.sqlite or .db and open it with the SQLite3 Editorexample.db and open it with the extensionOpen to the Side – same thing)driver_id, driver_name, nationality, and victories:
driver_id is an integer, not null, and primary key (select these options in the editor)driver_name is a text variablenationality is a text variable toovictories is an integercommit to save the changesTEXT: Stores string values of any length using UTF-8 encodingINTEGER: Holds whole numbers (positive or negative)NUMERIC: Stores decimal numbers as floating-point valuesBLOB: Contains binary data preserved as inputNULL: Represents missing or undefined dataREAL, BOOLEAN, DATE, TIME, and DATETIMEPRIMARY KEY constraint is used to uniquely identify each row in a tableNULL values and are usually integersAUTOINCREMENT keyword
driver_id INTEGER PRIMARY KEY AUTOINCREMENTFOREIGN KEY constraint is used to link two tables together
PRIMARY KEY in another tableROWID to uniquely identify each row in a table, but it is not recommended
customers, products, and orderscustomers table will have a primary key customer_idproducts table will have a primary key product_idorders table will have a primary key order_idorders table will also have two foreign keys: customer_id and product_id| customer_id (PK) | first_name | last_name | |
|---|---|---|---|
| 1 | John | Doe | john.doe@email.com |
| 2 | Jane | Smith | jane.smith@email.com |
| 3 | Bob | Johnson | bob.j@email.com |
| product_id (PK) | product_name | price |
|---|---|---|
| 101 | Laptop | 999.99 |
| 102 | Smartphone | 599.99 |
| 103 | Wireless Mouse | 29.99 |
| order_id (PK) | customer_id (FK) | product_id (FK) | quantity | total_price |
|---|---|---|---|---|
| 1 | 1 | 101 | 1 | 999.99 |
| 2 | 2 | 102 | 2 | 1199.98 |
| 3 | 3 | 103 | 3 | 89.97 |
INSERT INTO commandVALUES to the bottom of the table;)Query Editor and paste the following code:INSERT INTO drivers VALUES (2, 'Michael Schumacher', 'German', 53);
INSERT INTO drivers VALUES (3, 'Sebastian Vettel', 'German', 91);
INSERT INTO drivers VALUES (4, 'Fernando Alonso', 'Spanish', 32);Or more efficiently:
INSERT INTO drivers (driver_id, driver_name, nationality, victories) VALUES
(2, 'Michael Schumacher', 'German', 53),
(3, 'Sebastian Vettel', 'German', 91),
(4, 'Fernando Alonso', 'Spanish', 32);▶️️ Execute ️to run the codeSELECT command*) from the drivers tableSELECT is probably the most used command in SQL and is used to extract data from a databaseSELECT to see the resultsClick on Query Editor and create a table called students with the following columns:
student_id (integer not null primary key)student_name (text)major (text)gpa (numeric)Add the following rows to the table:
student_id: 1, student_name: ‘John Doe’, major: ‘Computer Science’, gpa: 3.5
student_id: 2, student_name: ‘Jane Doe’, major: ‘Mathematics’, gpa: 3.8
Query the table to see the results
If the table already exists on your database, you can drop it with the following command:
PRAGMA table_info() commandPRAGMA table_info() command returns one row for each column in the tablecid: Column IDname: Column nametype: Data typenotnull: 1 if the column cannot contain NULL values, 0 otherwisedflt_value: Default value for the columnpk: 1 if the column is part of the primary key, 0 otherwiseCREATE TABLE command, as we saw beforeIF NOT EXISTS clause to avoid errors if the table already exists
CREATE TABLE IF NOT EXISTS table_name (column1_name data_type, ...);ALTER TABLE table_name ADD column_name data_type;ALTER TABLE table_name DROP COLUMN column_name;ALTER TABLE table_name RENAME COLUMN old_name TO new_name;DROP TABLE command
DROP TABLE table_name;DELETE FROM command
DELETE FROM table_name WHERE condition;students tableALTER TABLE students ADD COLUMN age INTEGER;ALTER TABLE students DROP COLUMN major;ALTER TABLE students RENAME COLUMN gpa TO grade;student_id = 2: DELETE FROM students WHERE student_id = 2;SUM() or AVG()AS is an alias which assigns a name to the aggregate statisticSELECT SUM(victories) AS sum_victories,
COUNT(*) AS num_rows,
AVG(victories) AS mean_victories,
MIN(victories) AS min_victories,
MAX(victories) AS max_victories
FROM drivers;SELECT command to get the average victories (a subquery)WHERE clause. See the reason why here. More about subqueries here.HAVING)GROUP BY to group data by a columnSELECT nationality,
SUM(victories) AS sum_victories,
AVG(victories) AS mean_victories,
MIN(victories) AS min_victories,
MAX(victories) AS max_victories
FROM drivers
GROUP BY nationality;ROUND() to round the valuesSELECT nationality,
SUM(victories) AS sum_victories,
ROUND(AVG(victories), 1) AS mean_victories,
MIN(victories) AS min_victories,
MAX(victories) AS max_victories
FROM drivers
GROUP BY nationality;HAVING to filter variables after aggregatingWHERE, which is used to filter variables before aggregatingSELECT nationality,
SUM(victories) AS sum_victories,
ROUND(AVG(victories), 1) AS mean_victories,
MIN(victories) AS min_victories,
MAX(victories) AS max_victories
FROM drivers
GROUP BY nationality
HAVING SUM(victories) > 50;HAVING must be used after GROUP BY and before ORDER BYHAVING is used to filter the results of the GROUP BY clauseWHERE is used before aggregating and HAVING is used after aggregating!| SQL Command | Pandas Equivalent |
|---|---|
SELECT |
df[['column1', 'column2']] |
INSERT INTO |
df.loc[new_index] = new_row or df = df.append(new_row, ignore_index=True) |
WHERE |
df[df['column'] == value] |
GROUP BY |
df.groupby('column') |
HAVING |
df.groupby('column').filter(lambda x: condition) |
ROUND() |
df['column'].round(decimals) |
| Aggregate Functions | df.groupby('column').agg({'col1': 'sum', 'col2': 'mean', 'col3': 'min', 'col4': 'max'}) |
Other Tools and then Export / Importsqlite3 example.db .mode csv .output example.csv SELECT * FROM drivers;sqlite3 example.db .mode csv .import example.csv drivers;students table and add the rowsSELECT driver_name, nationality, victories
FROM drivers
WHERE (nationality = 'German' OR nationality = 'British') AND victories > 50;
SELECT driver_name, victories
FROM drivers
WHERE victories > (SELECT AVG(victories) FROM drivers);