class: middle # .eight[CSET 155:] ## .eight[Database Design] --- class: middle # Table Design --- class: middle # Agenda 1. [ ] [Case Conventions](#case) 2. [ ] [Table Design Rules](#rules) 3. [ ] [More on Constraints](#constraints) 4. [ ] [Altering Tables](#alter) ??? 5. [ ] [Indexes](#indexes) --- class: center, middle # [Get the Code for Today](https://raw.githubusercontent.com/anthonydb/practical-sql/master/Chapter_07/Chapter_07.sql) (right click and "Save Link As..." or just keep open in new tab) --- name: case # Case Conventions - .eight[camelCase] - .eight[PascalCase] - .eight[snake_case] Depends on your company. And if you choose one, stick with it! --- # Case Conventions Values in SQL are case insensitive. ```sql CREATE TABLE customers; CREATE TABLE Customers; ``` The second query will .eleven[fail] because that table already exists. --- # Case Conventions If you need case sensitivity, use quotes: ```sql CREATE TABLE "Customers"; ``` --- # Problems with Quotes If you use quotes to create a case-sensitive table, you **always** have to use quotes to refer to it: ```sql SELECT * FROM customers; SELECT * FROM Customers; SELECT * FROM "Customers"; ``` The first two queries are the same. --- # Problems with Quotes You can also use quotes to escape reserved words and characters: ```sql CREATE TABLE "CREATE TABLE"; ``` --- class: center, middle ![i also like to live dangerously](https://media0.giphy.com/media/kYkQYXkO3XyRa/giphy.gif?cid=3640f6095c4f13f56671465a36c99ae1) --- name: rules # Rules to Avoid Danger 1. Stick with snake_case names 2. Use proper names, not cryptic abbreviations 3. Watch out for length limits on names 4. Pluralize your table names 5. Use constraints! --- name: constraints # More on Constraints - Rules created for a Table or a Column - Defined in a .eight[CREATE] or .eight[ALTER TABLE] statement - Checked on .eight[INSERT], .eight[UPDATE], .eight[DELETE], etc. [You read the documentation, right?](https://www.postgresql.org/docs/current/ddl-constraints.html) --- # More on Constraints Table or Column Constraint? - Data Types - [Primary Key](#pk) - [Foreign Key](#fk) - [Unique](#unique) - [Check](#check) - [Not Null](#null) --- name: pk # Primary Keys: Natural - Already exist, so no extra column is needed - Already have meaning Examples: License Plate, Book ISBN, Product SKU --- # Primary Keys: Surrogate - Intentionally don't have meaning - Can save space depending on data type Examples: Serial integers, UUID, or custom values --- # Primary Keys: Syntax .fourteen[Run Listing 7-1] - Column constraint syntax in the first statement - Table constraint syntax in the second .fourteen[Run Listing 7-2 to see a key violation error] What is the reason for the two different syntaxes? --- # Primary Keys: Composite Keys .fourteen[Run Listing 7-3] - Table constraint syntax allows for comma-separated list - Column constraint syntax is quicker and more readable for single column keys .fourteen[Run Listing 7-4 to see another key violation error] --- # Primary Keys: Surrogate Keys - If there isn't a natural key, use a surrogate - If you don't know which surrogate, use .eight[bigserial] .fourteen[Listing 7-5 shows use of bigserial] There are reasons to avoid auto-incrementing keys, but if you don't have problems, you have nothing to fix. --- name: fk # Foreign Keys: Syntax Just like Primary Keys, Foreign Keys help maintain .eight[referential integrity]. .fourteen[Run Listing 7-6 one query at a time.] How many keys do these tables have? --- # Foreign Keys: Syntax How many keys do these tables have? 1. First table has a natural primary key 2. Second table has a natural composite primary key 3. Second table has a foreign key Did you get an error? Should you have? --- # Foreign Keys: Cascading Delete - Attempting to insert a reference that doesn't exist should fail. - What about removing a reference already in use? ```sql fk_col datatype REFERENCES tableB (pk_col) ON DELETE CASCADE ``` --- name: check # Check - Like PKs, can be column or table constraints - Inserts must pass a logical expression: ```sql age date CHECK (age > '1900-01-01 00:00:00'), ``` ```sql CONSTRAINT grad_check CHECK (credits >= 120 AND tuition = 'PAID') ``` .fourteen[Run Listing 7-7] --- name: unique # Unique One difference between Primary Key and Unique: NULL ```sql CONSTRAINT unique_users UNIQUE (username) ``` .fourteen[Run Listing 7-8 to see an example] --- name: null # Not Null When you need values. ```sql CREATE TABLE orders ( ... quantity integer NOT NULL ); ``` .fourteen[Run Listing 7-9] But you also might want to look at [default values](https://www.postgresql.org/docs/current/ddl-default.html). --- name: alter # Altering Tables So you made a constraint and now you don't want to follow it: ```sql ALTER TABLE table_name DROP CONSTRAINT constraint_name; ``` Or ```sql ALTER TABLE table_name ALTER COLUMN col_name DROP NOT NULL; ``` .fourteen[Run Listing 7-10] --- # Altering Tables Quick note: .eleven[You can only add a constraint if the data already follows it.] Update the data first, then add your new rule. --- # Great Debate on Constraints - Business logic in the database? - Business logic in the code? - Both? --- class: middle, center # .eight[Who's ready to work with a million records?] --- name: indexes # Indexes Speed up queries by using a shortcut, like indexes in books. PostgreSQL offers a few types: - Balanced-Tree Index - Generalized Inverted Index - Generalized Search Tree Primary Keys and Unique constraints automatically create indexes. --- # Indexes 1. Right click [this link](https://raw.githubusercontent.com/anthonydb/practical-sql/master/Chapter_07/city_of_new_york.csv) and save the CSV file 2. It's data for 940,374 addresses in New York City 3. Use .fourteen[Listing 7-11] to create the table and import the file 4. Verify it worked with a quick .eight[SELECT *] query 5. Use .fourteen[Listing 7-12] to run some benchmark queries You should see a line with .eight[Execution time:] in the output. Record the time for each query. --- # Indexes Let's make it run a little faster: ```sql CREATE INDEX street_idx ON new_york_addresses (street); ``` .eleven[Note: EXPLAIN and CREATE INDEX are Postgres-specific.] 1. Run this command and Postgres will build the index from the data 2. Now rerun the benchmark queries again and record their new times 3. Be amazed. --- # Indexes - Add size and maintenance costs - Look at what kind of indexes exist on your DBMS - Use indexes on columns you'll use in JOINs - Use indexes on columns used in WHERE clauses - Use EXPLAIN to test performance, just in case