Week 6 - DML & DDL

Week 6 - DML & DDL

This week we'll learn how to use commands from DML & DDL, including INSERT, UPDATE, DELETE, REPLACE, CREATE, ALTER and DROP. These commands are very powerful, but I think you'll find them simple to use!

Here's what we're going to do today:

  1. Categories of SQL Commands
  2. Data Manipulation Language
    1. INSERT
    2. REPLACE
    3. UPDATE
    4. DELETE
  3. Data Definition Language
    1. CREATE
    2. ALTER
    3. DROP

How to read these examples

I'm using a little shorthand in the examples this week.

-- Curly braces mean "fill in the blank"
SELECT {column1}, {column2}
FROM {table}
-- Square braces mean "this is optional".
SELECT * FROM {table}
[WHERE 1 + 1 = 2]

Categories of SQL Commands

Last week we touched on the INSERT command, but, up to that point, we'd been working exclusively with the SELECT command.

Not only are there a bunch more SQL commands, there are several categories of commands.

SELECT is usually put in it's own category, called DQL, which stands for "Data Query Language".

DML & DDL

In addition to DQL, today we add two new categories of SQL commands to our toolkit: Data Manipulation Language (DML), and Data Definition Language (DDL)1.

If you're familiar with the term 'CRUD' (short for "Create, Read, Update, Delete"), you'll recognize that thus far, we've only learned how to do the 'R' - reading data (by selecting it to show in a results table).

Today we're learning how to, um, 'CUD', I guess.

Both DML & DDL have commands for creating, updating and deleting.

In DML, create rows (INSERT), update rows (UPDATE/REPLACE), and delete rows (DELETE).

In DDL, we learn how to do the same thing, but affecting things besides rows - today we'll focus on doing this with tables. We can create (CREATE), update (ALTER), and delete (DROP) all kinds of things - tables, databases, etc.

Data Manipulation Language

Data Manipulation Language (DML), is, as you might expect, a category of commands for, well, manipulating data. Think of it as our way of affecting rows in our database tables.

Using DML, you can add rows, remove rows, or change the data in rows.

INSERT

INSERT, as we saw last week, is how we add rows to a database table.

The basic syntax is:

INSERT INTO {table}
  ({column1}, {column2})
VALUES ({value1}, {value2});

We saw the variation in syntax where we can omit the column names (and the parentheses that surround them) if you provide a value for each column in the correct order.

INSERT INTO {table}
VALUES ({value1}, {value2});

You can also add values for specific columns (omitting others) if you specify which columns the values are intended for, like this:

INSERT INTO {table}
  ({column5}, {column3})
VALUES ({value5}, {value3});

You can add values for multiple rows by adding multiple sets of parentheses, separated by commas, after VALUES like this:

INSERT INTO {table}
  ({column1}, {column2})
VALUES 
  ({value1}, {value2}), 
  ({value1}, {value2}), 
  ({value1}, {value2});

Heads up: adding values for multiple rows like this can save time, but it can also be more work:

  • If there is an error in any one of your value sets, the entire statement will fail.
  • If you add more rows than the database allows for in a given statement, the entire statement will fail.

REPLACE

REPLACE works exactly the same as INSERT, with identical syntax (except for the command name):

REPLACE INTO {table}
  ({column1}, {column2})
VALUES 
  ({value1}, {value2});

The only difference between INSERT and REPLACE is that if a row already exists with a primary key identical to the one being inserted, an INSERT statement will fail, but a REPLACE statement will overwrite the old row.

If no matching primary key exists, then REPLACE will insert a net-new row.

UPDATE

UPDATE, well, it updates rows.

The basic syntax is:

UPDATE {table}
SET {column} = {value}
[WHERE {condition}];

UPDATE also accepts the ORDER BY and LIMIT clauses.

UPDATE {table}
SET {column} = {value}
[WHERE {condition}]
[ORDER BY {column}]
[LIMIT {value}];

There aren't a ton of use cases for this, as it's a bit of a 'shotgun' approach - but it would be useful for something like "mark the top ten albums as 'bestsellers'".

What's the difference between REPLACE and UPDATE?

UPDATE will not insert a new row, only update an existing one.

REPLACE replaces all values in a row, while UPDATE can target specific columns.

DELETE

DELETE, you guessed it, deletes rows.

The basic syntax is:

DELETE FROM {table}
[WHERE {condition}];

DELETE also accepts the ORDER BY and LIMIT clauses.

DELETE FROM {table}
[WHERE {condition}]
[ORDER BY {column}]
[LIMIT {value}];

DELETE from joined tables (in MySQL)

You can delete from joined tables in MySQL (meaning this will work in dBeaver), but not in SQLite (meaning this will not work in the browser demos).

Since we tend to join tables where the data has a relationship, it's not uncommon to want to delete data from tables that we might join.

Similarly, we may want to create a condition for our deletion using data from a table join.

In other words, you might want to join tables and delete data from one table, or both, depending on the circumstances.

The syntax for this is pretty straightforward - we simply specify the table(s) to target using the table alias. We reference the table alias between 'DELETE' and 'FROM', and then write our JOIN after 'FROM'.

The basic syntax is:

DELETE {alias1[, alias2]} 
    FROM {table1} {alias1}
    JOIN {table2} {alias2}
    ON {alias1}.{column} = {alias2}.{column}
    [WHERE {condition}];

Data Definition Language

Data Definition Language (DDL), is, as you might expect, a category of commands for, defining data. Where Data Manipulation Language allows us to add, change or remove rows, Data Definition Language provides us the tools to add, change or remove components of a database, and databases themselves.

Using DDL, you can add, remove, or change tables and databases2, and you can add things (like procedures, functions and views) that we haven't even learned about yet!

We're not going to do a deep dive on DDL today. After we come back from reading week, we'll start creating our own tables, but for now I just want us to take a look at the commands so they'll be familiar later.

In these examples below, we'll look at how DDL commands work with tables.

CREATE

CREATE, you guessed it, creates things (like tables).

The basic syntax is:

CREATE TABLE [IF NOT EXISTS] {table} (
    {column name} {data type} [NOT NULL] [DEFAULT {defaultvalue}] [AUTO_INCREMENT] [UNIQUE] [PRIMARY KEY],
    {column name} {data type}       
);

Okay, we've got a few things going on here, so let's break it down:

IF NOT EXISTS is optional (everything in the square brackets is optional in these syntax outlines).

If you omit IF NOT EXISTS, that doesn't mean your CREATE TABLE statement will overwrite an existing table. IF NOT EXISTS simply prevents the database from throwing a Table already exists error.

Inside the parentheses, you define the table's columns. Every column needs a name and a data type.

NOT NULL creates a rule, saying that the column will not accept null values. Anytime you try to insert a row, there MUST be non-null data being inserted into that column, or your insert will fail.

DEFAULT creates a rule, saying that if no value is provided for that column, it will get the specified value instead of being null.

AUTO_INCREMENT creates a rule, saying that if no value is provided for that column, it will be the value of that column from the previous row, plus one3. If it's the first row, it'll start at 1. Only one AUTO_INCREMENT column is allowed per table, and it must be set as a KEY.

UNIQUE creates a rule, saying that values in this column must be unique. Other rows cannot have a duplicate value in this column.

PRIMARY KEY declares this column as the primary key4. This also makes this column have the rules NOT NULL and UNIQUE. You don't need to specify NOT NULL and UNIQUE if you're defining a column as the primary key.

All these possible rules make up the "column definition". When I write '{definition}' in the future, it's referring to these properties.

ALTER

ALTER is a powerful command that can let us change a lot of things about a table. Today we'll just look at how it can help us add, change or remove columns.

The basic syntax for adding a column to an existing table:

ALTER TABLE {table}
ADD [COLUMN] {column} {data type} [{definition}];

The basic syntax for removing a column from an existing table:

ALTER TABLE {table}
DROP [COLUMN] {column};

Note that the word COLUMN is optional in the above statements. If you leave out COLUMN, the statement will do exactly the same thing. That said, I'd argue that being explicit about what you're doing increases the readability of the code, and that's always a good thing.

The basic syntax for altering a column, including the column name, in an existing table:

ALTER TABLE {table}
CHANGE 
  {old column name} {new column name} {data type} [{definition}];

The basic syntax for altering a column, not including the column name, in an existing table:

ALTER TABLE {table}
MODIFY {column} {data type} [{definition}];

Note that when you change or modify a column, you need to be explicit about the data type and definition. Once a column is changed or modified, it won't "remember" that it was NOT NULL (or whatever) before.

Every time you modify or change a column, you need to include the data type and the necessary definitions.

The basic syntax for dropping a column:

ALTER TABLE {table}
DROP COLUMN {column name};

It's almost too easy!

DROP

Dropping tables is easy! SO BE CAREFUL. It's really hard, and sometimes impossible, Opens in a new window to recover data from a dropped table.

The basic syntax for dropping a table:

DROP TABLE [IF EXISTS] {table};

Footnotes

1 Again, this is a matter of semantics, but some would omit DQL as a category entirely and consider SELECT to be a part of DML, since SELECT manipulates data en route to the results table. Others argue that this doesn't count as manipulating data, since results tables are temporary, and nothing in the database is affected by the query. And honestly, it makes me question whether I should even mention this stuff to you guys, because it's not going to help you remember this stuff even better, but I don't have the constitution to present information to you as if it's absolute truth when there are conflicting sources out there. Better, I think, for you to be confused in class, than to be given confidence in class, only to be confused once you're out on your own.
↑ Back to reference 1
2 I'm not gonna let you create, update or delete databases on my server, though. Get your own!
↑ Back to reference 2
3 ...by default. You can set up your AUTO_INCREMENT column to increment by other values.
↑ Back to reference 3
4 It's possible to make your primary key a combination of columns, but not by declaring it in the column definition. You have to use CONSTRAINT, but we're not going to learn about constraints yet - we'll save that until we're back from the break.
↑ Back to reference 4

Exercises!

  1. Create a new table in your database. Give it an auto-incrementing primary key, and one column each with the data type date, time, and decimal.
  2. Insert three rows, without explicitly adding a value to the primary key column.
  3. Replace one of the existing rows.
  4. Update one (and only one) of the other rows.
  5. Delete one row.
  6. Add a column. Make it so that this column will only accept whole numbers, and will not accept null values.
  7. Rename this column.
  8. Give this column a default value.
  9. Delete a column.
  10. Delete your table.