Week 9 - Views & Custom Functions

This week we'll cover creating views and custom functions in MySQL

Saving (things that aren't data) in the database
Views
Running Scripts instead of statements
What is a delimiter?
What is a function?
The syntax for creating a function
How to use custom functions
Deterministic vs Non-deterministic functions
Dropping a function
Altering a function

Saving (things that aren't data) in the database

For the next few weeks, we'll be looking at four things that you can save to your database that aren't tables, or rows or columns or data values.

Instead, they are custom tools for (ideally) making your database safer, simpler and easier to use.

This week we're looking at views & custom functions.

Views are essentially saved results tables that you can query.

Custom functions are just like the built-in scalar functions, like LOWER(), SUBSTR(), or ROUND(), except we get to define how the data is processed!

Next week, we'll look at triggers.

Triggers are things that automatically happen before or after data is changed in a table.

The week after, we'll look at stored procedures, which are programmed subroutines - real procedural programming!

Stored procedures are saved "scripts" that can run DML commands (CREATE, INSERT, UPDATE, DELETE, etc.) based on logic we create.

Views

What is a view?

Views are "virtual tables" - a saved query that returns a results table, which can then be used as if it were an existing table in the database.

You've probably noticed at this point that good database design requires splitting your data up into numerous tables, often requiring joins on three or more tables at a time. If you were required to write out these complex joins over and over again, that would be a lot of work!

That's where views come in - write your query once, and save it. Now you can not only easily fetch that data, but you can query the results table as if it were an existing database table.

Creating a VIEW

The syntax for creating a view is very simple:

CREATE VIEW {viewName} AS {query}

Let's create a view in our library table:

CREATE VIEW americanAuthors AS
SELECT * FROM authors
WHERE country IN ('USA', 'U.S.');

Then we can query the view in the same way we'd write a select statement, only using the view in place of a table:

SELECT * FROM americanAuthors
WHERE SUBSTR(first_name,1,1) = 'D';

So this is a new database table, right?

NOPE

A view is not a database table. A view is just a shortcut - a shortcut to running a query and getting back the results table.

Every time you interact with the view, you are running the saved query and getting fresh results. Views don't represent any stored data, just a stored query.

When to use views

You find yourself writing the same SELECT statements repeatedly.
You find yourself writing the same complex SELECT statements (i.e. with JOINs or complex WHERE clauses) occasionally.
You want to grant a user access to some of the columns in a table, but not all.

When not to use views

When you think you'll need to alter the table structure in the future, and you're worried you'll forget to update the views to match.

Do views impact performance?

Nope, views don't have an impact on performance, either good or bad. They're just stored queries. You can't index them, and they don't partition your tables. On the other hand, they don't eat up space or memory either, they're essentially no different than if you wrote out the queries long-hand.

The only consideration with views is whether they can make the database easier to use.

Can I use UPDATE, INSERT or DELETE with views?

You can use UPDATE, INSERT, and DELETE with some views.

There is a long list Opens in a new window of qualifications for a view that can can accept DML statements. What it boils down to is this: for a view to be "updateable", there must be a direct relationship between the rows in the view and the rows in the underlying table. This means:

No DISTINCT
No aggregate functions
No GROUP BY or HAVING
No outer joins

You can also only use DML on one of the underlying tables if your view contains an inner join.

And, of course all the constraints of the underlying table still apply.

Should I use UPDATE, INSERT or DELETE with views?

I'd argue that you should not. Whatever possible gains you might make in syntactical simplicity are outweighed by the risks of adding abstraction to your data manipulations.

Simpler to write? Maybe. Harder to keep safe and organized? Usually, yes.

How do I replace, alter or drop my view?

The syntax for replacing, altering and dropping views probably won't shock you...

Deleting a view:

DROP VIEW {viewName};

Altering a view:

ALTER VIEW {viewName}
AS {selectStatement};

Replacing a view (maybe slightly different than you'd expect):

CREATE OR REPLACE VIEW {viewName}
AS {selectStatement};

Running Scripts instead of statements

So far, the things we've been doing are SQL statements - telling our database engines to do one thing.

Today we're going to start talking about SQL scripts - telling our database engines to do multiple things.

This means we'll need to learn about delimiters, but first, we need to learn how to run scripts in DBeaver. Don't worry, it's just a different button!

You can highlight some code to run as a script, or you can press the 'run script' button, and DBeaver will execute all of the statements in your script (.sql document).

From now on, when you tell me "my code isn't working", the first thing I'm gonna ask is "did you run it as a script?"

What is a delimiter?

When we write in english, we write in sentences and paragraphs.

If you were to just write one sentence, you wouldn't really need something to say where the sentence stops, because there's only one sentence - it just ends

However, if you're writing a paragraph with multiple sentences, you need periods. This is how the reader knows where one sentence ends, and the next begins.

Periods are the default, but there are other ways to end a sentence, aren't there? There are exclamation points! And sometimes question marks? Or even ellipses...

In an SQL script, we have a default way of delimiting statements (or, in other words, showing where one statement ends and the next begins). We have the semicolon (;) as our default delimiter.

Today we'll be talking to the database about commands we want it to run later. These commands will have semicolons in them. However, when the database sees a semicolon, it will want to stop reading!

What we do to tackle this is to use the command DELIMITER to define a new delimiter symbol. Then, once we're done telling it about the commands we want it to run later, we'll use the command DELIMITER again to set the delimiter back to the default.

What is a function?

Very broadly, in programming a function is a block of code that does a job for us.

As you know by this point, SQL has lots of functions that are already available. They all have one thing in common - they return a single value (a single string, or number, or date, or time, etc.)

SQL has built-in scalar functions, that return one value, based on, at most, one row of data. Think of LOWER(), which we use to transform a string into all lower-case letters.

SQL also has built-in aggregate functions, which can take in a whole column of data. Think of SUM(), which adds up all the numbers in a column.

What is a custom function?

SQL also allows us to create our own functions. These functions can only be scalar functions. We get to decide what values (if any) can go into the function, and how they get transformed before they come out the other side, a.k.a get returned.

The syntax for creating a function

What is a parameter?

A parameter is a variable (a named placeholder) that we use to represent the value that will get input into the function.

Let's look at one of SQL's built-in functions:

SELECT UPPER('Simon');
-- Takes the input parameter, 
-- and outputs 'SIMON'

And of course, we can use the values from rows as our input paramters, with one value output for each row returned by the SELECT statement:

SELECT UPPER(first_name) FROM authors;

The input parameters in your custom functions are the placeholders for the values that get input between the parentheses when your function gets used in a SELECT statement. If your custom functions gets defined like this...

DELIMITER //
CREATE FUNCTION whoseWorld ( 
  personsName VARCHAR(255) 
)
RETURNS VARCHAR(263)
BEGIN
   RETURN CONCAT(personsName, '''s World');
END; //
DELIMITER ;

... you can use it in your SELECT statements like this:

SELECT whoseWorld(first_name) FROM authors;
-- Giovannino's World
-- Douglas's World
-- R.L.'s World
-- Denis's World
-- etc.

Notice in our whoseWorld() function, we're using the CONCAT() function. You can totally use functions within functions!

Okay, the punctuation nerd in me is looking at the whoseWorld() function and cringing, because if a name ends in the letter s, it shouldn't have an "'s" after it, just an apostrophe. Let's drop that function, and add to our function using an IF/ELSE statement!

Don't worry too much about picking up the IF/ELSE syntax this week - we'll learn all about how conditions & loops work in SQL next week.

SELECT whoseWorld(first_name) FROM authors;
-- Giovannino's World
-- Douglas' World
-- R.L.'s World
-- Denis' World
-- etc.

Much better!

That being said, our code is starting to look a little repetitious - writing and re-writing the whole formulation of the return value multiple times. Let's use a variable within the function to take care of that.

But first, let's take a little aside to see if we can use multiple parameters, and/or defaults for our parameters.

Multiple parameters

Can you have multiple input parameters for your custom function? Of course!

Default parameters

Want to set a default parameter in case someone forgets to pass in a value?

TOO BAD.

MySQL, unlike most other RDBMS' doesn't provide a way to set default values for parameters that don't get passed in - any missing parameter will trigger an error.

That being said, you can set null values to a default, but the null value still has to get passed in as an argument. I'll show you how to set a default for null values in a minute, after we discuss how to assign values to variables within the function with the SET command. But first:

The DECLARE command

Parameters aren't the only kind of variable we can use in our functions. We can also have "local" variables that have their values set not by an input, but within the function.

In order to use a local variable in your function, you have to give it a name and a data type. That's what the DECLARE statement is for. You have to do this first thing after the BEGIN command.

Okay, this may seem like we're making things more complicated... and we are. But with good reason! This will save us time, and help prevent errors, in the long run.

So, first we declare a variable called wrld. This holds the value that we ultimately output. This means we don't have to put our whole formula into a return statement, and we can manipulate our data in multiple steps before we output it.

Then we DECLARE and SET a variable called str. This holds a string that we were repeating in our code. Variables help reduce repetition, which means our code is easier to maintain (we only have to update it in one place), and easier to debug (either it's right or it's wrong, not a mix of the two).

Assigning a default to NULL arguments

Oh, by the way, an 'argument' is the value to pass into a function's parameter. I feel like I'd been writing code for a weirdly long time before I learned that distinction. Anyway...

Deterministic vs Non-deterministic functions

One way we can optimize the execution of our functions is by declaring them to be DETERMINISTIC.

A deterministic function is a function that always returns the same result based on the same inputs.

If you run the following SELECT statement, wait one minute, and then run it again, you'll get two different results, despite using the same input:

SELECT xHoursFromNow(2);

That makes xHoursFromNow() a non-deterministic function.

Custom functions are NON-DETERMINISTIC by default, despite the fact that most functions you write (basically all those that don't use NOW(), RAND() - which returns a random number, or similar) will be DETERMINISTIC.

How to use custom functions

I think you've picked up on this by now - you use them exactly the same as the built-in scalar functions you've already been using. You just write their name, and then an open parenthesis, and then any values you need to assign to the parameters. You get it.

Dropping a function

Okay, you've seen me do this, and I think you could've guessed it anyway:

DROP FUNCTION {functionName};

While you're developing a function, it might be handy to keep the DROP FUNCTION statement at the top of your script, so it automatically overwrites the existing definition of the function every time you run your script.

If you want to have it there the first time you run your script, you might want to add IF EXISTS, so that you won't get an error the first time you run your script:

DROP FUNCTION IF EXISTS {functionName};

Altering a function

You can't actually alter anything significant about a function after it's been created - just drop the function and recreate it with your alterations.

Finally, let's talk about your