Tag: sql

MySQL 8.0 joins Google Cloud SQL

This is the announcement blogpost and this is my overview taken from medium:

The one where MySQL 8.0 lands on Google Cloud SQL

Ada Doglace and Lily Grace (lilygrams)
Ada Doglace and Lily Grace (lilygrams). Photo by Anthony Ferrara.

There are many things that makes me happy. Puppies (see picture), food, wine and databases… (not particularly in that order). And things that makes me even happier such as a well designed schema and proper usage of ORM (Object Relational Mapping).

MySQL was the database I used to love to hate. It grew on me and the fact that long strides were made to make it more consistent and more modern kept me away from using other open source databases on a daily basis for my projects. And with grand expectation and admiration I am proud of the Cloud SQL team for achieving this milestone:

Today we are making available MySQL 8.0 on Cloud SQL. It’s not just a new version inside our managed databases portfolio, but it also comes with all Cloud SQL capabilities such as Automatic Storage Increase, High Availability, Cross-region Replication and PITR (point in time recovery). And best yet: It’s not a beta, it’s a GA launch! See the announcement.

You can start using today from scratch or migrate an existing MySQL database to Cloud SQL; as a way to minimize downtime you can use the External Replication feature from your Google Cloud Console or via the gcloud command-line tool.

MySQL 8.0 has a huge list of new features, and you can do a range of new operations and querying. You can even do silly things like Fibonacci sequence using SQL (it is true!), traverse a Binary Tree or even do the old fizz-buzz:

WITH RECURSIVE fizz_buzz (sequence, modulo_3, modulo_5) AS (
  SELECT 1, CAST('' AS CHAR(4)), CAST('' AS CHAR(5))
  UNION ALL
  SELECT sequence + 1,
         IF(MOD(sequence + 1, 3) = 0, 'Fizz', ''),
         IF(MOD(sequence + 1, 5) = 0, 'Buzz', '')
  FROM fizz_buzz
  WHERE sequence < 100
)
SELECT
       IF(CONCAT(modulo_3, modulo_5) = '', sequence, CONCAT(modulo_3, modulo_5)) AS fizzbuzz
FROM fizz_buzz;

On a more serious note, and enterprise worthy, you can now avoid sub-queries (or other N+1 problems), have better access control, and use Window Functions. There are plenty of new stuff, and this is a small list of talks on MySQL 8.0 I gave since the preview versions in 2017. One of my favorites is the following:

phpDay about MySQL 8.0 Features

At Google we define anyone that uses a keyboard to work on a technical aspect of a product a Technical Practitioner. It is a wider definition, however more inclusive to the DBAs, DevOps and SysAdmins alike.

My particular goal has always been to help the day-to-day practitioner do things they wouldn’t imagine their database was capable of. This is why I have the Office Hours. This is why I do those talks and should do more blog posts.

If you want to learn more about MySQL 8.0 I ask you to read my website: gabi.dev, where I posted several things on MySQL.

Happy Launch day!

Your SQL IS NOT JavaScript (neither PHP)

IS NULL, IS NOT NULL, truthiness among other assertions on MySQL and PostgreSQL

People like to complain about JavaScript, how can one thing be equal to the other, i.e. null == undefined evaluates to true unless you use the triple equals ===. Other dynamic typed languages have its peculiarities, such as Ruby where 0 == true evaluates to true, the reason for Ruby is that considers 0 as a value and any value evaluates to true.

How about SQL? The answer is… it depends.

Which database are you using?

Some may be more forgiving, like MySQL doing casts for you all over the place, or more strict like PostgreSQL where you can only compare the truthiness of something of the same type.

Assertion MySQL PostgreSQL
('A' = TRUE) IS TRUE 0 invalid input syntax for type boolean: “A”
('A' IS TRUE) IS TRUE 0 invalid input syntax for type boolean: “A”
(1 = TRUE) IS TRUE 1 operator does not exist: integer = boolean
(1 IS TRUE) IS TRUE 1 argument of IS TRUE must be type boolean, not type integer
('1' IS TRUE) IS TRUE 1 1
('0' IS FALSE) IS TRUE 1 1
(1 = '1') IS TRUE 1 1
(0 = '0') IS TRUE 1 1

Beyond the implications of wrong type comparison, as you can see on lines 1 through 4 where MySQL evaluates values as true or false, you should also worry about what the fact of a column being NULL may imply when comparing two columns.

Suddenly knowing if it is true or false doesn’t matter because you cornered yourself with a third possible value: NULL. By definition NULL is not a value neither a state, it should be considered “garbage”, and no column where you know its value and type would be garbage.

In this front both databases operate the same way:

Assertion SQL
(NULL = NULL) IS TRUE 0
(NULL = NULL) IS FALSE 0
(NULL IS NULL) IS TRUE 1
(0 = NULL) IS NULL 1
(1 = NULL) IS NULL 1
('A' = NULL) IS NULL 1
(TRUE = NULL) IS NULL 1
(FALSE = NULL) IS NULL 1
(0 IS NULL) IS TRUE 0
(0 IS NOT NULL) IS TRUE 0
(0 = NULL) IS TRUE 0
(0 = NULL) IS FALSE 0
(0 = NULL) IS NOT TRUE 1
(0 = NULL) IS NOT FALSE 1
('NULL' = NULL) IS TRUE 0
('NULL' IS NULL) IS TRUE 0

The highlighted parts are usually assumptions that people expect to behave differently, for example that zero is not equal to NULL. That catches many people by surprise. It shouldn’t catch you by surprise though, because 0 is a value. The default behavior of MySQL of casting NULL to 0, on a INTEGER NOT NULL column without a DEFAULT value taught a whole generation of developers that this assumption is true. The same applies when casting a string-based column that has no default value on a NOT NULL column to empty string.

Is that a spaceship operator?

While writing this post and taking a peek into MySQL documentation, I never noticed that a NULL-safe equal operator, <=>, existed, unfortunately the MySQL website only shows the documentation from 5.5 to 8.0, so I can’t be certain if this existed in prior releases.

SELECT
1 IS NOT NULL,
1 = NULL,
1 <=> NULL;
Assertion MySQL
1 IS NOT NULL 1
1 = NULL NULL
1 <=> NULL 0

The case where MySQL thinks it is PHP

Another behavior I discovered when doing some JOINs were the string and integer comparison. Both databases get SELECT 1 = '1'; as true, however, MySQL takes a step further:

SELECT 1 = '1a',
'1' * 3,
'1abc' + 4;
Assertion MySQL PostgreSQL
1 = '1a' 1 invalid input syntax for integer: “1a”
'1' * 3 3 3
'1abc' + 4 5 invalid input syntax for integer: “1abc”

Thus behaving like PHP. PHP is expected to change this behavior soon with this RFC. PostgreSQL, in this case, is forgiving in casting a string as an integer, only if there is an integer inside of the quotes. If you mix the integer with other characters, it throws an error as you can see above.

I Know!

No, I don’t. I thought I knew enough SQL. Apparently, I was wrong.

Independent of the language you are using, you should be aware that assertions that are true in your language may not be true in SQL and the other way as well.

Ramblings on optimizations, anti patterns and N+1

A lot of people ask me to teach them how to do query analysis and performance. The truth is: there isn’t a script to follow. The following paragraphs are a brain dump on what usually goes on my mind when I am debugging and analyzing.

Please comment on what you think I should focus on to cover here.

TL; DR;

  • It’s just a messy post with database-y stuff
  • This post doesn’t have a conclusion, it is just me laying down my thoughts on performance and optimizations.

Thoughts

Query performance is a really difficult subject to talk about. Mostly because because SQL is a declarative language, leaving it up to the Optimizer to decide which way is the best to retrieve the information needed and that is based in so many variables.

The most common problem regarding optimization I see, comes not from the Database itself, but how we handle the requests on the application layer, the following for instance would cause N+1 problems:

Code example:


users = User.all
users.each do |user|
puts "Name #{user.name}"
puts "Addresses: "
user.addresses.each do |address|
puts address.street
puts "#{address.city}#{address.state}"
end
end

view raw

n_plus_one.rb

hosted with ❤ by GitHub

Although seemingly innocent at first, this code could easily slow down performance on the database due to the amount of requests that would be made.

You also need to know about the intricacies of indexes, which one is the best, if you have a composite index, which should go first, and what happens if I only use one of the fields of a two column indexes in my search? Does it still uses the index somehow? Another rule of thumb is that if an index is a BTREE, on a single column, you can use it either ASC​ or DESC.

Or better yet: why my transactions are taking so long to complete? Does it have too many indexes on the table? Is any other query locking table X?

Even a single ​INNER JOIN could be highly costly if joining two large tables.

Why are you saving that JSON in a TEXT​ field? Since we are on the subject, you really need the JSON in the relational database and not in a document store?

You don’t need to port all your data from PostgreSQL/MySQL to MongoDB if you want to have MongoDB on your stack. Everything has its place, relational data on relational databases and non-relational data on non relational databases. I even find unfair benchmarks between a SQL database and a NoSQL one. They were made to solve different problems, you can’t possibly have the same use case for both of them.

No, it’s not ok to have category_1, category_2, ..., category_n as columns on your products table.

Avoid as much as possible nullable fields.

Relationships should also explicitly live on the RDBMS, not only on your model, if you have a user_id​ on your addresses​ table, tell the database so, naming it user_id doesn’t automatically create the foreign key.

You need:


ALTER TABLE addresses ADD CONSTRAINT `fk_user_id` FOREIGN KEY (`user_id`) REFERENCES users (`id`);

Or your migration should look something like this:


class CreateUsers < ActiveRecord::Migration[5.1]
def change
create_table :users do |t|
t.string :name
t.timestamps
end
end
end
class CreateAddresses < ActiveRecord::Migration[5.1]
def change
create_table :addresses do |t|
t.text :street
t.string :city
t.string :state
t.string :zipcode
t.integer :user_id
t.timestamps
end
add_foreign_key :addresses, :users
end
end

Line 24: adds to the table addresses​ a foreign key from users.

End

And you, what you think is missing in this blogpost? What do you want to get deeper on?