Two less common tricks for improving unexplained slow MySQL queries

Sakis Kasampallis | Nov 8, 2014 min read
Recently I faced an SQL performance issue. What I wanted to do was rather common: Apply the (set) difference operation on two tables.

In relational algebra, the difference operation applied on two tables A and B gives as a result a new table C that contains all the elements that are in table A but aren't in table B.

That's a very common operation. A common example is having the table Students and the table Grades. To find all Students that have not been graded yet you can use the difference operation. Or in terms of set theory and using \ as the notation of difference:

{1,2,3} ∖ {2,3,4} = {1}

Some RDBMS have the difference operation built-in using the EXCEPT keyword. So the query in this case would be something like:

But that's not the case for MySQL. MySQL does not support EXCEPT but we can get the same result using a LEFT (OUTER) JOIN:
OK, so I used a LEFT JOIN to find out that the query was painfully slow. One table had 700 thousand records and the other 130 thousand records. For a relational database that's not a big deal (only a few seconds, let's say maximum 3).

If you search on the Web for slow LEFT JOIN query you'll see that everyone recommends (a) adding indexes and (b) using the SQL optimizer. Well, I had already done both things without achieving my goal. I added the indexes using CREATE INDEX and then used the optimizer by adding EXPLAIN in front of my query and made sure that the indexes were used properly.

So, what's left? Actually there are two other important things to check. The first is to inspect the output of SHOW PROCESSLIST. This will show you the list of active processes on the server. When writing queries killing the SQL client because it crashed/became unresponsive due to a bad query is not unusual. But killing the client does not necessarily mean that the query is killed. There might still be orphan queries that eat the resources of your server but you have no control of them. You can kill them using KILL PROCESS_ID (replace PROCESS_ID with the actual ID of the orphan process).

The second thing that really impacts the performance of MySQL is when trying to join tables that use different collations. MySQL uses by default the legacy latin1_swedish_ci collation, so if one of the tables you are trying to join is using a different collation (for example utf8_unicode_ci, which makes much more sense as a default nowadays) the joins become terribly slow. Just make sure that all database tables use the same collation.

Happy querying!