SQL Statement Tips

This guide provides tips for using certain SQL statements.

Introduction

There are several techniques for improving use of certain SQL statements. SQLite includes some special case optimizations for the MIN() and MAX() aggregate functions. Where possible, you should use UNION ALL instead of UNION. And you should avoid using OFFSET for scrolling cursors.

Intended audience:

This document is intended to be used by Symbian platform licensees and third party application developers.

The MIN and MAX Optimization

SQLite includes some special case optimizations for the MIN() and MAX() aggregate functions. The normal way of evaluating a MIN() or MAX() aggregate function is to run the entire query, examine every row of the result set, and pick the largest or smallest value. Hence, the following two queries take roughly the same amount of work:

SELECT x FROM demo326 WHERE y>11.5;
SELECT min(x) FROM demo326 WHERE y>11.5;

The only difference in the above two SELECT statements is that the first returns every possible value for “x” whereas the second only returns the smallest. Both require about the same amount of time to run.

But there are some special cases where MIN() and MAX() are very fast. If the result set of a SELECT consists of only the MIN() function or the MAX() function and the argument to that function is an indexed column or the RowID and there is no WHERE or HAVING or GROUP BY clause then the query runs in logarithmic time. So these two queries are very quick:

SELECT min(x) FROM demo326;
SELECT max(x) FROm demo326;

Note that the result set must contain a single column. The following query is much slower:

SELECT min(x), max(x) FROM demo326;

If you need the results from this last query quickly, you can rewrite it as follows:

SELECT (SELECT min(x) FROM demo326), (SELECT max(x) FROM dem326);

In other words, break the query up into two subqueries both of which can be optimized using the MIN/MAX optimization.

Note also that the result set must not be an expression on MIN() or MAX() - it needs to be a plain MIN() or MAX() and nothing else. So the following query is slower:

SELECT max(x)+1 FROM demo326;

As before, achieve the same result quickly using a subquery:

SELECT (SELECT max(x) FROM demo326)+1;

Use UNION ALL in Place of UNION where Possible

SQLite, in accordance with the SQL standard, allows two or more SELECT statements to be combined using operators UNION, UNION ALL, INTERSECT, or EXCEPT. The first two, UNION and UNION ALL, are the subject of this section.

The UNION and UNION ALL operators do very nearly the same thing, but with one important difference. Both operators return the union of all rows from their left and right queries. The difference is that the UNION operator removes duplicates whereas the UNION ALL operator does not. To look it another way, the following two queries are equivalent:

SELECT * FROM tableA UNION SELECT * FROM tableB;
SELECT DISTINCT * FROM (SELECT * FROM tableA UNION ALL SELECT * FROM tableB);

When you look at it this way, you should clearly see that UNION is just UNION ALL with some extra work to compute the DISTINCT operation. You should also see that UNION ALL is noticeably faster than UNION and uses less temporary storage space too.

If you need uniqueness of output values then by all means use UNION. It is there for you and it works. But if you know in advance that your results will be unique or if you do not care, UNION ALL will almost always run faster.

Avoid Using OFFSET for Scrolling Cursors

A common design pattern is to show the results of a large query result in a scrolling window. The query result might contain hundreds or thousands of rows, but only a handful are shown to the user at one time. The user clicks the “Up” or “Down” buttons or drags a scrollbar to move up and down the list.

A common example of this is in media players where a user has requested to see all “albums” of a particular “genre”. There might be 200 such albums stored on the device, but the display window is only large enough to show 5 at a time. So the first 5 albums are displayed initially. When the user clicks the Down button the display scrolls down to the next 5 albums. When the user presses the Up button the display scrolls back up to the previous 5 albums.

The naïve approach for implementing this behavior is to keep the index of the topmost album currently displayed. When the user presses Up or Down this index is decremented or incremented by 5 and a query like this is run to refill the display:

SELECT title FROM album WHERE genre='classical' ORDER BY title LIMIT 5 OFFSET ?

The bound parameter on the offset field would be filled in with the index of the topmost album to be displayed and the query is run to generate five album titles. Presumably the album table is indexed in such a way that both the WHERE clause and the ORDER BY clause can be satisfied using the index so that no accumulation of results and sorting is required. Perhaps the index looks like this:

CREATE INDEX album_idx1 ON album(genre, title);

This approach works find as long as the offset value is small. But the time needed to evaluate this query grows linearly with the offset. So as the user scrolls down towards the bottom of the list, the response time for each click becomes longer and longer.

A better approach is to remember the top and bottom title currently being displayed. (The application probably has to do this already in order be able to display the titles). To scroll down, run this query:

SELECT title FROM album WHERE genre='classical' AND title>:bottom
    ORDER BY title ASC LIMIT 5;

And to scroll back up, use this query:

SELECT title FROM album WHERE genre='classical' AND title<:top
    ORDER BY title DESC LIMIT 5;

For the scrolling down case the addition of the “title>:bottom” term (where :bottom is a parameter which is bound to the title of the bottom element currently displayed) causes SQLite to jump immediately to the first entry past the current display. There is no longer a need for an OFFSET clause in the query, though we still include “LIMIT 5”. The same index will still work to optimize both the WHERE clause and the ORDER BY clause.

The scrolling up case is very similar, although in this case we are looking for titles that are less than the current top element. We have also added the “DESC” tag to the ORDER BY clause so that titles will come out in “descending” order. (The sort order is descending, but the order is ascending if you are talking about the order in which the titles are displayed in the scrolling window). As before, the same album_idx1 index is able handle both terms of the WHERE clause and the descending ORDER BY clause.

Both of these queries should be much faster than using OFFSET, especially when the OFFSET is large. OFFSET is convenient for use in ad hoc queries entered on a workstation, but it is rarely helpful in an embedded application. An indexing scheme such as described here is only slightly more complex to implement but is much faster from user's perspective.

Related concepts