Posts Tagged ‘row number’

Numbering items within groups in MySQL

In my previous post, I described computing row numbers for items within a result set. If items in the result set can be ordered into groups, you can use the same technique to number items within the groups. This is valuable as it allows you to get the top 3, top 5, top N … items within each group (note that GROUP BY isn’t appropriate here; we don’t want to aggregate items within a group, we want an ordering of all the items within a group).

Consider the following sample of data where we’ve done an ordering by the location and age fields (ORDER BY location DESC, age DESC). Every user’s email address is within a group (location) and within that group users are sorted by age in descending order.

SELECT mytable.*
FROM mytable
ORDER BY location DESC, age DESC

ordered data

By using two variables, one to keep track of the item number (@itm) and another to keep track of the current group (@grp), and incrementing the item number if the current group has not changed, else resetting it to 1, we get a numbering of each item within each group. In this example, the group item number tells us who is the 1st oldest, 2nd oldest, etc. user in each location.

SELECT

    
@itm := IF(@grp = location, @itm + 1, 1) AS group_item_number,
    
@grp := location,

    
mytable.*

FROM     mytable,
        (
SELECT @itm := 0) AS itm_num_init,
        (
SELECT @grp := NULL) AS grp_init

ORDER BY location DESC, age DESC

group item numbers

Row numbers in MySQL

Surprisingly straight-forward. Row numbers are useful is cases where you need the rank of a row within a result set.

SET @cur_row := 0;
SELECT mytable.*,
@cur_row := @cur_row + 1 AS row_num
FROM mytable
ORDER BY whatever;

The variable initialization can be done without a SET declaration by doing it as a JOIN:

SELECT mytable.*,
@cur_row := @cur_row + 1 AS row_num
FROM mytable, (SELECT @cur_row := 0) AS row_num_init
ORDER BY whatever;

Solution via Daniel Vassallo’s answer on StackOverflow. A comment mentioned the row numbers being calculated before ordering is done, but I haven’t found this to be the case.

While the above queries will give you the row numbers as expected, putting a HAVING or GROUP BY clause in the same statement might be dangerous. From Xaprb:

The result of assigning to a variable and using it in the same statement (in the HAVING, for example) depends on the query plan the server chooses, the phase of the moon, and probably other things too.

So, in such cases, a subquery is the safe bet:

SELECT sub.* FROM

    
(SELECT mytable.*,
            
@cur_row := @cur_row + 1 AS row_num
    
FROM mytable, (SELECT @cur_row := 0) AS row_num_init
    
ORDER BY whatever) AS sub

HAVING sub.row_num ... ;