SQL Window functions are the functions related to the ordering of elements
in each given partition.
In the last post,
I have introduced an example of getting top-3 elements
for each partition using the rank()
function.
row_number()
, rank()
and dense_rank()
are SQL Window functions
that number each element in ordering. They look similar.
In fact, the results of the functions are the same
if all the values of the ordering column are unique.
But, the results of each function become different if there are the tie values,
that is, the same elements in order.
By definition, row_number()
just numbers each row through the ordering.
Therefore, the result of row_number()
is different
even if the value of the ordering column is the same.
rank()
and dense_rank()
return the rank of the ordering.
Therefore, the resulting number of rank()
is the same
when the value of the ordering column is the same.
But the next rank number is different.
Assume that there are two top elements and the third one.
rank()
returns 3 for the third element but dense_rank()
returns 2
since the previous rank was 1.
Let’s see the differences by the examples. Assume that the contents of rankEx
are as follows:
part | value |
---|---|
1 | aaa |
1 | aa |
1 | aab |
1 | aaa |
2 | aaa |
2 | bbb |
2 | bbb |
2 | cc |
2 | cc |
The result of row_number()
is as follows:
1 | select *, |
part | value | num |
---|---|---|
1 | aa | 1 |
1 | aaa | 2 |
1 | aaa | 3 |
1 | aab | 4 |
2 | aaa | 1 |
2 | bbb | 2 |
2 | bbb | 3 |
2 | cc | 4 |
2 | cc | 5 |
The result of rank()
is as follows:
1 | select *, |
part | value | num |
---|---|---|
1 | aa | 1 |
1 | aaa | 2 |
1 | aaa | 2 |
1 | aab | 4 |
2 | aaa | 1 |
2 | bbb | 2 |
2 | bbb | 2 |
2 | cc | 4 |
2 | cc | 4 |
The result of dense_rank()
is as follows:
1 | select *, |
part | value | num |
---|---|---|
1 | aa | 1 |
1 | aaa | 2 |
1 | aaa | 2 |
1 | aab | 3 |
2 | aaa | 1 |
2 | bbb | 2 |
2 | bbb | 2 |
2 | cc | 3 |
2 | cc | 3 |
row_number()
is used instead of rank()
or dense_rank()
when the size of the results must be fixed. The statement WHERE rank = 1
does not guarantee the number of results is 1 for each partition if the rank
column is generated from rank()
function.
row_number()
can be used to generate the sequence number of each row.