Thursday, September 15, 2005

Top N

A Frequently Asked Question on many Oracle related forums is “How can I get the first/last N rows in my table?” Sometimes that’s not an easy question to answer. I always ask the poster what they mean by “first”. Some people mean the order the records were inserted, some people mean the top-N rows. I’ve found it’s always easier to get the poster to explain what they are trying to do instead of guessing.

First/Last – in an ordered set
This one is easy. You take advantage of inline views and Oracle’s pseudocolumn ROWNUM. People usually get tripped up when they use ROWNUM because while ROWNUM is relevant in all queries, in this type of query it only makes sense when coupled with an ORDER BY.

For example:

SQL> select * from xyz;

X Y Z
---------- ---------- -------------------
127 1 09/16/2005 14:46:50
148 22 09/17/2005 14:48:42
149 23 09/18/2005 14:48:42
150 24 09/19/2005 14:48:42
131 5 09/20/2005 14:46:50
151 25 09/20/2005 14:48:42
133 7 09/22/2005 14:46:50
152 26 09/21/2005 14:48:42
153 27 09/22/2005 14:48:42
154 28 09/23/2005 14:48:42
137 11 09/16/2005 14:46:50
155 29 09/24/2005 14:48:42
139 13 09/18/2005 14:46:50
156 30 09/15/2005 14:48:42
143 17 09/22/2005 14:46:50
145 19 09/24/2005 14:46:50
147 21 09/16/2005 14:48:42

17 rows selected.

SQL> select rownum, x, y, z from xyz;

ROWNUM X Y Z
---------- ---------- ---------- -------------------
1 127 1 09/16/2005 14:46:50
2 148 22 09/17/2005 14:48:42
3 149 23 09/18/2005 14:48:42
4 150 24 09/19/2005 14:48:42
5 131 5 09/20/2005 14:46:50
6 151 25 09/20/2005 14:48:42
7 133 7 09/22/2005 14:46:50
8 152 26 09/21/2005 14:48:42
9 153 27 09/22/2005 14:48:42
10 154 28 09/23/2005 14:48:42
11 137 11 09/16/2005 14:46:50
12 155 29 09/24/2005 14:48:42
13 139 13 09/18/2005 14:46:50
14 156 30 09/15/2005 14:48:42
15 143 17 09/22/2005 14:46:50
16 145 19 09/24/2005 14:46:50
17 147 21 09/16/2005 14:48:42


Here, ROWNUM does exactly what it’s supposed to do. It gives you a sequential number according to how the rows were pulled out of the table. There is obviously no logical order (to the human eye, anyway), but that’s how the rows come out. If you restrict by ROWNUM, you will certainly get a result, but probably not what you expect:

SQL> select * from xyz
2 where rownum < 6
3 /

X Y Z
---------- ---------- -------------------
127 1 09/16/2005 14:46:50
148 22 09/17/2005 14:48:42
149 23 09/18/2005 14:48:42
150 24 09/19/2005 14:48:42
131 5 09/20/2005 14:46:50


So what to do? Just put an ORDER BY in?


SQL> select * from xyz
2 where rownum < 6
3 order by y
4 /

X Y Z
---------- ---------- -------------------
127 1 09/16/2005 14:46:50
131 5 09/20/2005 14:46:50
148 22 09/17/2005 14:48:42
149 23 09/18/2005 14:48:42
150 24 09/19/2005 14:48:42


Nope. But we’re getting closer. You have to use your ordered query as a subquery and then apply the filter.

SQL> select * from (
2 select * from xyz
3 order by y
4 )
5 where rownum < 6
6 /

X Y Z
---------- ---------- -------------------
127 1 09/16/2005 14:46:50
131 5 09/20/2005 14:46:50
133 7 09/22/2005 14:46:50
137 11 09/16/2005 14:46:50
139 13 09/18/2005 14:46:50


Bingo. That’s what we wanted. Now suppose we wanted a “window” from 6 to 10:


SQL> select o.x, o.y, o.z from (
2 select rownum r, i.x, i.y, i.z from (
3 select x, y, z from xyz
4 order by y ) i
5 where rownum < 11 )o
6 where o.r > 5
7 /

X Y Z
---------- ---------- -------------------
143 17 09/22/2005 14:46:50
145 19 09/24/2005 14:46:50
147 21 09/16/2005 14:48:42
148 22 09/17/2005 14:48:42
149 23 09/18/2005 14:48:42


First/Last – in insertion order
Since there is no inherent way for you to figure out the order the rows were put into a table, this one is a little trickier. The only way you can positively indicate the order the rows were inserted into the table is if you explicitly mark each record. Sometimes this may be a “DATE_INSERTED” field or it may be some other field that indicates order. In my example, I have created a BEFORE INSERT trigger to populate the Primary Key (x) with a value from a sequence:


create trigger xyz_bi
before insert
on xyz
for each row
declare
pkval number;

begin
select xyz_id.nextval into pkval from dual;

:new.x := pkval;
end;


Therefore, I know my rows were inserted in the order according to X. I can then use my PK value to find out the last 10 rows inserted by:

SQL> select * from (
2 select * from xyz
3 order by x desc
4 )
5 where rownum < 11
6 /

X Y Z
---------- ---------- -------------------
156 30 09/15/2005 14:48:42
155 29 09/24/2005 14:48:42
154 28 09/23/2005 14:48:42
153 27 09/22/2005 14:48:42
152 26 09/21/2005 14:48:42
151 25 09/20/2005 14:48:42
150 24 09/19/2005 14:48:42
149 23 09/18/2005 14:48:42
148 22 09/17/2005 14:48:42
147 21 09/16/2005 14:48:42

10 rows selected.

4 comments:

Robert Vollman said...

Related links:

Niall:
http://www.jlcomp.demon.co.uk/faq/top_sql.html

Rajeev:
http://www.jlcomp.demon.co.uk/faq/top_10_per.html

Thiru:
http://www.orafaq.com/faq/Server_Utilities/SQL/faq24.htm

AskTom:
http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:1212501913138

Barbara:
http://forums.oracle.com/forums/thread.jspa?messageID=1645٭

Jeff Hunter said...

Actually, no, you can't rely on ora_rowscn. For example:
SQL> l
1* select ora_rowscn, x, y, z from xyz
SQL> /

ORA_ROWSCN X Y Z
---------- ---------- ---------- ---------
423566 127 1 16-SEP-05
423566 148 22 17-SEP-05
423566 149 23 18-SEP-05
423566 150 24 19-SEP-05
423566 131 5 20-SEP-05
423566 151 25 20-SEP-05
423566 133 7 22-SEP-05
423566 152 26 21-SEP-05
423566 153 27 22-SEP-05
423566 154 28 23-SEP-05
423566 137 11 16-SEP-05
423566 155 29 24-SEP-05
423566 139 13 18-SEP-05
423566 156 30 15-SEP-05
423566 143 17 22-SEP-05
423566 145 19 24-SEP-05
423566 147 21 16-SEP-05

17 rows selected.

Anonymous said...

Jeff,

It was only 2 days ago I was googling for this, your tip explained Top N queries better. Since I am reading Top N for the second time I much clearer now. Thanks.

Ram.

Jeff Hunter said...

Thanks.