I started tracing the user's session and found that for every row in table T the query was executing function getCount (the names have been changed and the function simplified to clarify the example).
SELECT x, getcount()
FROM t
OK, seams reasonable enough, right. But inside getCount() was a query:
create or replace function getcount return integer
as
c integer;
begin
select count(*) into c from xyz where (code1='7' or code2='3');
return c;
end;
Ah, here's the query that showed up in the trace as being the top dog.
SELECT COUNT(*)
FROM
XYZ WHERE (CODE1='7' OR CODE2='3')
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 2.47 8.55 57428 63059 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 2.47 8.55 57428 63059 0 1
"Hmm, let me just fix this and let the developer know how to fix his problem" I thought to myself, "he'll appreciate it."
So I suggested the following query for the fix:
select count(*) from (
select id from xyz where code1='7'
union
select id from xyz where code2='3'
)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.05 0.05 0 666 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.05 0.05 0 666 0 1
The number of logical reads was much better and the wall clock response time was much better as well. Perfectly suitable solution.
Well, proving that DBAs don't know anything about how to write code, the fix ended up being:
create or replace function getcount return integer
as
cursor c1 is select count(*) cnt from xyz where code1='7';
cursor c2 is select count(*) cnt from xyz where code2='3';
t integer := 0;
begin
for i in c1 loop
t := t+i.cnt;
end loop;
for i in c2 loop
t := t+i.cnt;
end loop;
return t;
end;
/
SELECT COUNT(*) CNT
FROM
XYZ WHERE CODE2='3'
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.03 0.03 0 387 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.03 0.03 0 387 0 1
SELECT COUNT(*) CNT
FROM
XYZ WHERE CODE1='7'
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.03 0.02 0 279 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.03 0.02 0 279 0 1
OK, that works. No reason to do something the easy way when you can write code and do it the cool way.
Addition: As Gary points out in the coments, there is a potential in the new function for records with both a '3' AND a '7' to be picked up twice. In my actual data, there is a business rule that makes sure that doesn't happen, but in general it is an issue.
5 comments:
Just to point out that
select count(*) into c from xyz where (code1='7' or code2='3');
doesn't necessarily equal
select sum(cnt) from (
select count(*) cnt from xyz where code1='7'
union
select count(*) cnt from xyz where code2='3' )
Firstly, it should be UNION ALL as if both counts returned 3, the UNION would give a single 3 row.
Even your eventual solution has a potential double counting issue if entries can have both
code1='7' AND code2='3'
create or replace function getcount return integer as
cursor c1 is select count(*) cnt from xyz where code1='7';
cursor c2 is select count(*) cnt from xyz where code2='3' and NVL(code1,'0')!='7';
t integer := 0;
begin
for i in c1 loop
t := t+i.cnt;
end loop;
for i in c2 loop
t := t+i.cnt;
end loop;
return t;
end getcount;
/
Yeah, you're right, that's the problem with over-simplifying the original logic. The original actually did a:
select count(*) from (
select * from xyz where code1='7'
union
select * from xyz where code2='3'
)
updated the main text to reflect that...
Does there really need to be a plsql function in the original query at all?
It looks like you fixed the performance of the plsql function but can invoking the plsql function be skipped?
The old Tom Kyte refrain of do it all in purely sql ( and oracle's extentions to sql syntax ) might apply here.
John Hurley said...
Does there really need to be a plsql function in the original query at all?
In this specific example, no, but the logic is a little more complex in the actual PL/SQL function.
test comment using openID
Post a Comment