So, I was working on tuning a query for a developer the other day and found something interesting. The query was of the form:
SELECT
a.id,
b.description,
some_funky_package_call(a.id, b.description)
FROM
tableA a,
tableB b
WHERE a.id = b.id
AND some_funky_package_call(a.id, b.description) <> 0
After I played around with it a bit, I found that the slowest part was using the function call in the WHERE clause. If the query included the package call in the WHERE clause, the query finished in over an hour. If the package call was not in the WHERE clause, the query finished in 5 minutes (but did not return the correct results).
No problem, we'll just use an inline view and filter later. So I came up with:
SELECT id, description, bal
FROM (
SELECT
a.id,
b.description,
some_funky_package_call(a.id, b.description) bal
FROM
tableA a,
tableB b
WHERE a.id = b.id
)
WHERE bal <> 0
Confident in my fondness for inline views, I ran the query fully expecting to get the results back in a few minutes. Except the query went on, and on, and on for a full 15 minutes before I killed it. I traced the session and found out the execution profile was not different than my original query.
Eventually, I came up with a solution using HAVING. My data is such that tableA.id is guaranteed to be unique, so I knew a.id, b.description would also be unique. I changed my query to:
SELECT id, description, sum(bal)
FROM (
SELECT
a.id,
b.description,
some_funky_package_call(a.id, b.description) bal
FROM
tableA a,
tableB b
WHERE a.id = b.id
)
GROUP BY id, description
HAVING sum(bal) <> 0
Since I know that only one row will be returned per a.id, I also knew that SUM() would be summing one row. The GROUP BY essentially was doing nothing, but the HAVING clause filtered after Oracle already figured out the result set.
Edit: See "
Interesting Optimizer Result, Part II" for the resolution.