I think it's because when you don't directly specify offset for lag function Oracle has no information regarding to how many rows to store in the windows. When you specify lag(x,3) it then knows than it have to store only the three last rows, but when you specify lag(x,N) it has to store in memory (or on disk) all the data. For your particular query you can tune it, by specifying partition clause, and thus limiting the size of the analytic window. If I've understood your query correctly than you can add "partition by id" to "fix" performance problem. Your query should look like this:
SELECT COUNT(*), code FROM ( SELECT id, code, CASE WHEN first_code =3D 'LVA' THEN 0 ELSE 1 END x FROM ( SELECT id, code, lag(code, rn-1, 'AAA') over (partition by id order by id, rn) first_code FROM source ) ) WHERE x =3D 1 GROUP BY code ORDER BY COUNT(*) desc, code
But I guess what you are really want to ask is: if for particular ID there is more than one row, give me the code of the first row, else give me value of current row. Like Tom Kyte says "Tune the question, not the query". So what you really need is first_value(code) over (partition by id order by id, rn) first_code.
On 4/13/05, Gints Plivna <gints.plivna@(protected)> wrote: > The problem seems to be in lag function with varying second argument, > i.e. how many rows to look back. I'v created a small test case that > worked similarly on my home 1.8GHz Celeron with 500Mb RAM and 10g, as > well as on work box with parameters I'v described in the first mail. > I'v switched from automatic sort/hash sizing to manual and gave almost > 400 M to session that performed select below. As you can see select > with lag(1) worked fine, but select with varying lag(x) I had to kill > after almost 2 hours. And the most frustrating thing is that lag had > to look back 1 or 0 row as you can see from table create script as > well as later. And then you can see that with constant lag(0) and > lag(1) also everything works fine. >=20
--=20 Edgar -- http://www.freelists.org/webpage/oracle-l