René Nyffenegger's collection of things on the web | |
René Nyffenegger on Oracle - Most wanted - Feedback
- Follow @renenyffenegger
|
Using SQL to select the value for the last row that was not null | ||
I find it a common problem with SQL when I deal with tables that contain NULL values and if I do a select
on that table, I don't want to have NULLs returned but rather the value of the last row that was not NULL.
For example, if I maintain a table that keeps track of how much certain items cost, I'd have something like:
create table item_price ( dt date, item varchar2(30), price number(6,2) );
Now, whenever the price for an item changes, a record is inserted that contains the date (dt) when item changed the price.
alter session set nls_date_format='dd.mm.yyyy'; insert into item_price values('14.05.2001','beans',15.4); insert into item_price values('17.05.2001','pears', 2.8); insert into item_price values('28.05.2001','beans',16.2); insert into item_price values('26.05.2001','pears', 3.1); insert into item_price values('03.06.2001','beans',14.9); insert into item_price values('01.06.2001','pears', 2.9);
Now, I am interested in the price of beans between 14.05 and 03.06. First, I create a table that contains a record
for each date in this range:
create table date_range as select to_date(rownum-1 + to_date('14.05.2001')) dt from all_objects where rownum-1 <= to_date('03.06.2001') - to_date('14.05.2001') ;
The desired result can now be obtained with the following select statement:
select dr_dt, sum(ip_price) over (partition by new_price) price from ( select dr.dt dr_dt, ip.price ip_price, sum(ip.price) over (order by dr.dt) new_price from date_range dr left join item_price ip on dr.dt=ip.dt where nvl(ip.item,'beans') = 'beans' ) order by dr_dt; DR_DT PRICE ---------- ---------- 14.05.2001 15.4 15.05.2001 15.4 16.05.2001 15.4 17.05.2001 15.4 18.05.2001 15.4 19.05.2001 15.4 20.05.2001 15.4 21.05.2001 15.4 22.05.2001 15.4 23.05.2001 15.4 24.05.2001 15.4 25.05.2001 15.4 26.05.2001 15.4 27.05.2001 15.4 28.05.2001 16.2 29.05.2001 16.2 30.05.2001 16.2 31.05.2001 16.2 01.06.2001 16.2 02.06.2001 16.2 03.06.2001 14.9
How and why does this work? The select statement's inner select statement returns:
DR_DT IP_PRICE NEW_PRICE ---------- ---------- ---------- 14.05.2001 15.4 15.4 15.05.2001 15.4 16.05.2001 15.4 18.05.2001 15.4 19.05.2001 15.4 20.05.2001 15.4 21.05.2001 15.4 22.05.2001 15.4 23.05.2001 15.4 24.05.2001 15.4 25.05.2001 15.4 27.05.2001 15.4 28.05.2001 16.2 31.6 29.05.2001 31.6 30.05.2001 31.6 31.05.2001 31.6 02.06.2001 31.6 03.06.2001 14.9 46.5
Since I have left outer joined, the result set returns a record for each date stored in date_range. The column ip_price returns the price for the dates where one is stored
in item_price. The column new_price does an «analytical» sum, that is it changes its value each time ip_price changes. This new_price can now be used to find out when the price has changed (or is new,
hence the name) with the
partition by new price clause.
Now, Gemes Tibor has sent me a mail concerning this and points out that this is problematic with negative or zero numbers and doesn't work with varchars at all. He writes:
[...]
So I added a new varchar2 type column to the example:
alter table item_price add price2 varchar2(100); update item_price set price2 = price; insert into item_price values('22.05.2001','beans',15.9, 'a'); Here is the rewritten select which works for varchar2 type columns as well: select dr_dt, max(ip_price) over (partition by pricechangeindicator) price from ( select dr.dt dr_dt, ip.price2 ip_price, count(ip.price2) over (order by dr.dt) pricechangeindicator from (select price2, dt from item_price where item='beans' ) ip, date_range dr where ip.dt(+) = dr.dt ) order by dr_dt;
This select results:
DR_DT PRICE ---------- ----- 15.05.2001 15.4 14.05.2001 15.4 16.05.2001 15.4 17.05.2001 15.4 18.05.2001 15.4 19.05.2001 15.4 20.05.2001 15.4 21.05.2001 15.4 23.05.2001 a 22.05.2001 a 24.05.2001 a 27.05.2001 a 26.05.2001 a 25.05.2001 a 28.05.2001 16.2 29.05.2001 16.2 30.05.2001 16.2 02.06.2001 16.2 01.06.2001 16.2 31.05.2001 16.2 03.06.2001 14.9
Finally, he also sends a solution for 10
select dr.dt dr_dt, --ip.price2 ip_price, last_value(ip.price2 ignore nulls) over (order by dr.dt) last_price from ( select price2, dt from item_price where item='beans' ) ip, date_range dr where ip.dt(+) = dr.dt;
which returns the same result set.
I'd like to give thanks to Gemes Tibor for giving me this input.
|