Let's take for instance the following data set: measurements taken for two different people at different times.
-----> Example SQL Code Listing-#1: WITH w_collected_data as ( select NULL as nm , NULL as dts , NULL as measure from dual where 1=0 ----------------------------------------------------------------------------------------------- union all select 'Liz', to_date('2011-05-12 08:12:00','YYYY-MM-DD HH24:MI:SS'), 31.4 from dual union all select 'Liz', to_date('2011-05-12 10:21:00','YYYY-MM-DD HH24:MI:SS'), 31.1 from dual union all select 'Bob', to_date('2011-05-12 10:23:00','YYYY-MM-DD HH24:MI:SS'), 23.5 from dual union all select 'Bob', to_date('2011-05-12 11:32:00','YYYY-MM-DD HH24:MI:SS'), 24.9 from dual union all select 'Liz', to_date('2011-05-12 12:16:00','YYYY-MM-DD HH24:MI:SS'), 30.7 from dual union all select 'Bob', to_date('2011-05-12 12:44:00','YYYY-MM-DD HH24:MI:SS'), 26.1 from dual union all select 'Bob', to_date('2011-05-12 13:18:00','YYYY-MM-DD HH24:MI:SS'), 27.3 from dual union all select 'Bob', to_date('2011-05-12 14:27:00','YYYY-MM-DD HH24:MI:SS'), 27.1 from dual union all select 'Liz', to_date('2011-05-12 14:36:00','YYYY-MM-DD HH24:MI:SS'), 29.5 from dual union all select 'Bob', to_date('2011-05-12 15:13:00','YYYY-MM-DD HH24:MI:SS'), 26.8 from dual union all select 'Liz', to_date('2011-05-12 17:01:00','YYYY-MM-DD HH24:MI:SS'), 29.2 from dual ----------------------------------------------------------------------------------------------- ) select * from w_collected_data order by nm, dts ; ------------------------- -- Results ------------------------- NM DTS MEASURE --- ------------------- ------- Bob 2011-05-12 10:23:00 23.5 Bob 2011-05-12 11:32:00 24.9 Bob 2011-05-12 12:44:00 26.1 Bob 2011-05-12 13:18:00 27.3 Bob 2011-05-12 14:27:00 27.1 Bob 2011-05-12 15:13:00 26.8 Liz 2011-05-12 08:12:00 31.4 Liz 2011-05-12 10:21:00 31.1 Liz 2011-05-12 12:16:00 30.7 Liz 2011-05-12 14:36:00 29.5 Liz 2011-05-12 17:01:00 29.2
Now let's suppose that we want to find the difference between one measurement to the next.
-----> Example SQL Code Listing-#2: WITH w_collected_data as ( select NULL as nm , NULL as dts , NULL as measure from dual where 1=0 ----------------------------------------------------------------------------------------------- union all select 'Liz', to_date('2011-05-12 08:12:00','YYYY-MM-DD HH24:MI:SS'), 31.4 from dual union all select 'Liz', to_date('2011-05-12 10:21:00','YYYY-MM-DD HH24:MI:SS'), 31.1 from dual union all select 'Bob', to_date('2011-05-12 10:23:00','YYYY-MM-DD HH24:MI:SS'), 23.5 from dual union all select 'Bob', to_date('2011-05-12 11:32:00','YYYY-MM-DD HH24:MI:SS'), 24.9 from dual union all select 'Liz', to_date('2011-05-12 12:16:00','YYYY-MM-DD HH24:MI:SS'), 30.7 from dual union all select 'Bob', to_date('2011-05-12 12:44:00','YYYY-MM-DD HH24:MI:SS'), 26.1 from dual union all select 'Bob', to_date('2011-05-12 13:18:00','YYYY-MM-DD HH24:MI:SS'), 27.3 from dual union all select 'Bob', to_date('2011-05-12 14:27:00','YYYY-MM-DD HH24:MI:SS'), 27.1 from dual union all select 'Liz', to_date('2011-05-12 14:36:00','YYYY-MM-DD HH24:MI:SS'), 29.5 from dual union all select 'Bob', to_date('2011-05-12 15:13:00','YYYY-MM-DD HH24:MI:SS'), 26.8 from dual union all select 'Liz', to_date('2011-05-12 17:01:00','YYYY-MM-DD HH24:MI:SS'), 29.2 from dual ----------------------------------------------------------------------------------------------- ) select nm , dts , measure , lag( measure ) over ( order by dts ) as previous , measure - lag( measure ) over ( order by dts ) as msr_increase from w_collected_data order by dts ; ------------------------- -- Results ------------------------- NM DTS MEASURE PREVIOUS MSR_INCREASE --- ------------------- ------- -------- ------------ Liz 2011-05-12 08:12:00 31.4 Liz 2011-05-12 10:21:00 31.1 31.4 -0.3 Bob 2011-05-12 10:23:00 23.5 31.1 -7.6 Bob 2011-05-12 11:32:00 24.9 23.5 1.4 Liz 2011-05-12 12:16:00 30.7 24.9 5.8 Bob 2011-05-12 12:44:00 26.1 30.7 -4.6 Bob 2011-05-12 13:18:00 27.3 26.1 1.2 Bob 2011-05-12 14:27:00 27.1 27.3 -0.2 Liz 2011-05-12 14:36:00 29.5 27.1 2.4 Bob 2011-05-12 15:13:00 26.8 29.5 -2.7 Liz 2011-05-12 17:01:00 29.2 26.8 2.4
Notice that the "lag(measure)" gives us the previous measure where the "over (order by dts)" defines how the rows are to be sorted when finding the previous row. The entire analytic function given by "lag(measure) over (order by dts)" returns the previous row's measure value if the rows were to be sorted by "dts". The first row has no previous value because there was no previous row to pull it from -- it's value is set to NULL.
In an analytic function the "over (order by *)" is only used to determine which row the value will be taken from -- it doesn't determine how the final output data-set will have its rows sorted; that is always determined in the "Order By" clause of the query.
Also, notice that this new analytic function, "lag(measure) over (order by dts)", can now be used in a calculation (the measure minus the previous measure) to show how much the measure changed from one row to the next.
Next we might realize that the Bob's measurements don't relate to the measurements of Liz. We would like to look at the two people's measurements seperately -- not inner-mixed. This means we will need to split-up the data by the "NM" column -- in analytic functions we do this via the "partition by" operator within the "over" clause.
-----> Example SQL Code Listing-#3: WITH w_collected_data as ( select NULL as nm , NULL as dts , NULL as measure from dual where 1=0 ----------------------------------------------------------------------------------------------- union all select 'Liz', to_date('2011-05-12 08:12:00','YYYY-MM-DD HH24:MI:SS'), 31.4 from dual union all select 'Liz', to_date('2011-05-12 10:21:00','YYYY-MM-DD HH24:MI:SS'), 31.1 from dual union all select 'Bob', to_date('2011-05-12 10:23:00','YYYY-MM-DD HH24:MI:SS'), 23.5 from dual union all select 'Bob', to_date('2011-05-12 11:32:00','YYYY-MM-DD HH24:MI:SS'), 24.9 from dual union all select 'Liz', to_date('2011-05-12 12:16:00','YYYY-MM-DD HH24:MI:SS'), 30.7 from dual union all select 'Bob', to_date('2011-05-12 12:44:00','YYYY-MM-DD HH24:MI:SS'), 26.1 from dual union all select 'Bob', to_date('2011-05-12 13:18:00','YYYY-MM-DD HH24:MI:SS'), 27.3 from dual union all select 'Bob', to_date('2011-05-12 14:27:00','YYYY-MM-DD HH24:MI:SS'), 27.1 from dual union all select 'Liz', to_date('2011-05-12 14:36:00','YYYY-MM-DD HH24:MI:SS'), 29.5 from dual union all select 'Bob', to_date('2011-05-12 15:13:00','YYYY-MM-DD HH24:MI:SS'), 26.8 from dual union all select 'Liz', to_date('2011-05-12 17:01:00','YYYY-MM-DD HH24:MI:SS'), 29.2 from dual ----------------------------------------------------------------------------------------------- ) select nm , dts , measure , lag( measure ) over (partition by nm order by dts) as previous , measure - lag( measure ) over (partition by nm order by dts) as msr_increase , round((dts - lag( dts ) over (partition by nm order by dts))*24*60) as min_between from w_collected_data order by nm, dts ; ------------------------- -- Results ------------------------- NM DTS MEASURE PREVIOUS MSR_INCREASE MIN_BETWEEN --- ------------------- ------- -------- ------------ ----------- Bob 2011-05-12 10:23:00 23.5 Bob 2011-05-12 11:32:00 24.9 23.5 1.4 69 Bob 2011-05-12 12:44:00 26.1 24.9 1.2 72 Bob 2011-05-12 13:18:00 27.3 26.1 1.2 34 Bob 2011-05-12 14:27:00 27.1 27.3 -0.2 69 Bob 2011-05-12 15:13:00 26.8 27.1 -0.3 46 Liz 2011-05-12 08:12:00 31.4 Liz 2011-05-12 10:21:00 31.1 31.4 -0.3 129 Liz 2011-05-12 12:16:00 30.7 31.1 -0.4 115 Liz 2011-05-12 14:36:00 29.5 30.7 -1.2 140 Liz 2011-05-12 17:01:00 29.2 29.5 -0.3 145
Now we see two seperate groups of data, one for Bob and one for Liz. In an analytic function we can 'partition by' mutliple values/columns and sort by multiple values/columns. This makes the analytic functions very powerful. For instance we might have a LAG function that looks like this: "lag(measure) over (partition by region, store, clerk order by product, sales_date)".
Additionally, the previous example added a column to show how many minutes were between the measurments using this LAG function: "lag(dts) over (partition by nm order by dts)".
Another feature of the LAG function is that you can specify how many rows to 'look-back'. The default is 1 -- or the previous row; but you can change that to 2 as shown in the following example. A third parameter can also be supplied as a defauld value (instead of NULL) for rows that don't have a previous value.
-----> Example SQL Code Listing-#4: WITH w_collected_data as ( select NULL as nm , NULL as dts , NULL as measure from dual where 1=0 ----------------------------------------------------------------------------------------------- union all select 'Liz', to_date('2011-05-12 08:12:00','YYYY-MM-DD HH24:MI:SS'), 31.4 from dual union all select 'Liz', to_date('2011-05-12 10:21:00','YYYY-MM-DD HH24:MI:SS'), 31.1 from dual union all select 'Bob', to_date('2011-05-12 10:23:00','YYYY-MM-DD HH24:MI:SS'), 23.5 from dual union all select 'Bob', to_date('2011-05-12 11:32:00','YYYY-MM-DD HH24:MI:SS'), 24.9 from dual union all select 'Liz', to_date('2011-05-12 12:16:00','YYYY-MM-DD HH24:MI:SS'), 30.7 from dual union all select 'Bob', to_date('2011-05-12 12:44:00','YYYY-MM-DD HH24:MI:SS'), 26.1 from dual union all select 'Bob', to_date('2011-05-12 13:18:00','YYYY-MM-DD HH24:MI:SS'), 27.3 from dual union all select 'Bob', to_date('2011-05-12 14:27:00','YYYY-MM-DD HH24:MI:SS'), 27.1 from dual union all select 'Liz', to_date('2011-05-12 14:36:00','YYYY-MM-DD HH24:MI:SS'), 29.5 from dual union all select 'Bob', to_date('2011-05-12 15:13:00','YYYY-MM-DD HH24:MI:SS'), 26.8 from dual union all select 'Liz', to_date('2011-05-12 17:01:00','YYYY-MM-DD HH24:MI:SS'), 29.2 from dual ----------------------------------------------------------------------------------------------- ) select nm , dts , measure , lag( measure, 2, 0 ) over (partition by nm order by dts) as previous from w_collected_data order by nm, dts ; ------------------------- -- Results ------------------------- NM DTS MEASURE PREVIOUS --- ------------------- ------- -------- Bob 2011-05-12 10:23:00 23.5 0 Bob 2011-05-12 11:32:00 24.9 0 Bob 2011-05-12 12:44:00 26.1 23.5 Bob 2011-05-12 13:18:00 27.3 24.9 Bob 2011-05-12 14:27:00 27.1 26.1 Bob 2011-05-12 15:13:00 26.8 27.3 Liz 2011-05-12 08:12:00 31.4 0 Liz 2011-05-12 10:21:00 31.1 0 Liz 2011-05-12 12:16:00 30.7 31.4 Liz 2011-05-12 14:36:00 29.5 31.1 Liz 2011-05-12 17:01:00 29.2 30.7
One last example. Here's a data-set that shows the ID numbers for 3 different runners passing through checkpoints 0-3. A LAG function is used to obtain the previous "clock" value, and then a calculation is made with that function to show the minutes between each check-point.
-----> Example SQL Code Listing-#5: WITH w_check_points as ( select NULL as id , NULL as ckpt , NULL as clock from dual where 1=0 ---------------------------------------------- union all select 12, 0, '00:00:00' from dual union all select 47, 0, '00:00:00' from dual union all select 83, 0, '00:00:00' from dual union all select 12, 1, '01:00:16' from dual union all select 47, 1, '00:43:27' from dual union all select 83, 1, '00:40:18' from dual union all select 12, 2, '02:12:47' from dual union all select 47, 2, '01:32:53' from dual union all select 83, 2, '01:49:27' from dual union all select 12, 3, '03:27:14' from dual union all select 47, 3, '02:25:04' from dual union all select 83, 3, '03:31:38' from dual ---------------------------------------------- ) select id , ckpt , clock , lag(clock) over (partition by id order by ckpt) as prev_clock , round ( ( to_date(clock, 'HH24:MI:SS') - to_date(lag(clock) over (partition by id order by ckpt), 'HH24:MI:SS') ) * 24 * 60 , 3 ) as min from w_check_points order by id, ckpt ; ------------------------- -- Results ------------------------- ID CKPT CLOCK PREV_CLOCK MIN --- ---- -------- ---------- --------- 12 0 00:00:00 12 1 01:00:16 00:00:00 60.267 12 2 02:12:47 01:00:16 72.517 12 3 03:27:14 02:12:47 74.45 47 0 00:00:00 47 1 00:43:27 00:00:00 43.45 47 2 01:32:53 00:43:27 49.433 47 3 02:25:04 01:32:53 52.183 83 0 00:00:00 83 1 00:40:18 00:00:00 40.3 83 2 01:49:27 00:40:18 69.15 83 3 03:31:38 01:49:27 102.183
very nice option , this articular is very user for best example.
ReplyDelete