Saturday, November 24, 2012

Overlapping Intervals (both numeric or date): Do they overlap? By how much?

Word has it that there is an undocumented "Overlaps" function in Oracle. The problem is, (after a few simple tests) it only works for date intervals and it only returns a true or false condition -- no actual values of how much overlap exists. Here's a simple example:

    
  select  case when  ( date'2012-02-25', date'2012-08-23' ) overlaps
                     ( date'2012-08-13', date'2012-11-13' ) 
               then 'It Overlaps'
               else 'nope'
          end as rslt
    from  dual ;


 So, being that it has very limited use, and that it is undocumented -- why even consider using it? Especially when there's a nice flexible supported construct that works just as well for numeric intervals as it does for date-time intervals.

Wasting no more time, then, here's the construct:

    
Given 2 intervals A and B (either numeric or date-time)in this form:

    A = ( A_bgn, A_end )
    B = ( B_bgn, B_end )

* If they overlap, the following will return a TRUE value; 
  if they don't overlap it will return FALSE.
  (This can be used in a WHERE clause or in a CASE statement.)
  
    >>>  least(A_end, B_end) > greatest(A_bgn, B_bgn)

* Similarly, this would test weather the intervals just touch:

    >>>  least(A_end, B_end) = greatest(A_bgn, B_bgn) 

* The following will give you the actual length of overlap. 
  For a date-time interval it will give you the number of days. 
  For a numeric interval it returns the numeric distance. 
  If the intervals don't overlap this will return a negative 
  value indicating how far apart are the endpoints.

    >>>  least(A_end, B_end) - greatest(A_bgn, B_bgn)

* A simple modification will return 0 if there is no overlap and 
  the length of the overlap if they actually do overlap.

    >>>  greatest(least(A_end, B_end) - greatest(A_bgn, B_bgn), 0)




So as a quick example, here are two queries that use the above constructs.
One for numeric intervals and one for date-time intervals.
    
   with
     w_data  as
       ( select  null  as A_bgn
              ,  null  as A_end
              ,  null  as B_bgn
              ,  null  as B_end
           from  dual  where 1=0
         --------------------------------------------
         union all  select  11, 13, 16, 20  from dual
         union all  select  12, 16, 14, 19  from dual
         union all  select  10, 25, 15, 20  from dual
         union all  select  12, 16, 16, 19  from dual
         --------------------------------------------
       )
   select  greatest ( least(A_end, B_end) - greatest(A_bgn, B_bgn), 0)  as intrvl_overlap
        ,             least(A_end, B_end) - greatest(A_bgn, B_bgn)      as diff
        ,  case when  least(A_end, B_end) > greatest(A_bgn, B_bgn)      then  'overlapping'  end  as reslt
        ,  case when  least(A_end, B_end) = greatest(A_bgn, B_bgn)      then  'touches'      end  as reslt2
     from  w_data ;


And ...

    
   with
     w_data  as
       ( select  null  as A_bgn
              ,  null  as A_end
              ,  null  as B_bgn
              ,  null  as B_end
           from  dual  where 1=0
         --------------------------------------------------------------------------------------------------------
         union all  select  to_date('0214~09:27','MMDD~HH24:MI'), to_date('0910~15:45','MMDD~HH24:MI')
                         ,  to_date('0923~09:45','MMDD~HH24:MI'), to_date('1008~11:07','MMDD~HH24:MI')  from dual
         union all  select  to_date('0214~09:27','MMDD~HH24:MI'), to_date('0825~15:45','MMDD~HH24:MI')
                         ,  to_date('0705~09:45','MMDD~HH24:MI'), to_date('1008~11:07','MMDD~HH24:MI')  from dual
         union all  select  to_date('0115~02:53','MMDD~HH24:MI'), to_date('1103~23:15','MMDD~HH24:MI')
                         ,  to_date('0305~03:10','MMDD~HH24:MI'), to_date('0312~15:10','MMDD~HH24:MI')  from dual
         union all  select  to_date('0411~08:53','MMDD~HH24:MI'), to_date('0416~13:30','MMDD~HH24:MI')
                         ,  to_date('0416~13:30','MMDD~HH24:MI'), to_date('0425~07:42','MMDD~HH24:MI')  from dual
         --------------------------------------------------------------------------------------------------------
       )
   select  greatest ( least(A_end, B_end) - greatest(A_bgn, B_bgn), 0)  as intrvl_overlap
        ,             least(A_end, B_end) - greatest(A_bgn, B_bgn)      as diff
        ,  case when  least(A_end, B_end) > greatest(A_bgn, B_bgn)      then  'overlapping'  end  as reslt
        ,  case when  least(A_end, B_end) = greatest(A_bgn, B_bgn)      then  'touches'      end  as reslt2
     from  w_data ;

Tuesday, November 20, 2012

Update Selected Values from a Look-Up Table

This is a quick little example to demonstrate how to (and how not to) update selected values in a table from multiple values stored in a look-up table. First, let's create a couple of tables. The first will be our main table that we need to update. The second will be a list of various keys (the NM column) that have different values that need to be updated in the main table:

    
   drop table main_data ;

   drop table new_values ;

   create table main_data as 
     select  null  as nm
          ,  null  as val  
       from  dual  where 1=0
     union all  select  'bob', 13  from dual
     union all  select  'sam', 14  from dual
     union all  select  'joe', 15  from dual
     union all  select  'jan', 16  from dual
     union all  select  'fay', 17  from dual
     union all  select  'liz', 18  from dual ;
   
   create table new_values as 
     select  null  as nm
          ,  null  as val
       from  dual  where 1=0
     union all  select  'sam', 2345  from dual
     union all  select  'liz', 7865  from dual ;


As we can see, we have two new values that need to replace the values (in the VAL column) of our MAIN_DATA table.

    
     select * from main_data ;

     >  NM   VAL
     >  ---  ---
     >  bob   13 
     >  sam   14 
     >  joe   15 
     >  jan   16 
     >  fay   17 
     >  liz   18 


     select * from new_values ;

     >  NM    VAL
     >  ---  ----
     >  sam  2345 
     >  liz  7865 


We would like to run one UPDATE statement that will use the NEW_VALUES table to find the corresponding values for "sam" and "liz" and replace the existing "14" and "18" values in our main table.

So here is our first attempt. We'll update our table by setting the value equal to a correlated sub-query. The WHERE clause correlates the main table alias "a" with the sub-query's alias "b". When the names match, we want the VAL column from our NEW_VALUES table to update the VAL column in our MAIN_DATA table.

    
   update  main_data a
      set  val = ( select  val 
                     from  new_values b
                    where  b.nm = a.nm
                 ) ;


Now we can look at the data to see what happened:

    
     select * from main_data ;

     >  NM    VAL
     >  ---  ----
     >  bob        
     >  sam  2345 
     >  joe        
     >  jan        
     >  fay        
     >  liz  7865 


Noticed how badly we failed. The two values were updated, but all the other values were set to NULL. This is because our UPDATE statement didn't have a WHERE clause -- it updated every value in the table. Since it didn't find values for all the other names... it set them all to NULL.

Now let's look at a proper update statement. Notice that this one has two correlated sub-queries -- both looking at the same NEW_VALUES table. This time we will only update the rows WHERE a look-up value "exists". In this second sub-query notice that we are selecting a zero -- any constant value will do (you can even put NULL there). The WHERE clause is just verifying that a row exists in the look-up table (exists meaning that the "where b.nm = a.nm" evaluates to true). We "rollback" the previous update and run the new one:

    
   rollback ;
   
   update  main_data a
      set  val = 
             ( select  val 
                 from  new_values b
                where  b.nm = a.nm
             ) 
    where  exists 
             ( select  0
                 from  new_values b
                where   b.nm = a.nm
             ) ;


This time we get the result that we wanted. Update the values that exist in the look-up table, and leave all the other values alone.

    
     select * from main_data ;
     
     >  NM    VAL
     >  ---  ----
     >  bob    13  
     >  sam  2345 
     >  joe    15  
     >  jan    16  
     >  fay    17  
     >  liz  7865 




My preferred way of doing this kind of update is the following -- it's updating a sub-query or an inline view. In this case you MUST have a unique constraint on the key of the the look-up table -- here, I've added a primary key which fills that requirement.

Notice that the two tables are joined on the proper key, and I select  both the target value and the look-up value. Then, outside the sub-query  we can SET the target value to the new look-up value (using the column aliases). Notice that only the two values will be set (no "where exists" clause is needed) because the JOIN only returns the two rows that have keys in both tables.

    
   rollback ;

   alter table new_values 
     add constraint new_values_pk 
     primary key (nm) ;
   
   update
     ( select  a.val as target_val
            ,  b.val as lookup_val
         from  main_data   a
         join  new_values  b  on ( a.nm = b.nm )
     )    
     set target_val = lookup_val ;