Friday, September 30, 2011

Oracle PIVOT and UNPIVOT to Invert Data

Note: Prior to going through this example, you may wish to make sure you understand how to use Oracle's basic PIVOT and UNPIVOT clauses.

Let's suppose we have some data that we would like to "Invert" -- the rows become the columns and the columns become the rows. Here's a simple example that has account numbers for rows and quarters as columns. Note that the "Inversion" is shown with the quarters as rows and the account numbers as the columns, and the measures have been retained to correspond with the same quarters and accounts:





So, here is our sample data and the first part of the solution. We'll do an UNPIVOT to bring back all the measures into a single column:

/* ===== code ===== */     
   
   WITH
     w_data  as
       ( select  11 acct, 23.29 q_1, 81.72 q_2, 16.72 q_3, 60.00 q_4  from dual union all
         select  35     , null     , 11.53    , 51.27    , 76.14      from dual union all
         select  44     , 53.63    , 38.58    , 29.76    , 47.12      from dual
       )
   select  *
     from  w_data
  UNPIVOT  include nulls
           (   m1   FOR   qtr  IN
             ( q_1   as   1
             , q_2   as   2
             , q_3   as   3
             , q_4   as   4
             )
           )
    order  by 1 ;


Notice that we created the QTR column and assigned the proper identifying values to the previous column names. We now have the two identifiers (ACCT and QTR) and one column that contains all the measures:





For the next step we could create a sub-query with the previous code and then modify that result with a PIVOT clause. However, we can have both a PIVOT and an UNPIVOT clause in the same query.

/* ===== code ===== */     
   
   WITH
     w_data  as
       ( select  11 acct, 23.29 q_1, 81.72 q_2, 16.72 q_3, 60.00 q_4  from dual union all
         select  35     , null     , 11.53    , 51.27    , 76.14      from dual union all
         select  44     , 53.63    , 38.58    , 29.76    , 47.12      from dual
       )
   select  *
     from  w_data
  UNPIVOT  include nulls
           (   m1   FOR   qtr  IN
             ( q_1   as   1
             , q_2   as   2
             , q_3   as   3
             , q_4   as   4
             )
           )
    PIVOT  ( sum(m1) 
           FOR  acct IN
             (  11   as  acct_11
             ,  35   as  acct_35
             ,  44   as  acct_44
             )
           ) 
    order  by 1 ;


Here in the added PIVOT clause we create the three new columns for the three account values. We must use an aggregate function on the measure (sum(m1),  max(m1), ...) because the PIVOT clause turns the query into an aggregate (group-by) query. The QTR column remains, but the ACCT column is replaced by the three, more specific, account columns. The data-set has now been "Inverted":



Oracle's UNPIVOT Clause -- Going Further

Let's start with a nice little data set that has some un-pivot potential:

/* ===== code ===== */     
   
   WITH
     w_data  as
       ( select  'NY' st, 1 rgn, 13 q1cnt, 245.25 q1amt, 13 q2cnt, 245.25 q2amt, 13 q3cnt, 245.25 q3amt, 13 q4cnt, 245.25 q4amt  from dual union all
         select  'NY'   , 2    , 43      , 760.34      , 38      , 842.11      , 39      , 425.98      , 54      , 571.53        from dual union all
         select  'NY'   , 3    , 34      , 743.21      , 38      , 627.24      , null    , 463.74      , 41      , 154.96        from dual union all
         select  'HI'   , 1    , 14      , 256.13      , 11      , 532.48      , 10      , 246.37      , 18      , 261.37        from dual union all
         select  'OR'   , 1    , 73      , 758.14      , 71      , 132.84      , 63      , 623.74      , 82      , 851.92        from dual union all
         select  'OR'   , 2    , null    , null        , 62      , 275.11      , 73      , 846.32      , 70      , 253.19        from dual union all
         select  'UT'   , 1    , 27      , 156.38      , 32      , 637.19      , 31      , 256.48      , 39      , 146.75        from dual union all
         select  'UT'   , 2    , 24      , 264.82      , 20      , 423.61      , 21      , null        , 18      , 367.88        from dual union all
         select  'MT'   , 1    ,  3      , 374.28      ,  8      , 125.35      ,  5      , 423.66      ,  7      , 536.14        from dual 
       )
   select * from w_data ;
   

Here we have a couple of identifying columns (State and Region) that don't really have anything to do with the Un-Pivot -- they will just need to be associated with any measure that appears in their row. We also have eight columns of measures, two measures for each quarter of the year (Count and Amount). Un-Pivoting the data could mean that we take all these measures and un-pivot them into one single measures column, or it could mean that we un-pivot the four "Count" columns into one measures column and the four "Amount" columns into another measures column.





For the first example, let's start with that last one mentioned -- we'll un-pivot the data into two measure columns. We'll have a single "CNT" column and a seperate "AMT" column for the measures. We'll also create a new "QTR" colum that will contain an identifier that tells us which quarter the two measures came from.

Here is the solution. See the explanation and results below:

/* ===== code ===== */     
   
   WITH
     w_data  as
       ( select  'NY' st, 1 rgn, 13 q1cnt, 245.25 q1amt, 13 q2cnt, 245.25 q2amt, 13 q3cnt, 245.25 q3amt, 13 q4cnt, 245.25 q4amt  from dual union all
         select  'NY'   , 2    , 43      , 760.34      , 38      , 842.11      , 39      , 425.98      , 54      , 571.53        from dual union all
         select  'NY'   , 3    , 34      , 743.21      , 38      , 627.24      , null    , 463.74      , 41      , 154.96        from dual union all
         select  'HI'   , 1    , 14      , 256.13      , 11      , 532.48      , 10      , 246.37      , 18      , 261.37        from dual union all
         select  'OR'   , 1    , 73      , 758.14      , 71      , 132.84      , 63      , 623.74      , 82      , 851.92        from dual union all
         select  'OR'   , 2    , null    , null        , 62      , 275.11      , 73      , 846.32      , 70      , 253.19        from dual union all
         select  'UT'   , 1    , 27      , 156.38      , 32      , 637.19      , 31      , 256.48      , 39      , 146.75        from dual union all
         select  'UT'   , 2    , 24      , 264.82      , 20      , 423.61      , 21      , null        , 18      , 367.88        from dual union all
         select  'MT'   , 1    ,  3      , 374.28      ,  8      , 125.35      ,  5      , 423.66      ,  7      , 536.14        from dual 
       )
   select  st,  rgn
        ,  qtr
        ,  cnt,  amt
     from  w_data
  UNPIVOT  include nulls
             (   (   cnt,   amt ) FOR  qtr  IN
               ( ( q1cnt, q1amt )  as   1 
               , ( q2cnt, q2amt )  as   2 
               , ( q3cnt, q3amt )  as   3 
               , ( q4cnt, q4amt )  as   4 
               )
             )
  --where  nvl( cnt, 0 ) < 30
    order  by 1, 2, 3 ;


Explanation:

The "UNPIVOT" clause is placed immediately after the "FROM" clause. If you have a "WHERE" clause it will follow the "UNPIVOT" clause as shown.

The "Include Nulls" phrase will include an output row even if all the newly created measure columns are NULL. Take these two words out of the query and observe how the result-set changes.

Inside the "UNPIVOT" clause we start by creating the set of new measure columns (CNT and AMT). Notice that because we are creating a set (more than one) of new measures we put them in parentesis. These columns will contain all the un-pivoted measures. After the keyword "FOR" we create a new column (QTR) that will contain an identifier that tells us which quarter the measures came from (which original set of columns the specific count and amount measures came from).

After the "IN" keyword we include a parenthesis enclosed list that associates each set of original measure columns with an identifying value. For example, the original columns (Q1CNT and Q1AMT) are assigned the identifier of "1" -- this is the value that will appear in the new QTR column allong side the measures from those two columns.

And, one last item, if you un-comment (remove the "--" from the beginning of) the where clause, then the result-set will filter out any records that have a CNT of 30 or greater.





Notice that in the following example, all eight measure columns are un-pivoted into one single measure column (VAL). The new "VAL_ID" column will contain a text identifier that not only tells which quarter the measure comes from, but also tells us whether it is a "Count" or an "Amount" value.

/* ===== code ===== */     
   
   WITH
     w_data  as
       ( select  'NY' st, 1 rgn, 13 q1cnt, 245.25 q1amt, 13 q2cnt, 245.25 q2amt, 13 q3cnt, 245.25 q3amt, 13 q4cnt, 245.25 q4amt  from dual union all
         select  'NY'   , 2    , 43      , 760.34      , 38      , 842.11      , 39      , 425.98      , 54      , 571.53        from dual union all
         select  'NY'   , 3    , 34      , 743.21      , 38      , 627.24      , null    , 463.74      , 41      , 154.96        from dual union all
         select  'HI'   , 1    , 14      , 256.13      , 11      , 532.48      , 10      , 246.37      , 18      , 261.37        from dual union all
         select  'OR'   , 1    , 73      , 758.14      , 71      , 132.84      , 63      , 623.74      , 82      , 851.92        from dual union all
         select  'OR'   , 2    , null    , null        , 62      , 275.11      , 73      , 846.32      , 70      , 253.19        from dual union all
         select  'UT'   , 1    , 27      , 156.38      , 32      , 637.19      , 31      , 256.48      , 39      , 146.75        from dual union all
         select  'UT'   , 2    , 24      , 264.82      , 20      , 423.61      , 21      , null        , 18      , 367.88        from dual union all
         select  'MT'   , 1    ,  3      , 374.28      ,  8      , 125.35      ,  5      , 423.66      ,  7      , 536.14        from dual 
       )
   select  st,  rgn
        ,  val_id,  val
     from  w_data
  UNPIVOT  include nulls
             (   val    FOR   val_id        IN
               ( q1cnt   as  'Qtr-1 Count'
               , q2cnt   as  'Qtr-2 Count'
               , q3cnt   as  'Qtr-3 Count'
               , q4cnt   as  'Qtr-4 Count'
               , q1amt   as  'Qtr-1 Amount'
               , q2amt   as  'Qtr-2 Amount'
               , q3amt   as  'Qtr-3 Amount'
               , q4amt   as  'Qtr-4 Amount'
               )
             )
    order  by 1, 2, 3 ;


/* --- Partial Results --- */

     --  ST  RGN  VAL_ID        VAL
     --  --  ---  ------------  -------
     --  HI  1    Qtr-1 Amount  256.13
     --  HI  1    Qtr-1 Count   14
     --  HI  1    Qtr-2 Amount  532.48
     --  HI  1    Qtr-2 Count   11
     --  HI  1    Qtr-3 Amount  246.37
     --  HI  1    Qtr-3 Count   10
     --  HI  1    Qtr-4 Amount  261.37
     --  HI  1    Qtr-4 Count   18
     --  MT  1    Qtr-1 Amount  374.28
     --  MT  1    Qtr-1 Count   3
     --  ...




This next example is very similar to the last one. In this query, there are two identifying columns created (VAL_QTR and VAL_TYP) -- the identifying information here is split between these two columns. The first tells us which quarter the measure came from, and the second column tells us whether it was a "Count" or an "Amount" value.

/* ===== code ===== */     
   
   WITH
     w_data  as
       ( select  'NY' st, 1 rgn, 13 q1cnt, 245.25 q1amt, 13 q2cnt, 245.25 q2amt, 13 q3cnt, 245.25 q3amt, 13 q4cnt, 245.25 q4amt  from dual union all
         select  'NY'   , 2    , 43      , 760.34      , 38      , 842.11      , 39      , 425.98      , 54      , 571.53        from dual union all
         select  'NY'   , 3    , 34      , 743.21      , 38      , 627.24      , null    , 463.74      , 41      , 154.96        from dual union all
         select  'HI'   , 1    , 14      , 256.13      , 11      , 532.48      , 10      , 246.37      , 18      , 261.37        from dual union all
         select  'OR'   , 1    , 73      , 758.14      , 71      , 132.84      , 63      , 623.74      , 82      , 851.92        from dual union all
         select  'OR'   , 2    , null    , null        , 62      , 275.11      , 73      , 846.32      , 70      , 253.19        from dual union all
         select  'UT'   , 1    , 27      , 156.38      , 32      , 637.19      , 31      , 256.48      , 39      , 146.75        from dual union all
         select  'UT'   , 2    , 24      , 264.82      , 20      , 423.61      , 21      , null        , 18      , 367.88        from dual union all
         select  'MT'   , 1    ,  3      , 374.28      ,  8      , 125.35      ,  5      , 423.66      ,  7      , 536.14        from dual 
       )
   select  st,  rgn
        ,  val_qtr, val_typ, val
     from  w_data
  UNPIVOT  include nulls
             (   val    FOR  ( val_qtr, val_typ  ) IN
               ( q1cnt   as  ( 'Qtr-1', 'Count'  )
               , q2cnt   as  ( 'Qtr-2', 'Count'  )
               , q3cnt   as  ( 'Qtr-3', 'Count'  )
               , q4cnt   as  ( 'Qtr-4', 'Count'  )
               , q1amt   as  ( 'Qtr-1', 'Amount' )
               , q2amt   as  ( 'Qtr-2', 'Amount' )
               , q3amt   as  ( 'Qtr-3', 'Amount' )
               , q4amt   as  ( 'Qtr-4', 'Amount' )
               )
             )
    order  by 1, 2, 3, 4 ;


/* --- Partial Results --- */

     --  ST  RGN  VAL_QTR  VAL_TYP  VAL
     --  --  ---  -------  -------  -------
     --  HI  1    Qtr-1    Amount   256.13
     --  HI  1    Qtr-1    Count    14
     --  HI  1    Qtr-2    Amount   532.48
     --  HI  1    Qtr-2    Count    11
     --  HI  1    Qtr-3    Amount   246.37
     --  HI  1    Qtr-3    Count    10
     --  HI  1    Qtr-4    Amount   261.37
     --  HI  1    Qtr-4    Count    18
     --  MT  1    Qtr-1    Amount   374.28
     --  MT  1    Qtr-1    Count    3
     --  ...




This final example shows us that we can create as many identifying columns as we wish to include; and that we can define their values to just about anything we would need.

/* ===== code ===== */     
   
   WITH
     w_data  as
       ( select  'NY' st, 1 rgn, 13 q1cnt, 245.25 q1amt, 13 q2cnt, 245.25 q2amt, 13 q3cnt, 245.25 q3amt, 13 q4cnt, 245.25 q4amt  from dual union all
         select  'NY'   , 2    , 43      , 760.34      , 38      , 842.11      , 39      , 425.98      , 54      , 571.53        from dual union all
         select  'NY'   , 3    , 34      , 743.21      , 38      , 627.24      , null    , 463.74      , 41      , 154.96        from dual union all
         select  'HI'   , 1    , 14      , 256.13      , 11      , 532.48      , 10      , 246.37      , 18      , 261.37        from dual union all
         select  'OR'   , 1    , 73      , 758.14      , 71      , 132.84      , 63      , 623.74      , 82      , 851.92        from dual union all
         select  'OR'   , 2    , null    , null        , 62      , 275.11      , 73      , 846.32      , 70      , 253.19        from dual union all
         select  'UT'   , 1    , 27      , 156.38      , 32      , 637.19      , 31      , 256.48      , 39      , 146.75        from dual union all
         select  'UT'   , 2    , 24      , 264.82      , 20      , 423.61      , 21      , null        , 18      , 367.88        from dual union all
         select  'MT'   , 1    ,  3      , 374.28      ,  8      , 125.35      ,  5      , 423.66      ,  7      , 536.14        from dual 
       )
   select  st,   rgn
        ,  qtr,  qnm,  qrn,  qwrd,  qna,  qnb
        ,  cnt,  amt
     from  w_data
  UNPIVOT  include nulls
             (   (   cnt,   amt )  FOR  ( qtr ,        qnm  ,  qrn  ,   qwrd  ,  qna ,  qnb ) IN
               ( ( q1cnt, q1amt )   as  (   1 , 'Quarter-1' ,   'I' ,   'One' , 0.00 , 0.25 )
               , ( q2cnt, q2amt )   as  (   2 , 'Quarter-2' ,  'II' ,   'Two' , 0.25 , 0.50 )
               , ( q3cnt, q3amt )   as  (   3 , 'Quarter-3' , 'III' , 'Three' , 0.50 , 0.75 )
               , ( q4cnt, q4amt )   as  (   4 , 'Quarter-4' ,  'IV' ,  'Four' , 0.75 , 1.00 )
               )
             )
    order  by 1, 2, 3 ;


/* --- Partial Results --- */

     --  ST  RGN  QTR  QNM        QRN  QWRD   QNA    QNB    CNT  AMT
     --  --  ---  ---  ---------  ---  -----  -----  -----  ---  -------
     --  HI  1    1    Quarter-1  I    One    0      0.25    14  256.13
     --  HI  1    2    Quarter-2  II   Two    0.25   0.5     11  532.48
     --  HI  1    3    Quarter-3  III  Three  0.5    0.75    10  246.37
     --  HI  1    4    Quarter-4  IV   Four   0.75   1       18  261.37
     --  MT  1    1    Quarter-1  I    One    0      0.25     3  374.28
     --  MT  1    2    Quarter-2  II   Two    0.25   0.5      8  125.35
     --  ...


Thursday, September 22, 2011

Oracle's UNPIVOT Clause -- A Simple Example


To "Un-Pivot" data generally means to take the values from multiple columns and place them in a single column.

Here we have a table that contains measures for each quarter of the year for various accounts. We would like to 'un-pivot' the data so that all the measures that are currently contained in the four columns ( Q1, Q2, Q3, and Q4 ) will be located in one single column. In order to not lose any information we will also need a new column that will contain an indicator that tells us which column (which quarter, in this case) the specific measure came from.




Here's the "UNPIVOT" query (it also contains the sample data above) that will do the trick:

/* ===== code ===== */     
   

   WITH
     w_data  as
       ( select  11 acct, 23.29 q1, 81.72 q2, 16.72 q3, 60.00 q4  from dual union all
         select  35     , null    , 11.53   , 51.27   , 76.14     from dual union all
         select  44     , 53.63   , 38.58   , 29.76   , 47.12     from dual
       )
   select  acct,  qtr,  m1
     from  w_data
  UNPIVOT  include nulls
           (   m1  FOR  qtr  IN
             ( q1   as   1
             , q2   as   2
             , q3   as   3
             , q4   as   4
             )
           )
    order  by 1, 2 ;



Explanation: 

We add the "UNPIVOT" clause immediately after the "FROM" clause. If we want a place-holder for each account and quarter (even if the measure is NULL) then we can include the phrase "Include Nulls" as shown. Inside the main parenthesis of the UNPIVOT clause we first create two new columns. Prior to the keyword "FOR" we create the column that will contain the measures (m1). After the keyword "FOR" we create the column that will contain the identifying values that tell us which original column the measures came from (qtr). Then, following the keyword "IN" we list the columns and idetifying values as shown above. The following is an English translation of what is being said in this "IN" clause:

          Values in col. Q1 will be placed in the new col. M1 and identified by a 1 in col. QTR.
          Values in col. Q2 will be placed in the new col. M1 and identified by a 2 in col. QTR.
          Values in col. Q3 will be placed in the new col. M1 and identified by a 3 in col. QTR.
          Values in col. Q4 will be placed in the new col. M1 and identified by a 4 in col. QTR.

And, of course, the value in the ACCT column will stay the same for each measure brought into the result-set. The result of this query is shown below:





The previous example shows how a simple un-pivot statement works. In more complex un-pivot queries, you can have as many columns as you wish in order to show the desired identifying values and you can pivot the multiple columns in your source data into multiple measure columns in the result set. (More to come.)