Friday, July 19, 2013

Aggregated Lists: Creating Aggregated List Strings - ListAgg() Function.


Oracle's ListAgg() Function -- Or Other Methods.

Given the following sample data shown below, we want to aggregate the elements into a single, comma separated list for each group. The position is given to define the order in which the elements will appear in the list. Also, the NULL elements should not be ignored -- they should show up as nothing between the commas that define their potential placement in the list.


   
  Convert groups of elements into
  comma separated, aggregated lists...

    GRP  POS  ELMNT
    ---  ---  -----
    A      1  bob
    A      2
    A      3
    A      4  joe
    A      5
    ---  ---  -----
    B      1  342
    B      2  0
    B      3  47
    ---  ---  -----
    C      1  # #
    ---  ---  -----
    D      1  big
    D      2  foot
    ---  ---  -----
    E      1  a
    E      2  b
    E      3
    E      4  c
    E      5  d
    E      6  e
    ---  ---  -----


  For each group above, combine all the
  elements into a single, comma separated
  list of elements in the order given by
  the position.
  
    GRP  LIST
    ---  -----------
    A    bob,,,joe,
    B    342,0,47
    C    # #
    D    big,foot
    E    a,b,,c,d,e
    ---  -----------
    



Here's The Data:

NOTE: The sample data set is given in a WITH clause. To run some of the queries below, you will need to paste the entire WITH clause onto the beginning of those queries.


   WITH 
     sample_data   as
       ( select  NULL  as grp
              ,  NULL  as pos
              ,  NULL  as elmnt
           from  dual  where 1=0
         --------------------------------------
         union select  'A', 1, 'bob'  from dual
         union select  'A', 2,  null  from dual
         union select  'A', 3,  null  from dual
         union select  'A', 4, 'joe'  from dual
         union select  'A', 5,  null  from dual
         union select  'B', 1, '342'  from dual
         union select  'B', 2, '0'    from dual
         union select  'B', 3, '47'   from dual
         union select  'C', 1, '# #'  from dual
         union select  'D', 1, 'big'  from dual
         union select  'D', 2, 'foot' from dual
         union select  'E', 1, 'a'    from dual
         union select  'E', 2, 'b'    from dual
         union select  'E', 3,  null  from dual
         union select  'E', 4, 'c'    from dual
         union select  'E', 5, 'd'    from dual
         union select  'E', 6, 'e'    from dual
         --------------------------------------
       )
   select  grp
        ,  pos
        ,  elmnt
     from  sample_data
    order  by  grp, pos ;
    



    
  The Data:
      
    GRP  POS  ELMNT
    ---  ---  -----
    A      1  bob
    A      2
    A      3
    A      4  joe
    A      5
    B      1  342
    B      2  0
    B      3  47
    C      1  # #
    D      1  big
    D      2  foot
    E      1  a
    E      2  b
    E      3
    E      4  c
    E      5  d
    E      6  e
    ---  ---  -----




Using ListAgg() as an Aggregate (Group-By) Function

Notice how the simple use of ListAgg() ignores the NULL values. To recognize the NULLs we first substitute some dummy value for the null "~"; then outside of the ListAgg() function we replace the dummy value with NULL. In this way we can overcome the limitation of Oracl's ListAgg() function.


   
   (-- Prepend the complete WITH clause data here to run the query. --)
   select  grp
        ,  ListAgg( elmnt, ',' ) within group (order by pos)  as list
     from  sample_data 
    group  by  grp ;




     
  First Results:
       
    GRP  LIST
    ---  ---------
    A    bob,joe
    B    342,0,47
    C    # #
    D    big,foot
    E    a,b,c,d,e
 



   (-- Prepend the complete WITH clause data here to run the query. --)
   select  grp
        ,  replace
             ( ListAgg(nvl(elmnt,'~'), ',') within group (order by pos)
             , '~', null 
             )  as list
     from  sample_data 
    group  by  grp ;




     
  Correct Results:
  
    GRP  LIST
    ---  ----------
    A    bob,,,joe,
    B    342,0,47
    C    # #
    D    big,foot
    E    a,b,,c,d,e
 



As a Side Note...

The ListAgg() function can be used as an "Analytic" function. The syntax is a little different from other analytic functions, but the results look like you would expect.


   
   (-- Prepend the complete WITH clause data here to run the query. --)
   select  grp
        ,  pos
        ,  elmnt
        ,  replace
             ( listagg(nvl(elmnt,'~'), ',') 
                 within group (order by pos) 
                 over ( partition by grp )
             , '~'
             , NULL 
             )  as list
     from  sample_data 
    order  by  grp, pos ;




    
  Results:

    GRP  POS  ELMNT  LIST
    ---  ---  -----  -----------
    A      1  bob    bob,,,joe,
    A      2         bob,,,joe,
    A      3         bob,,,joe,
    A      4  joe    bob,,,joe,
    A      5         bob,,,joe,
    ---  ---  -----  -----------
    B      1  342    342,0,47
    B      2  0      342,0,47
    B      3  47     342,0,47
    ---  ---  -----  -----------
    C      1  # #    # #
    ---  ---  -----  -----------
    D      1  big    big,foot
    D      2  foot   big,foot
    ---  ---  -----  -----------
    E      1  a      a,b,,c,d,e
    E      2  b      a,b,,c,d,e
    E      3         a,b,,c,d,e
    E      4  c      a,b,,c,d,e
    E      5  d      a,b,,c,d,e
    E      6  e      a,b,,c,d,e
    ---  ---  -----  -----------




Recursive WITH Instead of ListAgg()

Oracle's Recursive WITH (subquery factoring) can be used instead of the ListAgg() function. The recursive part of the WITH clause is the "w_agg" portion; the list of parameters that immediately follow the "w_agg" tells you that it's different from the other parts of a normal WITH clause. The pre-Union clause of the Recursive WITH clause returns the first element of each group allong with the maximum number of elements in each group. The post-Union clause joins the running data with the "w_agg" data to add on each additional element.


   
   WITH 
     sample_data   as
       ( select  NULL  as grp
              ,  NULL  as pos
              ,  NULL  as elmnt
           from  dual  where 1=0
         --------------------------------------
         union select  'A', 1, 'bob'  from dual
         union select  'A', 2,  null  from dual
         union select  'A', 3,  null  from dual
         union select  'A', 4, 'joe'  from dual
         union select  'A', 5,  null  from dual
         union select  'B', 1, '342'  from dual
         union select  'B', 2, '0'    from dual
         union select  'B', 3, '47'   from dual
         union select  'C', 1, '# #'  from dual
         union select  'D', 1, 'big'  from dual
         union select  'D', 2, 'foot' from dual
         union select  'E', 1, 'a'    from dual
         union select  'E', 2, 'b'    from dual
         union select  'E', 3,  null  from dual
         union select  'E', 4, 'c'    from dual
         union select  'E', 5, 'd'    from dual
         union select  'E', 6, 'e'    from dual
         --------------------------------------
       )
   , w_agg ( grp, pos, list, mpos ) as
       ( select  grp, pos
              ,  cast(elmnt as varchar2(2000))
              ,  mpos
           from  ( select  grp, pos, elmnt
                        ,  max(pos) over (partition by grp)  as mpos  
                     from  sample_data
                 )
          where  pos = 1
       UNION ALL
         select  a.grp
              ,  b.pos
              ,  a.list || ',' || b.elmnt
              ,  a.mpos
           from  w_agg        a
           join  sample_data  b  
             on  ( a.grp = b.grp  and  a.pos = (b.pos-1) ) 
       )
   select  grp, list
     from  w_agg
    where  pos = mpos
    order  by  grp ;



    
  Results:
      
    GRP  LIST
    ---  ----------
    A    bob,,,joe,
    B    342,0,47
    C    # #
    D    big,foot
    E    a,b,,c,d,e




A "Connect-By" Solution

The "Connect-By" feature of Oracle's SQL can also be used to process the solution.


   
   (-- Prepend the complete WITH clause data here to run the query. --)
   select  grp
        ,  list
     from  ( select  grp
                  ,  pos
                  ,  max(pos) over (partition by grp) as mpos
                  ,  substr(sys_connect_by_path(elmnt, ','),2) as list
               from  sample_data
            connect  by ( prior grp = grp  and  prior pos = pos-1 )
                     start with  pos = 1
           )
    where  pos = mpos
    order  by  grp ;




Another "Connect-By" Solution:

This solution assumes that you can just take the longest list in each group as the final answer. The previous solution explicitly found the final list that was constructed for each group ( pos = mpos ).


   
   (-- Prepend the complete WITH clause data here to run the query. --)
   select  grp
        ,  substr( max(sys_connect_by_path(elmnt, ',')), 2 ) as list
     from  sample_data
    group  by  grp
  connect  by ( prior grp = grp  and  prior pos = pos-1 )
           start with  pos = 1
    order  by  grp ;





That's it... See my other post on decomposing Aggregated Lists into individual elements.

No comments:

Post a Comment