De-ListAgg, DeListAgg, or Reversing Oracle's ListAgg() Function.
Given the following sample data set shown below, we want to decompose these comma delimited lists into separate values. Each row will become a group of rows containing the elements of its list. Notice that a list may have missing elemets (nothing between the commas). We will also assign a position sequence number to each element so that if needed, the aggregated list could be reconstructed in the same order from the resulting data.
Convert comma separated, aggregated lists... GRP LIST --- --------------------- A sam,joe,bob,ray B (who),(where),(why) C n,,,nulls, D *?%,,=+@#,^$~!*? E e,d,c,b,a F We are done now Into a list of single elements. GRP Position Element --- -------- ------------------ A 1 sam A 2 joe A 3 bob A 4 ray --- -------- ------------------ B 1 (who) B 2 (where) B 3 (why) --- -------- ------------------ C 1 n C 2 C 3 C 4 nulls C 5 --- -------- ------------------ D 1 *?% D 2 D 3 =+@# D 4 ^$~!*? --- -------- ------------------ E 1 e E 2 d E 3 c E 4 b E 5 a --- -------- ------------------ F 1 We are done now --- -------- ------------------ We will even keep track of "Position" as shown. In that way, we will preserve the order of the elements in the original aggregated list.
Here's The Data:
NOTE: The sample data set is given in a WITH clause. To run some of the queries below, you will need to paste the entire WITH clause onto the beginning of those queries.
WITH sample_data as ( select null as grp , null as list from dual where 1=0 -------------------------------------------------------- union all select 'A', 'sam,joe,bob,ray' from dual union all select 'B', '(who),(where),(why)' from dual union all select 'C', 'n,,,nulls,' from dual union all select 'D', '*?%,,=+@#,^$~!*?' from dual union all select 'E', 'e,d,c,b,a' from dual union all select 'F', 'We are done now' from dual -------------------------------------------------------- ) select grp , list from sample_data order by grp ;
Now let's see if we can solve this.
Using the "Translate()" function we can null-out the commas -- then by comparing the before and after lengths, we can determine how many elements there are in each list.
(-- Prepend the complete WITH clause data here to run the query. --) select grp , list , (length(list)-length(translate(list,'x,','x'))+1) as val_cnt from sample_data order by grp ;
Results: GRP LIST VAL_CNT --- ------------------- ------- A sam,joe,bob,ray 4 B (who),(where),(why) 3 C n,,,nulls, 5 D *?%,,=+@#,^$~!*? 4 E e,d,c,b,a 5 F We are done now 1
Expanding the Groups
We now want to create one row for each element in a group. The "ODCINumberList()" function generates a table with a row for each listed number -- in this example it creates 14 rows, more than we need for the given situation; the largest group contains only 5 elements. It does the job, but it would be nice to make this query work no matter what the size of the list (we'll get to that later). If we range-join to this table of numbers and only include the numbers that are equal-to or less-than the value count that we just found, then we get the proper number of rows for each group as shown. We'll call this the "Position" and we'll use it in the next step to cleanly extract the desired element from the list.
(-- Prepend the complete WITH clause data here to run the query. --) select a.grp , a.list , b.column_value as position from sample_data a join table(sys.ODCInumberList(1,2,3,4,5,6,7,8,9,10,11,12,13,14)) b on ( b.column_value <= (length(a.list)-length(translate(a.list,'~,','~'))+1) ) order by a.grp, b.column_value ;
Results: GRP LIST POSITION --- ------------------- -------- A sam,joe,bob,ray 1 A sam,joe,bob,ray 2 A sam,joe,bob,ray 3 A sam,joe,bob,ray 4 B (who),(where),(why) 1 B (who),(where),(why) 2 B (who),(where),(why) 3 C n,,,nulls, 1 C n,,,nulls, 2 C n,,,nulls, 3 C n,,,nulls, 4 C n,,,nulls, 5 D *?%,,=+@#,^$~!*? 1 D *?%,,=+@#,^$~!*? 2 D *?%,,=+@#,^$~!*? 3 D *?%,,=+@#,^$~!*? 4 E e,d,c,b,a 1 E e,d,c,b,a 2 E e,d,c,b,a 3 E e,d,c,b,a 4 E e,d,c,b,a 5 F We are done now 1
Extracting the Elements:
Now by using this newly found position, we can use the "Instr()" and "Substr()" functions to extract the given element from each list. We now have the desired solution -- this is one way to get the results.
(-- Prepend the complete WITH clause data here to run the query. --) select a.grp , a.list , b.column_value as position , substr ( a.list , instr( ','||a.list||',', ',', 1, b.column_value) , instr( ','||a.list||',', ',', 1, b.column_value+1) - instr( ','||a.list||',', ',', 1, b.column_value)-1 ) as elmnt from sample_data a join table(sys.ODCInumberList(1,2,3,4,5,6,7,8,9,10,11,12,13,14)) b on ( b.column_value <= (length(a.list)-length(translate(a.list,'~,','~'))+1) ) order by a.grp, b.column_value ;
Results: GRP LIST POSITION ELMNT --- ------------------- -------- ----------------- A sam,joe,bob,ray 1 sam A sam,joe,bob,ray 2 joe A sam,joe,bob,ray 3 bob A sam,joe,bob,ray 4 ray --- ------------------- -------- ----------------- B (who),(where),(why) 1 (who) B (who),(where),(why) 2 (where) B (who),(where),(why) 3 (why) --- ------------------- -------- ----------------- C n,,,nulls, 1 n C n,,,nulls, 2 C n,,,nulls, 3 C n,,,nulls, 4 nulls C n,,,nulls, 5 --- ------------------- -------- ----------------- D *?%,,=+@#,^$~!*? 1 *?% D *?%,,=+@#,^$~!*? 2 D *?%,,=+@#,^$~!*? 3 =+@# D *?%,,=+@#,^$~!*? 4 ^$~!*? --- ------------------- -------- ----------------- E e,d,c,b,a 1 e E e,d,c,b,a 2 d E e,d,c,b,a 3 c E e,d,c,b,a 4 b E e,d,c,b,a 5 a --- ------------------- -------- ----------------- F We are done now 1 We are done now --- ------------------- -------- -----------------
Extracting the Elements using Regular Expressions:
I much prefer to use the "RegExp_Substr()" function to extract the elements. The search patern ".?([^,]*)" searches for an optional comma followed by non-comma characters. It then returns the non-comma charaters as the element.
(-- Prepend the complete WITH clause data here to run the query. --) select a.grp , a.list , b.column_value as position , regexp_substr( a.list, ',?([^,]*)', 1, b.column_value, 'i', 1 ) as elmnt from sample_data a join table(sys.ODCInumberList(1,2,3,4,5,6,7,8,9,10,11,12,13,14)) b on ( b.column_value <= (length(a.list)-length(translate(a.list,'~,','~'))+1) ) order by a.grp, b.column_value ;
Using Connect-By to Generate the Positions
Instead of the "ODCINumberList()" function we can use the Connect-By functionality. This still has the same problem of having to hard-code an upper limit to the number of elements.
(-- Prepend the complete WITH clause data here to run the query. --) select a.grp , a.list , seq as position , regexp_substr( a.list, ',?([^,]*)', 1, seq, 'i', 1 ) as elmnt from sample_data a join ( select level seq from dual connect by level <= 500 ) b on ( regexp_count(a.list,',')+1 >= seq ) order by a.grp, seq ;
Improved "Connect-By" solution
The Connect-By portion of the query can be modified to fix the upper-limit issue. With this solution you don't need to consider the maximum number of elements in the lists or hard-code them into the query.
WITH sample_data as ( select null as grp , null as list from dual where 1=0 -------------------------------------------------------- union all select 'A', 'sam,joe,bob,ray' from dual union all select 'B', '(who),(where),(why)' from dual union all select 'C', 'n,,,nulls,' from dual union all select 'D', '*?%,,=+@#,^$~!*?' from dual union all select 'E', 'e,d,c,b,a' from dual union all select 'F', 'We are done now' from dual -------------------------------------------------------- ) select grp , level as pos , regexp_substr( ','||list, ',([^,]*)', 1, level, 'i', 1 ) as elmnt from sample_data CONNECT BY ( regexp_count(list,',')+1 >= level and prior grp = grp and prior dbms_random.value is not null ) order by grp, level ;
Results: GRP POS ELMNT --- --- ---------------- A 1 sam A 2 joe A 3 bob A 4 ray B 1 (who) B 2 (where) B 3 (why) C 1 n C 2 C 3 C 4 nulls C 5 D 1 *?% D 2 D 3 =+@# D 4 ^$~!*? E 1 e E 2 d E 3 c E 4 b E 5 a F 1 We are done now --- --- ----------------
Alternate Solution #1 -- Recursive WITH Clause
The list of numbers generated by the connect-by clause can be replaced with a recursive WITH clause (subquery factoring). This clause uses the RegExp_Count() function to determine the number of elements; it then starts with that number and counts down, producing a list of numbers for each group that will then be joined to the origial data.
WITH sample_data as ( select null as grp , null as list from dual where 1=0 -------------------------------------------------------- union all select 'A', 'sam,joe,bob,ray' from dual union all select 'B', '(who),(where),(why)' from dual union all select 'C', 'n,,,nulls,' from dual union all select 'D', '*?%,,=+@#,^$~!*?' from dual union all select 'E', 'e,d,c,b,a' from dual union all select 'F', 'We are done now' from dual -------------------------------------------------------- ) , w_rowgen ( grp, seq ) as ( select grp, regexp_count(list,',')+1 from sample_data UNION ALL select grp, seq-1 from w_rowgen where seq > 1 ) select a.grp , b.seq , regexp_substr( a.list, ',?([^,]*)', 1, b.seq, 'i', 1 ) as elmnt from sample_data a join w_rowgen b on ( a.grp = b.grp ) order by a.grp, b.seq ;
Alternate Solution #2 -- Fully use the Recursive WITH Clause
And, if we're going to use the Recursive-WITH, we might as well use it to extract the elements from each group's list.
WITH sample_data as ( select null as grp , null as list from dual where 1=0 -------------------------------------------------------- union all select 'A', 'sam,joe,bob,ray' from dual union all select 'B', '(who),(where),(why)' from dual union all select 'C', 'n,,,nulls,' from dual union all select 'D', '*?%,,=+@#,^$~!*?' from dual union all select 'E', 'e,d,c,b,a' from dual union all select 'F', 'We are done now' from dual -------------------------------------------------------- ) , w_rowgen ( grp, list, seq, elmnt ) as ( select grp , list , regexp_count(list,',')+1 , regexp_substr( list, '^.*?,?([^,]*)$' , 1, 1, 'i', 1 ) from sample_data UNION ALL select grp , list , seq-1 , regexp_substr( list, ',?([^,]*)', 1, seq-1, 'i', 1 ) from w_rowgen where seq > 1 ) select grp, seq, elmnt from w_rowgen order by grp, seq ;
--------------------------------------------------------------------- -- Update: (2016-04-07) -- -- The following, in my oppinion, is a better solution. There are -- slight changes from the previous version. It will now handle -- leading and trailing NULL values in the lists properly. --------------------------------------------------------------------- WITH sample_data as ( select null as grp , null as list from dual where 1=0 -------------------------------------------------------- union all select 'A', 'sam,joe,bob,ray' from dual union all select 'B', '(who),(where),(why)' from dual union all select 'C', 'n,,,nulls,' from dual union all select 'D', '*?%,,=+@#,^$~!*?' from dual union all select 'E', 'e,d,c,b,a' from dual union all select 'F', 'We are done now' from dual union all select 'm', '' from dual union all select 'n', ',' from dual union all select 'o', ',,' from dual union all select 'p', ',x,' from dual union all select 'q', ',aa,,bb' from dual union all select 'r', 'mmm,oo,' from dual -------------------------------------------------------- ) , w_rowgen (grp, list, seq, val) as ( select grp, list , regexp_count(list, ',')+1 , regexp_substr(list, ',?([^,]*)$', 1, 1, 'i', 1) from sample_data UNION ALL select grp, list , seq-1 , regexp_substr(list, '([^,]*),', 1, seq-1, 'i', 1) from w_rowgen where seq > 1 ) select grp, seq, val from w_rowgen order by grp, seq ;
Alternate Solution #3 -- Using Oracle's XML Functionality
Finally, we could use Oracle's XML functionality to extract the elements:
(-- Prepend the complete WITH clause data here to run the query. --) select a.grp , dense_rank() over ( partition by a.list order by rownum ) as seq , trim( x.column_value.extract( 'e/text()' ) ) as elmnt from sample_data a join table ( xmlsequence ( xmltype ( '<e><e>' || replace( a.list, ',', '</e><e>' ) || '</e></e>' ).extract ( 'e/e' ) ) ) x on ( 1=1 ) order by grp, seq ;
That's it... See my other post on creating Aggregated Lists from expanded elements.
I've been surfing online more than 2 hours today, yet I never found
ReplyDeleteany interesting article like yours. It's pretty worth enough for me.
In my view, if all web owners and bloggers made good content as
you did, the net will be much more useful than ever before.
Look into my page ... holztreppen
Thanks for the kind words!
ReplyDeleteThis is an excellent resource!
ReplyDeleteThis site was... how do I say it? Relevant!! Finally I've found something
ReplyDeletethat helped me. Kudos!
Look at my page :: klimatyzacja
Excellent work, Steve. You really helped me out.
ReplyDeleteReally really useful, this has helped me immensely. Thank you
ReplyDeleteGod bless you !! Amazing resource !!
ReplyDeleteFantastic article - many thanks Steve!
ReplyDelete