资讯专栏INFORMATION COLUMN

Oracle 19c New FeaturesLISTAGG函数功能增强

IT那活儿 / 1450人阅读
Oracle 19c New FeaturesLISTAGG函数功能增强

一. LISTAGG函数简介

LISTAGG函数是11g R2引入用于合并字符串(列转行),可替换wmsys.wm_concat功能,而且效率更好。wm_concat函数是自定义聚集函数,而且是undocument的,从12C开始已经去掉,从11g R2开始,有字符串合并的需求,最好使用LISTAGG替代,以获得更好的性能。 

但是,LISTAGG函数在11g R2中有个缺点,就是不能直接DISTINCT,在ORACLE 19C中,ORACLE给它增加了DISTINCT功能,这样可以剔除重复的字符串合并。19C的完整LISTAGG语法如下:

从语法图上可以看出,LISTAGG函数可以是普通的组函数,也可以用于分析函数,并且12C开始增加了OVERFLOW语法,用于字符串过长的处理。


二. 使用LISTAGG函数实现字符串合并

下例所示:

需求:对emp表,按照部门分组,按逗号合并部门员工名。

使用LISTAGG实现如下:

select deptno, listagg(ename,,) within group(order by deptno) as enames
  from emp
  group by deptno
order by deptno;
 
DEPTNO ENAMES
------------------------------------------------------------------------------
    10 CLARK,KING,MILLER
    20 ADAMS,FORD,JONES,SCOTT,SMITH
    30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD 

下面给DEPTNO=10的部门插入2个重复名字的员工,如下所示:

INSERT INTO emp VALUES (8000,KING,ANALYST,7782,to_date(1983-1-1,yyyy-mm-dd),2000,NULL,10);
INSERT INTO emp VALUES (9000,KING,MANADER,7782,to_date(1984-5-1,yyyy-mm-dd),2500,NULL,10);
COMMIT;

再次查询,发现有重复的数据:

SQL> select deptno, listagg(ename,,) within group(order by deptno) as enames
  2    from emp
  3    group by deptno
  4  order by deptno;
 
DEPTNO ENAMES
------ --------------------------------------------------------------------------------
    10 CLARK,KING,KING,KING,MILLER
    20 ADAMS,FORD,JONES,SCOTT,SMITH
    30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

  一般遇到这种有重复数据的情况,需要剔除重复数据,在19C之前,需要先剔除重复数据: 

--红色部分使用分析函数剔重,当然这里直接distinct也可以,结果与最前面的一致
select deptno, listagg(ename, ,) within group(order by deptno) as enames
  from (select deptno,
               ename,
               row_number() over(partition by deptno, ename order by empno) rn
          from emp)
 where rn = 1
 group by deptno
 order by deptno;
 
或者直接DISTINCT :
select deptno, listagg(ename, ,) within group(order by deptno) as enames
  from (select distinct deptno,
               ename
          from emp)
 group by deptno
 order by deptno;
 
DEPTNO ENAMES
-------------------------------------------------------------------------------
10 CLARK,KING,MILLER
20 ADAMS,FORD,JONES,SCOTT,SMITH
30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

到了19C,LISTAGG提供了直接DISTINCT的功能,可以很简单地实现剔除重复数据,然后合并,如下所示:

select deptno, listagg(distinct ename,,) within group(order by deptno) as enames
  from emp
 group by deptno
 order by deptno;
 
DEPTNO ENAMES
-------------------------------------------------------------------------------
10 CLARK,KING,MILLER
20 ADAMS,FORD,JONES,SCOTT,SMITH
30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

默认情况下包含所有数据(ALL不用写默认),也就是LISTAGG(ALL column),如下所示:

select deptno, listagg(all ename,,) within group(order by deptno) as enames
  from emp
 group by deptno
 order by deptno; 


三. WM_CONCAT函数与LISTAGG函数比较
果使用wm_concat,则是:
select deptno, wm_concat(distinct ename) as enames
  from emp
 group by deptno
 order by deptno;
 
DEPTNO ENAMES
------ --------------------------------------------------------------------------------
    10 CLARK,KING,MILLER
    20 ADAMS,FORD,JONES,SCOTT,SMITH
    30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

wm_concat这个undocument函数可以直接distinct,但是它与LISTAGG比缺点很明显,一是性能差,而是undocument函数,比较如下:

  • wm_concat可以distinct去除重复的(用于分析函数,distinct不能带order by),19C之前的listagg不可以,可以先剔除重复,然后做listagg,19C之后的listagg可以distinct。

  • wm_concat不能保证排序,listagg可以保证排序.

  • WMSYS.WM_CONCAT是undocument的函数,最好不要使用,要使用也用自定义分析函数


四. 使用ON OVERFLOW处理字符串长度溢出问题
LISTAGG函数在12.2开始,合并长度最多32767字节,依赖于MAX_STRING_SIZE参数,如下所示:
  • 如果MAX_STRING_SIZE=EXTEND,则对于VARCHAR2和RAW类型,最多返回32767字节

  • 如果MAX_STRING_SIZE=STANDARD,则对于VARCHAR2最多4000字节,对于RAW类型最多2000字节 

那么在合并的字符串超过限制,溢出时,默认报错,但是12.2引入了ON OVERFLOW可以截断处理,如下: 

ON OVERFLOW TRUNCATE默认对溢出数据后面用…(count),如下所示:

SELECT deptno, LISTAGG(ename, ,  ON OVERFLOW TRUNCATE) WITHIN GROUP (ORDER BY deptno) AS enames
FROM  (select deptno,ename from emp,(select level from dual connect by level<1000))
GROUP BY deptno
ORDER BY deptno;
 
DEPTNO ENAMES
--------------------------------------------------------------------------------------------
    10 CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLAR
       省略
       ARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,
       CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,...(4334)
 
    20 ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAM
       S,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,AD
       AMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,
       省略
       ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,...(4334)
 
    30 ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLE
       N,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,AL
       LEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,
      省略
       LEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,
       ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,...(5333)

其他用法如下:

--1.改变省略的格式,换成~~~
SELECT deptno, LISTAGG(ename, , ON OVERFLOW TRUNCATE ~~~) WITHIN GROUP (ORDER BY deptno) AS enames
FROM  (select deptno,ename from emp,(select level from dual connect by level<1000))
GROUP BY deptno
ORDER BY deptno;
 
--2.通过WINTHOUT COUNT省略掉计数
SELECT deptno, LISTAGG(ename, , ON OVERFLOW TRUNCATE WITHOUT COUNT) WITHIN GROUP (ORDER BY deptno) AS enames
FROM  (select deptno,ename from emp,(select level from dual connect by level<1000))
GROUP BY deptno
ORDER BY deptno;
 
--3.默认格式ON OVERFLOW ERROW,溢出则报错ORA-01489: result of string concatenation is too long
 
SELECT deptno, LISTAGG(ename, , ON OVERFLOW ERROR) WITHIN GROUP (ORDER BY deptno) AS enames
FROM  (select deptno,ename from emp,(select level from dual connect by level<1000))
GROUP BY deptno
ORDER BY deptno;


 五. 使用XMLAGG返回CLOB处理字符串合并超长需求
1. 针对超过字符串长度溢出的问题,如果要完整显示,可以使用如下方式:
  • 自定义wm_concat返回值类型为CLOB

  • 使用XMLAGG函数获取CLOB值

下面使用XMLAGG函数处理超长合并字符串问题,如下所示:

SELECT deptno,  RTRIM(xmlagg(xmlelement(c, ename || ,)
ORDER BY deptno).extract(//text()).getclobval(),
             ,) AS enames
FROM  (select deptno,ename from emp,(select level from dual connect by level<1000))
GROUP BY deptno
ORDER BY deptno;


总结:ORACLE 19C针对LISTAGG增加了很多改进,在以后有字符串合并需求(列转行)的时候,要优先使用LISTAGG,而不是WM_CONCAT,对于自定义聚集函数,性能较差,而且功能也没有LISTAGG强大。


END

更多精彩干货分享

点击下方名片关注

IT那活儿

文章版权归作者所有,未经允许请勿转载,若此文章存在违规行为,您可以联系管理员删除。

转载请注明本文地址:https://www.ucloud.cn/yun/129862.html

相关文章

  • 19C DG Broker配置和测试

    19C DG Broker配置和测试 img{ display:block; margin:0 auto !important; width:100%; } body{ width:75%; ...

    IT那活儿 评论0 收藏2941
  • 聊聊 Java8 以后各个版本的新特性

    摘要:于是抽时间看了看以后各个版本的特性,做了一个总结。年和公开版本发布,取名为。此后对应版本就是,。发布,是一个重大版本更新。在此之后,就是每六个月发布一次新版本。以上和参考资料聊了一些关于的历史,下面我们看看各个版本有那些新特性。 【这是 ZY 第 11 篇原创技术文章】 某天在网上闲逛,突然看到有篇介绍 Java 11 新特性的文章,顿时心里一惊,毕竟我对于 Java 的版本认识...

    K_B_Z 评论0 收藏0
  • Java 10 新特性解密,引入类型推断机制,2018 年 3 月 20 日发布

    摘要:目标发布目前有两个主要功能针对局部变量类型推断这将删除大部分对象实例化所需的冗长的包含手动类型信息整合源树的库即不同的库将被合并成一个单一的存储库。特别是,承诺为局部变量实例化引入类型推断机制,并将现有的存储库合并到一个存储库中。 JDK 10 何时发布? JDK 10 是 Java 10 标准版的部分实现,将于 2018 年 3 月 20 日发布,改进的关键点包括一个本地类型推断、一...

    caspar 评论0 收藏0

发表评论

0条评论

IT那活儿

|高级讲师

TA的文章

阅读更多
最新活动
阅读需要支付1元查看
<