當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

讲讲 group by 的plus版

發布時間：2023/12/19 编程问答 20 豆豆

生活随笔收集整理的這篇文章主要介紹了讲讲 group by 的plus版小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

總第151篇/張俊紅

這一篇講講 group by plus，大家應該都知道 group by，可是 plus 是什么鬼呢？其實和 iphone plus一樣，就是升級版的意思。那到底這個 plus 是個什么東西呢？我們接下來慢慢講。

01|前言

我們先來看個數據需求場景，現在我有一張明細表，這張表里面存儲了每個店鋪的成交明細，其中包含每個店鋪所屬的城市、地區、大區屬性，我需要通過這張明細表獲取到每個店鋪、每個城市、每個省份、每個大區以及全國在最近一個月的成交量情況，我該怎么做呢？

明細表 t 如下：

有一種最簡單的方法就是，我們寫5個 Sql 語句，然后將數據導出來在 Excel 中處理。5個 Sql 語句如下：想一下，我們要做上面的那個需求，我們應該怎么做呢？

全國成交量

select?count(orderid)?as?sales?from?t?where?deal_date?between?"2019-05-01"?and?"2019-05-31"

大區成交量

select?area,count(orderid)?as?sales?? from?t? where?deal_date?between?"2019-05-01"?and?"2019-05-31" group?by?area

省份成交量

select?area,province,count(orderid)?as?sales?? from?t? where?deal_date?between?"2019-05-01"?and?"2019-05-31" group?by?area,province

城市成交量

selectarea,province?,city,count(orderid)?as?sales?? from?t? where?deal_date?between?"2019-05-01"?and?"2019-05-31" group?by?area,province?,city

店鋪成交量

select?area,province?,city,shop,count(orderid)?as?sales?? from?t? where?deal_date?between?"2019-05-01"?and?"2019-05-31" group?by?area,province?,city,shop

上面這種方法可以達到我們的目的，滿足我們的需求，但是這種方法太低效了，我們在Excel中還需要做合并處理，很麻煩。能不能把上面幾種結果在 Sql 中就進行合并處理，這樣就不需要在 Excel 中合并了。答案是可以的，需要借助的就是 union 和 union all，對查詢結果進行縱向合并。

union 和 union all的區別在于：前者是對合并后的結果進行去重處理，而后者返回合并后的所有數據。

具體代碼如下：

select?null,null,null,null,count(orderid)?as?sales?from?t?where?deal_date?between?"2019-05-01"?and?"2019-05-31"union?allselect?area,null,null,null,count(orderid)?as?sales?from?t?where?deal_date?between?"2019-05-01"?and?"2019-05-31"group?by?areaunion?allselect?area,province,null,null,count(orderid)?as?sales?from?t?where?deal_date?between?"2019-05-01"?and?"2019-05-31"group?by?area,provinceunion?allselect?area,province?,city,null,count(orderid)?as?sales?from?t?where?deal_date?between?"2019-05-01"?and?"2019-05-31"group?by?area,province,cityunion?allselect?area,province,city,shop,count(orderid)?as?sales??from?t?where?deal_date?between?"2019-05-01"?and?"2019-05-31"group?by?area,province?,city,shop

大家應該注意到上面的語句中 select 了很多 null，那是因為 union all 拼接的兩個表的列數需要相等。最后出來的結果如下：

02|grouping sets

利用 union all 要比導出5個Sql然后再在 Excel 中處理簡單多了，但是有沒有發現上面的代碼很長，很冗余。有人發現了，有人不僅發現了，還想出了一種更好的方法去解決，具體是什么方法呢？就是我們今天要講的group by的 plus 版。真名叫做 grouping sets。這個 plus 可以根據不同維度組合進行聚合。比如根據大區聚合、根據大區和省份聚合、根據大區省份和城市聚合、根據大區省份城市和店鋪聚合。

將上面 union all 語句用 grouping sets 改寫以后，代碼如下：

select?null,area,province,city,shop,count(orderid)?as?sales,grouping_id from?t? where?deal_date?between?"2019-05-01"?and?"2019-05-31" group?by?null,area,province,city,shop grouping?sets(null,area,(area,province),(area,province,city),(area,province?,city,shop)) order?by?grouping_id

上面代碼得到的效果和利用 union all 拼接得到的效果是一樣的，但是要比拼接的代碼簡潔很多。group by后面放的字段表示要分組聚合的全部字段，grouping sets 后面放的是 group by 后面各種字段的組合，根據實際需求進行組合就行，組合字段用小括號括起來，也可以是單一字段。

在求取全國的成交量的時候其實是不需要分組聚合的，但是為了使用 grouping sets，所以我們在求取全國成交量的時候用 group by null。

grouping_id 用來表示每個分組的序號。1表示第一個分組、2表示第二個分組、。。。我們可以根據grouping_id 選取出我們需要的組合。如果我們需要全國的成交量，讓 grouping_id = 1 即可；如果我們需要每個省份的成交量，讓 grouping_id = 3 即可。其他也是同樣的道理。

03|cube

看完 grouping sets 后，我們再來看另一個 plus 版，就是 cube。這個函數是對 group by 的維度的所有組合進行聚合。直接來看代碼：

select?area,province,count(orderid)?as?sales,grouping_id? from?t? where?deal_date?between?"2019-05-01"?and?"2019-05-31" group?by?area,province with?cube order?by?grouping_id

上面代碼是對區域和省份進行聚合，并利用了 cube ，最后得到的結果如下：

cube 會先對全部數據進行聚合，即 null,null，再對 area,null 進行聚合，然后再對 null,province 進行聚合，最后再對 area,province進行聚合。

04|rollup

再來看一下最后一個 plus 版，就是 rollup。這個函數其實和 cube 挺像的，是針對 group by 所有維度的部分組合。還是上面的例子，我們來看一下運行結果。代碼如下：

select?area,province,count(orderid)?as?sales,grouping_id? from?t? where?deal_date?between?"2019-05-01"?and?"2019-05-31" group?by?area,province with?rollup order?by?grouping_id

最后得到的結果如下：

仔細觀察一下 cube 和 rollup 得到的結果，我們會發現 rollup 少了 null province 這一個組合，看出差別來了吧，rollup 是以最左側指標為主進行組合聚合。

這一節講的這幾個 plus 版函數很實用，如果熟練掌握了，可以減少很多工作量的。

你還可以看：

Sql 的執行順序是怎樣的？

Sql 實現數據透視表功能

講講你不知道的窗口函數

總結

以上是生活随笔為你收集整理的讲讲 group by 的plus版的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

group

上一篇： 5月碎碎念
下一篇：介绍一下 information_sch