Greenplum roaring bitmap与业务场景 (类阿里云RDS PG varbitx, 应用于海量用户 实时画像和圈选、透视)
摘要: 標(biāo)簽 PostgreSQL , Greenplum , varbitx , roaring bitmap , pilosa , varbit , hll , 多階段聚合 背景 roaring bitmap是一個(gè)壓縮比很高同時(shí)性能不錯(cuò)的BIT庫(kù),被廣泛使用(例如Greenplum, ES, InfluxDB.
點(diǎn)此查看原文
標(biāo)簽
PostgreSQL , Greenplum , varbitx , roaring bitmap , pilosa , varbit , hll , 多階段聚合
背景
roaring bitmap是一個(gè)壓縮比很高同時(shí)性能不錯(cuò)的BIT庫(kù),被廣泛使用(例如Greenplum, ES, InfluxDB......):
Roaring bitmaps are compressed bitmaps which tend to outperform conventional compressed bitmaps such as WAH, EWAH or Concise. They are used by several major systems such as Apache Lucene and derivative systems such as Solr and Elasticsearch, Metamarkets' Druid, LinkedIn Pinot, Netflix Atlas, Apache Spark, OpenSearchServer, Cloud Torrent, Whoosh, InfluxDB, Pilosa, Bleve, Microsoft Visual Studio Team Services (VSTS), and eBay's Apache Kylin.
《Roaring Bitmap - A better compressed bitset》
https://github.com/RoaringBitmap/CRoaring
在PostgreSQL中內(nèi)置了varbit的數(shù)據(jù)類型,阿里云在其基礎(chǔ)上擴(kuò)展了對(duì)varbit的操作符:
《阿里云RDS for PostgreSQL varbitx插件與實(shí)時(shí)畫(huà)像應(yīng)用場(chǎng)景介紹》
是的阿里云RDS PG支持以更低的成本、更高的性能支持海量畫(huà)像的實(shí)時(shí)計(jì)算:
《阿里云RDS PostgreSQL varbitx實(shí)踐 - 流式標(biāo)簽 (閱后即焚流式批量計(jì)算) - 萬(wàn)億級(jí),任意標(biāo)簽圈人,毫秒響應(yīng)》
《基于 阿里云 RDS PostgreSQL 打造實(shí)時(shí)用戶畫(huà)像推薦系統(tǒng)(varbitx)》
《驚天性能!單RDS PostgreSQL實(shí)例 支撐 2000億 - 實(shí)時(shí)標(biāo)簽透視案例》
對(duì)于Greenplum,同樣有社區(qū)的朋友貢獻(xiàn)的插件,讓Greenplum可以支持roaringbitmap類型。
開(kāi)源代碼如下(感謝貢獻(xiàn)代碼的小伙伴):
https://github.com/zeromax007/gpdb-roaringbitmap
(目前這個(gè)版本沒(méi)有將聚合做到計(jì)算節(jié)點(diǎn),而是走了gather motion再聚合的方式,聚合性能不佳)。
postgres=# explain select rb_cardinality(rb_and_agg(bitmap)) from t1; QUERY PLAN ---------------------------------------------------------------------------------------- Aggregate (cost=1.05..1.07 rows=1 width=4) -> Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..1.05 rows=1 width=1254608) -> Seq Scan on t1 (cost=0.00..1.01 rows=1 width=1254608) (3 rows) Time: 0.727 ms建議有興趣的同學(xué)可以改進(jìn)一下 roaringbitmap for Greenplum 聚合代碼,改成多階段聚合,讓聚合先在計(jì)算節(jié)點(diǎn)做。
自定義分布式聚合的方法參考如下:
《PostgreSQL 10 自定義并行計(jì)算聚合函數(shù)的原理與實(shí)踐》
《Postgres-XC customized aggregate introduction》
《PostgreSQL aggregate function customize》
《Greenplum 最佳實(shí)踐 - 估值插件hll的使用(以及hll分式聚合函數(shù)優(yōu)化)》
接下來(lái)簡(jiǎn)單介紹一下roaringbitmap的安裝與功能點(diǎn)。
安裝
1、首先你需要安裝好greenplum。
2、然后你需要下載gpdb-roaringbitmap
git clone https://github.com/zeromax007/gpdb-roaringbitmap3、編譯gpdb-roaringbitmap
If $GPHOME is /usr/local/gpdb . gcc -march=native -O3 -std=c11 -Wall -Wpointer-arith -Wendif-labels -Wformat-security \ -fno-strict-aliasing -fwrapv -fexcess-precision=standard -fno-aggressive-loop-optimizations \ -Wno-unused-but-set-variable -Wno-address -fpic -D_GNU_SOURCE \ -I/usr/local/gpdb/include/postgresql/server \ -I/usr/local/gpdb/include/postgresql/internal \ -c -o roaringbitmap.o roaringbitmap.c 或如下,主要看你的頭文件在哪里 gcc -march=native -O3 -std=c11 -Wall -Wpointer-arith -Wendif-labels -Wformat-security \ -fno-strict-aliasing -fwrapv -fexcess-precision=standard -fno-aggressive-loop-optimizations \ -Wno-unused-but-set-variable -Wno-address -fpic -D_GNU_SOURCE \ -I/usr/local/gpdb/include/server \ -I/usr/local/gpdb/include/internal \ -c -o roaringbitmap.o roaringbitmap.c gcc -O3 -std=gnu99 -Wall -Wpointer-arith -Wendif-labels -Wformat-security \ -fno-strict-aliasing -fwrapv -fexcess-precision=standard -fno-aggressive-loop-optimizations \ -Wno-unused-but-set-variable -Wno-address -fpic -shared --enable-new-dtags \ -o roaringbitmap.so roaringbitmap.o4、將so文件拷貝到所有g(shù)pdb節(jié)點(diǎn)(所有master, slave, segment, mirror等)的軟件目錄對(duì)應(yīng)的lib目錄中.
cp ./roaringbitmap.so /usr/local/gpdb/lib/postgresql/5、在MASTER節(jié)點(diǎn),連接到需要使用roaringbitmap的DB中,執(zhí)行如下SQL,安裝對(duì)應(yīng)的類型,操作符,函數(shù)等。
psql -f ./roaringbitmap.sql使用DEMO
1、建表,使用roaringbitmap數(shù)據(jù)類型
CREATE TABLE t1 (id integer, bitmap roaringbitmap);2、使用rb_build生成roaringbitmap的數(shù)據(jù)(輸入為數(shù)組,輸出為roaringbitmap。含義:數(shù)組位置對(duì)應(yīng)的bit值設(shè)置為1)。
INSERT INTO t1 SELECT 1,RB_BUILD(ARRAY[1,2,3,4,5,6,7,8,9,200]); -- 將輸入的多條記錄的值對(duì)應(yīng)位置的BIT值設(shè)置為1,最后聚合為一個(gè)roaringbitmap INSERT INTO t1 SELECT 2,RB_BUILD_AGG(e) FROM GENERATE_SERIES(1,100) e;3、兩個(gè)roaringbitmap的BIT計(jì)算(OR, AND, XOR, ANDNOT)。andnot表示第一個(gè)參數(shù)與第二個(gè)參數(shù)的NOT進(jìn)行AND操作,等同于andnot(c1,c2)==and(c1, not(c2))
SELECT RB_OR(a.bitmap,b.bitmap) FORM (SELECT bitmap FROM t1 WHERE id = 1) AS a, (SELECT bitmap FROM t1 WHERE id = 2) AS b;4、一些聚合操作,并生成新的roaringbitmap (OR, AND, XOR, BUILD)
SELECT RB_OR_AGG(bitmap) FROM t1; SELECT RB_AND_AGG(bitmap) FORM t1; SELECT RB_XOR_AGG(bitmap) FROM t1; SELECT RB_BUILD_AGG(e) FROM GENERATE_SERIES(1,100) e;5、Cardinality,即roaringbitmap中包含多少個(gè)位置為1的BIT位。
SELECT RB_CARDINALITY(bitmap) FROM t1;6、從roaringbitmap返回位置為1的BIT的下標(biāo)(位置值)。
SELECT RB_ITERATE(bitmap) FROM t1 WHERE id = 1; postgres=# select rb_iterate(rb_build('{1,4,100}')); rb_iterate ------------ 1 4 100 (3 rows)7、一些bit設(shè)置操作
postgres=# select rb_iterate(rb_flip(rb_build('{1,2,3,100,4,5}'),7,10)); rb_iterate ------------ 1 2 3 4 5 7 8 9 100 (9 rows)內(nèi)置計(jì)算函數(shù)說(shuō)明
List of functions Schema | Name | Result data type | Argument data types | Type ------------+------------------------+------------------+--------------------------------------------+-------- public | rb_and | roaringbitmap | roaringbitmap, roaringbitmap | normal public | rb_and_cardinality | integer | roaringbitmap, roaringbitmap | normal public | rb_andnot | roaringbitmap | roaringbitmap, roaringbitmap | normal public | rb_andnot_cardinality | integer | roaringbitmap, roaringbitmap | normal public | rb_build | roaringbitmap | integer[] | normal public | rb_cardinality | integer | roaringbitmap | normal public | rb_equals | boolean | roaringbitmap, roaringbitmap | normal public | rb_flip | roaringbitmap | roaringbitmap, integer, integer | normal public | rb_intersect | boolean | roaringbitmap, roaringbitmap | normal public | rb_is_empty | boolean | roaringbitmap | normal public | rb_iterate | SETOF integer | roaringbitmap | normal public | rb_maximum | integer | roaringbitmap | normal public | rb_minimum | integer | roaringbitmap | normal public | rb_or | roaringbitmap | roaringbitmap, roaringbitmap | normal public | rb_or_cardinality | integer | roaringbitmap, roaringbitmap | normal public | rb_rank | integer | roaringbitmap, integer | normal public | rb_remove | roaringbitmap | roaringbitmap, integer | normal public | rb_xor | roaringbitmap | roaringbitmap, roaringbitmap | normal public | rb_xor_cardinality | integer | roaringbitmap, roaringbitmap | normal| rb_build | integer[] | roaringbitmap | Build a roaringbitmap tuple from integer array. | rb_build('{1,2,3,4,5}') |
| rb_and | roraingbitmap,roaringbitmap | roaringbitmap | Two roaringbitmap tuples and calculation. | rb_and(rb_build('{1,2,3}'),rb_build('{3,4,5}')) |
| rb_or | roraingbitmap,roaringbitmap | roaringbitmap | Two roaringbitmap tuples or calculation. | rb_or(rb_build('{1,2,3}'),rb_build('{3,4,5}')) |
| rb_xor | roraingbitmap,roaringbitmap | roaringbitmap | Two roaringbitmap tuples xor calculation. | rb_xor(rb_build('{1,2,3}'),rb_build('{3,4,5}')) |
| rb_andnot | roraingbitmap,roaringbitmap | roaringbitmap | Two roaringbitmap tuples andnot calculation. | rb_andnot(rb_build('{1,2,3}'),rb_build('{3,4,5}')) |
| rb_cardinality | roraingbitmap | integer | Retrun roaringbitmap tuple's cardinality. | rb_cardinality(rb_build('{1,2,3,4,5}')) |
| rb_and_cardinality | roraingbitmap,roaringbitmap | integer | Two roaringbitmap tuples and calculation, return cardinality. | rb_and_cardinality(rb_build('{1,2,3}'),rb_build('{3,4,5}')) |
| rb_or_cardinality | roraingbitmap,roaringbitmap | integer | Two roaringbitmap tuples or calculation, return cardinality. | rb_or_cardinality(rb_build('{1,2,3}'),rb_build('{3,4,5}')) |
| rb_xor_cardinality | roraingbitmap,roaringbitmap | integer | Two roaringbitmap tuples xor calculation, return cardinality. | rb_xor_cardinality(rb_build('{1,2,3}'),rb_build('{3,4,5}')) |
| rb_andnot_cardinality | roraingbitmap,roaringbitmap | integer | Two roaringbitmap tuples andnot calculation, return cardinality. | rb_andnot_cardinality(rb_build('{1,2,3}'),rb_build('{3,4,5}')) |
| rb_is_empty | roraingbitmap | boolean | Check if roaringbitmap tuple is empty. | rb_is_empty(rb_build('{1,2,3,4,5}')) |
| rb_equals | roraingbitmap,roaringbitmap | boolean | Check two roaringbitmap tuples are equal. | rb_equals(rb_build('{1,2,3}'),rb_build('{3,4,5}')) |
| rb_intersect | roraingbitmap,roaringbitmap | boolean | Check two roaringbitmap tuples are intersect. | rb_intersect(rb_build('{1,2,3}'),rb_build('{3,4,5}')) |
| rb_remove | roraingbitmap,integer | roraingbitmap | Remove the specified offset from roaringbitmap tuple. | rb_remove(rb_build('{1,2,3}'),3) |
| rb_flip | roraingbitmap,integer,integer | roraingbitmap | Flip the specified offsets range (not include the end) from roaringbitmap tuple. | rb_flip(rb_build('{1,2,3}'),7,10) -- 翻轉(zhuǎn)BIT位置為7到10(不含10)的BIT值 |
| rb_minimum | roraingbitmap | integer | Return the smallest offset in roaringbitmap tuple. Return UINT32_MAX if the bitmap tuple is empty. | rb_minimum(rb_build('{1,2,3}')) -- 返回該roaringbitmap中bit值設(shè)置為1的最小位置 |
| rb_maximum | roraingbitmap | integer | Return the greatest offset in roaringbitmap tuple. Return 0 if the bitmap tuple is empty. | rb_maximum(rb_build('{1,2,3}')) -- 返回該roaringbitmap中bit值設(shè)置為1的最大位置 |
| rb_rank | roraingbitmap,integer | integer | Return the number of offsets that are smaller or equal to the specified offset. | rb_rank(rb_build('{1,2,3}'),3) -- BIT位置小于等于N的BIT中,有多少個(gè)bit位置被設(shè)置為1 |
| rb_iterate | roaringbitmap | SETOF integer | Bitmap to SETOF integer | rb_iterate(rb_build('{1,2,3,100}')) |
內(nèi)置聚合函數(shù)說(shuō)明
List of functions Schema | Name | Result data type | Argument data types | Type --------+-------------------------+---------------------------+--------------------------------------------------+-------- public | rb_and_agg | roaringbitmap | roaringbitmap | agg public | rb_and_cardinality_agg | integer | roaringbitmap | agg public | rb_build_agg | roaringbitmap | integer | agg public | rb_or_agg | roaringbitmap | roaringbitmap | agg public | rb_or_cardinality_agg | integer | roaringbitmap | agg public | rb_xor_agg | roaringbitmap | roaringbitmap | agg public | rb_xor_cardinality_agg | integer | roaringbitmap | agg| rb_build_agg | integer | roraingbitmap | Build a roaringbitmap tuple from a integer set. | rb_build_agg(1) |
| rb_or_agg | roraingbitmap | roraingbitmap | Or Aggregate calculations from a roraingbitmap set. | rb_or_agg(rb_build('{1,2,3}')) |
| rb_and_agg | roraingbitmap | roraingbitmap | And Aggregate calculations from a roraingbitmap set. | rb_and_agg(rb_build('{1,2,3}')) |
| rb_xor_agg | roraingbitmap | roraingbitmap | Xor Aggregate calculations from a roraingbitmap set. | rb_xor_agg(rb_build('{1,2,3}')) |
| rb_or_cardinality_agg | roraingbitmap | integer | Or Aggregate calculations from a roraingbitmap set, return cardinality. | rb_or_cardinality_agg(rb_build('{1,2,3}')) |
| rb_and_cardinality_agg | roraingbitmap | integer | And Aggregate calculations from a roraingbitmap set, return cardinality. | rb_and_cardinality_agg(rb_build('{1,2,3}')) |
| rb_xor_cardinality_agg | roraingbitmap | integer | Xor Aggregate calculations from a roraingbitmap set, return cardinality. | rb_xor_cardinality_agg(rb_build('{1,2,3}')) |
例子
《驚天性能!單RDS PostgreSQL實(shí)例 支撐 2000億 - 實(shí)時(shí)標(biāo)簽透視案例》
背景:
有20億個(gè)BIT,有幾千萬(wàn)的標(biāo)簽。意味著有幾千萬(wàn)行,每一行有20億個(gè)BIT組成的roaringbitmap。
求任意標(biāo)簽組合的cardinate. (rb_???_cardinality_agg)
設(shè)計(jì):
數(shù)據(jù)按標(biāo)簽字段分布:
create table tbl (tagid int primary key, bitmap roaringbitmap) distributed by (tagid) ;SQL:
1、求合并的BIT中有多少為1的BIT
select rb_and_cardinality_agg(bitmap) from tbl where tagid in (?,......?);2、求合并的BIT,對(duì)應(yīng)的BIT位置
select RB_ITERATE(rb) from (select rb_and_agg(bitmap) as rb from tbl where tagid in(1,2,3)) t;加速
由于目前roaringbitmap gp這個(gè)插件沒(méi)有支持agg中的prefunc,所以聚合是收集到master節(jié)點(diǎn)操作的,這個(gè)勢(shì)必影響性能。
postgres=# explain select rb_and_cardinality_agg(bitmap) from tbl where tagid in (1,2,3,4,5,6,7,8); QUERY PLAN ----------------------------------------------------------------------------------- Aggregate (cost=0.04..0.06 rows=1 width=4) -> Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..0.04 rows=1 width=32) -> Seq Scan on tbl (cost=0.00..0.00 rows=1 width=32) Filter: tagid = ANY ('{1,2,3,4,5,6,7,8}'::integer[]) (4 rows) postgres=# explain select RB_ITERATE(rb) from (select rb_and_agg(bitmap) as rb from tbl where tagid in(1,2,3)) t; QUERY PLAN ----------------------------------------------------------------------------------------- Result (cost=0.04..0.07 rows=3 width=32) -> Aggregate (cost=0.04..0.06 rows=1 width=32) -> Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..0.04 rows=1 width=32) -> Seq Scan on tbl (cost=0.00..0.00 rows=1 width=32) Filter: tagid = ANY ('{1,2,3}'::integer[]) (5 rows)為了加速,務(wù)必要實(shí)現(xiàn)這些聚合函數(shù)的prefunc。
Greenplum支持的兩種聚合運(yùn)算模式:
1. 如果只配置了sfunc,則相關(guān)數(shù)據(jù)全部收集到master節(jié)點(diǎn),在master節(jié)點(diǎn)對(duì)所有數(shù)據(jù)依條加上sfunc的結(jié)果(第一次可選為initcond)輸入給sfunc計(jì)算,直到所有數(shù)據(jù)都跑完sfunc,最后如果設(shè)置了finalfunc,則計(jì)算并得到最終結(jié)果。
2. 如果同時(shí)配置了sfunc和prefunc,則在segment節(jié)點(diǎn)并行完成sfunc,然后將segment節(jié)點(diǎn)執(zhí)行的結(jié)果發(fā)給master,在master調(diào)用prefunc進(jìn)行再次聚合,輸出結(jié)果,如果配置了finalfunc,則這個(gè)結(jié)果再給finalfunc執(zhí)行并輸出最終結(jié)果。
優(yōu)化例子:
//bitmap and trans PG_FUNCTION_INFO_V1(rb_and_trans_pre); Datum rb_and_trans_pre(PG_FUNCTION_ARGS); Datum rb_and_trans_pre(PG_FUNCTION_ARGS) { MemoryContext aggctx; roaring_bitmap_t *r1; roaring_bitmap_t *r2; // We must be called as a transition routine or we fail. if (!AggCheckCallContext(fcinfo, &aggctx)) ereport(ERROR, (errcode(ERRCODE_DATA_EXCEPTION), errmsg("rb_and_trans outside transition context"))); // Is the first argument a NULL? if (PG_ARGISNULL(0)) { r1 = setup_roaringbitmap(aggctx); } else { r1 = (roaring_bitmap_t *) PG_GETARG_POINTER(0); } // Is the second argument non-null? if (!PG_ARGISNULL(1)) { r2 = (roaring_bitmap_t *) PG_GETARG_POINTER(1); if (PG_ARGISNULL(0)) { r1 = roaring_bitmap_copy(r2); } else { roaring_bitmap_and_inplace(r1, r2); } roaring_bitmap_free(r2); } PG_RETURN_POINTER(r1); } CREATE OR REPLACE FUNCTION rb_and_trans_pre(internal, internal) RETURNS internal AS 'roaringbitmap.so', 'rb_and_trans_pre' LANGUAGE C IMMUTABLE; CREATE AGGREGATE rb_and_agg(roaringbitmap)( SFUNC = rb_and_trans, PREFUNC = rb_and_trans_pre, STYPE = internal, FINALFUNC = rb_serialize );實(shí)現(xiàn)prefunc后,執(zhí)行計(jì)劃就會(huì)變成這樣的,先在計(jì)算節(jié)點(diǎn)執(zhí)行一階段聚合,然后再到master執(zhí)行第二階段的聚合,效率明顯提升。
postgres=# explain select RB_ITERATE(rb) from (select rb_and_agg(bitmap) as rb from tbl where tagid in(1,2,3)) t;QUERY PLAN ----------------------------------------------------------------------------------------Result (cost=0.07..0.10 rows=3 width=32)-> Aggregate (cost=0.07..0.08 rows=1 width=32)-> Gather Motion 3:1 (slice1; segments: 3) (cost=0.01..0.06 rows=1 width=4)-> Aggregate (cost=0.01..0.01 rows=1 width=4)-> Seq Scan on tbl (cost=0.00..0.00 rows=1 width=32)Filter: tagid = ANY ('{1,2,3}'::integer[]) (6 rows)postgres=# explain select rb_and_agg(bitmap) from tbl where tagid in (1,2,3,4,5,6,7,8);QUERY PLAN ----------------------------------------------------------------------------------Aggregate (cost=0.07..0.08 rows=1 width=32)-> Gather Motion 3:1 (slice1; segments: 3) (cost=0.01..0.06 rows=1 width=4)-> Aggregate (cost=0.01..0.01 rows=1 width=4)-> Seq Scan on tbl (cost=0.00..0.00 rows=1 width=32)Filter: tagid = ANY ('{1,2,3,4,5,6,7,8}'::integer[]) (5 rows)小結(jié)
gpdb-roaringbitmap是一個(gè)很好的插件,可以幫助用戶高效的實(shí)現(xiàn)多組標(biāo)簽的人群圈選。
目前需要實(shí)現(xiàn)prefunc來(lái)支持多階段聚合,否則只能gather到master去聚合。文中有例子。
參考
《PostgreSQL (varbit, roaring bitmap) VS pilosa(bitmap庫(kù))》
《Roaring Bitmap - A better compressed bitset》
《阿里云RDS PostgreSQL varbitx實(shí)踐 - 流式標(biāo)簽 (閱后即焚流式批量計(jì)算) - 萬(wàn)億級(jí),任意標(biāo)簽圈人,毫秒響應(yīng)》
《基于 阿里云 RDS PostgreSQL 打造實(shí)時(shí)用戶畫(huà)像推薦系統(tǒng)(varbitx)》
《阿里云RDS for PostgreSQL varbitx插件與實(shí)時(shí)畫(huà)像應(yīng)用場(chǎng)景介紹》
《Greenplum 最佳實(shí)踐 - 估值插件hll的使用(以及hll分式聚合函數(shù)優(yōu)化)》
《PostgreSQL 10 自定義并行計(jì)算聚合函數(shù)的原理與實(shí)踐》
《Postgres-XC customized aggregate introduction》
《PostgreSQL aggregate function customize》
https://github.com/RoaringBitmap/CRoaring
https://github.com/zeromax007/gpdb-roaringbitmap
《驚天性能!單RDS PostgreSQL實(shí)例 支撐 2000億 - 實(shí)時(shí)標(biāo)簽透視案例》
掃描二維碼獲取更多消息:?
總結(jié)
以上是生活随笔為你收集整理的Greenplum roaring bitmap与业务场景 (类阿里云RDS PG varbitx, 应用于海量用户 实时画像和圈选、透视)的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 车联网上云最佳实践(二)
- 下一篇: 阿里云企业IPv6部署方案