日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問(wèn) 生活随笔!

生活随笔

當(dāng)前位置: 首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

Hologres基于TPCH的性能测试介绍

發(fā)布時(shí)間:2024/9/3 编程问答 41 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Hologres基于TPCH的性能测试介绍 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
簡(jiǎn)介:本文將會(huì)介紹在Hologres中如何基于TPCH數(shù)據(jù)集做性能測(cè)試,并提供測(cè)試結(jié)果參考,方便您進(jìn)行產(chǎn)品規(guī)格選型。

背景信息

TPC-H(商業(yè)智能計(jì)算測(cè)試)是美國(guó)交易處理效能委員會(huì)(TPC,Transaction Processing Performance Council)組織制定的用來(lái)模擬決策支持類應(yīng)用的一個(gè)測(cè)試集。目前在學(xué)術(shù)界和工業(yè)界普遍采用它來(lái)評(píng)價(jià)決策支持技術(shù)方面應(yīng)用的性能。TPC-H 是根據(jù)真實(shí)的生產(chǎn)運(yùn)行環(huán)境來(lái)建模,模擬了一套銷售系統(tǒng)的數(shù)據(jù)倉(cāng)庫(kù)。其共包含 8 張表,數(shù)據(jù)量可設(shè)定從 1G~3T 不等。其基準(zhǔn)測(cè)試共包含了22個(gè)查詢,主要評(píng)價(jià)指標(biāo)各個(gè)查詢的響應(yīng)時(shí)間,即從提交查詢到結(jié)果返回所需時(shí)間。其測(cè)試結(jié)果可綜合反映系統(tǒng)處理查詢時(shí)的能力。詳情參考TPCH 文檔。

數(shù)據(jù)集介紹

該數(shù)據(jù)集包含如下 8 張表,互相間的關(guān)系如下圖所示。

測(cè)試詳情

測(cè)試數(shù)據(jù)量說(shuō)明

測(cè)試數(shù)據(jù)量會(huì)直接影響測(cè)試結(jié)果,TPC-H 的生成工具中使用 SF ( scale factor ) 控制生成數(shù)據(jù)的數(shù)據(jù)量的大小,1 SF 對(duì)應(yīng) 1 GB。

注意:以上提及的數(shù)據(jù)量?jī)H僅為原始數(shù)據(jù)的數(shù)據(jù)量,不包括索引等空間占用,所以準(zhǔn)備環(huán)境時(shí),需要預(yù)留更多的空間。

測(cè)試環(huán)境

本次測(cè)試使用了獨(dú)享實(shí)例(按量付費(fèi))的實(shí)例,由于僅為測(cè)試示意使用,所以計(jì)算資源配置選擇了8核32G。

測(cè)試場(chǎng)景

本測(cè)試場(chǎng)景主要包含3部分:

  • OLAP查詢場(chǎng)景測(cè)試,主要使用列存表,直接使用TPCH測(cè)試中的22條查詢;
  • Key/Value點(diǎn)查場(chǎng)景測(cè)試,主要使用行存表,針對(duì)orders使用行存表后,進(jìn)行主鍵過(guò)濾的點(diǎn)查;
  • 基礎(chǔ)環(huán)境準(zhǔn)備

    • 該步驟主要用于準(zhǔn)備OLAP查詢場(chǎng)景和Key/Value點(diǎn)查場(chǎng)景所需的數(shù)據(jù);

    基礎(chǔ)環(huán)境準(zhǔn)備

    1. 創(chuàng)建 ECS 實(shí)例

    登陸阿里云,創(chuàng)建一個(gè) ECS 實(shí)例,用于數(shù)據(jù)生成、向 Hologres 導(dǎo)入數(shù)據(jù)、客戶端測(cè)試。建議規(guī)格:

    • ecs.g6.4xlarge 規(guī)格
    • CentOS 7.9 系統(tǒng)
    • ESSD 數(shù)據(jù)盤,具體數(shù)據(jù)容量根據(jù)需要測(cè)試的數(shù)據(jù)量大小決定
    • 建議 ECS 與 Hologres 實(shí)例用相同 Region 和 VPC 網(wǎng)絡(luò)

    2. 創(chuàng)建 Hologres 實(shí)例

    • 登陸阿里云,進(jìn)入 Hologres 產(chǎn)品控制臺(tái),點(diǎn)擊新增引擎實(shí)例
    • 選擇配置,并填寫實(shí)例名稱,詳細(xì)說(shuō)明請(qǐng)參考官方文檔。

    3. 創(chuàng)建測(cè)試數(shù)據(jù)庫(kù)

    • 在創(chuàng)建實(shí)例后,您需要登陸您創(chuàng)建的 Hologres 實(shí)例,創(chuàng)建一個(gè)數(shù)據(jù)庫(kù),本測(cè)試中命名數(shù)據(jù)庫(kù)為tpch_1sf,詳細(xì)操作步驟請(qǐng)參考官方文檔

    生成 TPC-H 數(shù)據(jù)

    1. 準(zhǔn)備數(shù)據(jù)生成工具

    • 遠(yuǎn)程鏈接 ECS 實(shí)例
    • 更新所有庫(kù)
    yum update
    • 安裝 git
    yum install git
    • 安裝gcc
    yum install gcc
    • 下載 TPC-H 數(shù)據(jù)生成代碼
    git clone https://github.com/gregrahn/tpch-kit.git
    • 進(jìn)入數(shù)據(jù)生成工具代碼目錄
    cd tpch-kit/dbgen
    • 編譯數(shù)據(jù)生成工具代碼
    make

    2. 生成數(shù)據(jù)

    • 編譯成功后,您可以使用如下代碼查看代碼生成工具的相關(guān)參數(shù)。
    ./dbgen --help
    • 本次測(cè)試僅生成 1 GB 數(shù)據(jù),所以運(yùn)行如下代碼生成數(shù)據(jù)。
    ./dbgen -vf -s 1 如您需要生成更多數(shù)據(jù)量的數(shù)據(jù),可以調(diào)整 SF 的參數(shù),例如您可以使用如下代碼生成 1 T 數(shù)據(jù) ./dbgen -vf -s 1000
    • 一般情況下,32CU 可以跑 TPCH SF10,256CU 可以跑 TPCH SF50
    • 數(shù)據(jù)生成后,您可以使用如下代碼查看生成的文件。可以看到生成工具生成了 8 個(gè)數(shù)據(jù)文件,每個(gè)數(shù)據(jù)文件都對(duì)應(yīng)一張數(shù)據(jù)集中的表。
    ls | grep '.*.tbl'

    OLAP查詢場(chǎng)景測(cè)試

    準(zhǔn)備數(shù)據(jù)

    1. 創(chuàng)建表

    • 由于本文主要使用 psql 進(jìn)行數(shù)據(jù)導(dǎo)入操作,需要先在 ECS 中運(yùn)行如下命令安裝 psql
    yum install postgresql-server
    • 安裝 psql 后,您可以使用如下命令登陸 Hologres 實(shí)例
    PGUSER=<AccessID> PGPASSWORD=<AccessKey> psql -p <Port> -h <Endpoint> -d <Database>
    • 使用psql連接Hologres后,您可以使用如下建表語(yǔ)句創(chuàng)建數(shù)據(jù)庫(kù)表
    DROP TABLE IF EXISTS LINEITEM;BEGIN; CREATE TABLE LINEITEM (L_ORDERKEY INT NOT NULL,L_PARTKEY INT NOT NULL,L_SUPPKEY INT NOT NULL,L_LINENUMBER INT NOT NULL,L_QUANTITY DECIMAL(15,2) NOT NULL,L_EXTENDEDPRICE DECIMAL(15,2) NOT NULL,L_DISCOUNT DECIMAL(15,2) NOT NULL,L_TAX DECIMAL(15,2) NOT NULL,L_RETURNFLAG TEXT NOT NULL,L_LINESTATUS TEXT NOT NULL,L_SHIPDATE TIMESTAMPTZ NOT NULL,L_COMMITDATE TIMESTAMPTZ NOT NULL,L_RECEIPTDATE TIMESTAMPTZ NOT NULL,L_SHIPINSTRUCT TEXT NOT NULL,L_SHIPMODE TEXT NOT NULL,L_COMMENT TEXT NOT NULL,PRIMARY KEY (L_ORDERKEY,L_LINENUMBER) ); CALL set_table_property('LINEITEM', 'clustering_key', 'L_SHIPDATE,L_ORDERKEY'); CALL set_table_property('LINEITEM', 'segment_key', 'L_SHIPDATE'); CALL set_table_property('LINEITEM', 'distribution_key', 'L_ORDERKEY'); CALL set_table_property('LINEITEM', 'bitmap_columns', 'L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_RETURNFLAG,L_LINESTATUS,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT'); CALL set_table_property('LINEITEM', 'dictionary_encoding_columns', 'L_RETURNFLAG,L_LINESTATUS,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT'); CALL set_table_property('LINEITEM', 'time_to_live_in_seconds', '31536000'); COMMIT;DROP TABLE IF EXISTS ORDERS;BEGIN; CREATE TABLE ORDERS (O_ORDERKEY INT NOT NULL PRIMARY KEY,O_CUSTKEY INT NOT NULL,O_ORDERSTATUS TEXT NOT NULL,O_TOTALPRICE DECIMAL(15,2) NOT NULL,O_ORDERDATE timestamptz NOT NULL,O_ORDERPRIORITY TEXT NOT NULL,O_CLERK TEXT NOT NULL,O_SHIPPRIORITY INT NOT NULL,O_COMMENT TEXT NOT NULL ); CALL set_table_property('ORDERS', 'segment_key', 'O_ORDERDATE'); CALL set_table_property('ORDERS', 'distribution_key', 'O_ORDERKEY'); CALL set_table_property('ORDERS', 'colocate_with', 'LINEITEM'); CALL set_table_property('ORDERS', 'bitmap_columns', 'O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT'); CALL set_table_property('ORDERS', 'dictionary_encoding_columns', 'O_ORDERSTATUS,O_ORDERPRIORITY,O_CLERK,O_COMMENT'); CALL set_table_property('ORDERS', 'time_to_live_in_seconds', '31536000'); COMMIT;DROP TABLE IF EXISTS PARTSUPP;BEGIN; CREATE TABLE PARTSUPP (PS_PARTKEY INT NOT NULL,PS_SUPPKEY INT NOT NULL,PS_AVAILQTY INT NOT NULL,PS_SUPPLYCOST DECIMAL(15,2) NOT NULL,PS_COMMENT TEXT NOT NULL,PRIMARY KEY(PS_PARTKEY,PS_SUPPKEY) ); CALL set_table_property('PARTSUPP', 'distribution_key', 'PS_PARTKEY'); CALL set_table_property('PARTSUPP', 'colocate_with', 'LINEITEM'); CALL set_table_property('PARTSUPP', 'bitmap_columns', 'PS_PARTKEY,PS_SUPPKEY,PS_AVAILQTY,PS_COMMENT'); CALL set_table_property('PARTSUPP', 'dictionary_encoding_columns', 'PS_COMMENT'); CALL set_table_property('PARTSUPP', 'time_to_live_in_seconds', '31536000'); COMMIT;DROP TABLE IF EXISTS PART;BEGIN; CREATE TABLE PART (P_PARTKEY INT NOT NULL PRIMARY KEY,P_NAME TEXT NOT NULL,P_MFGR TEXT NOT NULL,P_BRAND TEXT NOT NULL,P_TYPE TEXT NOT NULL,P_SIZE INT NOT NULL,P_CONTAINER TEXT NOT NULL,P_RETAILPRICE DECIMAL(15,2) NOT NULL,P_COMMENT TEXT NOT NULL ); CALL set_table_property('PART', 'distribution_key', 'P_PARTKEY'); CALL set_table_property('PART', 'colocate_with', 'LINEITEM'); CALL set_table_property('PART', 'bitmap_columns', 'P_PARTKEY,P_SIZE,P_NAME,P_MFGR,P_BRAND,P_TYPE,P_CONTAINER,P_COMMENT'); CALL set_table_property('PART', 'dictionary_encoding_columns', 'P_NAME,P_MFGR,P_BRAND,P_TYPE,P_CONTAINER,P_COMMENT'); CALL set_table_property('PART', 'time_to_live_in_seconds', '31536000'); COMMIT;DROP TABLE IF EXISTS CUSTOMER; BEGIN; CREATE TABLE CUSTOMER (C_CUSTKEY INT NOT NULL PRIMARY KEY,C_NAME TEXT NOT NULL,C_ADDRESS TEXT NOT NULL,C_NATIONKEY INT NOT NULL,C_PHONE TEXT NOT NULL,C_ACCTBAL DECIMAL(15,2) NOT NULL,C_MKTSEGMENT TEXT NOT NULL,C_COMMENT TEXT NOT NULL ); CALL set_table_property('CUSTOMER', 'distribution_key', 'C_CUSTKEY'); CALL set_table_property('CUSTOMER', 'colocate_with', 'LINEITEM'); CALL set_table_property('CUSTOMER', 'bitmap_columns', 'C_CUSTKEY,C_NATIONKEY,C_NAME,C_ADDRESS,C_PHONE,C_MKTSEGMENT,C_COMMENT'); CALL set_table_property('CUSTOMER', 'dictionary_encoding_columns', 'C_NAME,C_ADDRESS,C_PHONE,C_MKTSEGMENT,C_COMMENT'); CALL set_table_property('CUSTOMER', 'time_to_live_in_seconds', '31536000'); COMMIT;DROP TABLE IF EXISTS SUPPLIER;BEGIN; CREATE TABLE SUPPLIER (S_SUPPKEY INT NOT NULL PRIMARY KEY,S_NAME TEXT NOT NULL,S_ADDRESS TEXT NOT NULL,S_NATIONKEY INT NOT NULL,S_PHONE TEXT NOT NULL,S_ACCTBAL DECIMAL(15,2) NOT NULL,S_COMMENT TEXT NOT NULL ); CALL set_table_property('SUPPLIER', 'distribution_key', 'S_SUPPKEY'); CALL set_table_property('SUPPLIER', 'colocate_with', 'LINEITEM'); CALL set_table_property('SUPPLIER', 'bitmap_columns', 'S_SUPPKEY,S_NAME,S_ADDRESS,S_NATIONKEY,S_PHONE,S_COMMENT'); CALL set_table_property('SUPPLIER', 'dictionary_encoding_columns', 'S_NAME,S_ADDRESS,S_PHONE,S_COMMENT'); CALL set_table_property('SUPPLIER', 'time_to_live_in_seconds', '31536000'); COMMIT;DROP TABLE IF EXISTS NATION;BEGIN; CREATE TABLE NATION(N_NATIONKEY INT NOT NULL PRIMARY KEY,N_NAME text NOT NULL,N_REGIONKEY INT NOT NULL,N_COMMENT text NOT NULL ); CALL set_table_property('NATION', 'distribution_key', 'N_NATIONKEY'); CALL set_table_property('NATION', 'colocate_with', 'LINEITEM'); CALL set_table_property('NATION', 'bitmap_columns', 'N_NATIONKEY,N_NAME,N_REGIONKEY,N_COMMENT'); CALL set_table_property('NATION', 'dictionary_encoding_columns', 'N_NAME,N_COMMENT'); CALL set_table_property('NATION', 'time_to_live_in_seconds', '31536000'); COMMIT;DROP TABLE IF EXISTS REGION;BEGIN; CREATE TABLE REGION (R_REGIONKEY INT NOT NULL PRIMARY KEY,R_NAME TEXT NOT NULL,R_COMMENT TEXT ); CALL set_table_property('REGION', 'distribution_key', 'R_REGIONKEY'); CALL set_table_property('REGION', 'colocate_with', 'LINEITEM'); CALL set_table_property('REGION', 'bitmap_columns', 'R_REGIONKEY,R_NAME,R_COMMENT'); CALL set_table_property('REGION', 'dictionary_encoding_columns', 'R_NAME,R_COMMENT'); CALL set_table_property('REGION', 'time_to_live_in_seconds', '31536000'); COMMIT;
    • 創(chuàng)建完畢后,您能在 psql 中使用如下代碼查看是否創(chuàng)建成功
    tpch_1sf=# \dt
    • 若成功,現(xiàn)實(shí)效果如下
    tpch_1sf=# \dtList of relationsSchema | Name | Type | Owner --------+----------+-------+--------------------public | customer | table | tpch_1sf_developerpublic | lineitem | table | tpch_1sf_developerpublic | nation | table | tpch_1sf_developerpublic | orders | table | tpch_1sf_developerpublic | part | table | tpch_1sf_developerpublic | partsupp | table | tpch_1sf_developerpublic | region | table | tpch_1sf_developerpublic | supplier | table | tpch_1sf_developer (8 rows)

    2. 導(dǎo)入數(shù)據(jù)

    • 本測(cè)試方案主要使用 COPY FROM STDIN?的方式導(dǎo)入數(shù)據(jù)詳細(xì)可以參考官方文檔。此處會(huì)將此前生成的 tbl 數(shù)據(jù)文件導(dǎo)入 Hologres 中創(chuàng)建的表中。
    • 您可以在數(shù)據(jù)生成工具的目錄中參考如下 shell腳本導(dǎo)入數(shù)據(jù)
    for i in `ls *.tbl`; doecho $i;name=`echo $i| cut -d'.' -f1`;PGUSER=<AccessID> PGPASSWORD=<AccessKey> psql -p <Port> -h <Endpoint> -d <Database> -c "COPY $name from stdin with delimiter '|' csv;" < $i; done
    • 至此您已完成數(shù)據(jù)導(dǎo)入

    3. 收集統(tǒng)計(jì)信息

    • 為了更好的執(zhí)行查詢,可以在 psql 中使用如下語(yǔ)句,使 Hologres 收集各張表特征信息。
    vacuum region; vacuum nation; vacuum supplier; vacuum customer; vacuum part; vacuum partsupp; vacuum orders; vacuum lineitem;analyze nation; analyze region; analyze lineitem; analyze orders; analyze customer; analyze part; analyze partsupp; analyze supplier;

    執(zhí)行查詢

    • 為了方便統(tǒng)計(jì)查詢信息,需要使用pgbench工具,您可以使用如下命令安裝pgbench(如果測(cè)試機(jī)上已有pgbench,請(qǐng)確保版本大于9.6以上,最好大版本是13以上,否則以下測(cè)試會(huì)遇到各種不兼容)
    yum install postgresql-contrib
    • 為了方便查詢,您可以直接通過(guò)以下連接,下載所需的22條SQL

    tpch_data_tpch_query.zip

    • 然后上傳至ECS
    • 進(jìn)入ECS,并進(jìn)入上傳文件的目錄,使用如下shell命令解壓縮文件
    unzip tpch_data_tpch_query
    • 至此,您已經(jīng)完成了準(zhǔn)備工作,即可使用pgbench進(jìn)行測(cè)試,您可以使用如下命令執(zhí)行單條查詢
    PGUSER=<AccessID> PGPASSWORD=<AccessKey> pgbench -h <Endpoint> -p <Port> -d <Database> -c <Client_Num> -t <Query_Num> -n -f xxx.sql
    • 參數(shù)解釋
    配置項(xiàng)參數(shù)說(shuō)明
    -hHologres實(shí)例的endpoint在Hologres管控臺(tái)查看
    -pHologres實(shí)例的端口地址在Hologres管控臺(tái)查看
    -dHologres指定實(shí)例中的數(shù)據(jù)庫(kù)名
    -c客戶端數(shù)目(并發(fā)度)示例:1,由于該測(cè)試僅測(cè)試查詢性能,不測(cè)試并發(fā),所以并發(fā)度置為1即可
    -t每個(gè)客戶端需要執(zhí)行的壓測(cè)query數(shù)目50
    -f壓測(cè)的sql示例:6.sql
    • 也可以直接執(zhí)行如下 shell?腳本,直接批量執(zhí)行22條查詢,并將結(jié)果輸出到文件hologres_tpch_test.out中
    rm -f hologres_tpch_test.out echo `date +"%Y-%m-%d %H:%M:%S"` begin >> ./hologres_tpch_test.out for i in {1..22} doPGUSER=<AccessID> PGPASSWORD=<AccessKey> pgbench -h <Endpoint> -p <Port> -d <Database> -c <Client_Num> -t <Query_Num> -n -f ./tpch_data_tpch_query/${i}.sql >> ./hologres_tpch_test.out done
    • 查看hologres_tpch_test.out即可得到查詢結(jié)果,樣例如下

      • transaction type:說(shuō)明了執(zhí)行的具體的SQL文件
      • latency average:記錄了對(duì)應(yīng)SQL文件的3次查詢的平均時(shí)間
    2021-03-23 03:50:54 begin pghost: hgpostcn-cn-oew21c935002-cn-hangzhou.hologres.aliyuncs.com pgport: 80 nclients: 1 nxacts: 3 dbName: tpch_100 transaction type: ./tpch_data_tpch_query/1.sql scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 number of transactions per client: 3 number of transactions actually processed: 3/3 latency average = 76.936 ms tps = 12.997850 (including connections establishing) tps = 15.972757 (excluding connections establishing) ...

    TPCH 22條查詢語(yǔ)句

    Q1

    selectl_returnflag,l_linestatus,sum(l_quantity) as sum_qty,sum(l_extendedprice) as sum_base_price,sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,avg(l_quantity) as avg_qty,avg(l_extendedprice) as avg_price,avg(l_discount) as avg_disc,count(*) as count_order fromlineitem wherel_shipdate <= date '1998-12-01' - interval '90' day group byl_returnflag,l_linestatus order byl_returnflag,l_linestatus;

    Q2

    selects_acctbal,s_name,n_name,p_partkey,p_mfgr,s_address,s_phone,s_comment frompart,supplier,partsupp,nation,region wherep_partkey = ps_partkeyand s_suppkey = ps_suppkeyand p_size = 15and p_type like '%BRASS'and s_nationkey = n_nationkeyand n_regionkey = r_regionkeyand r_name = 'EUROPE'and ps_supplycost = (selectmin(ps_supplycost)frompartsupp,supplier,nation,regionwherep_partkey = ps_partkeyand s_suppkey = ps_suppkeyand s_nationkey = n_nationkeyand n_regionkey = r_regionkeyand r_name = 'EUROPE') order bys_acctbal desc,n_name,s_name,p_partkey limit 100;

    Q3

    selectl_orderkey,sum(l_extendedprice * (1 - l_discount)) as revenue,o_orderdate,o_shippriority fromcustomer,orders,lineitem wherec_mktsegment = 'BUILDING'and c_custkey = o_custkeyand l_orderkey = o_orderkeyand o_orderdate < date '1995-03-15'and l_shipdate > date '1995-03-15' group byl_orderkey,o_orderdate,o_shippriority order byrevenue desc,o_orderdate limit 10;

    Q4

    selecto_orderpriority,count(*) as order_count fromorders whereo_orderdate >= date '1993-07-01'and o_orderdate < date '1993-07-01' + interval '3' monthand exists (select*fromlineitemwherel_orderkey = o_orderkeyand l_commitdate < l_receiptdate) group byo_orderpriority order byo_orderpriority;

    Q5

    selectn_name,sum(l_extendedprice * (1 - l_discount)) as revenue fromcustomer,orders,lineitem,supplier,nation,region wherec_custkey = o_custkeyand l_orderkey = o_orderkeyand l_suppkey = s_suppkeyand c_nationkey = s_nationkeyand s_nationkey = n_nationkeyand n_regionkey = r_regionkeyand r_name = 'ASIA'and o_orderdate >= date '1994-01-01'and o_orderdate < date '1994-01-01' + interval '1' year group byn_name order byrevenue desc;

    Q6

    selectsum(l_extendedprice * l_discount) as revenue fromlineitem wherel_shipdate >= date '1994-01-01'and l_shipdate < date '1994-01-01' + interval '1' yearand l_discount between 6 - 1 and 6 + 1and l_quantity < 2400

    Q7

    set hg_experimental_enable_double_equivalent=on; selectsupp_nation,cust_nation,l_year,sum(volume) as revenue from(selectn1.n_name as supp_nation,n2.n_name as cust_nation,extract(year from l_shipdate) as l_year,l_extendedprice * (1 - l_discount) as volumefromsupplier,lineitem,orders,customer,nation n1,nation n2wheres_suppkey = l_suppkeyand o_orderkey = l_orderkeyand c_custkey = o_custkeyand s_nationkey = n1.n_nationkeyand c_nationkey = n2.n_nationkeyand ((n1.n_name = 'FRANCE' and n2.n_name = 'GERMANY')or (n1.n_name = 'GERMANY' and n2.n_name = 'FRANCE'))and l_shipdate between date '1995-01-01' and date '1996-12-31') as shipping group bysupp_nation,cust_nation,l_year order bysupp_nation,cust_nation,l_year;

    Q8

    set hg_experimental_enable_double_equivalent=on; selecto_year,sum(casewhen nation = 'BRAZIL' then volumeelse 0end) / sum(volume) as mkt_share from(selectextract(year from o_orderdate) as o_year,l_extendedprice * (1 - l_discount) as volume,n2.n_name as nationfrompart,supplier,lineitem,orders,customer,nation n1,nation n2,regionwherep_partkey = l_partkeyand s_suppkey = l_suppkeyand l_orderkey = o_orderkeyand o_custkey = c_custkeyand c_nationkey = n1.n_nationkeyand n1.n_regionkey = r_regionkeyand r_name = 'AMERICA'and s_nationkey = n2.n_nationkeyand o_orderdate between date '1995-01-01' and date '1996-12-31'and p_type = 'STANDARD POLISHED TIN') as all_nations group byo_year order byo_year;

    Q9

    set hg_experimental_enable_double_equivalent=on; selectnation,o_year,sum(amount) as sum_profit from(selectn_name as nation,extract(year from o_orderdate) as o_year,l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amountfrompart,supplier,lineitem,partsupp,orders,nationwheres_suppkey = l_suppkeyand ps_suppkey = l_suppkeyand ps_partkey = l_partkeyand p_partkey = l_partkeyand o_orderkey = l_orderkeyand s_nationkey = n_nationkeyand p_name like '%green%') as profit group bynation,o_year order bynation,o_year desc;

    Q10

    selectc_custkey,c_name,sum(l_extendedprice * (1 - l_discount)) as revenue,c_acctbal,n_name,c_address,c_phone,c_comment fromcustomer,orders,lineitem,nation wherec_custkey = o_custkeyand l_orderkey = o_orderkeyand o_orderdate >= date '1993-10-01'and o_orderdate < date '1993-10-01' + interval '3' monthand l_returnflag = 'R'and c_nationkey = n_nationkey group byc_custkey,c_name,c_acctbal,c_phone,n_name,c_address,c_comment order byrevenue desc limit 20;

    Q11

    selectps_partkey,sum(ps_supplycost * ps_availqty) as value frompartsupp,supplier,nation whereps_suppkey = s_suppkeyand s_nationkey = n_nationkeyand n_name = 'GERMANY' group byps_partkey havingsum(ps_supplycost * ps_availqty) > (selectsum(ps_supplycost * ps_availqty) * 0.0000010000frompartsupp,supplier,nationwhereps_suppkey = s_suppkeyand s_nationkey = n_nationkeyand n_name = 'GERMANY') order byvalue desclimit 100;

    Q12

    selectl_shipmode,sum(casewhen o_orderpriority = '1-URGENT'or o_orderpriority = '2-HIGH'then 1else 0end) as high_line_count,sum(casewhen o_orderpriority <> '1-URGENT'and o_orderpriority <> '2-HIGH'then 1else 0end) as low_line_count fromorders,lineitem whereo_orderkey = l_orderkeyand l_shipmode in ('MAIL', 'SHIP')and l_commitdate < l_receiptdateand l_shipdate < l_commitdateand l_receiptdate >= date '1994-01-01'and l_receiptdate < date '1994-01-01' + interval '1' year group byl_shipmode order byl_shipmode;

    Q13

    selectc_count,count(*) as custdist from(selectc_custkey,count(o_orderkey)fromcustomer left outer join orders onc_custkey = o_custkeyand o_comment not like '%special%requests%'group byc_custkey) as c_orders (c_custkey, c_count) group byc_count order bycustdist desc,c_count desc;

    Q14

    select100.00 * sum(casewhen p_type like 'PROMO%'then l_extendedprice * (1 - l_discount)else 0end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue fromlineitem,part wherel_partkey = p_partkeyand l_shipdate >= date '1995-09-01'and l_shipdate < date '1995-09-01' + interval '1' month;

    Q15

    with revenue0(SUPPLIER_NO, TOTAL_REVENUE) as(selectl_suppkey,sum(l_extendedprice * (1 - l_discount))fromlineitemwherel_shipdate >= date '1995-12-01'and l_shipdate < date '1995-12-01' + interval '3' monthgroup byl_suppkey) selects_suppkey,s_name,s_address,s_phone,total_revenue fromsupplier,revenue0 wheres_suppkey = supplier_noand total_revenue = (selectmax(total_revenue)fromrevenue0) order bys_suppkey;

    Q16

    selectp_brand,p_type,p_size,count(distinct ps_suppkey) as supplier_cnt frompartsupp,part wherep_partkey = ps_partkeyand p_brand <> 'Brand#45'and p_type not like 'MEDIUM POLISHED%'and p_size in (49, 14, 23, 45, 19, 3, 36, 9)and ps_suppkey not in (selects_suppkeyfromsupplierwheres_comment like '%Customer%Complaints%') group byp_brand,p_type,p_size order bysupplier_cnt desc,p_brand,p_type,p_size;

    Q17

    selectsum(l_extendedprice) / 7.0 as avg_yearly fromlineitem,part wherep_partkey = l_partkeyand p_brand = 'Brand#23'and p_container = 'MED BOX'and l_quantity < (select0.2 * avg(l_quantity)fromlineitemwherel_partkey = p_partkeyand l_partkey in(select p_partkey from part where p_brand = 'Brand#23' and p_container = 'MED BOX'));

    Q18

    selectc_name,c_custkey,o_orderkey,o_orderdate,o_totalprice,sum(l_quantity) fromcustomer,orders,lineitem whereo_orderkey in (selectl_orderkeyfromlineitemgroup byl_orderkey havingsum(l_quantity) > 300)and c_custkey = o_custkeyand o_orderkey = l_orderkey group byc_name,c_custkey,o_orderkey,o_orderdate,o_totalprice order byo_totalprice desc,o_orderdate limit 100;

    Q19

    selectsum(l_extendedprice* (1 - l_discount)) as revenue fromlineitem,part where(p_partkey = l_partkeyand p_brand = 'Brand#12'and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')and l_quantity >= 1 and l_quantity <= 1 + 10and p_size between 1 and 5and l_shipmode in ('AIR', 'AIR REG')and l_shipinstruct = 'DELIVER IN PERSON')or(p_partkey = l_partkeyand p_brand = 'Brand#23'and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')and l_quantity >= 10 and l_quantity <= 10 + 10and p_size between 1 and 10and l_shipmode in ('AIR', 'AIR REG')and l_shipinstruct = 'DELIVER IN PERSON')or(p_partkey = l_partkeyand p_brand = 'Brand#34'and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')and l_quantity >= 20 and l_quantity <= 20 + 10and p_size between 1 and 15and l_shipmode in ('AIR', 'AIR REG')and l_shipinstruct = 'DELIVER IN PERSON');

    Q20

    selects_name,s_address fromsupplier,nation wheres_suppkey in (selectps_suppkeyfrompartsuppwhereps_partkey in (selectp_partkeyfrompartwherep_name like 'forest%')and ps_availqty > (select0.5 * sum(l_quantity)fromlineitemwherel_partkey = ps_partkeyand l_suppkey = ps_suppkeyand l_shipdate >= date '1994-01-01'and l_shipdate < date '1994-01-01' + interval '1' year))and s_nationkey = n_nationkeyand n_name = 'CANADA' order bys_name;

    Q21

    selects_name,count(*) as numwait fromsupplier,lineitem l1,orders,nation wheres_suppkey = l1.l_suppkeyand o_orderkey = l1.l_orderkeyand o_orderstatus = 'F'and l1.l_receiptdate > l1.l_commitdateand exists (select*fromlineitem l2wherel2.l_orderkey = l1.l_orderkeyand l2.l_suppkey <> l1.l_suppkey)and not exists (select*fromlineitem l3wherel3.l_orderkey = l1.l_orderkeyand l3.l_suppkey <> l1.l_suppkeyand l3.l_receiptdate > l3.l_commitdate)and s_nationkey = n_nationkeyand n_name = 'SAUDI ARABIA' group bys_name order bynumwait desc,s_name limit 100;

    Q22

    selectcntrycode,count(*) as numcust,sum(c_acctbal) as totacctbal from(selectsubstring(c_phone from 1 for 2) as cntrycode,c_acctbalfromcustomerwheresubstring(c_phone from 1 for 2) in('13', '31', '23', '29', '30', '18', '17')and c_acctbal > (selectavg(c_acctbal)fromcustomerwherec_acctbal > 0.00and substring(c_phone from 1 for 2) in('13', '31', '23', '29', '30', '18', '17'))and not exists (select*fromorderswhereo_custkey = c_custkey)) as custsale group bycntrycode order bycntrycode;

    Key/Value點(diǎn)查場(chǎng)景測(cè)試

    準(zhǔn)備數(shù)據(jù)

    1. 創(chuàng)建表

    • 繼續(xù)使用OLAP查詢場(chǎng)景創(chuàng)建的數(shù)據(jù)庫(kù),我們會(huì)使用TPCH數(shù)據(jù)集中的orders表進(jìn)行測(cè)試,使用psql連接Hologres后,您可以使用如下建表語(yǔ)句創(chuàng)建數(shù)據(jù)庫(kù)表;
    注意:點(diǎn)查場(chǎng)景需要使用行存表,所以需要?jiǎng)?chuàng)建一張新表,不能使用OLAP查詢場(chǎng)景中使用的表 DROP TABLE IF EXISTS orders_row;BEGIN; CREATE TABLE public.orders_row ("o_orderkey" int8 NOT NULL,"o_custkey" int8,"o_orderstatus" bpchar(1),"o_totalprice" numeric(15,2),"o_orderdate" date,"o_orderpriority" bpchar(15),"o_clerk" bpchar(15),"o_shippriority" int8,"o_comment" varchar(79), PRIMARY KEY (o_orderkey) ); CALL SET_TABLE_PROPERTY('public.orders_row', 'orientation', 'row'); CALL SET_TABLE_PROPERTY('public.orders_row', 'clustering_key', 'o_orderkey'); CALL SET_TABLE_PROPERTY('public.orders_row', 'time_to_live_in_seconds', '3153600000'); CALL SET_TABLE_PROPERTY('public.orders_row', 'distribution_key', 'o_orderkey'); COMMIT;

    2. COPY方式導(dǎo)入數(shù)據(jù)

    • 本測(cè)試方案主要使用 COPY FROM STDIN 的方式導(dǎo)入數(shù)據(jù)詳細(xì)可以參考官方文檔。此處會(huì)將此前生成的 tbl 數(shù)據(jù)文件導(dǎo)入 Hologres 中創(chuàng)建的表中。
    • 您可以在數(shù)據(jù)生成工具的目錄中參考如下命令導(dǎo)入數(shù)據(jù)
    PGUSER=<AccessID> PGPASSWORD=<AccessKey> psql -p <Port> -h <Endpoint> -d <Database> -c "COPY public.orders_row from stdin with delimiter '|' csv;" < orders.tbl

    3. INSERT INTO方式導(dǎo)入數(shù)據(jù)

    • 由于OLAP場(chǎng)景時(shí)您已經(jīng)導(dǎo)入了orders表的數(shù)據(jù),您可以運(yùn)行如下SQL語(yǔ)句導(dǎo)入數(shù)據(jù)
    INSERT INTO public.orders_row SELECT * FROM public.orders;

    查詢

    1. 生成查詢語(yǔ)句

    • Key/Value點(diǎn)查場(chǎng)景主要的查詢語(yǔ)句特征如下
    SELECT column_a,column_b,...,column_x FROM table_x WHERE pk = value_x ;

    SELECT column_a,column_b,...,column_x FROM table_x WHERE pk IN ( value_a, value_b,..., value_x ) ;
    • 您可以使用如下腳本生成所需的sql,該腳本會(huì)生成2條sql

      • kv_query_single.sql 針對(duì)單值篩選的SQL
      • kv_query_in.sql 針對(duì)多值篩選的SQL,該腳本會(huì)隨機(jī)生成一個(gè)針對(duì)10個(gè)值篩選的SQL
    rm -rf kv_query mkdir kv_query cd kv_query echo '\set column_values random(1,99999999) select O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT from public.orders_row WHERE o_orderkey =:column_values;' >> kv_query_single.sql echo '\set column_values1 random(1,99999999) \set column_values2 random(1,99999999) \set column_values3 random(1,99999999) \set column_values4 random(1,99999999) \set column_values5 random(1,99999999) \set column_values6 random(1,99999999) \set column_values7 random(1,99999999) \set column_values8 random(1,99999999) \set column_values9 random(1,99999999) \set column_values10 random(1,99999999) select O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT from public.orders_row WHERE o_orderkey in(:column_values1,:column_values2,:column_values3,:column_values4,:column_values5,:column_values6,:column_values7,:column_values8,:column_values9,:column_values10);' >> kv_query_in.sql

    2. 進(jìn)行查詢

    • 查詢需要使用pgbench,您可以使用如下命令安裝pgbench
    yum install postgresql-contrib
    • 之后您即可使用pgbench進(jìn)行壓測(cè),針對(duì)單值篩選的場(chǎng)景
    注意,請(qǐng)?jiān)谏蒘QL的目錄執(zhí)行如下命令 PGUSER=<AccessID> PGPASSWORD=<AccessKey> pgbench -h <Endpoint> -p <Port> -d <Database> -c <Client_Num> -t <Query_Num> -n -f kv_query_single.sql
    • 針對(duì)多值篩選的場(chǎng)景
    注意,請(qǐng)?jiān)谏蒘QL的目錄執(zhí)行如下命令 PGUSER=<AccessID> PGPASSWORD=<AccessKey> pgbench -h <Endpoint> -p <Port> -d <Database> -c <Client_Num> -t <Query_Num> -n -f kv_query_in.sql
    • 參數(shù)解釋
    配置項(xiàng)參數(shù)說(shuō)明
    -hHologres實(shí)例的endpoint在Hologres管控臺(tái)查看
    -pHologres實(shí)例的端口地址在Hologres管控臺(tái)查看
    -dHologres指定實(shí)例中的數(shù)據(jù)庫(kù)名
    -c客戶端數(shù)目(并發(fā)度)示例:8
    -t每個(gè)客戶端需要執(zhí)行的壓測(cè)query數(shù)目50
    -f壓測(cè)的sql示例:6.sql

    測(cè)試結(jié)果參考

    測(cè)試數(shù)據(jù)量:

    • 本測(cè)試基于TPCH 100G的數(shù)據(jù)集進(jìn)行測(cè)試,具體數(shù)據(jù)量如下表所示
    表名行數(shù)
    LINEITEM600,037,902
    ORDERS150,000,000
    PARTSUPP80,000,000
    PART20,000,000
    CUSTOMER15,000,000
    SUPPLIER1,000,000
    NATION25
    REGION5

    集群規(guī)格

    計(jì)算資源存儲(chǔ)容量軟件版本備注
    64 CU
    (CPU:64 Core 內(nèi)存:256 GB)100 GBr0.10.20使用集群默認(rèn)配置,Shard數(shù)量:40
    128 CU
    (CPU:128 Core 內(nèi)存:512 GB)100 GBr0.10.20使用集群默認(rèn)配置,Shard數(shù)量:80
    測(cè)試時(shí)間:2021年6月

    測(cè)試結(jié)果

    數(shù)據(jù)導(dǎo)入時(shí)間

    • 數(shù)據(jù)導(dǎo)入執(zhí)行時(shí)間以秒(s)為單位。
    • 導(dǎo)入時(shí)間指將數(shù)據(jù)導(dǎo)入Hologres內(nèi)表
    • 在使用COPY方法導(dǎo)入數(shù)據(jù)時(shí),一張表對(duì)應(yīng)一個(gè)數(shù)據(jù)文件,并未使用并發(fā)導(dǎo)入方式。
    • 具體數(shù)值如下表所示
    說(shuō)明:使用COPY方式導(dǎo)入時(shí)一張表對(duì)應(yīng)一個(gè)數(shù)據(jù)文件,并未使用并發(fā)導(dǎo)入方式 表名行數(shù)數(shù)據(jù)量Hologres 64CU
    使用COPY方式導(dǎo)入(公網(wǎng)網(wǎng)絡(luò))使用COPY方式導(dǎo)入(VPC網(wǎng)絡(luò)導(dǎo)入)使用MaxCompute外表導(dǎo)入
    LINEITEM600,037,90273.6GB3,070.453694.364148.165
    ORDERS150,000,00016.4GB691.060172.52937.741
    PARTSUPP80,000,0002.3GB468.560107.09218.488
    PART20,000,00011.3GB96.34224.0208.083
    CUSTOMER15,000,0002.3GB95.19022.93710.363
    SUPPLIER1,000,000132MB5.0571.8031.503
    NATION252KB0.5800.5840.747
    REGION50.375KB0.1680.1530.430
    Total106G4427.4101023.482225.52
    • 下圖中藍(lán)色為使用COPY方式在公網(wǎng)條件下導(dǎo)入數(shù)據(jù)的時(shí)間,綠色為使用COPY方式在VPC網(wǎng)絡(luò)條件下導(dǎo)入數(shù)據(jù)的時(shí)間,灰色為使用MaxCompute外表方式導(dǎo)入的時(shí)間
    • 縱坐標(biāo)數(shù)值越低,表示導(dǎo)入速度越快
    • 橫軸:表名。縱軸:數(shù)據(jù)導(dǎo)入時(shí)間(s)

    • 可以看出,由于網(wǎng)絡(luò)帶寬影響,使用COPY方式導(dǎo)入本地文件數(shù)據(jù)時(shí),使用VPC網(wǎng)絡(luò)導(dǎo)入數(shù)據(jù)時(shí)間明顯短于使用公網(wǎng)導(dǎo)入數(shù)據(jù)時(shí)間;使用MaxCompute導(dǎo)入數(shù)據(jù)時(shí)間明顯短于使用COPY方式導(dǎo)入本地文件數(shù)據(jù)時(shí)間。

    查詢時(shí)間

    • 查詢執(zhí)行時(shí)間以秒(s)為單位。
    • 查詢結(jié)果均基于Hologres內(nèi)表
    • 具體數(shù)值如下表所示
    TPCH Query編號(hào)Hologres 64CUHologres 128CU
    13.1202.150
    20.5810.467
    31.7351.005
    41.5580.836
    52.9211.917
    60.2970.096
    72.0061.029
    82.6741.679
    95.2982.796
    101.9440.924
    110.3970.297
    121.5310.852
    131.7410.971
    140.2860.160
    150.2930.177
    161.2231.020
    171.4050.607
    183.8172.169
    191.4000.622
    201.3580.868
    214.1642.047
    221.1210.654
    Total40.87023.343
    • 下圖中藍(lán)色為64CU的實(shí)例的查詢結(jié)果,綠色為128CU實(shí)例的查詢結(jié)果
    • 縱坐標(biāo)數(shù)值越低,表示 TPC-H 性能越好。
    • 可以看出隨著實(shí)例規(guī)模的成本增長(zhǎng),查詢時(shí)間也在成線性下降趨勢(shì)
    • 橫軸:query在文檔中的編號(hào)。縱軸:query執(zhí)行時(shí)間(s)

    原文鏈接:https://developer.aliyun.com/article/785226?

    版權(quán)聲明:本文內(nèi)容由阿里云實(shí)名注冊(cè)用戶自發(fā)貢獻(xiàn),版權(quán)歸原作者所有,阿里云開發(fā)者社區(qū)不擁有其著作權(quán),亦不承擔(dān)相應(yīng)法律責(zé)任。具體規(guī)則請(qǐng)查看《阿里云開發(fā)者社區(qū)用戶服務(wù)協(xié)議》和《阿里云開發(fā)者社區(qū)知識(shí)產(chǎn)權(quán)保護(hù)指引》。如果您發(fā)現(xiàn)本社區(qū)中有涉嫌抄襲的內(nèi)容,填寫侵權(quán)投訴表單進(jìn)行舉報(bào),一經(jīng)查實(shí),本社區(qū)將立刻刪除涉嫌侵權(quán)內(nèi)容。

    總結(jié)

    以上是生活随笔為你收集整理的Hologres基于TPCH的性能测试介绍的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。

    如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。