本文共 14380 字,大约阅读时间需要 47 分钟。
从可靠性和使用便利性来讲单机RDBMS完胜N多各类数据库,但当数据量到了一定量之后,又不得不寻求分布式,列存储等等解决方案。citus是基于PostgreSQL的分布式实时分析解决方案,由于其只是作为PostgreSQL的扩展插件而没有动PG内核,所有随快速随PG主版本升级,可靠性也非常值得信任。
citus在支持SQL特性上有一定的限制,比如不支持跨库事务,不支持部分join和子查询的写法等等,做选型时需要留意(大部分的分布式系统对SQL支持或多或少都有些限制,不足为奇,按场景选型即可)。
citus主要适合下面两种场景
在实时数据分析场景,单位时间的数据增量会很大,本文实测一下citus的数据插入能力(更新,删除的性能类似)。
master
worker(8个)
软件的安装都比较简单,参考官方文档即可,这里略过。
listen_addresses = '*'port = 5432max_connections = 1100shared_buffers = 32GBeffective_cache_size = 96GBwork_mem = 16MBmaintenance_work_mem = 2GBmin_wal_size = 4GBmax_wal_size = 32GBcheckpoint_completion_target = 0.9wal_buffers = 16MBdefault_statistics_target = 100shared_preload_libraries = 'citus'checkpoint_timeout = 60minwal_level = replicawal_compression = onwal_level = replicawal_log_hints = onsynchronous_commit = off
选用sysbench-1.0.3的oltp_insert.lua作为测试用例,执行的SQL的示例如下:
INSERT INTO sbtest1 (id, k, c, pad) VALUES (525449452, 5005, '28491622445-08162085385-16839726209-31171823540-28539137588-93842246002-13643098812-68836434394-95216556185-07917709646', '49165640733-86514010343-02300194630-37380434155-24438915047')
但是,sysbench-1.0.3的oltp_insert.lua中有一个bug,需要先将其改正
i = sysbench.rand.unique()
==>
i = sysbench.rand.unique() - 2147483648
CREATE TABLE sbtest1( id integer NOT NULL, k integer NOT NULL DEFAULT 0, c character(120) NOT NULL DEFAULT ''::bpchar, pad character(60) NOT NULL DEFAULT ''::bpchar, PRIMARY KEY (id));CREATE INDEX k_1 ON sbtest1(k);
src/sysbench --test=src/lua/oltp_insert.lua \--db-driver=pgsql \--pgsql-host=127.0.0.1 \--pgsql-port=5432 \--pgsql-user=postgres \--pgsql-db=dbone \--auto_inc=0 \--time=10 \--threads=128 \--report-interval=1 \run
TPS为134030
-bash-4.1$ src/sysbench --test=src/lua/oltp_insert.lua --db-driver=pgsql --pgsql-host=127.0.0.1 --pgsql-port=5432 --pgsql-user=postgres --pgsql-db=dbone --auto_inc=0 --time=20 --threads=128 --report-interval=5 runWARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.sysbench 1.0.3 (using bundled LuaJIT 2.1.0-beta2)Running the test with following options:Number of threads: 128Report intermediate results every 5 second(s)Initializing random number generator from current timeInitializing worker threads...Threads started![ 5s ] thds: 128 tps: 138381.74 qps: 138381.74 (r/w/o: 0.00/138381.74/0.00) lat (ms,95%): 2.07 err/s: 0.00 reconn/s: 0.00[ 10s ] thds: 128 tps: 134268.30 qps: 134268.30 (r/w/o: 0.00/134268.30/0.00) lat (ms,95%): 2.07 err/s: 0.00 reconn/s: 0.00[ 15s ] thds: 128 tps: 132830.91 qps: 132831.11 (r/w/o: 0.00/132831.11/0.00) lat (ms,95%): 2.07 err/s: 0.00 reconn/s: 0.00[ 20s ] thds: 128 tps: 132073.81 qps: 132073.61 (r/w/o: 0.00/132073.61/0.00) lat (ms,95%): 2.03 err/s: 0.00 reconn/s: 0.00SQL statistics: queries performed: read: 0 write: 2688192 other: 0 total: 2688192 transactions: 2688192 (134030.18 per sec.) queries: 2688192 (134030.18 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.)General statistics: total time: 20.0547s total number of events: 2688192Latency (ms): min: 0.10 avg: 0.95 max: 88.80 95th percentile: 2.07 sum: 2554006.85Threads fairness: events (avg/stddev): 21001.5000/178.10 execution time (avg/stddev): 19.9532/0.01
此时CPU利用率90%,已经接近瓶颈。
-bash-4.1$ iostat sdc -xk 5...avg-cpu: %user %nice %system %iowait %steal %idle 69.12 0.00 20.56 0.15 0.00 10.17Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %utilsdc 0.00 25302.60 18.20 705.20 72.80 104019.20 287.79 5.96 8.21 0.81 58.48
CREATE TABLE sbtest1( id integer NOT NULL, k integer NOT NULL DEFAULT 0, c character(120) NOT NULL DEFAULT ''::bpchar, pad character(60) NOT NULL DEFAULT ''::bpchar, PRIMARY KEY (id));CREATE INDEX k_1 ON sbtest1(k);set citus.shard_count = 128;set citus.shard_replication_factor = 1;select create_distributed_table('sbtest1','id');
/bak/soft/sysbench-1.0.3/src/sysbench --test=/bak/soft/sysbench-1.0.3/src/lua/oltp_insert.lua \--db-driver=pgsql \--pgsql-host=127.0.0.1 \--pgsql-port=5432 \--pgsql-user=postgres \--pgsql-db=dbcitus \--auto_inc=0 \--time=10 \--threads=64 \--report-interval=1 \run
TPS为44637,远低于单机。
-bash-4.1$ /bak/soft/sysbench-1.0.3/src/sysbench --test=/bak/soft/sysbench-1.0.3/src/lua/oltp_insert.lua --db-driver=pgsql --pgsql-host=127.0.0.1 --pgsql-port=5432 --pgsql-user=postgres --pgsql-db=dbcitus --auto_inc=0 --time=20 --threads=64 --report-interval=5 runWARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.sysbench 1.0.3 (using bundled LuaJIT 2.1.0-beta2)Running the test with following options:Number of threads: 64Report intermediate results every 5 second(s)Initializing random number generator from current timeInitializing worker threads...Threads started![ 5s ] thds: 64 tps: 44628.01 qps: 44628.01 (r/w/o: 0.00/44628.01/0.00) lat (ms,95%): 2.48 err/s: 0.00 reconn/s: 0.00[ 10s ] thds: 64 tps: 44780.80 qps: 44780.80 (r/w/o: 0.00/44780.80/0.00) lat (ms,95%): 2.48 err/s: 0.00 reconn/s: 0.00[ 15s ] thds: 64 tps: 44701.32 qps: 44701.72 (r/w/o: 0.00/44701.72/0.00) lat (ms,95%): 2.48 err/s: 0.00 reconn/s: 0.00[ 20s ] thds: 64 tps: 44801.41 qps: 44801.01 (r/w/o: 0.00/44801.01/0.00) lat (ms,95%): 2.48 err/s: 0.00 reconn/s: 0.00SQL statistics: queries performed: read: 0 write: 894715 other: 0 total: 894715 transactions: 894715 (44637.47 per sec.) queries: 894715 (44637.47 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.)General statistics: total time: 20.0421s total number of events: 894715Latency (ms): min: 0.42 avg: 1.43 max: 203.28 95th percentile: 2.48 sum: 1277233.99Threads fairness: events (avg/stddev): 13979.9219/71.15 execution time (avg/stddev): 19.9568/0.01
性能瓶颈在master的CPU上,master生成执行计划消耗了大量CPU。
master的CPU利用率达到69%
[root@node1 ~]# iostat sdc -xk 5Linux 2.6.32-431.el6.x86_64 (node1) 2017年03月13日 _x86_64_ (32 CPU)...avg-cpu: %user %nice %system %iowait %steal %idle 50.61 0.00 17.80 0.00 0.00 31.59Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %utilsdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
worker的CPU利用率只有3%,IO也不高。
[root@node5 ~]# iostat sdc -xk 5Linux 2.6.32-431.el6.x86_64 (node5) 2017年03月13日 _x86_64_ (32 CPU)...avg-cpu: %user %nice %system %iowait %steal %idle 2.24 0.00 0.63 0.00 0.00 97.13Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %utilsdc 0.00 774.00 0.00 265.80 0.00 4159.20 31.30 0.25 0.96 0.01 0.38
既然性能瓶颈在master上,那可以多搞几个master,甚至每个worker都作为master。 这并不困难,只要把master上的元数据拷贝到每个worker上,worker就可以当master用了。
在8个worker上分别执行以下SQL:
CREATE TABLE sbtest1( id integer NOT NULL, k integer NOT NULL DEFAULT 0, c character(120) NOT NULL DEFAULT ''::bpchar, pad character(60) NOT NULL DEFAULT ''::bpchar, PRIMARY KEY (id));CREATE INDEX k_1 ON sbtest1(k);copy pg_dist_node from PROGRAM 'psql "host=192.168.0.177 port=5432 dbname=dbcitus user=postgres" -Atc "copy pg_dist_node to STDOUT"';copy pg_dist_partition from PROGRAM 'psql "host=192.168.0.177 port=5432 dbname=dbcitus user=postgres" -Atc "copy pg_dist_partition to STDOUT"';copy pg_dist_shard from PROGRAM 'psql "host=192.168.0.177 port=5432 dbname=dbcitus user=postgres" -Atc "copy pg_dist_shard to STDOUT"';copy pg_dist_shard_placement from PROGRAM 'psql "host=192.168.0.177 port=5432 dbname=dbcitus user=postgres" -Atc "copy pg_dist_shard_placement to STDOUT"';copy pg_dist_colocation from PROGRAM 'psql "host=192.168.0.177 port=5432 dbname=dbcitus user=postgres" -Atc "copy pg_dist_colocation to STDOUT"';
分别修改每个worker上的oltp_insert.lua中下面一行,使各个worker上产生的主键不容易冲突
i = sysbench.rand.unique() - 2147483648
worker2
i = sysbench.rand.unique() - 2147483648 + 1
worker3
i = sysbench.rand.unique() - 2147483648 + 2
...
worker8
i = sysbench.rand.unique() - 2147483648 + 7
在每个worker上准备测试脚本
/tmp/run_oltp_insert.sh:
#!/bin/bashcd /bak/soft/sysbench-1.0.3/bak/soft/sysbench-1.0.3/src/sysbench /bak/soft/sysbench-1.0.3/src/lua/oltp_insert.lua \--db-driver=pgsql \--pgsql-host=127.0.0.1 \--pgsql-port=5432 \--pgsql-user=postgres \--pgsql-db=dbcitus \--auto_inc=0 \--time=60 \--threads=64 \--report-interval=5 \run >/tmp/run_oltp_insert.log 2>&1
在每个worker上同时执行insert测试
[root@node1 ~]# for i in `seq 1 8` ; do ssh 192.168.0.18$i /tmp/run_oltp_insert.sh >/dev/null 2>&1 & done[10] 27332[11] 27333[12] 27334[13] 27335[14] 27336[15] 27337[16] 27338[17] 27339
在其中一个worker上的执行结果如下,QPS 2.5w
-bash-4.1$ cat /tmp/run_oltp_insert.logsysbench 1.0.3 (using bundled LuaJIT 2.1.0-beta2)Running the test with following options:Number of threads: 64Report intermediate results every 5 second(s)Initializing random number generator from current timeInitializing worker threads...Threads started![ 5s ] thds: 64 tps: 25662.78 qps: 25662.78 (r/w/o: 0.00/25662.78/0.00) lat (ms,95%): 6.67 err/s: 2.60 reconn/s: 0.00[ 10s ] thds: 64 tps: 26225.38 qps: 26225.38 (r/w/o: 0.00/26225.38/0.00) lat (ms,95%): 6.67 err/s: 7.00 reconn/s: 0.00[ 15s ] thds: 64 tps: 25996.42 qps: 25996.42 (r/w/o: 0.00/25996.42/0.00) lat (ms,95%): 6.79 err/s: 11.40 reconn/s: 0.00[ 20s ] thds: 64 tps: 25670.36 qps: 25670.36 (r/w/o: 0.00/25670.36/0.00) lat (ms,95%): 6.79 err/s: 18.60 reconn/s: 0.00[ 25s ] thds: 64 tps: 25620.89 qps: 25620.89 (r/w/o: 0.00/25620.89/0.00) lat (ms,95%): 6.79 err/s: 22.60 reconn/s: 0.00[ 30s ] thds: 64 tps: 25357.39 qps: 25357.39 (r/w/o: 0.00/25357.39/0.00) lat (ms,95%): 6.91 err/s: 33.40 reconn/s: 0.00[ 35s ] thds: 64 tps: 25247.67 qps: 25247.67 (r/w/o: 0.00/25247.67/0.00) lat (ms,95%): 6.91 err/s: 34.60 reconn/s: 0.00[ 40s ] thds: 64 tps: 25069.27 qps: 25069.27 (r/w/o: 0.00/25069.27/0.00) lat (ms,95%): 6.91 err/s: 41.00 reconn/s: 0.00[ 45s ] thds: 64 tps: 24796.27 qps: 24796.27 (r/w/o: 0.00/24796.27/0.00) lat (ms,95%): 7.04 err/s: 49.40 reconn/s: 0.00[ 50s ] thds: 64 tps: 24801.00 qps: 24801.00 (r/w/o: 0.00/24801.00/0.00) lat (ms,95%): 7.04 err/s: 47.40 reconn/s: 0.00[ 55s ] thds: 64 tps: 24752.83 qps: 24752.83 (r/w/o: 0.00/24752.83/0.00) lat (ms,95%): 7.04 err/s: 57.20 reconn/s: 0.00[ 60s ] thds: 64 tps: 24533.35 qps: 24533.35 (r/w/o: 0.00/24533.35/0.00) lat (ms,95%): 7.17 err/s: 63.60 reconn/s: 0.00SQL statistics: queries performed: read: 0 write: 1518786 other: 0 total: 1518786 transactions: 1518786 (25277.24 per sec.) queries: 1518786 (25277.24 per sec.) ignored errors: 1944 (32.35 per sec.) reconnects: 0 (0.00 per sec.)General statistics: total time: 60.0829s total number of events: 1518786Latency (ms): min: 0.47 avg: 2.53 max: 1015.04 95th percentile: 6.91 sum: 3835098.18Threads fairness: events (avg/stddev): 23731.0312/213.31 execution time (avg/stddev): 59.9234/0.02
CPU消耗了66%
-bash-4.1$ iostat sdc -xk 5Linux 2.6.32-431.el6.x86_64 (node5) 2017年03月13日 _x86_64_ (32 CPU)avg-cpu: %user %nice %system %iowait %steal %idle 47.09 0.00 18.35 0.47 0.00 34.10Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %utilsdc 0.00 4195.60 0.00 19787.60 0.00 95932.80 9.70 0.98 0.05 0.02 42.54
8台worker的总qps为214362
[root@node1 ~]# for i in `seq 1 8` ; do ssh 192.168.0.18$i grep queries: /tmp/run_oltp_insert.log ; done queries: 1518786 (25277.24 per sec.) queries: 1587323 (26412.68 per sec.) queries: 1700562 (28305.06 per sec.) queries: 1631516 (27151.82 per sec.) queries: 1615778 (26885.48 per sec.) queries: 1649236 (27449.03 per sec.) queries: 1621940 (26993.20 per sec.) queries: 1554917 (25890.71 per sec.)
在master上查询插入的记录数。
dbcitus=# select count(1) from sbtest1; count ---------- 12880058(1 行记录)时间:73.197 ms
查询是在128个分片上并行执行的,所以速度很快。
转载地址:http://cyoel.baihongyu.com/