tag:blogger.com,1999:blog-7025658101214062282024-03-04T20:35:34.567-08:00По ПОAlexhttp://www.blogger.com/profile/12010600222763457685noreply@blogger.comBlogger10125tag:blogger.com,1999:blog-702565810121406228.post-83971161533646443392018-12-08T04:48:00.000-08:002018-12-08T23:39:54.824-08:00Running TPCC test on two synced PgRocks nodes ("multimaster")<div dir="ltr" style="text-align: left;" trbidi="on">
<h2 style="text-align: left;">
1. Setup</h2>
Here we use a 2xXeon Silver 4114 Intel CPU server with 256GB RAM and 1TB NVMe (data) and 120GB SSD (system); the software is Oracle Linux 7.6, PostgreSQL 11.1, RocksDB 5.17.2, and HammerDB 3.1.<br />
<h2 style="text-align: left;">
</h2>
<h2 style="text-align: left;">
2. Known limitations</h2>
So far we have not run PgRocks sync engine on large-scale data inputs. But it is clear that a) RocksDB locks for writing critically limit the speed and b) our sync engine may not be very fast. That is why we do not expect to get high on NOPM ("new orders per minute"). Instead, we are looking forward to challenge the sync engine consistency and watch for sync errors.<br />
<br />
With the current hard and software setup, we have been able to score ~500,000 NOPM when running "pure" single-master PostgreSQL 11.1, i.e. without PgRocks plugged in. These results can be found at <a href="http://pgrocks.com/tpc/">http://pgrocks.com/tpc/</a><br />
<h2 style="text-align: left;">
</h2>
<h2 style="text-align: left;">
3. Running test</h2>
<u> a. Data preparation</u><br />
<br />
<br />
We use HammerDB to create the initial tables, indexes and PostgreSQL PL/PGSQL functions necessary for the test. We select "100 warehouses" as option when creating the dataset, and "40 virtual users" option as the number of virtual CPUs. It takes about 5 minutes on our setup to get the initial data ready to begin the custom preparation.<br />
<br />
Because we are interested to calc the combined NOPM metrics, we will be syncing the "<span style="font-family: "courier new" , "courier" , monospace;"><b>orders</b></span>" table between the test nodes A and B. We will prepare the custom data on node A (localhost, port 5432), copy to node B (localhost, port 9873). Then we start PgRocks on both nodes, switch on the sync engine, and then we will run two instances of HammerDB simultaneously on two nodes:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPexVMvqS4G-ZLwc7Bw5fj9F0k6WS-BBlqVb4NLIn2IeFqoI7tlJZqvVuQWCHqsOTUF3YTNJ6wg3k8fb-xvmtNnCrSJbaTCA1N-pLmK9NjIaPq4F0vES-NTsV5m9vVR6UqfYt6ySY4nKA/s1600/schemeA.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" data-original-height="400" data-original-width="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPexVMvqS4G-ZLwc7Bw5fj9F0k6WS-BBlqVb4NLIn2IeFqoI7tlJZqvVuQWCHqsOTUF3YTNJ6wg3k8fb-xvmtNnCrSJbaTCA1N-pLmK9NjIaPq4F0vES-NTsV5m9vVR6UqfYt6ySY4nKA/s1600/schemeA.png" /></a></div>
<br />
<br />
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
Here is how we configure the two instances of HammerDB:</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8CLCWJSOJZbfKgNjwV80OGgZc4Q-m8q4NkB5ovbjFI7ggD_peefVBS2xucmM5YwNYEYC6aVr0UXC8Fx0-Ehu6N3FTAznqi_RGWB3zLwuRs3j2oWCOAkYOKffEcC4_B5M_PDH5dk5Nl8M/s1600/prepare.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" data-original-height="707" data-original-width="722" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8CLCWJSOJZbfKgNjwV80OGgZc4Q-m8q4NkB5ovbjFI7ggD_peefVBS2xucmM5YwNYEYC6aVr0UXC8Fx0-Ehu6N3FTAznqi_RGWB3zLwuRs3j2oWCOAkYOKffEcC4_B5M_PDH5dk5Nl8M/s1600/prepare.png" /></a></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
And here, step by step, how we prepare the custom data:</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
1. We write table <b><span style="font-family: "courier new" , "courier" , monospace;">orders</span></b> to RocksDB. For that we use "<span style="font-family: "courier new" , "courier" , monospace;">writing.pl</span>" script that splits the task to 16 parallel workers and creates 16 new RocksDB stores. Before we run it, we load PL/PGSQL library "<span style="font-family: "courier new" , "courier" , monospace;">lib_v3.sql</span>". Both scripts and libraries are found in PgRocks github at <a href="https://github.com/kl3eo/PgRocks">https://github.com/kl3eo/PgRocks</a></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">[postgres@alex6 db_v2]$ </span></span>psql tpcc < lib_v3.sql</span></span><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"> </span></span></span></span><br />
<br />
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">[postgres@alex6 db_v2]$ </span></span>./writing.pl orders 5432 tpcc 3000000 1 16 <br /><br />[postgres@alex6 db_v2]$ du /opt/nvme/pgrocks/</span></span></div>
<div style="text-align: left;">
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><br />8304 /opt/nvme/pgrocks/rocksdb_7<br />11108 /opt/nvme/pgrocks/rocksdb_15<br />10332 /opt/nvme/pgrocks/rocksdb_14<br />8516 /opt/nvme/pgrocks/rocksdb_6<br />9644 /opt/nvme/pgrocks/rocksdb_13<br />9652 /opt/nvme/pgrocks/rocksdb_4<br />11524 /opt/nvme/pgrocks/rocksdb_16<br />8952 /opt/nvme/pgrocks/rocksdb_5<br />8084 /opt/nvme/pgrocks/rocksdb_8<br />8520 /opt/nvme/pgrocks/rocksdb_11<br />11128 /opt/nvme/pgrocks/rocksdb_2<br />10360 /opt/nvme/pgrocks/rocksdb_3<br />8952 /opt/nvme/pgrocks/rocksdb_12<br />11436 /opt/nvme/pgrocks/rocksdb_1<br />8100 /opt/nvme/pgrocks/rocksdb_9<br />8288 /opt/nvme/pgrocks/rocksdb_10<br />152904 /opt/nvme/pgrocks/<br /><br /><br />tpcc=# select count(*) from orders_v3_dna;<br /> count <br />---------<br /> 3000000<br />(1 row)</span></span></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
So far we've written 3 million records from <b><span style="font-family: "courier new" , "courier" , monospace;">orders</span></b> to 16 RocksDB stores located at <span style="font-family: "courier new" , "courier" , monospace;">/opt/nvme/pgrocks/</span>, and a new table <b><span style="font-family: "courier new" , "courier" , monospace;">orders_v3_dna</span></b> with keys to all the RocksDB values containing rows of the original table.</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
2. Now we read the values from the RocksDB stores back to PostgreSQL DB "tpcc" into a new table <b><span style="font-family: "courier new" , "courier" , monospace;">orders_c0</span></b> which holds the rows from the old table but with new columns "key" and "mark" - the latter holds the number of the RocksDB store where the record can be found under the given "key".</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">[postgres@alex6 db_v2]$ </span></span></span>./reading.pl orders 5432 tpcc 1 1 8 && ./reading.pl orders 5432 tpcc 0 9 8 2>&1 &</span></span></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
( We've run consequently two 8-parallel scrips in order to bind to our resources limit.)</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
3. We don't need the old table so we drop it and rename the new table:</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">drop table orders;<br />alter table orders_c0 rename to orders;</span></span></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
4. We create a new index, change owner and replace indexes from the old table with new indexes which are unique on combination with our "key" column:</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">create index orders_c0_idx on orders (key,mark);<br />alter table orders owner to tpcc;<br />alter table orders_v3_dna owner to tpcc;<br />alter table orders add constraint orders_i1 primary key (key,o_w_id, o_d_id, o_id);<br />create unique index orders_i2 on orders (key,o_w_id,o_d_id,o_c_id,o_id);</span></span></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
5. Now we have finished the data still there are changes to be done to functions used by HammerDB:</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">[postgres@alex6 db_v2]$ </span></span></span>psql tpcc < neword.sql</span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"> </span></span></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
6. Now dump the database "tpcc" then create and restore it on Node B:<br />
<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">[postgres@alex6 db_v2]$ pg_dump -c -o tpcc > /opt/nvme/csv/tpcc_pgrocks<span style="font-family: "courier new" , "courier" , monospace;">.out</span></span></span></span></span></span><br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">[postgres@alex6 db_v2]$ createdb -p 9873 tpcc</span></span></span></span></span></span></span></span></span></span></span><br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">[postgres@alex6 db_v2]$ </span></span></span></span></span></span></span></span></span></span></span>psql -p 9873 tpcc</span></span></span></span></span> < </span></span></span></span></span></span><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">/opt/nvme/csv/tpcc_pgrocks<span style="font-family: "courier new" , "courier" , monospace;">.out</span></span></span></span></span></span><br />
<br />
<br />
That's it with the preparation. </div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
Now we are ready to begin test and we now start the sync engine:</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">[postgres@alex6 db_v2]$ cat s.sh<br />#!/bin/bash<br /><br />node server_pool_i.js >> server.log 2>&1 &<br />node server_pool_u.js >> server.log 2>&1 &</span></span><br />
<br /></div>
<div style="text-align: left;">
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"></span></span></div>
<div style="text-align: left;">
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><br /></span></span></div>
<u> b. The test</u><br />
<br />
We run the test for 30 minutes with 1 min ramp.<br />
<u><br /></u>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgaWEWwwdDx7cOsSFC4QHH_jyAl3BogyAN9fpN15cchzHmMbp59KUq6FzvxHABRneB1DplUpCfcGf7p7hI2R75clpmBLWdS4AUt5HLYsqNjpfKbv9VPg8sIMfwRJyWZpnc9WiEYn25neaI/s1600/2x10_30min.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" data-original-height="900" data-original-width="1600" height="360" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgaWEWwwdDx7cOsSFC4QHH_jyAl3BogyAN9fpN15cchzHmMbp59KUq6FzvxHABRneB1DplUpCfcGf7p7hI2R75clpmBLWdS4AUt5HLYsqNjpfKbv9VPg8sIMfwRJyWZpnc9WiEYn25neaI/s640/2x10_30min.png" width="640" /></a></div>
<u><br /></u>
<br />
<div style="text-align: left;">
The CPUs power is used at ~ 70%:</div>
<div style="text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQc3U4125tQ-_Tv8DXigZLU4XJFapXSUYKLQTOcpv-4TrHfbmCx9h6PJ3avbKhRir0om-feTiRdsDLu_B3K9qNgQd3tOVwhEZ3YYMVufgo7ctjR7m6NvvEgdnzhLjvYMDOYCIANl-mpS8/s1600/cpus_2x10_30min.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" data-original-height="900" data-original-width="1600" height="360" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQc3U4125tQ-_Tv8DXigZLU4XJFapXSUYKLQTOcpv-4TrHfbmCx9h6PJ3avbKhRir0om-feTiRdsDLu_B3K9qNgQd3tOVwhEZ3YYMVufgo7ctjR7m6NvvEgdnzhLjvYMDOYCIANl-mpS8/s640/cpus_2x10_30min.png" width="640" /></a></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
</div>
<div style="text-align: left;">
And as we foresaw, we didn't get really high scores:</div>
<div style="text-align: left;">
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9JeprAc3TWFfP1KeKtlKz9c6kR2okFRfrRHErDzvn7ITN_q75trNog4kNJvw-_if4NlNzzq9BvWxb5h7fDcPVTdcl4Q_CjB6bk-_uud6YUfTFYxkli-wJpzXW2pItNG4HYkf5cRuBE2I/s1600/res_2x10_30min_a.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" data-original-height="751" data-original-width="1186" height="405" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9JeprAc3TWFfP1KeKtlKz9c6kR2okFRfrRHErDzvn7ITN_q75trNog4kNJvw-_if4NlNzzq9BvWxb5h7fDcPVTdcl4Q_CjB6bk-_uud6YUfTFYxkli-wJpzXW2pItNG4HYkf5cRuBE2I/s640/res_2x10_30min_a.png" width="640" /></a></div>
</div>
<div style="text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
We've achieved the total of > 1000 NOPM per minute from two nodes, that is ~16 new records per second. In order to compare the TPCC result to just pushing "INSERT" as fast as possible, we run a simple script "ins.pl":</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">[postgres@alex6 db_v2]$ ./ins.pl > ins.log.1 2>&1 & ./ins.pl 9873 > ins.log.2 2>&1 & ./ins.pl > ins.log.3 2>&1 & ./ins.pl 9873 > ins.log.4 2>&1 & ./ins.pl > ins.log.5 2>&1 & ./ins.pl 9873 > ins.log.6 2>&1 & ./ins.pl > ins.log.7 2>&1 & ./ins.pl 9873 > ins.log.8 2>&1 & ./ins.pl > ins.log.9 2>&1 & ./ins.pl 9873 > ins.log.10 2>&1 &</span></span></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
When it's done, we have inserted 100,000 new records in just 513 sec (RUN#2), which is ~195 rows per second, no errors were detected in sync between nodes, that is, the tables <b><span style="font-family: "courier new" , "courier" , monospace;">orders</span></b> on both nodes are identical and contain the same number of rows coming from each node:<br />
<br />
RUN#1<br />
Tini=1544250995<br />
Tfin=1544251501<br />
deltaT=506sec<br />
deltaR=99998<br />
v=198 r/s<br />
<br />
RUN#2 </div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
Tini=1544252195<br />
Tfin=1544252708<br />
deltaT=513s<br />
deltaR=100000<br />
v=195 r/s </div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">tpcc=# select to_timestamp(1544252195);<br /> to_timestamp <br />------------------------<br /> 2018-12-08 09:56:35+03<br />(1 row)<br /><br />tpcc=# select to_timestamp(1544252708);<br /> to_timestamp <br />------------------------<br /> 2018-12-08 10:05:08+03<br />(1 row)<br /><br />tpcc=# select count(*) from orders where to_timestamp(substr(key::text,2,10)::bigint) >= '2018-12-08 09:56:35' and to_timestamp(substr(key::text,2,10)::bigint) <= '2018-12-08 10:05:08';<br /> count <br />--------<br /> 100000<br />(1 row)<br /><br />tpcc=# select count(*) from orders where to_timestamp(substr(key::text,2,10)::bigint) >= '2018-12-08 09:56:35' and to_timestamp(substr(key::text,2,10)::bigint) <= '2018-12-08 10:05:08' and substr(key::text,0,2)::int = 1;<br /> count <br />-------<br /> 50000<br />(1 row)<br /><br />tpcc=# select count(*) from orders where to_timestamp(substr(key::text,2,10)::bigint) >= '2018-12-08 09:56:35' and to_timestamp(substr(key::text,2,10)::bigint) <= '2018-12-08 10:05:08' and substr(key::text,0,2)::int = 2;<br /> count <br />-------<br /> 50000<br />(1 row)</span></span></div>
<h2 style="text-align: left;">
</h2>
<h2 style="text-align: left;">
4. RocksDB compaction style and locking</h2>
<div style="text-align: left;">
We first used RocksDB with its default "level" style of compaction and the default number of six layers. This turned out an inappropriate configuration as although we didn't have errors doing nodes sync, we couldn't get more than 50 r/s speed of inserts in the first run and with each new run the speed would decrease dramatically to mere 10 r/s after the fourth run.<br />
<br />
After trying various options we have settled upon a variant of "universal style compaction" with just the default L0 compaction, i.e.<br />
<br />
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">rocksdb_options_optimize_level_style_compaction(rocksdb_options, 0);</span></span><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"> </span></span><br />
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">rocksdb_options_set_create_if_missing(rocksdb_options, createIfMissing ? 1 : 0);<br />rocksdb_options_set_compression(rocksdb_options, 0);<br />rocksdb_options_set_write_buffer_size(rocksdb_options, 64 << 20); // 64MB<br />rocksdb_options_set_max_open_files(rocksdb_options, 300);<br />rocksdb_options_set_num_levels(rocksdb_options,1);</span></span><br />
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">rocksdb_options_set_keep_log_file_num(rocksdb_options,2);<br />rocksdb_options_set_allow_concurrent_memtable_write(rocksdb_options, 1);<br />rocksdb_options_set_min_write_buffer_number_to_merge(rocksdb_options,1);<br />rocksdb_options_set_compaction_style(rocksdb_options,1);</span></span><br />
<br />
Selecting these options would allow us to avoid the appearance of thousands of small one-record-per-file SST files during compaction to L1, and this turned out to be the factor that made the overall speed stable and not drop with each new run.<br />
<br />
We have a loop in our C API "rocksdb_open()" wrapper to wait for the previous writer lock to finish:</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"> int counter = 1000;</span></span></div>
<div style="text-align: left;">
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"> while (err != NULL && counter > 0) {<br /> err = NULL;<br /> usleep(10000);<br /> rocksdb = rocksdb_open(rocksdb_options, name, &err);<br /> counter--;<br /> }<br /> <br /> if (err != NULL) {<br /> ereport(ERROR,<br /> (errcode(ERRCODE_NO_DATA),<br /> errmsg("[rocksdb], open error: %s", err)));<br /> } else {<br /> <br /> FILE *file = fopen("/tmp/report", "a+");<br /><br /> if (file)<br /> { <br /> fprintf(file, "%d\n", 1000-counter);<br /> fclose(file); <br /> }<br /> }</span></span></div>
<div style="text-align: left;">
<br />
<br /></div>
<div style="text-align: left;">
When we check the "/tmp/report" file (see example below), we see that about 30% of the records are not zero, with mean 2.88 * 10 ms ~ 29 ms. Which gives us a mean delay about 10 ms per record, while still having result speed of ~200 rows per second means the overall impact of locking must be heavy.<br />
... <br />
... <br />
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">0<br />5<br />0<br />0<br />0<br />0<br />0<br />0<br />2<br />0<br />0<br />2<br />0<br />0<br />1<br />5</span></span><br />
...<br />
... <br />
<br />
If we trim the code above and decrease the "usleep" argument to 5,000, we may expect some growth of the overall speed. Let's double check it and see:<br />
<br />
RUN#3<br />
Tini=1544267952<br />
Tfin=1544268445<br />
deltaT=493s<br />
deltaR=99999<br />
v = 203 r/s<br />
<br />
RUN#4<br />
Tini=1544269110<br />
Tfin=1544269618<br />
deltaT=508s<br />
deltaR=100000<br />
v = 197 r/s<br />
<br />
Alas, there is no conclusive evidence that our assumption was correct . But we have not slowed down the speed with each new run as total rows in <b><span style="font-family: "courier new" , "courier" , monospace;">orders</span></b> and in RocksDB were growing:<br />
<br />
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">tpcc=# select count(*) from orders;<br /> count <br />---------<br /> 3399997<br />(1 row)</span></span><br />
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><br /></span></span>
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">[postgres@alex6 db_v2]$ du /opt/nvme/pgrocks/<br />10824 /opt/nvme/pgrocks/rocksdb_7<br />13608 /opt/nvme/pgrocks/rocksdb_15<br />12884 /opt/nvme/pgrocks/rocksdb_14<br />11060 /opt/nvme/pgrocks/rocksdb_6<br />12196 /opt/nvme/pgrocks/rocksdb_13<br />12192 /opt/nvme/pgrocks/rocksdb_4<br />14084 /opt/nvme/pgrocks/rocksdb_16<br />11496 /opt/nvme/pgrocks/rocksdb_5<br />10640 /opt/nvme/pgrocks/rocksdb_8<br />11072 /opt/nvme/pgrocks/rocksdb_11<br />13672 /opt/nvme/pgrocks/rocksdb_2<br />12920 /opt/nvme/pgrocks/rocksdb_3<br />11496 /opt/nvme/pgrocks/rocksdb_12<br />13988 /opt/nvme/pgrocks/rocksdb_1<br />10664 /opt/nvme/pgrocks/rocksdb_9<br />10824 /opt/nvme/pgrocks/rocksdb_10<br />193624 /opt/nvme/pgrocks/</span></span><br />
<br />
(We here might have lost the insert of 3 rows in 4 runs due to race condition while terminating scripts, but the total number of new rows is the same on both nodes as there were no errors during the sync process).<br />
<br />
NB: the CPUs were loaded about 35% in average during the parallel 10 runs of "ins.pl".<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhjLJETs10rYVNZH_sruKJaIRlvq6rZ3UspGiAgDz7nqkAbJx98_F2-uqTNPkwGlT9iTq4dBw8CE3UlR7_JBduxDnerwF3w24_uBzn8xDV0geD1L3eQgdX78hx67J2ziE1-DZjoF9pUGt4/s1600/10_us_ins_cpu.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" data-original-height="1024" data-original-width="1280" height="512" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhjLJETs10rYVNZH_sruKJaIRlvq6rZ3UspGiAgDz7nqkAbJx98_F2-uqTNPkwGlT9iTq4dBw8CE3UlR7_JBduxDnerwF3w24_uBzn8xDV0geD1L3eQgdX78hx67J2ziE1-DZjoF9pUGt4/s640/10_us_ins_cpu.png" width="640" /></a></div>
<br />
<br /></div>
<h2 style="text-align: left;">
</h2>
<h2 style="text-align: left;">
</h2>
<h2 style="text-align: left;">
5. Conclusions</h2>
<div style="text-align: left;">
We have been able here to test the consistency of PgRocks multimaster mode on a model with just two nodes running on a single server, and while the overall speed was low, we have seen stable and error-free work of the sync engine put in stress of data input and update. The RocksDB locking and imperfect selection of compaction style may have been the two general contributing factors to low results of the test when compared to single-master PostgreSQL results.</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<br /></div>
</div>
Alexhttp://www.blogger.com/profile/12010600222763457685noreply@blogger.com0tag:blogger.com,1999:blog-702565810121406228.post-31209981909243945092018-04-07T02:30:00.000-07:002018-04-12T04:45:26.640-07:00Using PgRocks to sync and rewind data on PostgreSQL nodes<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: left;">
<div style="text-align: left;">
<b>Introduction</b></div>
</div>
<br />
The following article discusses the usage of <a href="https://github.com/kl3eo/PgRocks">PgRocks</a> on hosts co-running the same web app (i.e. nodes) with a goal to provide a solution for the task of syncing data between PostgreSQL tables on these nodes.<br />
<br />
The software we need on them is built up for a web application running on Apache and making connection to a PostgreSQL database. We will assume a copy of our own corporate web application is running on each node.<br />
<br />
The data used by our app is such that there are local tables as well as 'global' tables, while only this latter data must be entirely shared by all nodes.<br />
<br />
So we are going to deal with this situation by syncing only these tables to <i>all</i> of our PG nodes, while the rest of the tables contain locally significant data and are not to be in sync between the nodes. This approach is further discussed in this article.<br />
<br />
Here is described how to <a href="http://alexshevlakov.blogspot.ru/2018/04/how-to-set-up-postgresql-with-plugin-to.html">setup PgRocks</a> on a single Linux machine. We will compile and install PostgreSQL and RocksDB with PgRocks on each of the nodes according to that guide.<br />
<br />
Then we are concerned with a choice of network layer implementation because we have to share the bottom layer of data between our nodes. The bottom level of data is supposed to reside in a RocksDB store that is accessible from all the nodes for read and write. While there are several solutions how to setup such RocksDB store, for simplicity in this article we suppose a network file system (NFS) is the setup sufficient for demonstration of key principles of our approach (see Fig. 1).<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTDz0sHH-8Jz8FnyDM24Zj-PS_Wept_L16vPzdAltPeaKfv317p_lkKKJQgmvlRunNWYCS5_YVI1uvBK33DDvdj3stzSEgEWS5UKFKRJnPGSnjDs0cWUZu7CST-amjvglPgoW8pvApxBE/s1600/planB.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="360" data-original-width="720" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTDz0sHH-8Jz8FnyDM24Zj-PS_Wept_L16vPzdAltPeaKfv317p_lkKKJQgmvlRunNWYCS5_YVI1uvBK33DDvdj3stzSEgEWS5UKFKRJnPGSnjDs0cWUZu7CST-amjvglPgoW8pvApxBE/s400/planB.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Figure 1. The network layer implementation scheme.</td></tr>
</tbody></table>
<br />
Also for simplicity further in this article when we show examples we will limit to only two nodes.<br />
<br />
<div style="text-align: left;">
<b>1. Preparing initial Cache-0</b></div>
<br />
We will refer as <i>Cache-0</i> to all those tables that have to be identical on all nodes, so that our goal is to keep Cache-0 tables in the state of synchronization with each other between the nodes.<br />
<br />
We assume that Cache-0 tables will normally be accessed by the users of our application through their web interface. The access to them by superusers via console must not interfere with the web application.<br />
<br />
Before our web application is launched on the nodes, everything has to get prepared on each node.<br />
<br />
<i>First</i>, we assign numbers to each PostgreSQL node through <span style="font-family: "courier new" , "courier" , monospace;">postgresql.conf</span> new parameter <i>node_number</i>, namely:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">node_number = 1</span><br />
<br />
and<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">node_number = 2</span><br />
...<br />
<br />
and so on, on all nodes, respectively.<br />
<br />
<i>Second</i>, we prepare <i>a PL/PGSQL library</i> that contains functions we need for communication between PostgreSQL nodes, the <i>sync engine</i> (see later) and RocksDB store. We will discuss the details of these functions later in this article. Most important for us now, this library contains functions for creating RocksDB store, packing initial data to RocksDB store and filling the Cache-0 from the RocksDB store. We have to deploy this library on all our participating nodes.<br />
<br />
<i>Third</i>, only on one of our nodes we execute functions from the library in order to create a new RocksDB store on a mounted NFS partition, pack our initial data into it and simultaneously create the initial <i>DNA (Data Numeric Audit) table</i> <span style="font-family: "courier new" , "courier" , monospace;">v1_dna</span> with keys to our data just packed into the RocksDB store.<br />
<br />
That's how DNA version one table (i.e. <span style="font-family: "courier new" , "courier" , monospace;">v1_dna</span>) looks like after packing e.g. table '<span style="font-family: "courier new" , "courier" , monospace;">players</span>' into the RocksDB store #1:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: xx-small;"><b>select * from v1_dna_1;<br /><br /> </b> tab | rev | key <br />---------+-----+-------------------<br /> players | 1 | 11522687676247617<br /> players | 1 | 11522687676247896<br /> players | 1 | 11522687676247929</span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: xx-small;">.... </span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: xx-small;">....</span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: xx-small;">.... </span></span><br />
<br />
<i>Fourth</i>, we clone this initial <span style="font-family: "courier new" , "courier" , monospace;">v1_dna</span> to all other nodes, which is, we send its text <span style="font-family: "courier new" , "courier" , monospace;">pg_dump</span> output and then restore with <span style="font-family: "courier new" , "courier" , monospace;">psql</span> on each node.<br />
<br />
<i>Fifth</i>, we mount NFS partition of our RocksDB store on all other nodes and then use our library on each of the nodes to create initial Cache-0 tables and fill them with data from the RocksDB store. <br />
<br />
Now that we have identical <span style="font-family: "courier new" , "courier" , monospace;">v1_dna</span> and Cache-0 on each of the participating nodes, we are ready to start the sync engine and then to start the web app. <br />
<br />
<div style="text-align: left;">
<b>2. PL/PGSQL library</b></div>
<br />
This library contains a number of SQL functions for PG nodes to communicate with the sync engine and with the RocksDB store, e.g. here is a function that does packing data from the initial PG table to the RocksDB store:<br />
<br />
<span style="background-color: #eeeeee;"><span style="font-size: x-small;"><span style="font-family: "verdana" , sans-serif;">CREATE OR REPLACE FUNCTION _w_v1_dna(text, text, int)<br />RETURNS integer AS $$<br />DECLARE <br /> result integer := 0;<br />BEGIN<br />EXECUTE format('truncate tmp');<br />EXECUTE format('insert into tmp select * from %s where %s = %s',$1,$2,$3);<br />EXECUTE format('insert into v1_dna_%s (tab, rev, key) select ''%s'',1,row_to_csv_rocks(%s,tmp) from tmp',$3,$1,$3);<br />EXECUTE format('select rocks_close()');<br /> return result;<br />END;$$ LANGUAGE plpgsql;</span></span></span><br />
<br />
We have assumed here that the initial table has a column of type <span style="font-family: "courier new" , "courier" , monospace;">integer</span> that serves to us as parameter for subdividing this table (see the second EXECUTE line in the code above); this column's name is the second and its filter value is the third parameter of<span style="font-family: "courier new" , "courier" , monospace;"> </span><span style="font-family: "courier new" , "courier" , monospace;">_w_v1_dna(text,text,int).</span><br />
<br />
We can send our initial data to different RocksDB stores based on value of this parameter. We will further assume that for all initial data this parameter is set to 1, so that we have to deal only with RocksDB store#1 and <span style="font-family: "courier new" , "courier" , monospace;">v1_dna_1</span>. We also initialize our <span style="font-family: "courier new" , "courier" , monospace;">v1_dna</span> with <span style="font-family: "courier new" , "courier" , monospace;">rev</span> column set to 1 for all new rows. (See how this revision works later at 'Data rewind'.)<br />
<br />
As soon as the RocksDB store is created and <span style="font-family: "courier new" , "courier" , monospace;">v1_dna</span> filled with new keys, we are ready to proceed and unpack the data into the Cache-0 table (see the following code for that). The result <span style="font-family: "courier new" , "courier" , monospace;">mytable_c0_1</span> differs from the initial table <span style="font-family: "courier new" , "courier" , monospace;">mytable</span> in an additional column <span style="font-family: "courier new" , "courier" , monospace;">key</span> for keeping the RocksDB store key:<br />
<span style="background-color: #eeeeee;"><span style="font-size: x-small;"><span style="font-family: "verdana" , sans-serif;"><br />if $1 = 'mytable' then<br />EXECUTE format('create table if not exists %s_c0_%s with oids as select key, d.* <br />from v1_dna_%s, rocks_csv_to_record(%s,v1_dna_%s.key) </span></span></span><br />
<span style="background-color: #eeeeee;"><span style="font-size: x-small;"><span style="font-family: "verdana" , sans-serif;"><i>-- we have to describe our fields here</i></span></span></span><br />
<span style="background-color: #eeeeee;"><span style="font-size: x-small;"><span style="font-family: "verdana" , sans-serif;">d(<i>field1 field1_type, ...,..., fieldN fieldN_type</i>)</span></span></span><br />
<span style="background-color: #eeeeee;"><span style="font-size: x-small;"><span style="font-family: "verdana" , sans-serif;">-- <i>why we select only positive revisions from dna table here, see "Data rewind"</i></span></span></span><br />
<span style="background-color: #eeeeee;"><span style="font-size: x-small;"><span style="font-family: "verdana" , sans-serif;">where v1_dna_%s.rev > 0 and v1_dna_%s.tab = ''%s''',$1,$2,$2,$2,$2,$2,$2,$1);</span></span></span>
<span style="background-color: #eeeeee;"><span style="font-size: x-small;"><span style="font-family: "verdana" , sans-serif;">elsif $1 = '</span></span></span><span style="background-color: #eeeeee;"><span style="font-size: x-small;"><span style="font-family: "verdana" , sans-serif;"><span style="background-color: #eeeeee;"><span style="font-size: x-small;"><span style="font-family: "verdana" , sans-serif;"><span style="background-color: #eeeeee;"><span style="font-size: x-small;"><span style="font-family: "verdana" , sans-serif;">myothertable</span></span></span></span></span></span>'</span></span></span><br />
<span style="background-color: #eeeeee;"><span style="font-size: x-small;"><span style="font-family: "verdana" , sans-serif;">...</span></span></span><br />
<span style="background-color: #eeeeee;"><span style="font-size: x-small;"><span style="font-family: "verdana" , sans-serif;">...<br />end if;</span></span></span><br />
<br />
The result Cache-0 table has got an important trigger for CUD events occurring on it:<br />
<span style="background-color: #eeeeee;"><span style="font-size: x-small;"><span style="font-family: "verdana" , sans-serif;"><br />EXECUTE format('CREATE TRIGGER %s_c0_%s_i<br />BEFORE INSERT OR UPDATE OR DELETE <br />ON %s_c0_%s<br />FOR EACH ROW<br />EXECUTE PROCEDURE _w_new_row(''%s'',%s);',$1,$2,$1,$2,$1,$2);</span></span></span><br />
<br />
The procedure <span style="font-family: "courier new" , "courier" , monospace;">_w_new_row</span> called from this trigger depending on type of event does the following (here is an example of its code for INSERT):<br />
<br />
(...) <br />
<br />
<span style="background-color: #eeeeee;"><span style="font-size: x-small;"><span style="font-family: "verdana" , sans-serif;">IF (TG_OP = 'INSERT') THEN<br />EXECUTE 'select rocks_get_node_number()' into v;<br />IF (v = floor(NEW.key/10000000000000000)) THEN<br />drop table if exists tmp;<br />CREATE TEMP TABLE tmp on commit drop as select NEW.*;<br />ALTER TABLE tmp drop column key;<br />EXECUTE format('select row_to_csv_rocks(%s,tmp) from tmp',TG_ARGV[1]) into v;<br />EXECUTE format('select rocks_close()');<br />EXECUTE format('select cast(%s as text)',v) into vc;<br />payload := (SELECT TG_ARGV[1] || ',' || json_build_object('tab',TG_ARGV[0],'rev',1,'key',vc) );<br /> perform pg_notify('v1_dna_insert', payload); <br />EXECUTE format('insert into v1_dna_%s (tab, rev, key) values(''%s'',1,%s)',TG_ARGV[1],TG_ARGV[0],v);<br />NEW.key = v;<br />END IF;<br />RETURN NEW;</span></span></span><br />
<br />
(...)<br />
<br />
In this code, we first check if the newly arrived row is originating from our node (i.e. from the web app), if it is really 'new' then it has to be packed to the RocksDB store. The sync engine will be notified of this new row via <span style="font-family: "courier new" , "courier" , monospace;">pg_notify</span> with payload of a JSON object containing the new row's key, revision number (which is '1' as it is for all new data) and the table name, i.e. all the data for INSERT into <span style="font-family: "courier new" , "courier" , monospace;">v1_dna</span>. This notification's payload is processed by the sync engine that executes this INSERT on all other nodes.<br />
<br />
If the first check failed, this row must have been added to our local Cache-0 by the sync engine and the data had already been packed into the RocksDB store from another originating node.<br />
<br />
<div style="text-align: left;">
<b>3. Sync engine</b></div>
<br />
The sync engine is a dedicated TCP server that keeps connections to all the nodes. Once an INSERT or UPDATE event occurred on a node's Cache-0 table, the triggered procedure sends <span style="font-family: "courier new" , "courier" , monospace;">pg_notify</span> to the sync engine with a payload containing new RocksDB record key, revision and table name. The sync engine then makes all other nodes write this information into their DNAs and then makes them fetch this single row from the RocksDB store into their Cache-0.<br />
<br />
It is worth mentioning that UPDATE and INSERT both create a new record in the RocksDB store and in the v1_dna. The former data of the updated record is not removed from the RocksDB store, but its <span style="font-family: "courier new" , "courier" , monospace;">rev</span> value in <span style="font-family: "courier new" , "courier" , monospace;">v1_dna</span> changes to <span style="font-family: "courier new" , "courier" , monospace;">(-1)*rev</span>, and the one corresponding to the updated record becomes (<span style="font-family: "courier new" , "courier" , monospace;">rev + 1</span>). <br />
<br />
The sync engine also handles DELETE events for all Cache-0 in sync with the originating node. However, even as some row gets deleted from the Cache-0 nothing at all changes in the RocksDB store. Only in <span style="font-family: "courier new" , "courier" , monospace;">v1_dna</span> its <span style="font-family: "courier new" , "courier" , monospace;">rev</span> value becomes '0' due to the fact the data row has been deleted from the Cache-0.<br />
<br />
Handling the <span style="font-family: "courier new" , "courier" , monospace;">rev</span> values this way, it is possible to dig all previous revisions of our data (see 'Data rewind').<br />
<br />
In general, how the sync engine is supposed to work in terms of PostgreSQL <a href="https://www.postgresql.org/docs/current/static/different-replication-solutions.html">replication schemes</a> complies with the definition "middleware for <span class="term">synchronous multimaster replication". Moving the actually shared data away into external store and limiting the direct traffic between nodes only to <span style="font-family: "courier new" , "courier" , monospace;">v1_dna</span> updates will reduce the communication overhead that is often regarded as characteristic of this scheme.</span><br />
<br />
<span class="term">We have developed the code for the sync engine based on <span style="font-family: "verdana" , sans-serif;">node.js</span>. The code for this server is at <span style="font-family: "verdana" , sans-serif;">https://github.com/kl3eo/PgRocks</span>.</span><br />
<br />
<div style="text-align: left;">
<b><span class="term">4. Data consistency </span></b></div>
<br />
Making up such a hybrid architecture we have to be concerned with known issues arising in the multi-node environment - e.g. double-spending transactions, <a href="https://blog.2ndquadrant.com/postgresql-anti-patterns-read-modify-write-cycles/">"read-modify-write" loops</a>, etc. Whereas there are many ways in which PostgreSQL provides for its data consistency through MVCC and locking, we have to set up one of our own based upon locking.<br />
<br />
First, we see that the UPDATE and DELETE events occurring on Cache-0 tables may pose problems for us if two or more nodes simultaneously draw this kind of events on a data row sharing the same key.<br />
<br />
To avoid such problems we have to make sure that no actual writing to the RocksDB store is done by our "workhorse" function <span style="font-family: "courier new" , "courier" , monospace;">row_to_csv_rocks</span> unless the originating node has notified the sync engine about a forthcoming UPDATE. Upon receiving this notification, the sync engine checks the nodes list and applies an appropriate locking on the rows of interest on all other nodes from the checklist. If no locking could be applied on a node, this node has to be dropped from the active nodes list. The sync engine sends back a confirmation of locking to the originating node and only then the trigger function on a Cache-0 table will proceed and write a new row to the RocksDB store, perform pg_notify, and so on (see example trigger code in 'PL/PGSQL library').<br />
<br />
A similar mechanism works when the DELETE event occurs, although in this case there is no writing to the RocksDB store, only locking the Cache-0 record (or may be even the whole table) while its <span style="font-family: "courier new" , "courier" , monospace;">v1_dna</span> is being synchronized and Cache-0 table is updated.<br />
<br />
<div style="text-align: left;">
<b>
5. Data rewind</b></div>
<br />
Having the keys constructed from the epoch time at the moment of
packing, now it is easy to see how we can bring back the data into the
Cache-0 table exactly as it was at a certain moment in past. Namely, for
that we will have another column added to our DNA table named "<span style="font-family: "verdana" , sans-serif;">ancestor</span>" of type <span style="font-family: "courier new" , "courier" , monospace;">bigint</span>,
in which we hold the keys of the "time zero" records of our Cache-0
table, so that each updated row remembers the key of its ancestor with <span style="font-family: "courier new" , "courier" , monospace;">rev</span>=1. This modified version of DNA table has now four columns and we call it <span style="font-family: "courier new" , "courier" , monospace;">v2_dna</span>.<br />
<br />
The mechanism for data rewind is following: the revision of the current row with data is incremented by 1 each time the row was updated, while the previous revision changes its sign. All rows at the beginning have revision "1" in the DNA table, one UPDATE on a row in Cache-0 table will result in a new row with a new key in DNA table, and the rev value is "2" for it, while the record in the DNA that previously referred to the modified row of the Cache-0 will change its rev to (-1). When the record is deleted from the Cache-0 table, its <span style="font-family: "courier new" , "courier" , monospace;">rev</span> in the DNA table becomes "0".<br />
<br />
The following <span style="font-family: "courier new" , "courier" , monospace;">psql</span> output illustrates the process of data revision change occurring in the DNA table.<br />
<br />
We have our data in table 'players' and proceed with the filter 'club_id = 5' to create a Cache-0 table '<span style="font-family: "verdana" , sans-serif;">players_c0_5</span>':<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: xx-small;">test=# <b>select * from players where club_id = 5;</b><br /> name | aka | dob | weight | height | last_seen | club_id <br />----------------+-----------+------------+--------+--------+---------------------+---------</span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: xx-small;"> Petra Che | Baby | 1989-05-30 | 62.3 | 180 | 2017-12-04 11:20:00 | 5<br /> Mike Bryan | Joker | 1984-08-21 | 80 | 180 | 2018-04-08 06:25:00 | 5<br /> Alicia Silver | Checkmate | 1995-09-02 | 57.8 | 168 | 2017-05-24 09:05:00 | 5<br /> Fernan Ozy | Beast | 1967-12-12 | 92.7 | 177 | 2017-12-30 16:10:00 | 5<br /> Ivan Lebed | Ruso | 1959-01-10 | 77.4 | 180 | 2018-03-30 14:30:00 | 5<br />(<span style="font-family: "courier new" , "courier" , monospace;">5</span> rows)</span></span><br />
<br />
Initialize v2_dna:<br />
<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: xx-small;">test=# <b>select _i_v2_dna('players','club_id',4,6);</b><br /> _i_v2_dna <br />-----------<br /> 1<br />(1 row)</span></span><br />
<br />
Pack the data into the RocksDB store#5 and scan it:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: xx-small;">test=# <b>select _wc_v2_dna('players','club_id',4,6);</b><br />NOTICE: Executing _w_v2_dna('players','club_id',5):<br /> _wc_v2_dna <br />------------<br /> 0<br />(1 row)</span></span><br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: xx-small;">test=# <b>select *,</b></span></span><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: xx-small;"><b><span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;">'0x' || LPAD(to_hex(key), 16, '0') as hex_key</span></span> from v2_dna_5;</b><br /> tab | rev | key | ancestor | hex_key <br />---------+-----+-------------------+-------------------+--------------------<br /> players | 1 | 11523440602279181 | 11523440602279181 | 0x0028f08217b06d0d<br /> players | 1 | 11523440602279198 | 11523440602279198 | 0x0028f08217b06d1e<br /> players | 1 | 11523440602279215 | 11523440602279215 | 0x0028f08217b06d2f<br /> players | 1 | 11523440602279232 | 11523440602279232 | 0x0028f08217b06d40<br /> players | 1 | 11523440602279249 | 11523440602279249 | 0x0028f08217b06d51<br />(<span style="font-family: "courier new" , "courier" , monospace;">5</span> rows)</span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: xx-small;"><span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;"></span></span></span></span><span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;"></span><br /></span><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: xx-small;">postgres@quercus:~$ <b>ldb --db=/tmp/rocksdb_5 scan --key_hex</b></span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: xx-small;">0x0028F08217B06D0D : Petra Che|Baby|"1989-05-30"|62.3|180|565701600000000|5|<br />0x0028F08217B06D1E : Mike Bryan|Joker|"1984-08-21"|80|180|576483900000000|5|<br />0x0028F08217B06D2F : Alicia Silver|Checkmate|"1995-09-02"|57.8|168|548931900000000|5|<br />0x0028F08217B06D40 : Fernan Ozy|Beast|"1967-12-12"|92.7|177|567965400000000|5|<br />0x0028F08217B06D51 : Ivan Lebed|Ruso|"1959-01-10"|77.4|180|575735400000000|5|</span></span><br />
<span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;"><br /></span></span>
<span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;"><br /></span></span>
Unpack the data from the RocksDB store#5 to Cache-0 table:<br />
<br />
<span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;">test=# <b><span style="background-color: cyan;">select _p_c0('players',4,6,1);</span></b><br />NOTICE: Executing _checkout_c0('players',5,1):<br />NOTICE: table "players_c0_5" does not exist, skipping<br /> _p_c0 <br />-------<br /> 0<br />(1 row)<br /><br />test=# <b>select * from players_c0_5;</b><br /> key | name | aka | dob | weight | height | last_seen <br />-------------------+----------------+-----------+------------+--------+--------+---------------------<br /> 11523440602279181 | Petra Che | Baby | 1989-05-30 | 62.3 | 180 | 2017-12-04 11:20:00<br /> 11523440602279198 | Mike Bryan | Joker | 1984-08-21 | 80 | 180 | 2018-04-08 06:25:00<br /> 11523440602279215 | Alicia Silver | Checkmate | 1995-09-02 | 57.8 | 168 | 2017-05-24 09:05:00<br /> 11523440602279232 | Fernan Ozy | Beast | 1967-12-12 | 92.7 | 177 | 2017-12-30 16:10:00<br /> 11523440602279249 | Ivan Lebed | Ruso | 1959-01-10 | 77.4 | 180 | 2018-03-30 14:30:00<br />(<span style="font-family: "courier new" , "courier" , monospace;">5</span> rows)</span></span><br />
<br />
<br />
Update a record:<br />
<br />
<span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;">test=# <b><span style="background-color: #b6d7a8;">update players_c0_5 set weight=80.2 where name = 'Mike Bryan';</span></b><br />NOTICE: table "tmp" does not exist, skipping<br />UPDATE 1<br />test=# <b>select * from v2_dna_5;</b><br /> tab | rev | key | ancestor <br />---------+-----+-------------------+-------------------<br /> players | 1 | 11523440602279181 | 11523440602279181<br /> players | 1 | 11523440602279215 | 11523440602279215<br /> players | 1 | 11523440602279232 | 11523440602279232<br /> players | 1 | 11523440602279249 | 11523440602279249<br /> players | 2 | 11523440812685302 | 11523440602279198<br /> players | -1 | 11523440602279198 | 11523440602279198<br />(<span style="font-family: "courier new" , "courier" , monospace;">6</span> rows)</span></span><span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;"><br /></span></span><br />
<br />
Update the same record again:<br />
<br />
<span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;">test=# <b><span style="background-color: #f1c232;">update players_c0_5 set height=181 where name = 'Mike Bryan';</span></b><br />NOTICE: table "tmp" does not exist, skipping<br />UPDATE 1<br />test=# <b>select * from v2_dna_5;</b><br /> tab | rev | key | ancestor <br />---------+-----+-------------------+-------------------<br /> players | 1 | 11523440602279181 | 11523440602279181<br /> players | 1 | 11523440602279215 | 11523440602279215<br /> players | 1 | 11523440602279232 | 11523440602279232<br /> players | 1 | 11523440602279249 | 11523440602279249<br /> players | -1 | 11523440602279198 | 11523440602279198<br /> players | 3 | 11523440850187058 | 11523440602279198<br /> players | -2 | 11523440812685302 | 11523440602279198<br />(<span style="font-family: "courier new" , "courier" , monospace;">7</span> rows)</span></span><span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;"></span></span><br />
<br />
Delete this record and scan the RocksDB store: <br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: xx-small;"></span></span><span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;"></span></span><span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;"></span></span><span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;"><br />test=# <b><span style="background-color: yellow;">delete from players_c0_5 where name = 'Mike Bryan';</span></b><br />DELETE 1<br /> </span></span><br />
<span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;">test=# <b>select *,'0x' || LPAD(to_hex(key), 16, '0') as hex_key from v2_dna_5;</b><br /> tab | rev | key | ancestor | hex_key <br />---------+-----+-------------------+-------------------+--------------------<br /> players | 1 | 11523440602279181 | 11523440602279181 | 0x0028f08217b06d0d<br /> players | 1 | 11523440602279215 | 11523440602279215 | 0x0028f08217b06d2f<br /> players | 1 | 11523440602279232 | 11523440602279232 | 0x0028f08217b06d40<br /> players | 1 | 11523440602279249 | 11523440602279249 | 0x0028f08217b06d51<br /> players | -1 | 11523440602279198 | 11523440602279198 | 0x0028f08217b06d1e<br /> players | -2 | 11523440812685302 | 11523440602279198 | 0x0028f082243af7f6<br /> players | 0 | 11523440887503450 | 11523440602279198 | 0x0028f08228b09a5a<br /> players | -3 | 11523440850187058 | 11523440602279198 | 0x0028f08226773332<br />(<span style="font-family: "courier new" , "courier" , monospace;">8</span> rows)</span></span><br />
<span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;"><br /></span></span>
<span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;"><br /></span></span>
<span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;">postgres@quercus:~$ <b>ldb --db=/tmp/rocksdb_5 scan --key_hex</b></span></span><br />
<span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;">0x0028F08217B06D0D : Petra Che|Baby|"1989-05-30"|62.3|180|565701600000000|5|<br />0x0028F08217B06D1E : Mike Bryan|Joker|"1984-08-21"|80|180|576483900000000|5|<br />0x0028F08217B06D2F : Alicia Silver|Checkmate|"1995-09-02"|57.8|168|548931900000000|5|<br />0x0028F08217B06D40 : Fernan Ozy|Beast|"1967-12-12"|92.7|177|567965400000000|5|<br />0x0028F08217B06D51 : Ivan Lebed|Ruso|"1959-01-10"|77.4|180|575735400000000|5|<br />0x0028F082243AF7F6 : Mike Bryan|Joker|"1984-08-21"|80.2|180|576483900000000|<br />0x0028F08226773332 : Mike Bryan|Joker|"1984-08-21"|80.2|181|576483900000000|<br /> </span></span><span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;"></span></span> <br />
<br />
Now, to accomplish our task of rewinding Cache-0 data to a certain moment in past all we have to do is select keys that satisfy to the given clause and feed them to our <span style="font-family: "courier new" , "courier" , monospace;">rocks_csv_to_record</span> function:<br />
<br />
<span style="background-color: #eeeeee;"><span style="font-size: x-small;"><span style="font-family: "verdana" , sans-serif;">CREATE OR REPLACE FUNCTION _rewind_c0(text,int,timestamptz)<br />RETURNS integer AS $$<br />BEGIN<br />if ($1 = 'players') then<br />else <br />RETURN -1;<br />end if;<br />EXECUTE format('drop table if exists %s_c0_%s',$1,$2); <br />EXECUTE format('create temp table tmp on commit drop as select max(abs(rev)) as max, min(abs(rev)) as min, ancestor from v2_dna_%s where right(key::text,16)::bigint < EXTRACT(EPOCH FROM timestamptz ''%s'')*1000000 and tab = ''%s'' group by ancestor',$2,$3,$1);<br />if $1 = 'players' then<br />EXECUTE format('create unlogged table if not exists %s_c0_%s with oids as select key, d.* <br />from v2_dna_%s, tmp, rocks_csv_to_record(%s,v2_dna_%s.key) <br />d(name text, aka text, dob date, weight float, height int, last_seen timestamp<br />) where abs(v2_dna_%s.rev) = tmp.max and tmp.min != 0 and v2_dna_%s.ancestor = tmp.ancestor',$1,$2,$2,$2,$2,$2,$2);<br />end if;<br />EXECUTE format('select rocks_close()');<br />EXECUTE format('CREATE TRIGGER %s_c0_%s_i<br />BEFORE INSERT OR UPDATE OR DELETE <br />ON %s_c0_%s<br />FOR EACH ROW<br />EXECUTE PROCEDURE _w_new_row(''%s'',%s);',$1,$2,$1,$2,$1,$2);<br />RETURN 0;<br />END;$$ LANGUAGE plpgsql;</span></span></span><br />
<br />
Now run this function four times with 30 sec interval around the time we modified and then deleted our record:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: xx-small;">test=# <b>select _rewind_c0('players',5,'<span style="background-color: cyan;">2018-04-11 13:00:00</span>');</b><br /> _rewind_c0 <br />------------<br /> 0<br />(1 row)<br /><br />test=# <b>select * from players_c0_5;</b><br /> key | name | aka | dob | weight | height | last_seen <br />-------------------+----------------+-----------+------------+--------+--------+---------------------<br /> 11523440602279181 | Petra Che | Baby | 1989-05-30 | 62.3 | 180 | 2017-12-04 11:20:00<br /> 11523440602279198 | Mike Bryan | Joker | 1984-08-21 | <span style="background-color: cyan;">80</span> | <span style="background-color: cyan;">180</span> | 2018-04-08 06:25:00<br /> 11523440602279215 | Alicia Silver | Checkmate | 1995-09-02 | 57.8 | 168 | 2017-05-24 09:05:00<br /> 11523440602279232 | Fernan Ozy | Beast | 1967-12-12 | 92.7 | 177 | 2017-12-30 16:10:00<br /> 11523440602279249 | Ivan Lebed | Ruso | 1959-01-10 | 77.4 | 180 | 2018-03-30 14:30:00<br />(<span style="font-family: "courier new" , "courier" , monospace;">5</span> rows)</span></span><br />
<br />
<span style="font-size: xx-small;"><span style="font-family: "courier new" , "courier" , monospace;">test=# <b>select _rewind_c0('players',5,'<span style="background-color: #b6d7a8;">2018-04-11 13:00:30</span>');</b><br /> _rewind_c0 <br />------------<br /> 0<br />(1 row)<br /><br />test=# <b>select * from players_c0_5;</b><br /> key | name | aka | dob | weight | height | last_seen <br />-------------------+----------------+-----------+------------+--------+--------+---------------------<br /> 11523440602279181 | Petra Che | Baby | 1989-05-30 | 62.3 | 180 | 2017-12-04 11:20:00<br /> 11523440602279215 | Alicia Silver | Checkmate | 1995-09-02 | 57.8 | 168 | 2017-05-24 09:05:00<br /> 11523440602279232 | Fernan Ozy | Beast | 1967-12-12 | 92.7 | 177 | 2017-12-30 16:10:00<br /> 11523440602279249 | Ivan Lebed | Ruso | 1959-01-10 | 77.4 | 180 | 2018-03-30 14:30:00<br /> 11523440812685302 | Mike Bryan | Joker | 1984-08-21 | <span style="background-color: #b6d7a8;">80.2</span> | <span style="background-color: cyan;">180</span> | 2018-04-08 06:25:00<br />(<span style="font-family: "courier new" , "courier" , monospace;">5</span> rows)<br /><br />test=# <b>select _rewind_c0('players',5,'<span style="background-color: #f1c232;">2018-04-11 13:01</span>');</b><br /> _rewind_c0 <br />------------<br /> 0<br />(1 row)<br /><br />test=# <b>select * from players_c0_5;</b><br /> key | name | aka | dob | weight | height | last_seen <br />-------------------+----------------+-----------+------------+--------+--------+---------------------<br /> 11523440602279181 | Petra Che | Baby | 1989-05-30 | 62.3 | 180 | 2017-12-04 11:20:00<br /> 11523440602279215 | Alicia Silver | Checkmate | 1995-09-02 | 57.8 | 168 | 2017-05-24 09:05:00<br /> 11523440602279232 | Fernan Ozy | Beast | 1967-12-12 | 92.7 | 177 | 2017-12-30 16:10:00<br /> 11523440602279249 | Ivan Lebed | Ruso | 1959-01-10 | 77.4 | 180 | 2018-03-30 14:30:00<br /> 11523440850187058 | Mike Bryan | Joker | 1984-08-21 | <span style="background-color: #b6d7a8;">80.2</span> | <span style="background-color: #f1c232;">181</span> | 2018-04-08 06:25:00<br />(<span style="font-family: "courier new" , "courier" , monospace;">5</span> rows)<br /><br />test=# <b>select _rewind_c0('players',5,'<span style="background-color: yellow;">2018-04-11 13:01:30</span>');</b><br /> _rewind_c0 <br />------------<br /> 0<br />(1 row)<br /><br />test=# <b>select * from players_c0_5;</b><br /> key | name | aka | dob | weight | height | last_seen <br />-------------------+----------------+-----------+------------+--------+--------+---------------------<br /> 11523440602279181 | Petra Che | Baby | 1989-05-30 | 62.3 | 180 | 2017-12-04 11:20:00<br /> 11523440602279215 | Alicia Silver | Checkmate | 1995-09-02 | 57.8 | 168 | 2017-05-24 09:05:00<br /> 11523440602279232 | Fernan Ozy | Beast | 1967-12-12 | 92.7 | 177 | 2017-12-30 16:10:00<br /> 11523440602279249 | Ivan Lebed | Ruso | 1959-01-10 | 77.4 | 180 | 2018-03-30 14:30:00<br />(<span style="font-family: "courier new" , "courier" , monospace;">4</span> rows)</span></span><br />
<br />
<br />
It is clear that the correct working of the described algorithm depends on the consistency of the time measure and uniqueness of the keys assigned while packing the data into the RocksDB store. We have to either synchronize the clocks of all nodes better than 1 microsecond, or make all nodes receive the time from the same clock. The latter solution works fine with NTP time converted to the epoch time on each node. As for the uniqueness of the keys, a simple check in the code of PgRocks is enough to make sure that no two subsequent records are assigned the same key while packing data into the RocksDB store even if it comes out so fast that some rows are written within 1 ms of each other.<br />
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<b>
Conclusion</b></div>
<div style="text-align: left;">
</div>
In this article we have shown how a hybrid SQL-NoSQL architecture is implemented in a new plugin developed by our team within the last half a year. The code of PgRocks is released under GNU GPL license and can be found at the developers site at https://github.com/kl3eo/PgRocks.<br />
<br />
There is a simple live demo at the site <a href="http://pgrocks.com/">http://pgrocks.com</a> that shows how two PostgreSQL nodes work in sync following the methods described in this article.<br />
<span class="term"><span style="font-family: "verdana" , sans-serif;"><br /></span></span>
<span class="term"><span style="font-family: "verdana" , sans-serif;"><br /></span></span>
<br />
<br /></div>
Alexhttp://www.blogger.com/profile/12010600222763457685noreply@blogger.com0tag:blogger.com,1999:blog-702565810121406228.post-29195734666380180712018-04-03T04:17:00.000-07:002018-11-21T11:31:58.864-08:00How To Set Up PostgreSQL With Plugin To RocksDB On Linux<div dir="ltr" style="text-align: left;" trbidi="on">
<b>Introduction</b><br />
<br />
PostgreSQL is a well-known, "solid and mature" (C) SQL database management system. <br />
<br />
RocksDB is a persistent key-value store and C/C++ library.<br />
<br />
PgRocks is a plugin that uses DNA-like PostgreSQL data structure to link and easily migrate data between a PostgreSQL database and a RocksDB store.<br />
<br />
In this tutorial, you'll install and configure PgRocks, and by doing so you will set up a data bridge between PostgreSQL and RocksDB. <br />
<br />
This will require modifications in source code of both PostgreSQL and RocksDB, their compilation and installation, or re-installation, if they are already on the system.<br />
<br />
In the end of this guide you will see examples of how the plugin works.<br />
<br />
When you're finished, you'll basically understand how to flow data from PostgreSQL to RocksDB and back with the new plugin.<br />
<br />
<br />
<b>Prerequisites</b><br />
<br />
Before you begin this guide you'll need the following:<br />
<br />
<ul style="text-align: left;">
<li>- a typical modern Linux system, like CentOS 7, Ubuntu 16, or Debian 9;</li>
<li>- a root access (or non-root user with sudo privileges) on this system;</li>
<li>- <span style="font-family: "courier new" , "courier" , monospace;">git</span> and optionally <span style="font-family: "courier new" , "courier" , monospace;">wget</span> installed on the system;</li>
<li>- gcc compiler version 4.8 and later, and make.</li>
</ul>
<br />
<br />
<b>Step 1 - Preparing all necessary source code trees </b><br />
<br />
First, download the latest stable (<b><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">postgresql-<span style="font-family: "courier new" , "courier" , monospace;">xx.x)<span style="font-family: "courier new" , "courier" , monospace;"> </span></span></span></span></b></span></span></b>source code for PostgreSQL.<br />
<br />
(NB: Before selecting the version of PostgreSQL for running this tutorial, check our github repository <b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">https://github.com/kl3eo/PgRocks </span></span></b>for the latest supported version of PG. We try to be in sync with the official PG:)<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi7kx-1IdnApH2pfRaiq7Z6N9Vo1H_cZra8GufbAo0YdvaTwlS41V9wWST1BOUeXtIzpq3S26zB16tWpHurMhLGp7W_IrNi-44z1gBL-l7scZldfjc-jttQ3p6afaD-t6ZTpWg-mtSJ2ZA/s1600/github.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="567" data-original-width="611" height="296" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi7kx-1IdnApH2pfRaiq7Z6N9Vo1H_cZra8GufbAo0YdvaTwlS41V9wWST1BOUeXtIzpq3S26zB16tWpHurMhLGp7W_IrNi-44z1gBL-l7scZldfjc-jttQ3p6afaD-t6ZTpWg-mtSJ2ZA/s320/github.png" width="320" /></a></div>
<br />
<br />
<br />
Unpack the source code tree into a new work directory, e.g. <span style="font-family: "courier new" , "courier" , monospace;">work_pgrocks</span>:<br />
<br />
<b><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">mkdir work_pgrocks<br />cd work_pgrocks<br />wget https://ftp.postgresql.org/pub/source/v<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">xx.x</span></span><span style="font-family: "courier new" , "courier" , monospace;"></span>/postgresql-<span style="font-family: "courier new" , "courier" , monospace;">xx</span>.<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">x</span></span>.tar.gz<br />tar zxvf postgresql-<span style="font-family: "courier new" , "courier" , monospace;">xx</span>.<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">x</span></span>.tar.gz</span></span></b><br />
<br />
In the same work directory (i.e. <span style="font-family: "courier new" , "courier" , monospace;">work_pgrocks</span>), get the code for RocksDB and PgRocks with <span style="font-family: "courier new" , "courier" , monospace;">git</span>:<br />
<br />
<br />
<b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">git clone https://github.com/facebook/rocksdb<br />git clone https://github.com/kl3eo/PgRocks</span></span></b><br />
<br />
You have three new catalogs in your work directory, <span style="font-family: "courier new" , "courier" , monospace;">postgresql-xx.x</span>, <span style="font-family: "courier new" , "courier" , monospace;">rocksdb</span> and <span style="font-family: "courier new" , "courier" , monospace;">PgRocks</span>. <br />
<br />
Further in Steps 2 and 3 you will modify the RocksDB and PostgreSQL with files from the PgRocks tree.<br />
<br />
<br />
<b>Step 2 - Patching the RocksDB tree, then compiling and installing the new RocksDB libraries</b><br />
<br />
First, locate the file with the code for patching RocksDB. It is the file named <span style="font-family: "courier new" , "courier" , monospace;">rocksdb_get2.patch</span> at the top level of <span style="font-family: "courier new" , "courier" , monospace;">PgRocks</span> catalog.<br />
<br />
Apply this patch:<br />
<br />
<b><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">cd rocksdb<br />git apply ../PgRocks/rocksdb_get2.patch</span></span></b><br />
<br />
Before you proceed with compilation of the new RocksDB libraries, check that all the libs required for RocksDB are installed on the system. Methods of their installation depending on the system are explained at <a href="https://github.com/facebook/rocksdb/blob/master/INSTALL.md">https://github.com/facebook/rocksdb/blob/master/INSTALL.md</a><br />
<br />
For example, to install the required libs on Debian or Ubuntu, run the following command as root (or with sudo prefix if run from a user account):<br />
<br />
<b><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">apt-get install libgflags-dev libsnappy-dev zlib1g-dev libbz2-dev liblz4-dev<span style="font-family: "courier new" , "courier" , monospace;"> </span>libzstd-dev libnuma-dev</span></span></b><br />
<br />
Now you're ready to compile and install (or possibly re-install) RocksDB. Go ahead with it:<br />
<br />
<b><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">make static_lib<br />make shared_lib</span></span></b><br />
<br />
Everything fine, go ahead and install (as root):<br />
<br />
<b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">make install</span></span></b><br />
<br />
Optionally, you may compile and install a useful tool <span style="font-family: "courier new" , "courier" , monospace;">ldb</span> for working with RocksDB store. Go to the <span style="font-family: "courier new" , "courier" , monospace;">tools</span> and run these commands:<br />
<br />
<b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">g++ -std=gnu++11 ldb.cc -o ldb -lrocksdb<br />cp -a ldb /usr/local/bin</span></span></b><br />
<b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">cd ../.. </span></span></b><br />
<br />
Now you're ready to proceed to Step 3 and modify the code in PostgreSQL tree, then compile and install it.<br />
<br />
<b>Step 3 - Modifying the code in PostgreSQL tree, compiling and installing it</b><br />
<br />
Run the following commands to overwrite the source files in PostgreSQL tree with newer files of the same name from <span style="font-family: "courier new" , "courier" , monospace;">PgRocks/src</span> catalog:<br />
<br />
<b><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">cd postgresql-<span style="font-family: "courier new" , "courier" , monospace;">xx.x</span></span></span></b><br />cp -a ../PgRocks/src/* src/</span></span></b><br />
<br />
Then configure the PostgreSQL source code:<br />
<br />
<b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">./configure</span></span></b><br />
<br />
This command creates a file named <span style="font-family: "courier new" , "courier" , monospace;">Makefile.global</span> in the <span style="font-family: "courier new" , "courier" , monospace;">src</span> catalog. You will have to open it in the editor and modify it, manually replacing only one line, the one that starts with "LIBS". Replace it with another line you will find in the file named <span style="font-family: "courier new" , "courier" , monospace;">compile</span> at the top level of PgRocks source tree. <br />
<br />
These lines only differ in libs required by <span style="font-family: "courier new" , "courier" , monospace;">gcc</span> to compile the code.<br />
<br />
If in your <span style="font-family: "courier new" , "courier" , monospace;">src/Makefi<span style="font-family: "courier new" , "courier" , monospace;">le.global</span></span> there is the line:<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><br /> </span></span><br />
<b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">LIBS = -lpthread -lz -lreadline -lrt -lcrypt -ldl -lm</span></span></b><br />
<br />
than you have to replace it with this line:<br />
<br />
<b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">LIBS = -lpthread -lz -lreadline -lrt -lcrypt -ldl -lm -lsnappy -lgflags -lbz2 -llz4 -lzstd -lnuma -lrocksdb</span></span></b><br />
<br />
After editing the <span style="font-size: small;"><span style="font-family: "courier new" , "courier" , monospace;">src/Makefile.global</span></span>, you are ready to compile PostgreSQL:<br />
<br />
<b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">make</span></span></b><br />
<br />
Now that the compilation is complete and you're ready to install, please make sure that no older version of PostgreSQL exists on your system. <br />
<br />
If another version of PostgreSQL had been automatically installed (and probably never used), remove it now, e.g. on Debian and Ubuntu run the following command as root:<br />
<b><br /><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">apt remove PostgreSQL-10</span></span></b><br />
<br />
Now as root run the following command from the top level of PostgreSQL tree:<br />
<br />
<b><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">make install</span></span></b><br />
<br />
Make sure the newly installed libs will be found by the linker:<br />
<b><br /><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">ldconfig</span></span></b><br />
<br />
Only if you're installing PostgreSQL for the very first time, then run the following commands, also as root:<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"></span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><br /><b>adduser postgres<br />mkdir /usr/local/pgsql/data<br />chown postgres /usr/local/pgsql/data<br />su - postgres<br />/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data</b></span></span><b></b><br />
<b><br /></b>
Start the new PostgreSQL daemon as user <b>postgres</b> to see if everything is fine:<br />
<br />
<b><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">su - postgres <br />/usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data >logfile 2>&1 &<br />/usr/local/pgsql/bin/psql template1</span></span></b><br />
<br />
If you are able to connect to the daemon with this command, then you see this prompt:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">template1=#</span></span><br />
<br />
You're ready to make new functions visible by PostgreSQL at the next Step 4.<br />
<br />
<br />
<b>Step 4 - Making new internal functions visible by PostgreSQL</b><br />
<br />
Open the file named <span style="font-size: small;"><span style="font-family: "courier new" , "courier" , monospace;">pg_catalog.pg_proc.txt</span></span> found at the top level of <span style="font-family: "courier new" , "courier" , monospace;">PgRocks</span> catalog in the editor, select, copy and then paste <i>everything</i> into the prompt that you have opened at the previous Step. This will result in many lines of output like this:<br />
<br />
(...)<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">drop table if exists jar;<br />DROP TABLE<br />delete from pg_catalog.pg_proc where proname='rocks_get_node_number';<br />DELETE 0<br />create table jar as select * from pg_catalog.pg_proc where oid = 89;<br />SELECT 1</span></span><br />
<br />
(...)<br />
<br />
If you have made mistakes while copy-pasting, repeat the whole Step 4 from beginning to end, copy and paste ALL the lines from the <span style="font-size: small;"><span style="font-family: "courier new" , "courier" , monospace;">pg_catalog.pg_proc.txt</span></span> into the prompt.<br />
<br />
This is it, and now you are ready to start using PgRocks.<br />
<br />
<br />
<b>Step 5 - Creating a test suit for the plugin and running examples</b><br />
<br />
In this Step you will see how the plugin works. <br />
<br />
You will only use three of new PostgreSQL/PgRocks functions, one for writing data to RocksDB, one for reading from it, and one for closing the open connection.<br />
<br />
-- First, create a test database as user <b>postgres</b>.<br />
<br />
<b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">su - postgres <br />/usr/local/pgsql/bin/createdb test</span></span></b><br />
<br />
-- Second, create there a table named <span style="font-family: "courier new" , "courier" , monospace;">players</span> and populate it with test data:<br />
<br />
<b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">/usr/local/pgsql/bin/psql test</span></span><br /><br /><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">create table players(name text, aka text, dob date, weight float, height int, last_seen timestamp); </span></span></b><br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">CREATE TABLE</span></span><b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><br /><br />insert into players values ('Peter Stevens','Boss', '1956-06-30', 85.4,169,'2017-10-25 17:30');<br />insert into players values ('Mike Palmer','Hippy', '1988-12-06', 75.5,184,'2018-02-28 14:20');<br />insert into players values ('Dorothy Brown','Miss D', '1992-07-12', 64.3,172,'2018-03-15 09:25');<br />insert into players values ('Linda Chambers','Jamaica', '1987-03-10', 57.7,180,'2018-04-01 14:00');<br />insert into players values ('Claude Moulin','French', '1944-10-17', 68.1,170,'2017-09-15 12:30');</span></span></b><br />
<br />
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><b>select * from players;</b> </span></span><br />
<br />
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">name | aka | dob | weight | height | last_seen <br />----------------+---------+------------+--------+--------+---------------------<br /> Peter Stevens | Boss | 1956-06-30 | 85.4 | 169 | 2017-10-25 17:30:00<br /> Mike Palmer | Hippy | 1988-12-06 | 75.5 | 184 | 2018-02-28 14:20:00<br /> Dorothy Brown | Miss D | 1992-07-12 | 64.3 | 172 | 2018-03-15 09:25:00<br /> Linda Chambers | Jamaica | 1987-03-10 | 57.7 | 180 | 2018-04-01 14:00:00<br /> Claude Moulin | French | 1944-10-17 | 68.1 | 170 | 2017-09-15 12:30:00<br /><br />(5 rows)<b><br /> </b></span></span><br />
<br />
-- Third, create another table named <span style="font-family: "courier new" , "courier" , monospace;">venues</span> and also populate it with test data:<br />
<br />
<b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">create table venues (address text, capacity int);<br />insert into venues values ('Baker St. 221b', 330);<br />insert into venues values ('Downing St. 10', 50);<br />insert into venues values ('Marylebone Rd', 1000);</span></span></b><br />
<br />
-- Fourth, create an empty "DNA" table <span style="font-family: "courier new" , "courier" , monospace;">v1_dna_1</span> (v1 stands for "version one") for RocksDB "store #1";<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><b>create table v1_dna_1 (tbl text, rev int, key bigint);</b></span></span><br />
<br />
-- Fifth, pack the data from the two test table into a new RocksDB store located at <span style="font-family: "courier new" , "courier" , monospace;">/tmp/rocksdb_1</span> on your file system - i.e. to store#1.<br />
<br />
For this you will use the new internal PostgreSQL/PgRocks function <span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: small;"><b>row_to_json_rocks</b></span></span>, which creates a new store #1, if there is none, <br />
and writes all rows from the given PG table into this new store as JSON data.<br />
<br />
<b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">insert into v1_dna_1 (tbl, rev, key) select 'players', 1, row_to_json_rocks(1,players) from players;<br />insert into v1_dna_1 (tbl, rev, key) select 'venues', 1, row_to_json_rocks(1,venues) from venues;<br />select rocks_close();</span></span></b><br />
<br />
Make sure to close the RocksDB connection each time after write operations calling function <b><span style="font-family: "courier new" , "courier" , monospace;">rocks_close</span></b>.<br />
<br />
N.B. You have inserted "1" into <span style="font-family: "courier new" , "courier" , monospace;">rev</span> column of DNA-table. By doing this you assume that all the rows are of revision one. You won't need the data revisions yet.<br />
<br />
The function <span style="font-family: "courier new" , "courier" , monospace;"><b>row_to_json_rocks</b></span> just used takes two arguments, (1) the number of the store and (2) the name of the table.<br />
<br />
It returns the key of type <span style="font-family: "courier new" , "courier" , monospace;">bigint</span> where the first digit is the number of the node (1 by default), and the other digits are <i>epoch time</i> in microseconds, measured at the moment of packing the record into the store.<br />
<br />
These keys are guaranteed to be unique and as such they are assigned to the values in the RocksDB store.<br />
<br />
You will keep these keys in the column <span style="font-family: "courier new" , "courier" , monospace;">key</span> of the DNA table:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><b>select * from v1_dna_1;<br /><br /> </b> tbl | rev | key <br />---------+-----+-------------------<br /> players | 1 | 11522687676247617<br /> players | 1 | 11522687676247896<br /> players | 1 | 11522687676247929<br /> players | 1 | 11522687676247951<br /> players | 1 | 11522687676247972<br /> venues | 1 | 11522687676275189<br /> venues | 1 | 11522687676275254<br /> venues | 1 | 11522687676275269<br /> </span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">(8 rows)</span></span><br />
<br />
-- Sixth, drop the tables <span style="font-family: "courier new" , "courier" , monospace;">players</span> and <span style="font-family: "courier new" , "courier" , monospace;">venues</span> - since you keep all the data in RocksDB, you don't need the original tables any more.<br />
<br />
<br />
<b><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">drop table players;<br />drop table venues;</span></span></b><br />
<br />
<br />
-- Seventh, take a look at our data how it is stored in RocksDB. <br />
<br />
You will need the <span style="font-family: "courier new" , "courier" , monospace;">bigint</span> keys printed as <span style="font-family: "courier new" , "courier" , monospace;">hexadecimal</span> for searching with <span style="font-family: "courier new" , "courier" , monospace;">ldb</span>:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><b>select '0x' || LPAD(to_hex(key), 16, '0') as hex_key from v1_dna_1;</b> </span></span><br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">hex_key <br />--------------------<br /> 0x0028efd2c9ccee41<br /> 0x0028efd2c9ccef58<br /> 0x0028efd2c9ccef79<br /> 0x0028efd2c9ccef8f<br /> 0x0028efd2c9ccefa4<br /> 0x0028efd2c9cd59f5<br /> 0x0028efd2c9cd5a36<br /> 0x0028efd2c9cd5a45<br />(8 rows)</span></span><br />
<br />
Quit the psql shell and look at our data packed as JSON in our RocksDB store:<br />
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><br /><b>ldb --db=/tmp/rocksdb_1 scan --key_hex</b><span style="font-size: xx-small;"> </span></span></span><br />
<br />
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">0x0028EFD2C9CCEE41 : {"name":"Peter Stevens","aka":"Boss","dob":"1956-06-30","weight":85.4,"height":169,"last_seen":"2017-10-25T17:30:00"}<br />0x0028EFD2C9CCEF58 : {"name":"Mike Palmer","aka":"Hippy","dob":"1988-12-06","weight":75.5,"height":184,"last_seen":"2018-02-28T14:20:00"}<br />0x0028EFD2C9CCEF79 : {"name":"Dorothy Brown","aka":"Miss D","dob":"1992-07-12","weight":64.3,"height":172,"last_seen":"2018-03-15T09:25:00"}<br />0x0028EFD2C9CCEF8F : {"name":"Linda Chambers","aka":"Jamaica","dob":"1987-03-10","weight":57.7,"height":180,"last_seen":"2018-04-01T14:00:00"}<br />0x0028EFD2C9CCEFA4 : {"name":"Claude Moulin","aka":"French","dob":"1944-10-17","weight":68.1,"height":170,"last_seen":"2017-09-15T12:30:00"}<br />0x0028EFD2C9CD59F5 : {"address":"Baker St. 221b","capacity":330}<br />0x0028EFD2C9CD5A36 : {"address":"Downing St. 10","capacity":50}<br />0x0028EFD2C9CD5A45 : {"address":"Marylebone Rd","capacity":1000}</span></span><br />
<br />
<br />
You may now search the data in this store by the key:<br />
<br />
<b><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">ldb --db=/tmp/rocksdb_1 get 0x0028EFD2C9CCEF8F --key_hex</span></span></b><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"> </span></span><br />
<br />
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">{"name":"Linda Chambers","aka":"Jamaica","dob":"1987-03-10","weight":57.7,"height":180,"last_seen":"2018-04-01T14:00:00"}</span></span><br />
<br />
-- Eighth, get back the data into table <span style="font-family: "courier new" , "courier" , monospace;">players</span> dropped earlier, - suppose you need it again.<br />
<br />
You will use the counterpart function <span style="font-family: "courier new" , "courier" , monospace;"><b>rocks_json_to_record</b></span>, which also takes two arguments - (1) number of the store and (2) the key, and returns the stored record:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><b>create table players as select d.* from v1_dna_1, rocks_json_to_record(1,v1_dna_1.key) d(name text, aka text, dob date, weight float, height int, last_seen timestamp) where v1_dna_1.tbl = 'players';</b><br /> </span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">SELECT 5<b><br /><br />select * from players;</b> </span></span><br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">name | aka | dob | weight | height | last_seen <br />----------------+---------+------------+--------+--------+---------------------<br /> Peter Stevens | Boss | 1956-06-30 | 85.4 | 169 | 2017-10-25 17:30:00<br /> Mike Palmer | Hippy | 1988-12-06 | 75.5 | 184 | 2018-02-28 14:20:00<br /> Dorothy Brown | Miss D | 1992-07-12 | 64.3 | 172 | 2018-03-15 09:25:00<br /> Linda Chambers | Jamaica | 1987-03-10 | 57.7 | 180 | 2018-04-01 14:00:00<br /> Claude Moulin | French | 1944-10-17 | 68.1 | 170 | 2017-09-15 12:30:00<br /> </span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">(5 rows)</span></span><br />
<br />
-- Ninth, create a temporary table in PostgreSQL from the RocksDB store, do calculations and drop it on commit, such as:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">CREATE OR REPLACE FUNCTION sql_main_sum_field(int,text,text)<br />RETURNS float AS $$<br />DECLARE <br /> result float := 0;<br />BEGI<span style="font-family: "courier new" , "courier" , monospace;">N</span> <br /> EXECUTE format('create temp table temp_table on commit drop as select d.* from v1_dna_<span style="font-family: "courier new" , "courier" , monospace;">%s</span>, rocks_json_to_record(<span style="font-family: "courier new" , "courier" , monospace;">%<span style="font-family: "courier new" , "courier" , monospace;">s</span></span>,v1_dna_<span style="font-family: "courier new" , "courier" , monospace;">%s</span>.key) d(%s text) where v1_dna_<span style="font-family: "courier new" , "courier" , monospace;">%<span style="font-family: "courier new" , "courier" , monospace;">s</span></span>.tbl = ''%s'' ',$1,$1,$1,$<span style="font-family: "courier new" , "courier" , monospace;">3</span>,$1,$<span style="font-family: "courier new" , "courier" , monospace;">2</span>);<br /> EXECUTE format('select sum(%s::float) from temp_table',$<span style="font-family: "courier new" , "courier" , monospace;">3</span>) into result;<br /> return result;<br /><br />END;$$ LANGUAGE plpgsql;</span></span><br />
<br />
<b><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">select sql_main_sum_field(1,'venues','capacity') as total;</span></span></b><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;">total <br />-------<br /> 1380<br />(1 row)</span></span><br />
<br />
<br />
<b>Conclusion</b><br />
<br />
In this tutorial, you've explored PgRocks which is a plugin to flow data between PostgreSQL and RocksDB.<br />
<br />
You've also seen how it is possible to enhance the PostgreSQL to create new internal functions for migrating its native formats data to external storage.<br />
<br />
You may further explore PgRocks and take a look at a live demo you'll find at <a href="http://pgrocks.com/">http://pgrocks.com</a> about how PgRocks can be used for syncing PostgreSQL nodes over the network.</div>
Alexhttp://www.blogger.com/profile/12010600222763457685noreply@blogger.com0tag:blogger.com,1999:blog-702565810121406228.post-74474480769972483792011-09-23T10:00:00.000-07:002011-09-23T10:00:56.392-07:00<div dir="ltr" style="text-align: left;" trbidi="on">
I bought a new Amazon Kindle 3g and thought I could read some Turkish clips in order to learn this language. But there are so many verbal forms and cases for both nouns and adjectives, one could possibly say - as many as in our Russian tongue - and I soon found it really hard. Then I tried to search for inline Kindle Turkish-to-English dictionary in the net. There are some "online" ones and there are some "Windows" ones, but I was unhappy not to find a good mobi formatted one. I tried to make one then, and after some days - I succeded. <a href="http://82.146.44.218/turk.zip"></a><br />
<br />
First of all, I downloaded a Babylon setup and extracted the much needed BLG file. Then I used UnpackBLG.exe in wine to extract a TXT file - option Terms - Outside. This TXT was processed by two Perl scripts, one (A) removing empty lines and tab-delimiting them, and correctly stuffing UTF symbols instead of the bad ones from UnpackBLG.exe output, the other (B) one splitting Pipe-delimited verbal forms and cases into separate lines:<br />
<br />
<br />
A:<br />
<br />
<div style="font-family: "Courier New",Courier,monospace;">
$line1=~ s/\n/\t/g if ($line1 =~ /(.+)\n/);<br />
$line1=~ s/þ/Å/g;#s cedilla, shekure<br />
$line1=~ s/ý/ı/g;#i grave<br />
$line1=~ s/ü/ü/g;#u dotted<br />
$line1=~ s/ö/ö/g;#o dotted<br />
$line1=~ s/ð/Ä/g;#g capped<br />
$line1=~ s/â/â/g;#a capped<br />
$line1=~ s/ç/ç/g;#c cedilla, cocuk</div>
<div style="font-family: "Courier New",Courier,monospace;">
$line1=~ s/Þ/Å/g;#s cedilla, shekure<br />
$line1=~ s/Ý/Ä°/g;#i grave<br />
$line1=~ s/Ü/Ã/g;#u dotted<br />
$line1=~ s/Ö/Ã/g;#o dotted<br />
$line1=~ s/Ð/Ä/g;#g capped<br />
$line1=~ s/Â/â/g;#a capped<br />
$line1=~ s/Ç/Ã/g;#c cedilla, cocuk</div>
<div style="font-family: "Courier New",Courier,monospace;">
$line1=~ s/é/é/g;#French e</div>
<br />
B:<br />
<br />
<div style="font-family: "Courier New",Courier,monospace;">
if ($line1 =~ /(.+) \| (.+)/) {<br />
my ($pipes, $meaning) = split(' ', $line1);<br />
my @list = split(' \| ',$pipes);<br />
<br />
my $counter = 0;<br />
foreach my $line (@list) {<br />
my $str = '';<br />
if(!$counter) {<br />
$str = $line.' '.$meaning."\n";<br />
} else {<br />
$str = $line.' '.$list[0]."\n";<br />
}<br />
$counter++;<br />
print OUTF $str unless ($str =~ /\//);<br />
}<br />
} else {<br />
<br />
print OUTF $line1;<br />
}</div>
<br />
I manually removed a couple of trash lines that were infused by <b>UnpackBLG</b>, those that <b>tab2opf.py</b> couldn't swallow:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">python tab2opf.py -utf turk.txt 2>&1>err</span><br />
<br />
At last, I got an .opf file and 138x2Mb XML files to feed to mobigen:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">wine mobigen/mobigen.exe turk.opf -unicode</span> <br />
<br />
The desktop 2-core, it took an hour rattling HDD and my fears of losing it, but finally - it worked, thank God.<br />
<br />
Here is the 40MB zipped file - <a href="http://82.146.44.218/turk.zip">http://82.146.44.218/turk.zip </a><br />
<br />
<br />
Put it directly to Kindle docs directory via USB (no conversion requred). Then choose it as primary dictionary for Turkish texts. You would usually need push the button twice - first, reducing form to infinitive, second, getting the meaning of it, - it's the most handy way, I think.<br />
<br />
By the way, you know how to clip web sites to your Kindle? Look <a href="http://www.klip.me/">where</a> I clip Turkish newspapers for reading on my Kindle.<br />
<br /></div>
Alexhttp://www.blogger.com/profile/12010600222763457685noreply@blogger.com0tag:blogger.com,1999:blog-702565810121406228.post-71226897920204285202011-04-24T09:54:00.000-07:002011-04-24T09:54:10.225-07:00семафор для пары серверов<div dir="ltr" style="text-align: left;" trbidi="on">Допустим, некий интернет-проект работает на паре независимых серверов у разных провайдеров с разными доменными именами, на одном из которых (медленном - <b>slow</b>) в корне через .htaccess идет редирект на второй (быстрый - <b>fast</b>). [Именно так, к слову, обстоит дело с нашим теннисным клубом где <b>fast</b> - <a href="http://tennismatchmachine.com/">tennismatchmachine.com</a> , а <b>slow</b> - <a href="http://motivation.ru/">motivation.ru</a> .] Медленный каждые три часа забирает с быстрого снимок БД и заливает у себя. Каким образом, в случае отказа <b>fast</b>-сервера, <b>slow</b>-сервер <i>надежно</i> узнает об этом и отменит редирект? И как, когда потом восстановит его обратно? Решаем задачу:<br />
<br />
<b>A.</b> На трех хостах, включая <b>fast</b>, кладем незамысловатый скриптик <i>reply_self</i>:<br />
<br />
<div style="color: #3d85c6;"><span style="color: #0b5394; font-family: "Courier New",Courier,monospace; font-size: small;">#!/usr/bin/perl<br />
<br />
use CGI;<br />
$query = new CGI;<br />
print $query->header;<br />
my $me=`/bin/hostname`;<br />
$chop($me);<br />
my $addr=$ENV{'REMOTE_ADDR'};<br />
print $me."!";<br />
print $addr."\n";</span> <span style="font-size: small;"><br />
</span></div><span style="font-size: small;"><br />
</span><br />
<span style="font-size: small;">B. А на <b>slow</b> каждую минутe запускаем нечто вроде :</span><br />
<br />
<span style="color: #0b5394; font-size: small;"><span style="font-family: "Courier New",Courier,monospace;">#!/usr/bin/perl</span><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">my $tainted = 0;</span><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">my $tested_host = 'tennismatchmachine.com';</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">$test_host_url = 'http://'.$tested_host.'/scripts/reply_self';</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">my $first_link_ip = 'onnnemoreee.ru';</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">my $first_link_test_url = 'http://'.$first_link_ip.'/protected/reply_self';</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">my $second_link_ip = 'anotttheer.ru';</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">my $second_link_test_url = 'http://'.$second_link_ip.'/scripts/reply_self';</span><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">my $user = 'wget';</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">my $passwd = 'dumbasspassword';</span><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">local $SIG{ALRM} = sub { die "alarm\n" }; # NB: \n required</span><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">&Connect(0);</span><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">unless ($tainted) {</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> </span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> &Connect(1);</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> unless ($tainted) {</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> &Connect(2);</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> unless ($tainted) {</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> sleep(30);</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> &Connect(0);</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> }</span><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> } else {</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> sleep(30);</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> &Connect(0); </span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> }</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">}</span><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">if (!(-f './tmm_flag_down') && $tainted && $tainted ne '3') {</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">system("echo > ./tmm_flag_down");</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">system("date >> ./tmm_changed");</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">system("echo 'DOWN' >> ./tmm_changed");</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">system("echo '==============' >> ./tmm_changed");</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">system("./if_mail_tmm_changed.sh");</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">system("/bin/rm -f ./public_html/.htaccess");</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">system("cd public_html;/bin/ln -s ./.htaccess.home ./.htaccess;cd ..;");</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">} elsif (-f './tmm_flag_down' && $tainted eq '3') {</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">system("/bin/rm -f ./tmm_flag_down");</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">system("date >> ./tmm_changed");</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">system("echo 'UP' >> ./tmm_changed");</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">system("echo '==============' >> ./tmm_changed");</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">system("./if_mail_tmm_changed.sh");</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">system("/bin/rm -f ./public_html/.htaccess");</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">system("cd public_html;/bin/ln -s ./.htaccess.tmm ./.htaccess;cd ..;");</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">}</span><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">sub Connect {</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">my $p = shift;</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">my $test_url;</span><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">if ($p eq '0') {</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> $test_url = $test_host_url;</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">} elsif ($p eq '1') {</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> $test_url = $first_link_test_url;</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">} elsif ($p eq '2') {</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> $test_url = $second_link_test_url;</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">}</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> </span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">my $wget_cmd = 'wget --timeout=5 --http-user '.$user.' --http-passwd '.$passwd.' -O - '.$test_url;</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">eval {</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> </span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">alarm 5;</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">system($wget_cmd.' 2>/dev/null 1>/dev/null');</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">open (IN, '-|', $wget_cmd.' 2>&1')</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> or die ("Can't wget");</span><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">my $r = '';</span><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> while (!eof(IN)) {</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> my $q = readline (*IN);</span><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> $q =~ /(\S+)!(\d+\.\d+\.\d+\.\d+)/;</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> </span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> if ($1 && $r ne $1) {</span><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> $r = $1;</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> $tainted = ($1 =~ /^tennismatchmachine/) ? 3 : 4;</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> last if ($tainted eq '3');</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> </span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> }</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;"> }</span><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">close (IN);</span><br style="font-family: "Courier New",Courier,monospace;" /><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">alarm 0;</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">};</span><br style="font-family: "Courier New",Courier,monospace;" /><span style="font-family: "Courier New",Courier,monospace;">} #end Connect</span></span><br />
Voila!</div>Alexhttp://www.blogger.com/profile/12010600222763457685noreply@blogger.com0tag:blogger.com,1999:blog-702565810121406228.post-8921521365783809532011-03-22T03:53:00.000-07:002011-03-22T03:53:06.459-07:00Без Cisco делаем Server Pool<div dir="ltr" style="text-align: left;" trbidi="on">Простая и оригинальная схема, реализованная мной у клиентов.<br />
<br />
<i>A. Исходные: </i><br />
<br />
Есть локальная сеть 25 рабочих станций (на схеме - "<b>клиенты</b>"), которые используют Linux-сервер (<b>unix1</b>) как шлюз в Сеть.<br />
<br />
На этом боксе работают :<br />
<ul style="text-align: left;"><li>а) <i>squid</i> (прокси-сервер) - для брожения по Интернету; </li>
<li>б) <i>Cyrus IMAP</i> - для складирования корпоративной почты; непосредственно из локальной сети (ЛС) он принимает почту от <i>sendmail</i>; с POP3 у провайдера почту ему периодически доставляет <i>fetchmail</i>. </li>
<li>в) <i>sendmail</i> - для отправки почты из ЛС на relay-сервер провайдера, или на свой <i>Cyrus</i> (см. выше); </li>
<li>г) cacheing DNS-сервер <i>named</i> - нужен и для <i>squid</i>, и для <i>sendmail</i>; </li>
<li>д) HTTP-сервер <i>apache</i> - для возможности извне скачивать файлы со своего веб-узла; </li>
<li>е) <i>Samba</i>-сервер,чтобы из ЛС можно было складировать файлы в папки <i>apache</i>.</li>
</ul><br />
Сервер <b>unix1</b>, а через него локальная сеть, соединяется с Интернетом через двух провайдеров:<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNUKfI8VnmMSTJXUJOhKKLV7xBwPx6nQyiaG-KeqQSC8BcMzhzr1L6S2BzmNrgyCw1AR9_xtX0uc6jt3j1PY-vJOJePGvp4tE6YmWWc0tBjt2mEAariz2zJ5o6pFlMlhibNLIQrcGRtcg/s1600/unix_channels_tp_link.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><br />
</a></div><br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtthowqADrVcFTVpm0pVGuS4sPgzZg_58-T1C4RyVP9mSdQizaXEbLVvdpt0wNLhjLNpCchrLMcauWxaBUpzIPtVXaGb2sQObwTW76lZwms4MkeXNUgWO39aBtGJQkQMaRQVRgtB0w9ns/s1600/unix_channels_tp_link2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="178" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtthowqADrVcFTVpm0pVGuS4sPgzZg_58-T1C4RyVP9mSdQizaXEbLVvdpt0wNLhjLNpCchrLMcauWxaBUpzIPtVXaGb2sQObwTW76lZwms4MkeXNUgWO39aBtGJQkQMaRQVRgtB0w9ns/s320/unix_channels_tp_link2.jpg" width="320" /></a></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNUKfI8VnmMSTJXUJOhKKLV7xBwPx6nQyiaG-KeqQSC8BcMzhzr1L6S2BzmNrgyCw1AR9_xtX0uc6jt3j1PY-vJOJePGvp4tE6YmWWc0tBjt2mEAariz2zJ5o6pFlMlhibNLIQrcGRtcg/s1600/unix_channels_tp_link.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><br />
</a></div>Распределение трафика по каналам выполнено с помощью <i>iptables/netfilter</i> таким образом, чтобы ключевые сервисы ходили через более надежный, но дорогой кабельный канал, а трафик от <i>squid</i> и ёмкие внешние закачки - через неограниченный по объему трафика, но менее устойчивый и менее быстрый беспроводной канал. Для этого на <b>unix1</b> задействованы два сетевых интерфейса - один (eth0) непосредственно соединяется с кабельным модемом, второй (eth1) через общий коммутатор ЛС - с Wimax устройством.<br />
<br />
И вот, все это хозяйство успешно работает несколько лет, иногда правда случаются неприятности - то винчестер ахнет, то провайдер тихо умрет на неопределенный срок. Для минимизации неприятных последствий, от этого есть защита - полное резервное копирование жесткого диска с возможностью загрузки и работы со второго запасного диска, а также набор скриптов, которые смотрят круглосуточно через оба канала и переключают трафик в зависимости от доступности провайдеров.<br />
<br />
Однако, нет пока защиты от полного выхода из строя сервера <b>unix1</b>!<br />
<br />
Тут на ум приходят сами слова <i>Server Pool</i> - не для того, чтобы раскидать по кластеру огромный, но однообразный трафик, а для того, чтобы дублировать функции основного сервера и в случае его поломки немедленно восстановить все рабочие процессы.<br />
<br />
<i>B. Server Pool</i><br />
<br />
Итак, закупаем необходимое железо и собираем в 1U-корпусе сервер-"дублер" <b>unix2</b>, после чего устанавливаем на него в точности то же самое ПО, которое работает на основном сервере. Но лучше сразу сообразить, какие адреса поставить на сетевых интерфейсах. Очевидно, нужно выбрать IP в тех же сегментах WAN и LAN, что и на основном сервере. Для этого в схему подключения добавляется простой хаб для микса обоих серверов <b>unix1</b> и <b>unix2</b> на входе кабельного модема.<br />
<br />
Здесь конечно можно бы и остановиться - действительно, когда работают два <i>одинаковых</i> сервера (но с разными IP-адресами), в случае поломки основного всегда есть возможность за какое-то время перенести все функции на запасной сервер. Например, написать скрипт, который бы менял на <b>unix2</b> IP адреса на те же, что были у поломанного <b>unix1</b>, затем перезагрузка и вроде бы все снова заработает. Однако, есть та проблема, что коммутатор держит у себя в памяти таблицу соответствия MAC-адресов, поэтому для реальной замены нужно будет еще и как-то перегрузить эту таблицу на коммутаторе (flush). Если коммутатор может выполнить эту процедуру удаленно, например, по SNMP, то и замечательно. Однако, в противном случае придется после замены IP еще и перезагружать коммутатор. Можно попробовать еще и MAС-адреса на <b>unix2</b> изменить, но все равно у такой схемы достаточно много минусов - взять хотя бы синхронизацию серверов до и после переключения ! Это нетривиальная задача, если вдуматься.<br />
<br />
Конечно, было бы неплохо подключить оба сервера через некоторое устройство, которое бы их мониторило и переключало трафик с основного на запасной, как это делается после Health Monotoring Probe на дорогих моделях Cisco Catalyst. Ведь тогда нам не придется менять адреса на запасном сервере. Если же такого устройства нет и менять адреса мы не хотим, тогда очевидно на всех рабочих станциях локальной сети после замены сервера пришлось бы переписать настройки браузеров, Outlook и т.п., что конечно тоже возможно, но крайне нежелательно.<br />
<br />
Итак? Вперед в магазин за Cisco ..? 3000$ ??<br />
<br />
Нет, для нашего случая достаточно 200$ - нужно купить небольшой load balancer (LB) на два WAN-выхода. Неважно, чтобы он сам переключал каналы - достаточно, чтобы это можно было сделать через веб-интерфейс администратора. Также добавим по третьей сетевой карте (eth2, 10.0.0.0) на оба сервера, если их там еще нет, но лучше сразу motherboard с тремя.<br />
<br />
WAN-выходы с LB направим на эти дополнительные сетевые интерфейсы серверов - WAN1 -> <b>unix1</b>, WAN2 -> <b>unix2</b>. Соединим один из восьми входов LB с коммутатором ЛС и присвоим LB адрес из сегмента нашей ЛС.<br />
<br />
Перепишем настройки на клиентах ЛС - укажем как gateway адрес LB, а в качестве адреса прокси и почтового сервера - некоторый фиктивный IP адрес в сегменте 172.16.0.0, который на серверах пусть будет недостижим, но пакеты на который будет отлавливать <i>netfilter</i> и перенаправлять на свои порты:<br />
<br />
<span style="font-size: x-small;">$IPTABLES -A POSTROUTING -o eth1 -p tcp -m tcp -d 172.16.0.1 -j SNAT --to-source 192.168.4.201 -t nat<br />
$IPTABLES -A PREROUTING -s 10.0.0.202 -i eth2 -p tcp -m tcp --dport 3128 -d 172.16.0.1 -j REDIRECT --to-ports 3128 -t nat<br />
$IPTABLES -A PREROUTING -s 10.0.0.202 -i eth2 -p tcp -m tcp --dport 993 -d 172.16.0.1 -j REDIRECT --to-ports 993 -t nat<br />
$IPTABLES -A PREROUTING -s 10.0.0.202 -i eth2 -p tcp -m tcp --dport 25 -d 172.16.0.1 -j REDIRECT --to-ports 25 -t nat</span><br />
<br />
Теперь осталось запустить на <b>unix2</b> скрипты, которые будут смотреть на <b>unix1</b> через оба интерфейса eth0 и eth1 (таким образом выполняя health monitoring probe) и при обнаружении факта "падения" основного сервера переключать wget-скриптом трафик через LB с WAN1 на WAN2.<br />
<br />
<br />
<br />
</div>Alexhttp://www.blogger.com/profile/12010600222763457685noreply@blogger.com0tag:blogger.com,1999:blog-702565810121406228.post-970180354296508752011-03-02T23:16:00.000-08:002011-03-02T23:16:46.785-08:00Filezilla and Co. - Обзор ПО для managed FTP под Линукс. D. Motivation File Browse<div dir="ltr" style="text-align: left;" trbidi="on"><span style="font-size: large;"><b>4. Motivation File Browse</b></span><br />
<br />
Лицензия GPL<br />
февраль 2011 <br />
<br />
программа является сокращенным вариантом GPL-релиза моей perl-программы tmm (http://sourceforge.net/projects/tclub). Для работы используется javascript-интерфейс; админы и клиенты видят разные окна одной программы, похожие, но с разными наборами функций. Например, админ может создать логин клиента, задать или изменить пароль.<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiO9GdscsbyxkMkAX-sIV5yo3QQV3GRmIyO2XeNw7_5BCKtTso9j9K0cCsbh_qIE3ifX5ZpM1uj9UtS-DQP8wS4Y5rfVLdTJTOSsGBgHl01YSm-c-H3X55SzayIYKLUp12V_ndi834qhHc/s1600/Screenshot-11.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="256" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiO9GdscsbyxkMkAX-sIV5yo3QQV3GRmIyO2XeNw7_5BCKtTso9j9K0cCsbh_qIE3ifX5ZpM1uj9UtS-DQP8wS4Y5rfVLdTJTOSsGBgHl01YSm-c-H3X55SzayIYKLUp12V_ndi834qhHc/s320/Screenshot-11.png" width="320" /> </a></div><div class="separator" style="clear: both; text-align: center;"><br />
</div>В этой оригинальной программе у веб-пользователей <i>совсем нет</i> домашних каталогов, вместо них есть "тикеты" на закачку файлов, которые хранятся в базе данных. Таким образом, авторизовавшись на входе в программу, клиент видит таблицу своих тикетов. В каждом тикете указаны срок действия тикета, название файла, ссылка на старт закачки и комментарии создавшего тикет админа.<br />
<br />
Контроль за файлами осуществляется через Samba; каких-то дополнительных функций по работе с файловой системой у самой программы не предусмотрено, кроме пожалуй, возможности автоматически удалять временные файлы после из успешной закачки.<br />
<br />
В качестве "рабочей" лошади http-сервера использован популярный GPL пакет perlball. Для хранения и обработки пользовательских данных используется СУБД PostgreSQL; для javascript- интерфейса - библиотеки Mootools.<br />
<br />
Ниже приведен отчет о "боевых" сравнительных испытаниях Motivation File Browse и GoAnywhere Services:<br />
<br />
<div style="font-family: "Courier New",Courier,monospace;"><span style="font-size: small;"><b>GoAnywhere Services vs. Motivation File Browse</b><br />
<br />
1. Условия тестирования<br />
<br />
Скачивался пробный файл (37 Mb), время скачивания на скорости 100 Kbyte/sec ~ 6 минут.<br />
<br />
Во время скачивания я устраивал следующие "неприятности":<br />
<br />
A. Выключал и снова включал интерфейс скачивания (ifconfig);<br />
Б. Перезапускал сеть (network restart);<br />
С. Включал/выключал запрет на пакеты на порт скачивания (iptables)<br />
<br />
==============<br />
1. GA Services имеет 2 интерфейса HTTPS:<br />
</span></div><div style="font-family: "Courier New",Courier,monospace;"><span style="font-size: small;">a) Enhanced<br />
b) Basic<br />
<br />
Работа в интерфейсе Enhanced начинается с запуска Java-апплета. Апплет удачно стартует на моем браузере Firefox 3.6 с установленным обновлением Java, но только с первого запуска.<br />
</span></div><div style="font-family: "Courier New",Courier,monospace;"><span style="font-size: small;">Попытки поменять интерфейс с Enhanced на Basic и обратно на Enhanced в браузере после запуска апплета почти всегда приводят к тому, что окно Enhanced-апплета повторно не появляется - требуется перезапуск браузера.<br />
<br />
Кроме того, в Enhanced-интефейсе нельзя приостановить начатую загрузку, т.е. у апплета нет такой функции. Можно совсем отменить закачку, перегрузив апплет.<br />
<br />
<br />
NB: После того, как закачка началась, обновление страницы браузера (т.е. перезапуск апплета) приводит к отмене загрузки !<br />
<br />
Устойчивость к помехам - нулевая. Иначе говоря, любая из "неприятностей", указанных выше, убивает закачку и требует перезапуска апплета.<br />
<br />
Интерфейс Basic не требует запуска Java-апплета и использует Firefox Download Manager, однако как указанные помехи, так и просто попытка приостановить закачку через функцию Download Manager убивает закачку. Похоже на то, что сервер не поддерживает Range-запросы.<br />
<br />
=============<br />
Моя программа:<br />
<br />
Для скачивания используется Firefox Download Manager. Закачка может быть приостановлена и возобновлена в Download Manager без всяких проблем, в 100% случаев. Кроме того, только в одном случае (выключение и повторное включение канала скачивания), и то в одном эпизоде, произошла отмена закачки; во всех остальных случаях устраиваемых "неприятностей" закачка успешно возобновлялась с того места, на котором была остановлена помехой.<br />
</span></div><div style="font-family: "Courier New",Courier,monospace;"><span style="font-size: small;"><br />
</span></div><div style="font-family: "Courier New",Courier,monospace;"><span style="font-size: small;"><br />
</span></div><div style="font-family: "Courier New",Courier,monospace;"><br />
</div><div style="font-family: "Courier New",Courier,monospace;"><br />
</div><br />
</div>Alexhttp://www.blogger.com/profile/12010600222763457685noreply@blogger.com1tag:blogger.com,1999:blog-702565810121406228.post-21025445422325222802011-02-24T02:31:00.000-08:002011-02-24T02:31:19.649-08:00Filezilla and Co. - Обзор ПО для managed FTP под Линукс. C. GoAnywhere Services<div dir="ltr" style="text-align: left;" trbidi="on"><span style="font-size: large;"><b>3. GoAnywhere Services</b></span><br />
<br />
<a href="http://www.goanywheremft.com/products/services">http://www.goanywheremft.com/products/services</a><br />
версия 2.0.2 (февраль 2011)<br />
<br />
<b>Цена:</b> лицензия от 1495$<br />
<b>ОС:</b> Linux, Windows, MacOS, AIX, HP-UX, Solaris<br />
<br />
Данный пакет от фирмы Linoma Software (США) предлагается как отдельный модуль в интегральной корпоративной схеме, включающей также GoAnywhere Director и GoAnywhere Gateway. Это реализованный на Java интерфейс к собственным FTP/SFTP/HTTP/HTTPS серверам, идущим в пакете. Предоставляется бесплатная 30-дневная лицензия. <br />
<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQ4srt3AjXHJBOzw7KInCCrTFZ-6bbqKmnTUIcL6cRTBTfBm5HOSx20GmY9J3f2v2D5FsrlFw2Q1e60UqJzR7aaiRw2aLbDLNMnaEfs1JXBbOhhEQwqXeS6WJeFlH79QTH9MJAxqOqvMI/s1600/ga_start_page.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="256" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQ4srt3AjXHJBOzw7KInCCrTFZ-6bbqKmnTUIcL6cRTBTfBm5HOSx20GmY9J3f2v2D5FsrlFw2Q1e60UqJzR7aaiRw2aLbDLNMnaEfs1JXBbOhhEQwqXeS6WJeFlH79QTH9MJAxqOqvMI/s320/ga_start_page.png" width="320" /></a></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj_fn4PrjTj6HV9cxpOWvPIa4B2lWTWlllpLujjlq2RagRGKwplMZ7Vfb9ZKPBe0xocXuEPVfJNQEKwaoDoBUp5ZBAfKeccBdm_7B60LumdjcaeaI0zA7E9W9ZIxaj6o9PDeTJD0h2KsK4/s1600/ga1_ftp_log.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><br />
</a></div><br />
<b>Окно</b><br />
<br />
<br />
Веб-интерфейс администратора выполнен как меню Java-апплетов, использующих Apache Tomcat. У программы приятный дизайн, присущий только Java. Tomcat идет вместе с пакетом, отдельно устанавливать его не нужно. Легкость инсталляции и продуманная начальная конфигурация характеризуют это ПО с лучшей стороны.<br />
<br />
<b>Пользователи</b><br />
<br />
Ведется отдельный реестр для пользователей с правами администраторов и для Web-пользователей. Предусмотрен весь стандартный спектр конфигурации прав доступа. Редактирование настроек для Web-пользователей выполнено на высоком уровне. Очень неплохо реализован File Manager - механизм доставки в папки веб-пользователей файлов для закачки; пользователи с правами администраторов выкладывают файлы в папки веб-пользователей через функцию Upload.<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj_fn4PrjTj6HV9cxpOWvPIa4B2lWTWlllpLujjlq2RagRGKwplMZ7Vfb9ZKPBe0xocXuEPVfJNQEKwaoDoBUp5ZBAfKeccBdm_7B60LumdjcaeaI0zA7E9W9ZIxaj6o9PDeTJD0h2KsK4/s1600/ga1_ftp_log.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="256" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj_fn4PrjTj6HV9cxpOWvPIa4B2lWTWlllpLujjlq2RagRGKwplMZ7Vfb9ZKPBe0xocXuEPVfJNQEKwaoDoBUp5ZBAfKeccBdm_7B60LumdjcaeaI0zA7E9W9ZIxaj6o9PDeTJD0h2KsK4/s320/ga1_ftp_log.png" width="320" /></a></div><br />
<br />
<b>Закачки</b><br />
<br />
Логи закачек ведутся непрерывно, и есть возможность вернуться к логам всех предыдущих сессий, фильтровать логи по именам пользователей, их IP, именам файлов, а также по всем событиям протокола - логин, file transfer и т.д. Каждая строка-элемент таблицы логов раскрывается с выводом всевозможной информации о данном событии в журнале. <br />
<br />
<i>Резюме: отличная программа, использующая по максимуму преимущества Линукса и Unix-подобных систем.</i><br />
<br />
<br />
<br />
</div>Alexhttp://www.blogger.com/profile/12010600222763457685noreply@blogger.com0tag:blogger.com,1999:blog-702565810121406228.post-74238051928300264732011-02-24T00:37:00.000-08:002011-02-24T00:37:43.049-08:00Filezilla and Co. - Обзор ПО для managed FTP под Линукс. B. WingFTP<div dir="ltr" style="text-align: left;" trbidi="on"><span style="font-size: large;"><b>2. Wing FTP Server</b></span><br />
<br />
<a href="http://www.wftpserver.com/">http://www.wftpserver.com</a><br />
версия 3.7.5 (23/Jan/2011)<br />
<br />
<br />
<b>Цена:</b> от 1600$ за сайт на Linux с неограниченным числом клиентов<br />
<b>ОС:</b> Windows, Linux, MacOS, Solaris<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEizIT8ifgJiv-qFzh1vwOkQZL_BOIZi9HPyCi-cvLFv8ishWcF5Vu1xiTd1H-jvaucJ0vWw5MC5g3QPsTEiLdNkdjN1K9x6bbWsZQcBzdf0_vvPxy2AiJ-XGSjoCtkOA8R0gQsr7KiIXCw/s1600/WingFTP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="256" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEizIT8ifgJiv-qFzh1vwOkQZL_BOIZi9HPyCi-cvLFv8ishWcF5Vu1xiTd1H-jvaucJ0vWw5MC5g3QPsTEiLdNkdjN1K9x6bbWsZQcBzdf0_vvPxy2AiJ-XGSjoCtkOA8R0gQsr7KiIXCw/s320/WingFTP.png" width="320" /></a></div><b>Окно</b><br />
<br />
Итак, 30-дневная бесплатная лицензия для начала. Административный интерфейс вызывается через браузер (http://127.0.0.1:5466) Здесь есть поддержка 13(!) языков, но пока, к сожалению, нет русского. Интуитивный web-интерфейс - слева раскрывающиеся меню, справа окно, в котором можно, например, смотреть за логами закачек. Гладкий, чрезвычайно "отмытый" веб-интерфейс административной программы включает даже lua-консоль для любителей печатать скрипт-команды на этом языке.<br />
<br />
<b>Пользователи</b><br />
<br />
Простой и одновременно объемный контроль за FTP-клиентами - имена, пароли, домашние каталоги, права - все это, конечно, есть, - и работает как надо. Сложная структура предлагается в настройках IP и файловых масок, времени доступа к сервисам. Для удобства, реализованы <i>домены</i> как структурный элемент доступа к ресурсам. <br />
<br />
<br />
<b>Закачки</b><br />
<br />
Логи закачек можно просматривать в текущей сессии сервера, т. е. , например, не нашел я здесь способа смотреть, как и что качал вчера. Вручную, конечно, можно отыскать нужный лог. Однако, специфика некоторых проектов предполагает масштабируемый поиск по логам с фильтрацией по именам файлов, пользователей и т. п. Все это удобно делать прямо в рабочем окне админа, если такая возможность предусмотрена.<br />
<br />
<i>Резюме: очень продвинутая программа для контроля за пользователями предоставляемых FTP и HTTP ресурсов; несомненно, уровень сложности настроек значительно выше, чем у Filezilla, что, однако, может в какой-то степени и "отпугнуть".</i> <br />
<br />
<br />
<br />
<br />
</div>Alexhttp://www.blogger.com/profile/12010600222763457685noreply@blogger.com1tag:blogger.com,1999:blog-702565810121406228.post-39248336447487963682011-02-23T22:35:00.000-08:002011-02-23T22:35:32.834-08:00Filezilla and Co. - Обзор ПО для managed FTP под Линукс. A. Filezilla<div dir="ltr" style="text-align: left;" trbidi="on"><span style="font-size: large;"><b>1. Filezilla Server</b></span><br />
<br />
<b>Цена:</b> 0$ (лицензия GPL, поставляется с открытым кодом, компилируется средствами разработки для Windows)<br />
<b>ОС:</b> Windows (Linux под Wine)<br />
<br />
Очень легкий пакет для инсталляции под Линукс - запускается:<br />
wine /home/alex/FileZilla_Server-0_9_37.exe и далее без остановок по всем пунктам до желаемого результата:<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIaLVWhlJ89wWqe1UmJbusigYqDmlLGWZKf3bj2AuQOnkoSiBZtYwolt1cDT1UJOzRsSDiXSMxrYGoXg-ZmEyS0LXWxbdEMLig5-Vc-sED8AK6w3_gYc_-CMvbsFSqLhT5pzYuin6PlFA/s1600/Filezilla.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="256" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIaLVWhlJ89wWqe1UmJbusigYqDmlLGWZKf3bj2AuQOnkoSiBZtYwolt1cDT1UJOzRsSDiXSMxrYGoXg-ZmEyS0LXWxbdEMLig5-Vc-sED8AK6w3_gYc_-CMvbsFSqLhT5pzYuin6PlFA/s320/Filezilla.png" width="320" /></a></div><br />
<b>Окно</b><br />
<br />
Сервер на 21-ом и административный интерфейс запускаются раздельно как .exe файлы; в окне интерфейса отражаются обычный FTP-лог, а также состояние текущих закачек (в нижнем фрейме окна).<br />
<br />
<b>Пользователи</b><br />
<br />
Создать аккаунт FTP пользователя и/или изменить пароль, группу очень просто.<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgW-7xlNfn7qt8ad5ouKUQbSg4VuFBccQ-h2K6aoFwm-zP4WtHOKWN2azvaLKRTNAladhS_Ytw0jaL2Cbsrjd_mHl6uZifPo93Z24jBe-aoakekZekdDueCClMX7nKXRbXraSvKQ-WE-9U/s1600/filezilla2_users.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="256" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgW-7xlNfn7qt8ad5ouKUQbSg4VuFBccQ-h2K6aoFwm-zP4WtHOKWN2azvaLKRTNAladhS_Ytw0jaL2Cbsrjd_mHl6uZifPo93Z24jBe-aoakekZekdDueCClMX7nKXRbXraSvKQ-WE-9U/s320/filezilla2_users.png" width="320" /></a></div><br />
<b>Закачки</b><br />
<br />
К сожалению, нет возможности устанавливать временные ограничения и иные атрибуты для предлагаемых к закачке файлов. Кроме того, в административном окне просмотр логов весьма ограничен и нет фильтрации.<br />
<br />
<i>Резюме: достойная программа, в большинстве случаев подходит для легких и средних FTP-проектов, не требующих каких-то особенных функций контроля за качками.</i><br />
<br />
<br />
</div>Alexhttp://www.blogger.com/profile/12010600222763457685noreply@blogger.com0