PostgreSQL Synchronization Tool

View: 891    Dowload: 0   Comment: 0   Post by: hanhga   Category: PostgreSQL   Fields: Education - Schools

Download  postgresql-synchronization.zip (35.72 KB)

You need to Sign In to download the file postgresql-synchronization.zip
If you do not have an account then Sign up for free here


As I'm using rsync on a regular basis, I thought why isn't there a tool for databases that simply does nothing more than synchronizing two databases on demand - no sophisticated triggers or additional changes of the database. psync is a simple application that synchronizes two PostgreSQL databases. The concept can be adapted or even encapsulated so that other databases can also be used.

The idea of psync is to use two database connections and update the tables and their content like this:

  1. fetch all tables of the source and destination databases
  2. for each table in the source, do:
    1. create table if not exist on destination
    2. sync contents of table
  3. for each table in destination, do:
    1. delete table if not exists on the source

Although the algorithm looks simple, there are many problems to overcome.

  1. Duplicating tables
  2. Handling of special columns, e.g., oid for large objects
  3. Constraints on tables
  4. Special structures of extensions

Duplicating Tables

In order to duplicate a table, we need to find out which columns exist and which primary keys we have and so on. With this information, we can assemble a CREATE TABLE command that creates the table. In PostgreSQL, the information is stored in a table named information_schema.columns: column_name, is_nullable, and data_type. Primary keys are a little bit more complicated as we need two tables information_schema.table_constraints and information_schema.key_column_usage. See PSync.PrimaryKeys for the whole SQL statement.

Large Object Identifiers

The current implementation of psync focuses on the oid data type for handling large objects. Large objects store binary data and are identified using an integer. Tables with an oid column store only references to a large object. Most replication solutions can't handle large objects.

Certainly, we can't just copy the identifiers from the source to the destination, we need to compare the data itself. Always copying the whole object is not a solution as they may be big and this will waste network bandwidth and time when they are equal. The solution used by psync is hash values (currently the MD5 algorithm is used). In order to keep the traffic low, these hashes are calculated by the database using a custom function without transferring any data. psync compares these values and copies the object on demand.

The PostgreSQL function for calculating an MD5 hash value is (original source):

as $$
 fd        integer;
 size      integer;
 hashval   text;
 INV_READ  constant integer := 262144; -- 0x40000 from libpq-fs.h
 SEEK_SET  constant integer := 0;
 SEEK_END  constant integer := 2;
 IF id is null THEN
 fd   := lo_open(id, INV_READ);
 size := lo_lseek(fd, 0, 2);
 PERFORM lo_lseek(fd, 0, 0);
 hashval := md5(loread(fd, size));
 PERFORM lo_close(fd);
 RETURN hashval;
language plpgsql stable strict;
comment on FUNCTION md5(id oid) is 'Calculates the md5 sum of a large object.';

The process gets even more complex as there maybe more than one oid column in a table. The following code segment shows the process of constructing the SQL query for retrieving the hash values:

bool first;
string _select = "select";
string _from = " from";
string _where = " where";

// Compose the select block
first = true;
foreach(int oid in oids) {
  if (!first)
    _select += ",";
  // Get the md5 hash value for the oid-column
  _select += string.Format(" md5({0})", result.ColumnName(oid));
  first = false;

// Data from the table we are currently working on
_from += table;

// The row identified by the primary keys of the table
first = true;
foreach(string key in primaryKeys) {
  if (!first)
    _where += " and";
  _where += string.Format(" {0}='{1}'", key, result[row, result.ColumnIndex(key)]);
  first = false;
command = _select + _from + _where;

After building the SQL command, the results are compared:

Result r1, r2;
try {
  r1 = dst.Exec(command);
  if (r1.Rows > 0) {
    r2 = src.Exec(command);

    // Compare large objects
    for(int c=0; c<r1.Columns; c++) {
      if (r1[0, c] != r2[0, c]) {
        // md5 hashes are different
        int id = CopyOid(result.GetInt(row, oids[c]));
        command = string.Format("update {0} set {1}='{2}'{3}", 
                                table, result.ColumnName(oids[c]), id, _where);

  else {
    // Row missing on destination
    command = string.Format("insert into {0} values(", table);
    first = true;
    for(int col=0; col<result.Columns; col++) {
      if (!first)
        command += ",";
      if (oids.Contains(col)) {
        int id = CopyOid(result.GetInt(row, oid));
        command += string.Format(" '{0}'", id);
        command += string.Format(" '{0}'", result[row, col]);
      first = false;
    command += ")";
} catch(PostgreSQLException e) {
  // Large object id exist on destination but does not reference an object
  foreach(int oid in oids) {
    int id = CopyOid(result.GetInt(row, oid));
    command = string.Format("update {0} set {1}='{2}'{3}", 
                            table, result.ColumnName(oid), id, _where);

Using the Code

The code is implemented in two parts. The npq namespace implements a low-level wrapper around libpq (class PG) together with a high-level interface for easier use.

Example use of the library:

PostgreSQL db = new PostgreSQL("hostaddr='' port='5432' requiressl='1' " + 
                               "user='XXX' password='XXX' dbname='XXX'");
Console.WriteLine("Server {0} Protocol {1}", db.Version, db.Protocol);

Result result = db.Exec("select * from some_table");
Console.WriteLine("Result {0}", result.Valid);
Console.Write("{0,5} ", "Nr");
for(int f=0; f<result.Columns; f++)
  Console.Write("| {0,15}:{1,8}", result.ColumnName(f), result.ColumnType(f));
for(int n=0; n<result.Rows; n++) {
  Console.Write("{0,5} ", n);
  for(int f=0; f<result.Columns; f++)
    Console.Write("| {0,24}", result[n, f]);
// You'll need to explicitly call Dispose to free associated ressources

Handling of large objects is done like this:

LargeObject lo = new LargeObject(db);
byte[] tmp = new byte[1024];
int s = 0;
FileStream fs = new FileStream(result[n, 0]+".jpg", FileMode.Create);
while ((s = lo.Read(tmp, 1024)) > 0)
  fs.Write(tmp, 0, s);

The code was developed under Linux using Mono and MonoDevelop. It should aslo work under Windows but was not tested. Linux users must add the following configuration (npq.dll.config) to correctly map the library.

  <dllmap dll="libpq.dll" target="libpq.so" os="!windows" />

PostgreSQL Synchronization Tool

This article introduces a method of synchronizing two PostgreSQL databases. Although, this seems to be an easy task, no product (slony, londiste, ...) really satisfied the needs within the maps.bremen.de project. Either they have special prerequisites that didn't apply for our problem, or they didn't support synchronizing of large objects. Large objects are used to store tiles of a street/aerial map within PostgreSQL. My GIS-server queries the database and gets the tiles out. By using this construction, we are getting a flexible infrastructure for updating and maintaining different versions of the maps. Everything was working fine until the service needs to be spread over three servers. How can we easily synchronize the databases? I really found no working solution that was clean and easy to use.

Posted on 03-04-2014 


To comment you must be logged in members.

Files with category

  • Accessing: MySQL Database using MySQL C API

    Accessing: MySQL Database using MySQL C API

    View: 2677    Download: 2   Comment: 0

    Category: PostgreSQL     Fields: Other

    This article was written for beginners, developers that do not know how to access MySql database using MySQL C API and would like to have a small introduction.

  • Full Database Abstraction Layer Generator

    Full Database Abstraction Layer Generator

    View: 2806    Download: 0   Comment: 0

    Category: PostgreSQL     Fields: Other

    This article presents yet another Data Abstraction Layer generator (popular topic). The DAL structure is directly inspired from the DotNetNuke, DAL document and the SQL generation is done using SQLpp. The generator takes care of pretty much...

  • Code .NET 2.0; Build Java; Run Linux

    Code .NET 2.0; Build Java; Run Linux

    View: 2286    Download: 0   Comment: 0

    Category: PostgreSQL     Fields: Other

    This article is in the Product Showcase section for our sponsors at CodeProject. These reviews are intended to provide you with information on products and services that we consider useful and of value to developers. This is a showcase review for...

  • Postgres Database Backup/Restore From C#

    Postgres Database Backup/Restore From C#

    View: 3327    Download: 3   Comment: 0

    Category: PostgreSQL     Fields: Other

    This article provides Postgres database backup/restore functionality. By using this application we can check whether a Postgres database is installed or not in a system. If a Postgres database is installed then the Postgres databases list is...

  • Installing Redmine on Windows in production

    Installing Redmine on Windows in production

    View: 7093    Download: 0   Comment: 0

    Category: PostgreSQL     Fields: Other

    This is a very simple step by step guide to install Redmine on Windows server in production. This guide is applicable to IIS 7+ based servers, which include Windows Server 2008, 2008 R2, Windows Vista and Windows 7. Latter two should not be used for...

  • A SQL Management Console for MSSQL 2000 & 2005, MySQL 5.0,...

    A SQL Management Console for MSSQL 2000 & 2005, MySQL 5.0,...

    View: 3612    Download: 2   Comment: 0

    Category: PostgreSQL     Fields: Other

    Since MSSQL is not holding the lion share of the market yet, integration/interaction with other RDBMS like PL/Oracle, MySQL, FireBird, DB2, is inevitable. With the class System.Data.Common in ADO .NET 2.0, different data providers can write the...

  • Creating an Interactive Map in ASP.NET 2.0 Using SharpMap

    Creating an Interactive Map in ASP.NET 2.0 Using SharpMap

    View: 4749    Download: 5   Comment: 0

    Category: PostgreSQL     Fields: Other

    Showing maps on the web is becoming more and more common. The most well-known ones are Google Maps and Microsoft's Virtual Earth. Creating your own websites with interactive maps require you to get hold on the map-data and convert this data to...

  • DevShot - Database Snapshots for Developers

    DevShot - Database Snapshots for Developers

    View: 1161    Download: 0   Comment: 0

    Category: PostgreSQL     Fields: Other

    A few weeks back, I got an idea of writing a simple tool for switching database states between coding cycles. I thought this would be a good tool for anyone who is working on database related projects and have to switch between database states since...

File suggestion for you
File top downloads
Codetitle - library source code to share, download the file to the community
Copyright © 2018. All rights reserved. codetitle Develope by Vinagon .Ltd