Setting up a Postgres test cluster in vagrant

No Comments

Most of the time its fine to test things out on your local machine and local installation of Postgres but often, you want more flexibility, the ability to quickly reset to a known starting point and the ability to try out different and more complex server architectures.

That is where Vagrant comes in. I use it to quickly set up Postgres clusters at different versions and different configurations.

You can find all of these in my repo at

For example here is a set up to allow you to set up a 3 node cluster with 1 leader and 2 followers all on postgres 9.4 and using replication slots:

If you want to try it out, all you need to do (once you have Vagrant installed ) is:

Then if you want to reset the cluster, you just do a ‘vagrant destroy’ then vagrant up again and you have a clean starting point.

You could also try out the same config on Postgres 10 (beta2) which is the latest version right now using the config here:


But what if the config you want isn’t there and you want to set up you own? Well fortunately its really easy to set up.

Lets take a look at the most basic configuration of 1 master and one read only slave using the new Postgres 10 beta release:

The first point is the Vagrant file. That tells Vagrant which nodes to configure, what networking they need and then the script name that will set up the software once the VM is up and running.

This is the only important part (that you need to know about):

That sets up 2 servers, one called master and one called slave. If you wanted to add another node in, all you would need to do is add something like the slave config section again e.g.

and put it before the final end. You just need to change the ip address, the hostname and the “scripts/” script if you want to use a different script to set it up.


In the master dir, you have the postgresql.conf and the pg_hba.conf files for the master (the slave doesn’t need them because it will get them from the master when we start it up as a slave). That just allows replication connections in the pg_hba.conf and sets up a couple of parameters that differ from the default in the postgresql.conf file:

Most of the defaults i postgres are prety good so for this basic set up we can leave everything else as is.


So the only other part of the whole configuration is the provisioning scripts. There is one for the master and one or the slave node that runs once the VM’s have been set up. The order is Master VM – master provisioning script then slave VM – slave provisioning script.

The 2 scripts are very similar. They first download the yum repos and install the necessary software:


Next it creates some dirs and links then initializes a postgres cluster. Once that is done, postgres is shut down so that the conf files can be copied over and its started up again:


The script is the same up to the point where it installs the software but the actual initializing of the database is unnecessarty as it is all handled by the pg_basebackup command that copieds the master database over and sets up replication in one handy command!


Finally there is some tidying up by creating a link to the data directory and also priming the recovery.conf file with a trigger file in case it needs to be easily promoted later.


That really is it.

Once vagrant up finishes, you can connect easily with ‘psql -U postgres -h’ to test it out and start playing.

Categories: PostgreSQL, Tutorial Tags: Tags: ,

Failover testing of a Postgres cluster

No Comments

Testing failover to one of 2 slaves and reattaching to the new master.

Starting config

Post failover config



You will need Vagrant and Virtualbox to run this tutorial but that should be all.



Turn off master.
Promote slave1 to Master
Attach slave2 as follower to slave1
Attach master is as follower to slave1 (without rebuilding)




Check the cluster stat and replication:


Stop postgres on the master:


Promote slave1


Connect to slave1:

From your local machine – check the status of slave1:

Check replication slots:

There are none there. Lets create some for our other hosts so that we dont lose ant wal files (you may not want to do this on a production system as it will force the master to keep ahold of wall files untill the new slaves can get them. It is necessary to attach the master straight back in though):

(The old master will still be called master so that we can see what has changed, but it wont be the leader in the new config)


Ok. so now we have a new master and the cluster is back up and running. Now its time to plug in the first follower (slave2) so that we have a standby for failover.

Connect to slave2 and stop postgres:


edit the recovery.conf file to point to the new leader – (slave1)
and set recovery_target_timeline = ‘latest’
Note the primary_slot_name = ‘slave2’ controls which replication slot this node will use.

Start postgres back up again:

Now check that the new follower is pointing at the leader and successfully getting data:

(on slave1 – the leader)

As you can see the ‘slave2’ replication slot is now active and connected.
You can further check this by creating something on slave1 and seeing it appear on slave2.


Now, its time to put the old master back into use. We don’t want to use it as the master agian because we already have a leader and switching would mean more downtime so we will put it in as another follower:

Doing that can be tricky because failing over could have put the cluster on a different timeline. This is easy to get around though by telling the old master to use the latest timeline.

Connect to the master node and create a recovery.conf file as above:

Start the instance back up and check that the replication is connected and recieveing changes:

As you can see Everything is connected and changeds are being passed on.

All thats left now is a bit of tydying up. The ‘master’ node still has replication slots configured and thess need to be removed.
Its simple enough to do:

What is apache Cassandra?

No Comments

Cassandra is a fast distributed database.
It has several defining features:

  • Built in high availability. – Any node can handle read and write requests and your data is replicated to x nodes so regardless of which node (or even a data center) goes down, you will still have access to read and write your data.
  • Linear Scalability. – Doubling the number of (identical) nodes should double the write performance. Its basically as simple as that was all nodes can handle all operations and there is no central control.
  • Predictable performance. (i.e. doubling the number of identical nodes should double the write throughput)
  • no single point of failure. -nodes can go down and come back up without the front end application becoming aware of it.
  • Multiple Data Centres catered for and taken advantage of out as standard.
  • Built to run on commodity hardware – so you can run it on lots of $1000 servers rather than 1 or 2 $100000 servers.
  • Easy to manage operationally. – The system is designed to need very little ops input.

Cassandra is not a drop in replacement for RDBMS – it has a different data model and requires thinking of your data store in a different way.

The basic topology of a Cassandra cluster is that of a Hash ring. There are no master/slave /replicas etc, each node is essentially equal and capable of performing any operation that another node can perform. There are no config servers, again, that function is shared by all nodes.

The stored data is partitioned around the ring. Each node owns a piece of the ‘token’ which is a range of values. The primary key or partition key for a given table is hashed and each possible value forms a piece of the token. This means that you can tell which node owns a given piece of data easily by hashing the required value nd comparing the result to the token ownership.
While any piece of data is owned by one node, the data is replicated to RF=n servers (where RF is the ‘Replication factor’ set at keycap creation), so multiple nodes will have the data and could answer queries on it.
All nodes hold data and answer queries.

The CAP tradeoff refers to it being impossible to be both consistent and highly available (during periods of network partition). If you insist on consistency then your service will be unable to respond if it can’t reach all nodes and if you insist on high availability then you must risk inconsistent data if you can’t reach a particular node.
Latency between data centers also makes consistency impractical. Cassandra chooses Availability and partition tolerance over consistency.

However – You can can tune availability vs consistency in Cassandra to produce a fully consistent system, a fully HA focused cluster or more commonly a tradeoff that sits somewhere in the middle.


Data replicated automatically and without user intervention. The only thing that you need to do is to pick the number of servers that you would like your data replicated to . This is called the Replication factor. Data is always replicated to each replica automatically. If a machine is down, missing data is replayed via a ‘hinted handoff’ when that machine becomes available.

Consistency level.

In Cassandra, consistency can be set on a per query basis. That means that if one query requires data to be returned very fast, you might set your consistency level low (‘ONE’ being the lowest level) , but if you need to make sure that a write is replicated and will be consistent then you could set that query to a high level of consistency (‘ALL’ being the highest). Common options are ‘ALL’, ‘ONE’ and ‘QUORUM’ (which is more than half the nodes). The consistency level determines him many nodes must respond with OK for a given query.

Read and write is faster with lower consistency levels.
Availability is negatively affected by raising consistency levels.

Multi DC

Cassandra has strong support for multiple data centers. A DC is a logical separation in Cassandra which should map to physical separate machines and along with ‘racks’ which group machines together determines which nodes have data replicated to them from a particular owner.
A typical useage example would be to write to local DC and replicate async to other DCs for high availability.
Replication factor is set per keyspace, per data center.

Categories: Cassandra Tags: Tags:

Relational Databases and Big Data workloads.

No Comments

This intro to Cassandra is taken from the DataStax course. I don’t necessarily agree with everything – particularly their take on what a traditional RDBMS can and can’t do but I have included their view here for completeness.

Cassandra is designed for ‘Big Data’ workloads. Im order to understand the characteristics of Big Data, lets first define ‘Small Data’:

This would typically be a volume of storage that would fit on 1 machine and a RDBMS is typically fine and able to handle the number of operations and the quantity of data. The system will support a number of concurrent users in the hundreds. It fully supports ACID.

When you want to scale such a system, you are going to do it vertically first – with a bigger host, more RAM or processors.

Can Relational databases support big data?

Read More…

Categories: Big Data, Cassandra

Get all of the tables linked by a Foreign Key

No Comments

Here is a handy little PostgreSQL query that will list all of the tables linked by a foreign key to the table YOUR_TABLE.
The output will include the keys on each side.
This can be very useful if you want to build up a map of your database through the relationships between the tables.

select confrelid::regclass, af.attname as fcol, conrelid::regclass, a.attname as col
from pg_attribute af,
pg_attribute a,
(select conrelid,confrelid,conkey[i] as conkey,
confkey[i] as confkey from (select conrelid,confrelid,conkey,confkey, generate_series(1,array_upper(conkey,1)) as i from pg_constraint where contype = ‘f’) ss) ss2
where af.attnum = confkey and af.attrelid = confrelid and a.attnum = conkey and a.attrelid = conrelid AND (conrelid::regclass = ‘YOUR_TABLE’::regclass);

Categories: PostgreSQL