What is apache Cassandra?

Cassandra is a fast distributed database.
It has several defining features:

  • Built in high availability. – Any node can handle read and write requests and your data is replicated to x nodes so regardless of which node (or even a data center) goes down, you will still have access to read and write your data.
  • Linear Scalability. – Doubling the number of (identical) nodes should double the write performance. Its basically as simple as that was all nodes can handle all operations and there is no central control.
  • Predictable performance. (i.e. doubling the number of identical nodes should double the write throughput)
  • no single point of failure. -nodes can go down and come back up without the front end application becoming aware of it.
  • Multiple Data Centres catered for and taken advantage of out as standard.
  • Built to run on commodity hardware – so you can run it on lots of $1000 servers rather than 1 or 2 $100000 servers.
  • Easy to manage operationally. – The system is designed to need very little ops input.

Continue reading “What is apache Cassandra?”

Relational Databases and Big Data workloads.

This intro to Cassandra is taken from the DataStax course. I don’t necessarily agree with everything – particularly their take on what a traditional RDBMS can and can’t do but I have included their view here for completeness.

Cassandra is designed for ‘Big Data’ workloads. Im order to understand the characteristics of Big Data, lets first define ‘Small Data’:

This would typically be a volume of storage that would fit on 1 machine and a RDBMS is typically fine and able to handle the number of operations and the quantity of data. The system will support a number of concurrent users in the hundreds. It fully supports ACID.

When you want to scale such a system, you are going to do it vertically first – with a bigger host, more RAM or processors.

Can Relational databases support big data?

Continue reading “Relational Databases and Big Data workloads.”

Get all of the tables linked by a Foreign Key

Here is a handy little PostgreSQL query that will list all of the tables linked by a foreign key to the table YOUR_TABLE.
The output will include the keys on each side.
This can be very useful if you want to build up a map of your database through the relationships between the tables.

select confrelid::regclass, af.attname as fcol, conrelid::regclass, a.attname as col
from pg_attribute af,
pg_attribute a,
(select conrelid,confrelid,conkey[i] as conkey,
confkey[i] as confkey from (select conrelid,confrelid,conkey,confkey, generate_series(1,array_upper(conkey,1)) as i from pg_constraint where contype = ‘f’) ss) ss2
where af.attnum = confkey and af.attrelid = confrelid and a.attnum = conkey and a.attrelid = conrelid AND (conrelid::regclass = ‘YOUR_TABLE’::regclass);

How to kill all sessions in Postgres so that they stay dead.

Postgres Connections

Have you ever tried to drop or rename a postgres database or even create another database from one as a template, only to be confronted by the dreaded message:

postgres=# create database test with template a_database;
ERROR: source database “a_database” is being accessed by other users
DETAIL: There are 40 other sessions using the database.

So, we kill those sessions off with something like the below SQL that will kill all sessions that are connected to that database:

SELECT pg_terminate_backend(pg_stat_activity.pid)
FROM pg_stat_activity
WHERE pid <> pg_backend_pid()pg_stat_activity.datname = ‘TARGET_DB’
AND pg_stat_activity.datname = ‘A_DATABASE’;

Job done, now we can create our database… But wait!

postgres=# create database test with template a_database;
ERROR: source database “a_database” is being accessed by other users
DETAIL: There are 40 other sessions using the database.

Someone has beaten us too it and reconnected!

If you are like me, the only logical solution is to try to do the above statements really really fast. Unfortunately, I normally lose, so I need a way to stop sessions from reconnecting to the database that doesn’t involve changing config files etc.

Fortunately, the simple solution is to disallow all connections to that database and we can do it with this line:

update pg_database set datallowconn = false where datname = ‘a_database’;

Once you are finished, you just need to set it back to True and the database will be open for connections again.

How to make a Decision

How to Make a Decision
How to Make a Decision

I am slightly embarrassed to admit that I have often avoided making decisions throughout my life. That’s not to say that I haven’t made things happen but I have often avoided putting my foot down and choosing a path one way or the other, preferring instead to allow circumstances to make the decision myself. That’s not to say that I haven’t subconsciously pushed circumstances one way or another, but it’s not the same and does not give you the same benefits as being decisive and telling the world what you want.

The thing is that an argument could be made that the whole point of life is to make decisions (if we take any spiritual questions out of the equation). The biological definition of life is that ‘life is irritable’ meaning that the environment does something and life does something back. This is true for all life – a plant decides on some level to move towards the light and if it were to stop doing it, it would die. The same is true for us on a much more complex level. If we were to stop making decisions, we would not get up, eat or drink and we would soon die.

Continue reading “How to make a Decision”