What is apache Cassandra?

Cassandra is a fast distributed database.
It has several defining features:

  • Built in high availability. – Any node can handle read and write requests and your data is replicated to x nodes so regardless of which node (or even a data center) goes down, you will still have access to read and write your data.
  • Linear Scalability. – Doubling the number of (identical) nodes should double the write performance. Its basically as simple as that was all nodes can handle all operations and there is no central control.
  • Predictable performance. (i.e. doubling the number of identical nodes should double the write throughput)
  • no single point of failure. -nodes can go down and come back up without the front end application becoming aware of it.
  • Multiple Data Centres catered for and taken advantage of out as standard.
  • Built to run on commodity hardware – so you can run it on lots of $1000 servers rather than 1 or 2 $100000 servers.
  • Easy to manage operationally. – The system is designed to need very little ops input.

Read more

Relational Databases and Big Data workloads.

This intro to Cassandra is taken from the DataStax course. I don’t necessarily agree with everything – particularly their take on what a traditional RDBMS can and can’t do but I have included their view here for completeness.

Cassandra is designed for ‘Big Data’ workloads. Im order to understand the characteristics of Big Data, lets first define ‘Small Data’:

This would typically be a volume of storage that would fit on 1 machine and a RDBMS is typically fine and able to handle the number of operations and the quantity of data. The system will support a number of concurrent users in the hundreds. It fully supports ACID.

When you want to scale such a system, you are going to do it vertically first – with a bigger host, more RAM or processors.

Can Relational databases support big data?

Read more

What is an Oracle Service

Oracle services were a feature introduced in Oracle 10g. Their function is to simplify workload management by allowing you to group applications that share traits such as thresholds, priorities and attributes. The Oracle database is presented as a service and so you always have at least one service running. It is good practice to create …

Read more

How to modify Voting disks in ASM

The process for modifying a votedisk that is stored on ASM differs from the process when the votedisk is stored on some other medium. When your voting disks are stored on ASM you can move them off to non ASM storage with the replace command: $ crsctl replace votedisk [path_to_new_votedisk] (this can be done from …

Read more