Saturday 29 August 2015

Backups and Restore data In Cassandra

 Cassandra backs up data by taking a snapshot of all on-disk data files (SSTable files) stored in the data directory. You can take a snapshot of all keyspaces, a single keyspace, or a single table while the system is online.

Using a parallel ssh tool (such as pssh), you can snapshot an entire cluster. This provides an eventually consistent backup. Although no one node is guaranteed to be consistent with its replica nodes at the time a snapshot is taken, a restored snapshot resumes consistency using Cassandra's built-in consistency mechanisms.
 
After a system-wide snapshot is performed, you can enable incremental backups on each node to backup data that has changed since the last snapshot: each time an SSTable is flushed, a hard link is copied into a /backups subdirectory of the data directory (provided JNA is enabled)

1. Taking a snapshot

Run the nodetool snapshot command, specifying the hostname, JMX port, and keyspace.
 
$ nodetool -h hostname -p jmx port snapshot mykeyspace

 For example:

$ nodetool -h localhost -p 7999  snapshot test


The snapshot is created in data_directory_location/keyspace_name/table_name/snapshots/snapshot_name directory. Each snapshot directory contains numerous .db files that contain the data at the time of the snapshot.

For example:

Packaged installs:

/var/lib/cassandra/data/mykeyspace/mytable/snapshots/23939834298/mykeyspace.db

Tarball installs:

install_location/data/data/mykeyspace/mytable/snapshots/23939834298/mykeyspace.db

Deleting snapshot files


To delete all snapshots for a node, run the nodetool clearsnapshot command. For example:
$ nodetool -h localhost -p 7199 clearsnapshot

Enabling incremental backups

Edit the cassandra.yaml configuration file on each node in the cluster and change the value of incremental_backups to true.


 Restoring from a Snapshot

1. Shut down the node.

2. Clear all files in the commitlog directory:
    
 Packaged installs: /var/lib/cassandra/commitlog
 Tarball installs: install_location/data/commitlog


3. Delete all *.db files in data_directory_location/keyspace_name/table_name directory, but DO NOT delete the /snapshots and /backups subdirectories. where data_directory_location is Packaged installs: /var/lib/cassandra/data and Tarball installs: install_location/data/data

4. Locate the most recent snapshot folder in this directory:
data_directory_location/keyspace_name/table_name/snapshots/snapshot_name

5. Copy its contents into this directory: data_directory_location/keyspace_name/table_name directory.

6. If using incremental backups, copy all contents of this directory:
data_directory_location/keyspace_name/table_name/backups

7. Paste it into this directory: data_directory_location/keyspace_name/table_name
8. Restart the node.

Related Posts :

Insert Data Into Cassandra Example

Insert Data Into Cassandra Example Set 2 

 Select Data From Cassandra Using Java



No comments:

Post a Comment