How To Backup/Restore ModelOp Center's Databases

Prerequisites

It is assumed you already have the MongoDB database tools and the PostgreSQL client/CLI (psql) installed.

Download MongoDB Command Line Database Tools

https://www.postgresql.org/download/

Additionally, if you are using AWS or Kubernetes, it is assumed you have kubectl and AWS CLI installed.

Assumptions

It is assumed that you are very comfortable using a terminal/terminal commands and are familiar with CLIs. It is additionally assumed that the database in question is only a small to medium size and that the machine in question has local storage with a decent amount of space remaining (30+ GB). Otherwise you have to have access to AWS CLI/S3 or another bucket interface that has stdin/stdout access and can store large amounts of data. Finally, it is assumed the user knows or has access to all the necessary credentials for the databases or any other cloud service that must be accessed.

Overview

The process for backing up and restoring both databases is somewhat simple in concept. Assuming the database isn’t too large, one can directly dump the database into a local dump file using that database's specific tools. Then another instance of the same database type can be restored from that dump. There are some variations of this process that will be discussed later, including uploading the dump file directly to s3 as well as restoring from s3, however the overall procedure remains the same.

It is important to note that these commands can stream unencrypted data over the network, if that is a concern there are a few ways to solve it. If one is exclusively working in a AWS VPC ecosystem, using an ec2 node as the machine running the commands could keep all your data inside of your private AWS network. If you are port forwarding a database instance from kubernetes that should already be secured via kubectl’s encryption. Another method is that if ssl is configured for the database connection, simply add the environment variable PGSSLMODE=require for Postgres and the additional command parameter -ssl for Mongo. This will allow you to securely transfer data across the network.

PostgreSQL

Here it will be discussed the various ways to dump and restore a Postgres DB instance. The tools used to backup a Postgres DB are the pg_dump and psql utilities. Additionally AWS CLI may be used in discussions about S3 uploads of and restores from the dump file.

Everything should be done with the postgres user with root privileges.

Local Dump File

Here we will just be saving the dump file to local storage from a remote host.

This only works if your local device has enough storage space for the database dump file, otherwise later methods including but not limited to s3 uploads are necessary.

Dump a file from the remote database to your current working directory with (replacing all parameters with the necessary credentials):

pg_dump dbname -h hostname -p port -U username -W > dumpfile.sql

This will dump the single specified database into the file. If you would like to dump all information from a Postgres instance, including all databases run this instead:

pg_dumpall dbname -h hostname -p port -U username -W > dumpfile.sql

To restore the database run:

psql dbname -h hostname -p port -U username -W < dumpfile.sql

Or if you run pg_dumpall, run:

Pipe dump file directly to new database

If the database you want to restore to is already up and running, you can simply run the command:

The database must still be created in the new DB before piping over the dump, additionally this method does not seem to work with pg_dumpall.

Backup and restore to and from S3

To dump to an s3 bucket (using your local machine as an intermediary pipe) run:

To restore a database from an s3 bucket, run:

MongoDB

Here it will be discussed the various ways to dump and restore a Mongo DB instance. The tools used to backup/restore a Mongo DB are the mongodump and mongorestore utilities. Similar to Postgres, AWS CLI may be used in discussions about S3 uploads of and restores from the mongo dump file.

Local Dump File

Here we will just be saving the dump file to local storage from a remote host.

Dump a file from the remote databases to your current working directory with (replacing all parameters with the necessary credentials):

To restore the database run:

Pipe dump file directly to new database

If the database you want to restore to is already up and running, you can simply run the command:

Backup and restore to and from S3

To dump to an s3 bucket (using your local machine as an intermediary pipe) run:

To restore a database from an s3 bucket, run:

Special Cases

Kubernetes

If your database is in a kubernetes cluster as a pod or deployment, you will have to port forward the instance in order to receive access to it. This is done with the command:

This will return a output similar to:

Now you should be able to access the database locally using the given port and using localhost as the host parameter. For further information, please visit Use Port Forwarding to Access Applications in a Cluster.

References and further reading