Migrating Mysql database to Postgresql with pgloader

In this demo, we show how to migrate a sample MySQL database to PostgreSQL in k8s using a pgloader job. The overall processes are:

  1. ‘helm install’ a MySQL db and a PostgreSQL db in a k8s cluster (GKE).
  2. Load a simple employee data set to MySQL db.
  3. Create a pgloader job reading MySQL employees db and moving the data to postgres db employees schema.

First, we use helm install MySQL and Postgresql from bitnami. The simple point we want to demo is running pgloader in K8S job, so we won’t discuss different helm charts here. Then we load a sample dataset, employees, to the MySQL database. There is another dataset (DVD rental) to try if you don’t have a MySQL dataset ready for migration.

To make it easy, we deploy MySQL and PostgreSQL in the “default” namespace.

After the helm chart deployed, we see the pods are running and services are all up. Take a note of the service names of both databases. We will need them when creating the K8S job.

Next, we download the employee dataset (here) and load it to the MySQL container. For this demo, I use kubectl port-forward to connect the MySQL service to my localhost. This way I can load the sample dataset on my local machine. To access the Mysql service, we need the default Mysql root password. The bitnami helm chart gives you a hint on how to get the password.

The content in the employee db looks like this.

We check the content in the postgres db and find nothing there. The pgloader has many options if there are data in the target db. For example, truncate or drop data before data migration.

Once the data is loaded to our source db, we can prepare the k8s job. The job will need the source and the target db hostnames, user names, and user passwords. We already have the hostnames (the services). We can use refer the passwords from secrets and put them in env.

The pgloader job template looks like this. Well, this is what this demo about. Just this single job yaml file to move your MySQL db to postgres! Inside the job yaml, I use K8S service FQDN(service.namespace.svc.cluster) to address the db hostnames. If the target and/or source db are/is outside of K8S, you will need external services to address hostnames.

In the job template, we use Dimitri’s (the author of pgloader) docker image and the command-line options. You can use the command file too. Store the command file in k8s configMap and mount the configMap as Volume in Job to simplify the command. By the way, you can see the credentials are from secrets and stored in “env”. This is not the best practice of storing credentials. The passwords may be exposed in logs. Next step, we just deploy the job and trace the log.

It takes a while to move the data. In the meantime, you can check the tables in postgres.

The following are the log output from GKE logs.

pgloader is a good tool to migrate MySQL. As we saw in the past, MySQL to PostgreSQL migration is difficult from many perspectives. For more details, go to Dimitri's GitHub and readthedoc for all the options.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Yuwei Sung

Yuwei Sung

A data nerd started from data center field engineer to cloud database reliability engineer.