Since travel to deliver content and awareness is being held for a while (COVID-19), I am missing the interaction I would have with people at random times during a conference, the exchange of ideas and impromptu problem solving.
The hallway track of a conference has always been my favorite part: the networking, exchanging ideas, getting feedback about products, all of that is harder to replicate without talks and social the setting of a conference.
I decided to experiment and try to find new ways to interact with “my” audience, the idea of Office Hours came, which originally is not mine, however I like the fact that I can have a technical conversation with someone that has questions about databases (not just about Cloud SQL) and also provide a window of opportunity for you to give me feedback on our products.
Examples of things we can talk about:
Best Practices for Migrations
Where to store your data
Should I put my Server on Kubernetes?
How do I migrate my data to the cloud?
The pilot started with 30 min sessions with the availability to talk to up to 10 people weekly (5h/week). However I think it is more productive a change to up to 3h/week and talking to 9 people in 20 min slots in more timezones.
Write down your questions and add it to the booking tool.
If a lengthy context is needed to understand your problem, please add the information in the booking tool, however do not send to me:
PII – Personal Identifiable Information
Your SQL Dump
Your intellectual property
No database credentials
Have defined scope of what you want to talk about, I can’t solve everything in 20 minutes.
Do not double book, other people should also take advantage of this
This is not a guaranteed consultancy agreement, this is just people talking informally about tech problems and possible solutions, information shared and explained are guides, you are responsible for weighing your options and if any advice is executed, the outcome is your responsibility.
This is a post based on recent tutorials I published, with the goal of discussing how to prepare your current MySQL instance to be configured as an External Primary Server with a Replica/Follower into Google Cloud Platform.
First, I want to talk about the jargon used here. I will be using primary to represent the external “master” server, and replica to represent the “slave” server. Personally, I prefer the terms leader/follower but primary/replica currently seems to be more common in the industry. At some point, the word slave will be used, but because it is the keyword embedded on the server to represent a replica.
The steps given will be in the context of a VM running a one-click install of WordPress acquired through the Google Marketplace (formerly known as Launcher) .
To help prepare for replication you need to configure your primary to meet some requirements.
server-id must be configured; it needs to have binary logging enabled; it needs to have GTID enabled, and GTID must be enforced. Tutorial.
A dump file must be generated using the mysqldump command with some information on it.
The steps above are also necessary if you are migrating from another cloud or on-prem.
Why split the application and database and use a service like Cloud SQL?
First, you will be able to use your application server to do what it was mainly designed for: serve requests of your WordPress application (and it doesn’t much matter for the purposes of this post if you are using nginx or Apache).
Databases are heavy, their deadly sin is gluttony, they tend to occupy as much memory as they can to make lookups fairly fast. Once you are faced with this reality, sharing resources with your application is not a good idea.
Next, you may say: I could use Kubernetes! Yes, you could, but just because you can do something doesn’t mean you should. Configuring stateful applications inside Kubernetes is a challenge, and the fact that pods can be killed at any moment may pose a threat to your data consistency if it happens mid transaction. There are solutions on the market that use MySQL on top of Kubernetes, but that would be a totally different discussion.
You also don’t need to use Cloud SQL, you can set up your
database replicas, or even the primary, on another VM (still wins when compared with putting the database and application together), but in this scenario you are perpetually risking hitting the limits of your finite hardware capabilities.
Finally, Cloud SQL has a 99.95% availability and it is curated by the SRE team of Google. That means you can focus your efforts on what really matters — developing your application — and not spend hours, or even days, setting up servers. Other persuasively convenient features include PITR (Point in Time Recovery) and High Availability in case a failover is necessary.
Setting up the replica on GCP
Accessing the menu SQL in your Google Cloud Console will give you a listing of your current Cloud SQL instances. From there execute the following:
Click on the Migrate Data button
Once you have familiarized yourself with the steps shown on the screen, click on Begin Migration
In the Data source details , fill the form out as follows:
Name of data source: Any valid name for a Cloud SQL instance that will represent the primary server name
Public IP address of source: The IP address of the primary
Port number of source: The port number for the primary, usually 3306
MySQL replication username: The username associated with the replication permissions on the primary
MySQL replication password: The password for the replication username
Database version: Choose between MySQL 5.6 and MySQL 5.7. If you are not sure which version you are running, execute SELECT @@version; in your primary server and you will have the answer.
(Optional) Enable SSL/TLS certification: Upload or enter the Source CA Certificate
Click on Next
The next section Cloud SQL read replica creation, will allow you to choose:
Read replica instance ID: Any valid name for a Cloud SQL instance that will represent the replica server name
Location: choose the Region and then the Zone for which your instance will be provisioned.
Machine Type: Choose a Machine Type for your replica; This can be modified later! In some cases it is recommended to choose a higher instance configuration than what you will keep after replication synchronization finishes
Storage type: Choice between SSD and HDD. For higher performance choose SSD
Storage capacity: It can be from 10GB up to 10TB. The checkbox for Enable automatic storage increases means whenever you’re near capacity, space will be incrementally increased. All increases are permanent
SQL Dump File: Dump generated containing binary logging position and GTID information.
(Optional) More options can be configured by clicking on Show advanced options like Authorized networks, Database flags, and Labels.
Once you’ve filled out this information, click on Create.
The following section, Data synchronization, will display the previous selected options as well the Outgoing IP Address which must be added to your current proxy, firewall, white-list to be able to connect and fetch replication data. Once you are sure your primary can be accessed using the specified credentials, and the IP was white-listed, you can click on Next. After that replication will start.
If you want to see this feature in action, please check this video from Google Cloud Next 2018:
-h the hostname or IPV4 address of the primary should replace [MASTER_IP]
-P the port or the primary server, usually [MASTER_PORT] value will be 3306
-u takes the username passed on [USERNAME]
-p informs that a password will be given
--databases a comma separated list of the databases to be imported. Keep in mind [DBS]should not include the sys, performance_schema, information_schema, and mysql schemas
--hex-blob necessary for dumping binary columns which types could be BINARY, BLOB and others
--skip-triggers recommended for the initial load, you can import the triggers at a later moment
--master-data according to the documentation: “It causes the dump output to include a CHANGE MASTER TO statement that indicates the binary log coordinates (file name and position) of the dumped server”
--order-by-primary it dumps the data in the primary key order
--no-autocommit encloses the table between a SET autocommit=0 and COMMIT statements
--default-character-set informs the default character set
--ignore-table must list the VIEW to be ignored on import, for multiple views, use this option multiple times. Views can be imported later on after promotion of the replica is done
--single-transaction a START TRANSACTION is sent to the database so the dump will contain the data up to that point in time
--set-gtid-purged writes the the state of the GTID information into the dump file and disables binary logging when the dump is loaded into the replica
After that the result is compressed in a GZIP file and uploaded to a bucket on Google Cloud Storage with gsutil cp - gs://[BUCKET]/[PATH_TO_DUMP] where [BUCKET] is the bucket you created on GCS and [PATH_TO_DUMP] will save the file in the desired path.
Be aware that no DDL operations should be performed in the database while the dump is being generated else you might find inconsistencies.
See something wrong in this tutorial? Please don’t hesitate to message me through the comments or the contact page.