Tuesday, September 11, 2012

MySQL Cluster to InnoDB Replication Howto


In this blog post I will show you how to setup a replication from MySQL Cluster  (ndbcluster) to a regular MySQL Server (InnoDB). If you want to understand the concepts, check out part 7 of our free MySQL Cluster training.

First of all we start with a MySQL Cluster looking like this, and what we want to do is to setup replication server to the Reporting Server (InnoDB slave).




MySQL Cluster is great at scaling large numbers of write transactions or shorter key-based read querise, but not so good at longer reporting or analytical queries. I generally recommend people to limit analytical/reporting queries on the MySQL Cluster, in order to avoid slowing down the realtime access to the cluster. A great way of doing that is to replicate the MySQL Cluster data to a standalone MySQL Server.  

To achieve that, we will need a replication server. All data written into NDBCLUSTER is sent as events to the replication server. A MySQL Server can be turned into a replication server by specifying --log-bin. The replication server then produces a binary log, which can be replicated to a standalone InnoDB. 

(NOTE: For redundancy, it is possible to have 2 replication servers. We will cover that in a separate blog.)


Replication Layer Configuration
In the my.cnf of the replication server you should have the following:
[mysqld]
...
#REPLICATION SPECIFIC - GENERAL
#server-id must be unique across all mysql servers participating in replication.
server-id=101
#REPLICATION SPECIFIC - MASTER
log-bin=binlog
binlog-do-db=user_db_1
binlog-do-db=user_db_2
binlog-do-db=mysql
expire-logs-days=5
...
You may want to skip  the binlog-do-db=.., if you want to replicate all databases, but, if you want to replicate a particular database, make sure you also replicate the mysql database in order to get some very important data on the slave.

Restart the replication server for the settings to have effect.
Grant access to the Slave:

GRANT REPLICATION SLAVE ON *.* TO 'repl'@'ip/hostname of mysqld m' 
  IDENTIFIED BY 'repl';

InnoDB Slave Configuration
The first requirement on the InnoDb slave is that it must use the mysqld binary that comes from the MySQL Cluster package. If you already have a MySQL 5.5 installed that is not clustered, you need to upgrade it to the Cluster version of it. E.g, by doing:

sudo rpm -Uvh MySQL-Cluster-server-gpl-7.2.7-1.el6.x86_64.rpm
sudo rpm -Uvh MySQL-Cluster-client-gpl-7.2.7-1.el6.x86_64.rpm

The InnoDB slave should have the following:

[mysqld]
...
binlog-format=mixed
log-bin=binlog
log-slave-updates=1
slave-exec-mode=IDEMPOTENT
expire-logs-days=5
...

If you want the InnoDb to further replicate to a set of slaves, then you should set log-slave-updates=1 otherwise you can set it to 0 (log-slave-updates=0). Thatt is all, restart the slave.

You must also create the following table on the Innodb Slave:

use mysql;
CREATE TABLE `ndb_apply_status` (
`server_id` int(10) unsigned NOT NULL,
`epoch` bigint(20) unsigned NOT NULL,
`log_name` varchar(255) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
`start_pos` bigint(20) unsigned NOT NULL,
`end_pos` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`server_id`) USING HASH)
ENGINE=INNODB DEFAULT CHARSET=latin1;

Then do CHANGE MASTER:
CHANGE MASTER TO MASTER_HOST='ip/hostname of the replication server', MASTER_USER='repl', MASTER_PASSWORD='repl';

Staging the InnoDB Slave with Data
Now you need to stage the InnoDB slave with data. What you need to do is to disable traffic to NDBCLUSTER in order to get a consistent snapshot of the data. As there are no clusterwide table locks in NDBCLUSTER you have two options:
  1. Block the Loadbalancer from sending any transactions to MySQL Cluster
  2. Make all SQL nodes READ ONLY, by locking all tables on ALL MySQL servers (if you use NDBAPI applications, then option 1) or shutting down the applications is the the only option):  FLUSH TABLES WITH READ LOCK;
So on all MySQL Servers in the Access Layer do:
FLUSH TABLES WITH READ LOCK;

Ensure, by looking at the replication server,  that no writes are made to the NDBCLUSTER by looking at the SHOW MASTER STATUS:
mysql> show master status;
+---------------+-----------+--------------+------------------+
| File          | Position  | Binlog_Do_DB | Binlog_Ignore_DB |
+---------------+-----------+--------------+------------------+
| binlog.000008 | 859092065 |              |                  |
+---------------+-----------+--------------+------------------+
1 row in set (0.00 sec) 
Run the SHOW MASTER STATUS; a couple of times until you see the Position not changing any more

Then RESET the replication server, so you have a good clean slate to start from:
mysql> RESET MASTER;

Now use mysqldump two times to get  :
  1. one dump with the schema
  2. another dump with the data 
mysqldump --no-data --routines --trigggers > schema.sql
mysqldump --no-create-info --master-data=1 > data.sql

Of course you can dump out only the databases you are interested in.


When the dumps have finished, you can enable traffic to NDBCLUSTER again. You can do on ALL SQL nodes:

UNLOCK TABLES

Point is that you can enable traffic to NDBCLUSTER again.

Now, change the ENGINE=ndbcluster to ENGINE=innodb in schema.sql:

sed -i.bak 's#ndbcluster#innodb#g' schema.sql

Copy the schema.sql and data.sql to the slave, and load in the dump file to the InnoDb slave.

Finally you can start replication, on the InnoDB slave you can now do:

START SLAVE;
SHOW SLAVE STATUS \G

And hopefully all will be fine :)

10 comments:

Unknown said...

Hi,

I have 2 sqlnodes (A&B), and i want to replicate mysql cluster to an innodb database.
I set the replication on innodb database to replicate on sqlnode A. But when I using insert query on sqlnode B it isn't replicated to innodb database. What is the problem?

Kelly Hans said...

You need to inset in Node A, Nobe B is for query only

Johan Andersson said...

Hi Akhmad,

If A and B are connected to MySQL Cluster, and you are replicating from A or B to the INNODB database (C) then that should work.

Does it work if you insert on A?
Is it really connected to the Cluster?
Is the table you want to replicate from cluster created with engine=NDB?


thanks,
johan

Unknown said...

Thanks All,

My problem was solved. It's caused because the default binlog_format in my caluster was 'statement'. After I set it to 'row' my replication is works. Thanks for your comments..

Anonymous said...

Hi,

I was wondering if there is any possibility to make a consistent backup from the cluster without having to lock all the sql nodes.
We currently have a lot of traffic and locking the cluster down would be something we would not want to risk.

Johan Andersson said...

Hi,

No it is not possible. It is unfortunately only possible to stage slaves using mysqldump, which then requires no writes to MySQL Cluster.
Another possibility is to use ndb backup (ONLINE) and use ndb_restore and restore into csv files. However, ndb_restore supports only a handful data types (like INT, CHAR, VARCHAR) when writing to the csv files.

So mysqldump is your best bet, unfortunately.

Keith said...

hi johan, when importing the mysql database it's giving an error on

CREATE TABLE `ndb_index_stat_sample`

Error Code: 1071. Specified key was too long; max key length is 767 bytes

Is replicating the mysql database important?
What is the best solution for this?
Thanks.

Johan Andersson said...

Hi Keith,

Pewh, looks like a new bug (Error Code: 1071. Specified key was too long; max key length is 767 bytes
).

Try and remove the mysql database from the replication.
On the slave you can set (in my.cnf):
replicate-ignore-db=mysql
or on the master:
binlog-ignore-db=mysql
Use whatever that is most convenient for you (slave filtering may be enough since there are not usually that many changes happening on the mysql database.

Let me know what happens!
Thanks
Johan

Keith said...

Is there any benefit of having "ndbinfo", "performance_schema" and "tmp" databases replicated?

Johan Andersson said...

No, so you can disable them too.
-j