How to fix MySql Replication Error 1236

How to fix MySql Replication Error 1236

How to fix MySql Replication Error 1236
We have some websites that use two database servers in master-master replication. Recently one of the servers died and had to be resurrected. (They are cloud based and SoftLayer doesn’t seem to have their cloud-based offering thing nailed down, yet. Every week one of them will go down and be unresponsive until tech support does some magic.)

After one of the servers was brought back up the other server would not connect. It “Slave_IO_Running: No” and “Seconds_Behind_Master: null” which means it was not playing nicely.

First, I went to the MySql log files, which for this server is found at /var/log/mysql.log and looked at the last few messages here by running “tail /var/log/mysql.log” from the command prompt. This revealed the error number (server_errno=1236). It also had the following info:
Got fatal error 1236: ‘Client requested master to start replication from impossible position’ from master when reading data from binary log

Just before this entry in the mysql.log it indicates the file and position that it is trying to read. So, with this data in hand, I headed over to the master server and took a look at the bin logs. They are located in /var/lib/mysql/. Here I took a look at the file in question using the mysqlbinlog utility. I used the following command to check out the bin log. Obviously, you’ll have to replace the position and file name with the position and file indicated in your mysql.log.
mysqlbinlog –offset=128694 mysql-bin.000013

And, this is what I saw here, among other things:
Warning: this binlog was not closed properly. Most probably mysqld crashed writing it.

Well, that explains things! When the server crashed the bin log was not closed properly. This is easy to fix. Going back to the slave server, I stopped the slave, reset the bin log position and started the server.
?
1
2
3
4
STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_POS = 0;
CHANGE MASTER TO MASTER_LOG_FILE = ‘mysql-bin.000014’;
START SLAVE;

I simply pointed the slave to the start of the next bin log. It started right up with no problem.