1

We have a MySQL 5.0.77 Master-Slave replication. The replication was not properly running for the past few weeks and it was giving Duplicate entry error 1062. The Set Global Skip-counter option didn't help, so I had to skip the error no.1062 by adding it to the /etc/my.cnf file and then it reported table doesn't exist error in one particular database.

I have then taken a mysqldump of that database and restored in Slave last weekend. Then the Slave IO_Thread and Slave_SQL both started running fine, and it looked like the replication was back on track. The Seconds_behind_master value was very high and then it started reducing for the past 4 days.

When I checked the slave replication status today, I found that the seconds_behind_master value is keep on increasing since morning. I stopped the slave IO_Thread once and then the seconds_behind_master became Null. Then after I started the IO_thread the value became the same and kept on increasing.

I see one process is running from morning system user gss-app Connect 9535736 copy to tmp table ALTER TABLEqueue_clicksADD INDEX(puid)

Please help me to fix this issue. Thanks.

#mysql> show slave status\G;
`*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: 203.x.x.x
                Master_User: replication
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: mysql-bin.000990
        Read_Master_Log_Pos: 185674180
             Relay_Log_File: mysqld-relay-bin.000224
              Relay_Log_Pos: 9286354
      Relay_Master_Log_File: mysql-bin.000774
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB:
        Replicate_Ignore_DB:
         Replicate_Do_Table:
     Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
                 Last_Errno: 0
                 Last_Error:
               Skip_Counter: 0
        Exec_Master_Log_Pos: 472142385
            Relay_Log_Space: 112995681998
            Until_Condition: None
             Until_Log_File:
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File:
         Master_SSL_CA_Path:
            Master_SSL_Cert:
          Master_SSL_Cipher:
             Master_SSL_Key:
      Seconds_Behind_Master: 9533355
1 row in set (0.00 sec)`
Manny0606
  • 141
  • MySQL 5.0 reached it's end-of-life some time ago, and the latest MySQL 5.0 release is 5.0.96 (you have MySQL 5.0.77). Many replication bugs have been fixed since MySQL 5.0.77, and replication performance has improved quite a bit since even MySQL 5.1. There area many reasons for MySQL replication delays in older versions of MySQL. – Stefan Lasiewski Mar 18 '15 at 01:01

3 Answers3

6

I won't worry about it if the IO and SQL are running, as well as the Relay_Master_Log_File is catching up with the Master_Log_File. I believe the delay is in the fact that your total relay log file is huge, approx. 105G - Relay_Log_Space: 112995681998 and considering that the Slave is at 000774 position and master is at 000990 position, there are a total of 214 binary logs of approximately 468M each (105G/214) waiting to be replayed on the slave.

My advice is to keep an eye on the Relay_Master_Log_File and make sure that is is going up and catching up with the Master_Log_File. I also see that the master host is in a public IP address, is this replication taking place over a public network or slow WAN? That could be introducing a delay, that faster the link the better.

Daniel t.
  • 9,424
  • Thanks for your response! Yes, the Relay_Master_Log_File is catching up with the Master_Log_file. Today the Relay_Log_space is about 74G. I believe the slave will catchup the master in a week or so. – Manny0606 Mar 18 '15 at 00:53
  • Hi Daniel, The replication finally did catch up with the master server two days ago, there were few errors to run few partially completed that I had to run manually and keep an eye on the replication several times everyday. Now, it is 0 seconds behind master. Tested the replication working fine by creating new tables in master database and verified the same created in the slave server. All are fine. – Manny0606 Apr 15 '15 at 00:05
0

I've experienced the exact same issue before.

Please make the following changes to rectify "Seconds_Behind_Master" increasing once slave has started.

Firstly check the "trx" value on slave server

  • show variables like '%trx%';

if the value of "innodb_flush_log_at_trx_commit" is "1" as below

  • Variable_name | Value |

  • innodb_api_trx_level | 0 |

  • innodb_flush_log_at_trx_commit | 1 |

  • innodb_use_global_flush_log_at_trx_commit | ON |

then change this value to "2"

  • set global innodb_flush_log_trx_commit = 2;

Once you've changed this in your mysql client also change this in your my.cnf file to ensure that once mysql restarts this value persists.

  • vi /etc/mysql/my.cnf

search for "innodb_flush_log_at_trx_commit=1"

and change its value to "2" as well.

This will resolve your "Seconds_Behind_Master" issue.

0

When you stop the IO_Thread and SQL_Thread catches up with the available relay log data, Seconds_Behind_Master will turn to NULL and hence does not reflect the right delay under these circumstances. You can find the right ways to interpret the values of Seconds_Behind_Master under different : https://scalegrid.io/blog/mysql-tutorial-understanding-the-seconds-behind-master-value/