mysqlguy.net

Faster way to find the next valid position in a relay log?

Submitted by jay on January 11, 2008 - 11:25am

I'm working on an implementation of a mysql slave precache tool as described by Paul Tuckfield at last year's User conference.

I can easily tell where the SQL Thread is currently is reading from in the local relay log, but it's a harder problem to solve when I want to scan ahead in that log some number of binlog entries and find a valid position. I can simply add to the position some number of bytes, but then I have to stop and loop through calls to 'mysqlbinlog' until I get a position that doesn't throw an error.

This turns out to be a pretty expensive operation, and I'm having a problem where by the time I find a good position and can start my precacher, the slave is already past me!

Does anyone have any clever ways to find good binary log positions besides this (this is in ruby, in case you're wondering):

def mysqlbinlog_cmd( position, length )
   "mysqlbinlog --start-pos #{position.pos} --stop-pos #{position.pos.to_i + length} #{position.file} 2>&1"
end

def find_next_valid_position( start_position )
    string = %x{#{mysqlbinlog_cmd( start_position, 1)}}
    until string !~ /Error in Log_event/ and string !~ /Unknown event/
        start_position = start_position + 1
        string = %x{#{mysqlbinlog_cmd( start_position, 1)}}
    end

    # If we give a position off the end of the file, mysqlbinlog outputs 
    # one line and then exits.  Check for that here.
    return nil if string.split(/\n/).length == 1

    return start_position
end

Trackback URL for this post:

http://mysqlguy.net/trackback/2

Why not just only work off of

Why not just only work off of valid positions?  Each entry in the binary log output will contain both the beginning and end position:

# at 669690
#071026 18:29:32 server id 1  end_log_pos 670619        Query   thread_id=231   exec_time=0     error_code=0

The # at 669690 is the current position, and the end_log_pos 670619 is where it ends in the binary log.  It does get a bit more complex with transactions for end_log_pos, but that is the general way I would do it.

For example, say you know the position the slave is currently on, then you could use the start-position as the beginning and then calculate a good stop-position based on how far in the future you want to be.  stop-position doesn't have to be exact like start-position does.  You could then get the last event based on that.  If the slave is at 659877 and you want to be 10000 bytes ahead for example, in shell:

mysqlbinlog --start-pos=659877 --stop-pos=669877  file-bin.000023 | grep "^# at" | tail -n 1

jay's picture

Thanks, that's a good tip.  I

Thanks, that's a good tip.  I didn't spend a lot of time examining the log format. 

I suggest seeing how the

I suggest seeing how the SQL_SLAVE_SKIP_COUNTER variable works.

-Sheeri K. Cabral

I've already done it

Jay,

I've already written what you're trying to do, but it's unreleased.  (And I had a bunch of email conversations with Paul and others about it).  I haven't yet released it for the same reason Paul hasn't released his.  However, I'd love for you to try it out.  Shoot me an email at baron at xaprb dot com.

Warning!

Comment abuse is not tolerated on this site, besides all the comments are moderated, so don't bother posting comments that are not on topic, only for increasing the SEO of your site, or are outright spam.  If you've got something intelligent to contribute, by all means, post a link to your blog.  

About Me

Jay Janssen
Yahoo!, Inc.
jayj at yahoo dash inc dot com
MySQL
High Availability
Global Load Balancing
Failover
View Jay Janssen on Twitter  View Jay Janssen's LinkedIn profile View Jay Janssen's Facebook profile