mysqlguy.net

innodb_io_capacity

Doing some performance testing on some modern hardware comparing Innodb plugin 1.0.4 with stock Innodb. I'm running a sysbench transactions test (reads and writes) with 200M rows in my table (table size is around 46G, RAM is 16G, buffer pool is set to 12G).

I was puzzled to see the innodb plugin to be decent, but not really as great as I expected, I was doing about ~6100 RW operations a second (individual statements within transactions). Then I compared it to the stock innodb and shockingly I got ~7K ops. I thought about what I tuned that was different in the plugin and came up with the innodb_io_capacity.

I/O Thread delay trick

  I was debugging some repl delay monitoring metrics and I noticed that the test I was doing (sysbench --test=oltp prepare) to generate replication data was far outstripping the slave.  The SQL thread was caught up to the IO thread, but the IO thread was way behind the master.
 
    Replicating from:           a2.db.bcp.re1.yahoo.com
    Master:                     a2_db_bcp_re1.000166/138395515
    Slave I/O:          Yes     a2_db_bcp_re1.000165/802640907  ???
    Slave Relay:        Yes     a2_db_bcp_re1.000165/802030586  596K
  198 secs
 

 

  In this case, the I/O thread was getting further and further behind as sysbench did bulk inserts into my master.  My theory is that a lot of relatively small binary log records simply don't transfer efficiently.  That leaves the SQL thread idle some of the time waiting for the IO thread, and leads it inefficient replication.
 
   I poked around the replication options manual page, looking for something to help and found this:  slave_compressed_protocol
 

Re: Eventually Consistent Relational Database?

In response to Eric Day's post on "Eventually Consistent Relational Database?"... I started posting a comment there, but I realized I have my own blog for this sort of thing. :)

I've been thinking the same thing, it's nice to hear I'm not the only one. This is a neglected area of "cloud" development, mostly because it's a big scary problem. Everyone says "use NoSQL", but if we had strategies/systems to give us EC RDBMS solutions, nobody would use key/value storage (except where it actually made sense). NoSQL is a big golden hammer nowadays. It works, but it sure takes a lot of effort to code stuff the storage layer should be able to handle (joins, etc.).

The power of RSS

Thanks a ton to Xarb who reminded me of pipes in his blog post about filtering out fluffy planet mysql authors.


I should remember Pipes, since I work at Yahoo and I was called in to help out with their DB that first full day they were launched and couldn't handle the traffic, but hey, sometimes things just don't come to mind.

So you want to talk about Single points of failure, eh?

In reply to Arjen's post about Single points of failure:


Arjen, you are absolutely right.  It doesn't matter how over-engineered a storage solution is (I'm thinking of a giant dual-headed Netapp with redundant everything).  After you've paid a few hundred K for that, you still have a single point of failure.  Is it a highly-unlikely point of failure?  Sure, but it's still a point of failure. 

Let's take it a step further, at Yahoo we're beyond thinking about how to make a single node redundant (be it for storage, networking, or even a simple webserver), we consider entire datacenters to be single points of failure.  What does that mean?  

About Me

Jay Janssen
Yahoo!, Inc.
jayj at yahoo dash inc dot com

MySQL
High Availability
Global Load Balancing
Failover

View Jay Janssen's LinkedIn profileView Jay Janssen's Facebook profile