Friday, September 04, 2009

Pick Two

There's a saying in the computer business, "Good, Fast, Cheap. Pick Two." This saying represents the trade offs we have to make every day on the front lines of IT.

Recently, we have started playing around with MySQL Enterprise Monitor (MEM). MEM is MySQL's offering for an event driven monitoring application similar to Oracle's Enterprise Manager. There are four pieces to MEM; the monitoring agent that gathers that db statistics, the query analyzer that allows you to "trace" queries, the connection proxy that allows you to redirect connections, and the data repository.

Our first stab at configuring MEM was putting the monitoring agent, query analyzer, and query proxy on the host being monitored with the repository db on a different machine. This actually worked pretty well until we had some big queries run through the query proxy. CPU utilization spiked because both the query proxy and the database had to handle the same set of data.

So MySQL Support suggested we move the proxy, query analyzer, and monitoring agent to a separate host. This "middle-tier" would handle the extra CPU load we were generating and keep our db host performance consistent. This solution worked very well, except we weren't getting host related statistics like CPU utilization on the DB host.

No problem, right? We moved the monitoring agent to the db host and kept the proxy and query analyzer on the middle-tier. Now we had our host related statistics, except we couldn't get the query analyzer to work because it has to have a monitoring agent present in order to work properly.

The only solution MySQL Support can offer us at this point is to have two agents monitoring the same database; one on the middle-tier host and one on the db host. OK, we'll do what needs to be done, but I have two fundamental problems with that configuration. First, my database background tells me that duplication will come back and bite you in the ass someday. My other very practical issue is now we are storing twice as much data in the repository for 10% more information.

So while the concept of MEM is great, the actual implementation is a compromise.

Note: Some of the above configurations are not currently supported by MySQL. Use at your own risk.

2 comments:

arjen said...

Its architecture is a nuisance and has been since day one, since it does not take into account what people actually need. You don't really want to monitor your MySQL database server separate from the machine beneath it, other services, your network, and a possible front end.

There are other options. We use them, and provide proactive support rather than emergency/reactive.

ryan said...

We run the proxy, agent, and Quan all on the same box. One key difference though, we don't flow any production traffic through Quan under normal circumstances. We will only point a subsection of our cluster to hit the Quan port in order to get a sample of live traffic.

In non-production (test, stage) environments we throw everything through Quan except for performance testing. This helps us catch errors in the early stage before they hit production, and the testing level of activity is never enough to cause problems.

This provides very little overhead, but with most of the benefit from using MEM. May not work for everyone though.