Approximate Times to Update an RDS Instance

Here’s a quick overview from an upgrade a couple months ago of an RDS instance type db.t2.medium to type db.r3.large. In addition to changing the instance type, we upgraded the disk from 64GB to 100GB, and applied a new parameter group. The disk increase by far took the longest amount of time. We clocked in at ever so slightly over an hour for the disk increase, while the instance upgrade only took a bit over 16 minutes. [Read More]

Why You Should Point Staging to Your Production Database

I have been thinking about this topic more recently as I’ve just started working with a new infrastructure, and one of the things that I noticed is the staging database is a separate copy of the production database. It also happens to be a very old copy that has drifted, certainly in data and possibly in schema. It was likely put in place to avoid the perceived problem of new code possibly affecting production data in some critical way, and having a safe and solid environment to examine changes before they do hit production. [Read More]

Percona Live 2012 – The Etsy Shard Architecture

I attended Percona Conf in Santa Clara last week. It was a great 3 days of lots of people dropping expert MySQL knowledge. I learned a lot and met some great people. I gave a talk about the Etsy shard architecture, and had a lot of good feedback and questions. A lot of people do active/passive, but we run everything in active/active master-master. This helps us keep a warm buffer pool on both sides, and if one side goes out we only lose about half of the queries until its pulled and they start hashing to the other side. [Read More]

Distributed MySQL Sleuthing on the Wire

Intro Oftentimes you need to know what MySQL is doing right now and furthermore if you are handling heavy traffic you probably have multiple instances of it running across many nodes. I’m going to start by showing how to take a tcpdump capture on one node, a few ways to analyze that, and then go into how to take a distributed capture across many nodes for aggregate analysis. Taking the Capture The first thing you need to do is to take a capture of the interesting packets. [Read More]

Using Mongo and Map Reduce on Apache Access Logs

Introduction With more and more traffic pouring into websites, it has become necessary to come up with creative ways to parse and analyze large data sets. One of the popular ways to do that lately is using MapReduce which is a framework used across distributed systems to help make sense of large data sets. There are lots of implementations of the map/reduce framework but an easy way to get started is by using MongoDB. [Read More]

Investigating Data in Memcached

Intro Almost every company I can think of uses Memcached at some layer of their stack, however I haven’t until recently found a great way to take a snapshot of the keys in memory and their associated metadata such as expire time, LRU time, the value size and whether its been expired or flushed. The tool to do this is called peep Installing Peep The main thing about installing peep is that you have to compile memcached with debugging symbols. [Read More]

Using DBIx::Class to Version Your Schema

Intro to DBIx::Class In my opinion DBIx::Class is one of the best ORM solutions out there. Not only can it model your database, including mapping out any foreign key relationships, it can also be used as a canonical point of reference for your schema. This means we can use it not only as an application layer interface to the database, but can also define a versioned database structure using the code, where if you add a field into your result class, you can generate a version of alter statements and a DDL just by running a simple script. [Read More]

Quick DB Setups with MySQL Sandbox

Introduction There are various reasons to setup quick “sandbox” instances of MySQL. You can use them to test different types of replication (such as master-master or various slave topologies), to test your code against different versions of MySQL, or to setup instances of MySQL on a per developer basis where each person has their own database running on a different port so they can breakdown/setup the DB easily or make schema changes without affecting other team members. [Read More]

MySQL Snapshots using LVM

Introduction LVM snapshots are a quick and easy way to take a backup of your MySQL server’s data files. The idea is to acquire a read lock on all tables, flush all server caches to disk, and create a snapshot of the volume containing the MySQL data directory, then unlock the tables again. This gives an atomic snapshot that takes only milliseconds. A mysql instance can then be launched to point at this data directory and you can perform a mysqldump to get data for setting up slaves, or you can take a completely destroyed DB and use the data directory to restore it to this point in time. [Read More]