Slides from my talk at Percona 2013
Apr 23 2013
Slides from my talk at Percona 2013
Apr 15 2012
I attended Percona Conf in Santa Clara last week. It was a great 3 days of lots of people dropping expert MySQL knowledge. I learned a lot and met some great people.
I gave a talk about the Etsy shard architecture, and had a lot of good feedback and questions. A lot of people do active/passive, but we run everything in active/active master-master. This helps us keep a warm buffer pool on both sides, and if one side goes out we only lose about half of the queries until its pulled and they start hashing to the other side. Some details on how this works in the slides below.
Jan 09 2012
Oftentimes you need to know what MySQL is doing right now and furthermore if you are handling heavy traffic you probably have multiple instances of it running across many nodes. I’m going to start by showing how to take a tcpdump capture on one node, a few ways to analyze that, and then go into how to take a distributed capture across many nodes for aggregate analysis.
The first thing you need to do is to take a capture of the interesting packets. You can either do this on the MySQL server or on the hosts talking to it. According to this percona post this command is the best way to capture mysql traffic on the eth0 interface and write it into mycapture.cap for later analysis:
% tcpdump -i eth0 -w mycapture.cap -s 0 "port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2" tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes 47542 packets captured 47703 packets received by filter 60 packets dropped by kernel
The next step is to take a look at your captured data. One way to do this is with tshark, which is the command line part of wireshark. You can do yum install wireshark or similar to install it. Usually you want to do this on a different host than the one taking traffic since it can be memory and CPU intensive.
You can then use it to reconstruct the mysql packets like so:
% tshark -d tcp.port==3306,mysql -T fields -R mysql.query -e frame.time -e ip.src -e ip.dst -e mysql.query -r mycapture.cap
This will give you the time, source IP, destination IP, and query but this is still really raw output. Its a nice start but we can do better. Percona has released the Percona Toolkit which includes some really nice command line tools (including what used to be in Maatkit).
The one we’re interested in here is pt-query-digest
It has tons of options and you should read the documentation, but here’s a few I’ve used recently.
Lets say you want to get the top tables queried from your tcpdump
% tcpdump -r mycapture.cap -n -x -q -tttt | pt-query-digest --type tcpdump --group-by tables --order-by Query_time:cnt \ --report-format profile --limit 5 reading from file mycapture.cap, link-type EN10MB (Ethernet) # Profile # Rank Query ID Response time Calls R/Call Apdx V/M Item # ==== ======== ============= ===== ====== ==== ===== ==================== # 1 0x 0.3140 6.1% 674 0.0005 1.00 0.00 shard.images # 2 0x 0.8840 17.1% 499 0.0018 1.00 0.03 shard.activity # 3 0x 0.1575 3.1% 266 0.0006 1.00 0.00 shard.listing_images # 4 0x 0.1680 3.3% 265 0.0006 1.00 0.00 shard.connection_edges_reverse # 5 0x 0.0598 1.2% 254 0.0002 1.00 0.00 shard.listing_translations # MISC 0xMISC 3.5771 69.3% 3534 0.0010 NS 0.0 <86 ITEMS>
Note the tcpdump options I used this time, which the tool requires to work properly when passing –type tcpdump. I also grouped by tables (as opposed to full queries) and ordered by the count (the Calls column). It will stop at your –limit and group the rest into MISC so be aware of that.
You can remove the –order-by to sort by response time, which is the default sort order, or provide other attributes to sort on. We can also change the –report-format, for example to header:
% tcpdump -r mycapture.cap -n -x -q -tttt | pt-query-digest --type tcpdump --group-by tables --report-format header reading from file mycapture.cap, link-type EN10MB (Ethernet) # Overall: 5.49k total, 91 unique, 321.13 QPS, 0.30x concurrency _________ # Time range: 2012-01-08 15:52:05.814608 to 15:52:22.916873 # Attribute total min max avg 95% stddev median # ============ ======= ======= ======= ======= ======= ======= ======= # Exec time 5s 3us 114ms 939us 2ms 3ms 348us # Rows affecte 316 0 13 0.06 0.99 0.29 0 # Query size 3.64M 18 5.65k 694.98 1.09k 386.68 592.07 # Warning coun 0 0 0 0 0 0 0 # Boolean: # No index use 0% yes, 99% no
If you set the –report-format to query_report you will get gobs of verbose information that you can dive into and you can use the –filter option to do things like getting slow queries:
% tcpdump -r mycapture.cap -n -x -q -tttt | \
pt-query-digest --type tcpdump --filter '($event->{No_index_used} eq "Yes" || $event->{No_good_index_used} eq "Yes")'
Now that we’ve taken a look at capturing and analyzing packets from one host, its time to dive into looking at our results across the cluster. The main trick is that tcpdump provides no option to stop capturing – you have to explicitly kill it. Otherwise we’ll just use dsh to send our commands out. We’ll assume you have a user that can hop around in a password-less fashion using ssh keys – setting that up is well outside the scope of this article but there’s plenty of info out there on how to do that.
There’s a few ways you can let a process run on a “timeout” but I’m assuming we don’t have any script written or tools like bash timeout or the one distributed in coreutils available.
So we’re going off the premise that you will background the process and kill it after a sleep by grabbing its pid:
( /path/to/command with options ) & sleep 5 ; kill $!
Simple enough, except we’ll want to capture the output on each host, so we need to ssh the output back over to the target using a pipe to grab the stdout. This means that $! will return the pid of our ssh command instead of our tcpdump command. We end up having to do a little trick to kill the right process, since the capture won’t be readable if we kill ssh command that is writing the output. We’ll need to kill tcpdump and to do that we can look at the parent pid of the ssh process, ask pkill (similar to pgrep) for all of the processes that have this parent, and finally kill the oldest one, which ends up being our tcpdump process.
Then end result looks like this if I were to run it across two machines:
% dsh -c -m web1000,web1001 \
'sudo /usr/sbin/tcpdump -i eth0 -w - -s 0 -x -n -q -tttt "port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2" | \
ssh dshhost "cat - > ~/captures/$(hostname -a).cap" & sleep 10 ; \
sudo pkill -o -P $(ps -ef | awk "\$2 ~ /\<$!\>/ { print \$3; }")'
So this issues a dsh to two of our hosts (you can make a dsh group with 100 or 1000 hosts though) and runs the command concurrently on each (-c). We issue our tcpdump on each target machine and send the output to stdout for ssh to then cat back to a directory on the source machine that issued the dsh. This way we have all of our captures in one directory with each file named with the target name of each host the tcpdump was run. The sleep is how long the dump is going to run for before we then kill off the tcpdump.
The last piece of the puzzle is to get these all into one file and we can use the mergecap tool for this, which is also part of wireshark:
% /usr/sbin/mergecap -F libpcap -w output.cap *.cap
And then we can analyze it like we did above.
http://www.mysqlperformanceblog.com/2011/04/18/how-to-use-tcpdump-on-very-busy-hosts
http://stackoverflow.com/questions/687948/timeout-a-command-in-bash-without-unnecessary-delay
http://www.xaprb.com/blog/2009/08/18/how-to-find-un-indexed-queries-in-mysql-without-using-the-log/
Just to clarify this command a bit more, particularly how the kill part works since that was the trickiest part for me to figure out.
When we run this
$ dsh -c -m web1000,web1001 \
'sudo /usr/sbin/tcpdump -i eth0 -w - -s 0 -x -n -q -tttt "port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2" | \
ssh dshhost "cat - > ~/captures/$(hostname -a).cap" & sleep 10 ; \
sudo pkill -o -P $(ps -ef | awk "\$2 ~ /\<$!\>/ { print \$3; }")'
on the server the process list looks something like
user 12505 12504 0 03:12 ? 00:00:00 bash -c sudo /usr/sbin/tcpdump -i eth0 -w - -s 0 -x -n -q -tttt "port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2" | ssh myhost.myserver.com "cat - > /home/etsy/captures/$(hostname -a).cap" & sleep 5 ; sudo pkill -o -P $(ps -ef | awk "\$2 ~ /\<$!\>/ { print \$3; }")
pcap 12506 12505 1 03:12 ? 00:00:00 /usr/sbin/tcpdump -i eth0 -w - -s 0 -x -n -q -tttt port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2
user 12507 12505 0 03:12 ? 00:00:00 ssh myhost.myserver.com cat - > ~/captures/web1001.cap
So $! is going to return the pid of the ssh process, 12507. We use awk to find the process matching that, and then print the parent pid out, which is then passed to the -P arg of pkill. If you use pgrep to look at this without the -o you’d get a list of the children of 12505, which are 12506 and 12507. The oldest child is the tcpdump command and so adding -o kills that guy off.
So if we were only running the command on one host we could use something much simpler
ssh dbhost01 '(sudo /usr/sbin/tcpdump -i eth0 -w - -s 0 port 3306) & sleep 10; sudo kill $!' | cat - > output.cap
Jan 18 2010
With more and more traffic pouring into websites, it has become necessary to come up with creative ways to parse and analyze large data sets. One of the popular ways to do that lately is using MapReduce which is a framework used across distributed systems to help make sense of large data sets. There are lots of implementations of the map/reduce framework but an easy way to get started is by using MongoDB. MongoDB is a scalable and high performant document oriented database. It has replication, sharding, and mapreduce all built in which makes it easy to scale horizontally.
For this article we’ll look a common use case of map/reduce which is to help analyze your apache logs. Since there is no set schema in a document oriented database, its a good fit for log files since its fairly easy to import arbitrary data. We’ll look at getting the data into a format that mongo can import, and writing a map/reduce algorithm to make sense of some of the data.
Mongo is easy to install with detailed documentation here. In a nutshell you can do
$ mkdir -p /data/db $ curl -O http://downloads.mongodb.org/osx/mongodb-osx-i386-latest.tgz $ tar xzf mongodb-osx-i386-latest.tgz
At this point its not a bad idea to put this directory somewhere like /opt and adding its bin directory to your path. That way instead of
./mongodb-xxxxxxx/bin/mongod &
You can just do
mongodb &
In any case, start up the daemon one of those two ways depending how you set it up.
Apache access files can vary in the information reported. The log format is easy to change with the LogFormat directive which is documented here. In any case these logs that I’m working with are not out of the box apache format. They look something like this
Jan 18 17:20:26 web4 logger: networks-www-v2 84.58.8.36 [18/Jan/2010:17:20:26 -0500] "GET /javascript/2010-01-07-15-46-41/97fec578b695157cbccf12bfd647dcfa.js HTTP/1.1" 200 33445 "http://www.channelfrederator.com/hangover/episode/HNG_20100101/cuddlesticks-cartoon-hangover-4" "Mozilla/5.0 (Windows; U; Windows NT 6.1; de; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7" www.channelfrederator.com 35119
We want to take this raw log and convert it to a JSON structure for importing into mongo, which I wrote a simple perl script to iterate through the log and parse it into sensible fields using a regular expression
#!/usr/bin/perl
use strict;
use warnings;
my $logfile = "/logs/httpd/remote_www_access_log";
open(LOGFH, "$logfile");
foreach my $logentry (<LOGFH>) {
chomp($logentry); # remove the newline
$logentry =~ m/(\w+\s\w+\s\w+:\w+:\w+)\s #date
(\w+)\slogger:\s # server host
([^\s]+)\s # vhost logger
(?:unknown,\s)?(-|(?:\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3},?\s?)*)\s #ip
\[(.*)\]\s # date again
\"(.*?)\"\s # request
(\d+)\s #status
([\d-]+)\s # bytes sent
\"(.*?)\"\s # referrer
\"(.*?)\"\s # user agent
([\w.-]+)\s? # domain name
(\d+)? # time to server (ms)
/x;
print <<JSON;
{"date": "$5", "webserver": "$2", "logger": "$3", "ip": "$4", "request": "$6", "status": "$7", "bytes_sent": "$8", "referrer": "$9", "user_agent": "$10", "domain_name": "$11", "time_to_serve": "$12"}
JSON
}
Again my regular expression probably won’t quite work on your logs, though you may be able to take bits and pieces of what I’ve documented above to use for your script. On my logs that script outputs a bunch of lines that look like this
{"date": "18/Jan/2010:17:20:26 -0500", "webserver": "web4", "logger": "networks-www-v2", "ip": "84.58.8.36", "request": "GET /javascript/2010-01-07-15-46-41/97fec578b695157cbccf12bfd647dcfa.js HTTP/1.1", "status": "200", "bytes_sent": "33445", "referrer": "http://www.channelfrederator.com/hangover/episode/HNG_20100101/cuddlesticks-cartoon-hangover-4", "user_agent": "Mozilla/5.0 (Windows; U; Windows NT 6.1; de; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7", "domain_name": "www.channelfrederator.com", "time_to_serve": "35119"}
And we can then import that directly to MongoDB. The creates a collection called weblogs in the logs database, from the file that has the output of the above JSON generator script
$ mongoimport --type json -d logs -c weblogs --file weblogs.json
We can also take a look at them and verify they loaded by running the find command, which dumps out 10 rows by default
$ mongo > use logs; switched to db logs > db.weblogs.find()
So for this example what I am looking for is how many hits are going to each domain. My servers handle a bunch of different domain names and that is one thing I’m outputting into my logs that is easy to examine
The basic command line interface to MondoDB is a kind of javascript interpreter, and the mongo backend takes javascript implementations of the map and reduce functions. We can type these directly into the mongo console. The map function must emit a key/value pair and in this example it will output a ’1′ each time a domain is found
> map = "function() { emit(this.domain_name, {count: 1}); }"
And so basically what comes out of this is a key for each domain with a set of counts, something like
{"www.something.com", [{count: 1}, {count: 1}, {count: 1}, {count: 1}]}
This is send to the reduce function, which sums up all those counts for each domain
> reduce = "function(key, values) { var sum = 0; values.forEach(function(f) { sum += f.count; }); return {count: sum}; };"
Now that you’ve setup map and reduce functions, you can call mapreduce on the collection
> results = db.weblogs.mapReduce(map, reduce)
...
>results
{
"result" : "tmp.mr.mapreduce_1263861252_3",
"timeMillis" : 9034,
"counts" : {
"input" : 1159355,
"emit" : 1159355,
"output" : 92
},
"ok" : 1,
}
This gives us a bit of information about the map/reduce operation itself. We see that we inputted a set of data and emmitted once per item in that set. And we reduced down to 92 domain with a count for each. A result collection is given, and we can print it out
> db.tmp.mr.mapreduce_1263861252_3.find() > it
You can type the ‘it’ operator to page through multiple pages of data. Or you can print it all at once like so
> db.tmp.mr.mapreduce_1263861252_3.find().forEach( function(x) { print(tojson(x));});
{ "_id" : "barelydigital.com", "value" : { "count" : 342888 } }
{ "_id" : "barelypolitical.com", "value" : { "count" : 875217 } }
{ "_id" : "www.fastlanedaily.com", "value" : { "count" : 998360 } }
{ "_id" : "www.threadbanger.com", "value" : { "count" : 331937 } }
Nice, we have our data aggregated and the answer to our initial problem.
As usual CPAN comes to the rescue with a MongoDB driver that we can use to interface with our database in a scripted fashion. The mongo guys have also done a great job of supporting it in a bunch of other languages which makes it easy to interface with if you are using more than just perl
#!/bin/perl
use MongoDB;
use Data::Dumper;
use strict;
use warnings;
my $conn = MongoDB::Connection->new("host" => "localhost", "port" => 27017);
my $db = $conn->get_database("logs");
my $map = "function() { emit(this.domain_name, {count: 1}); }";
my $reduce = "function(key, values) { var sum = 0; values.forEach(function(f) { sum += f.count; }); return {count: sum}; }";
my $idx = Tie::IxHash->new(mapreduce => 'weblogs', 'map' => $map, reduce => $reduce);
my $result = $db->run_command($idx);
print Dumper($result);
my $res_coll = $result->{'result'};
print "result collection is $res_coll\n";
my $collection = $db->get_collection($res_coll);
my $cursor = $collection->query({ }, { limit => 10 });
while (my $object = $cursor->next) {
print Dumper($object);
}
The script is straightforward. Its just using the documented perl interface to make a run_command call to mongo, which passes the map and reduce javascript functions in. It then prints the results similar to how we did on the command line earlier.
We’ve gone over a very simple example of how MapReduce can work for you. There are lots of other ways you can put it to good use such as distributed sorting and searching, document clustering, and machine learning. We also took a look at MongoDB which has great uses for schema-less data. It makes it easy to scale since it has built in replication and sharding capabilities. Now you can put map/reduce to work on your logs and find a ton of information you couldn’t easily get before.
http://github.com/mongodb/mongo
http://www.mongodb.org/display/DOCS/MapReduce
http://apirate.posterous.com/visualizing-log-files-with-mongodb-mapreduce
http://search.cpan.org/~kristina/MongoDB-0.27/
http://kylebanker.com/blog/2009/12/mongodb-map-reduce-basics/
Jun 11 2009
Almost every company I can think of uses Memcached at some layer of their stack, however I haven’t until recently found a great way to take a snapshot of the keys in memory and their associated metadata such as expire time, LRU time, the value size and whether its been expired or flushed. The tool to do this is called peep
The main thing about installing peep is that you have to compile memcached with debugging symbols. Not really a big deal though. First thing you’ll want to do is install libevent
$ wget http://monkey.org/~provos/libevent-1.4.11-stable.tar.gz $ tar xzvf libevent-1.4.11-stable.tar.gz $ cd libevent-1.4.11-stable $ ./configure $ make $ sudo make install
Now grab memcached
$ wget http://memcached.googlecode.com/files/memcached-1.2.8.tar.gz $ tar xzvf memcached-1.2.8.tar.gz $ cd memcached-1.2.8 $ CFLAGS='-g' ./configure --enable-threads $ make $ sudo make install
Note the configure line on this one sets the -g flag which enables debug symbols. Now move this directory somewhere standard like /usr/local/src because we’ll need to reference it when installing peep
$ sudo mv memcached-1.2.8 /usr/local/src
Ok, now for peep, which is written in ruby. If you don’t have rubygems you need that and the ruby headers. Just use your package management system for this. I happened to be on a CentOS box so I ran
$ sudo yum install rubygems.noarch $ sudo yum install ruby-devel.i386
Finally you can install peep
$ sudo gem install peep -- --with-memcached-include=/usr/local/bin/memcached-1.2.8
Finally you can actually use this thing, you just give it the running memcached process pid
$ sudo peep --pretty `cat /var/run/memcached/memcached.pid`
Here’s a snippet of what peep can show us in its “pretty” mode
time | exptime | nbytes | nsuffix | it_f | clsid | nkey | key | exprd | flushd
485 | 785 | 17925 | 10 | link | 25 | 64 | "post.ordered-posts1-03afdb" | true | false
537 | 721 | 16991 | 10 | link | 24 | 63 | "post.ordered-posts1-03bd6" | true | false
240 | 3684 | 434 | 8 | link | 9 | 22 | "channel.channel.105.v1" | false | false
241 | 3687 | 27286 | 10 | link | 27 | 55 | "post.post_count-fec35129" | false | false
538 | 4022 | 3223 | 9 | link | 17 | 55 | "post.post_count-2ff57a7" | false | false
538 | 4024 | 17169 | 10 | link | 25 | 55 | "post.post_count-2ba928998d" | false | false
241 | 3686 | 10763 | 10 | link | 22 | 55 | "post.post_count-3879a24011" | false | false
25 | 320 | 8874 | 9 | link | 22 | 22 | "channel.posterframes.4" | true | false
The above is not the easiest thing to analyze, especially if you have a lot of data in cache. But we can easily load it in to MySQL so that we can run queries on it.
First create the db and permissions
mysql> create database peep; mysql> grant all on peep.* to peep@'localhost' identified by 'peep4u';
Then a table to store the output of the peep snapshot
mysql> CREATE TABLE `entries` ( `lru_time` int(11) DEFAULT NULL, `expires_at_time` int(11) DEFAULT NULL, `value_size` int(11) DEFAULT NULL, `suffix_size` int(11) DEFAULT NULL, `it_flag` varchar(255) DEFAULT NULL, `class_id` int(11) DEFAULT NULL, `key_size` int(11) DEFAULT NULL, `key_name` varchar(255) DEFAULT NULL, `is_expired` varchar(255) DEFAULT NULL, `is_flushed` varchar(255) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Now run peep and output the data in “ugly” format into a file
$ sudo peep --ugly `cat /var/run/memcached/memcached.pid` > peep.out
And now you can load it into your entries table
mysql> load data local infile '/tmp/peep.out' into table entries fields terminated by ' | ' lines terminated by '\n'; Query OK, 177 rows affected (0.01 sec) Records: 177 Deleted: 0 Skipped: 0 Warnings: 0
I’m only loading a small dev install of memcahed but if you are running this on production you’d be importing many more rows. Luckily load data infile is sufficiently optimized to input large datasets. Somewhat unfortunately however, peep makes memcached block while its taking its snapshot, which can take a bit of time in production.
In any case, not that interesting with dev data but you can get lots of interesting numbers out of this. In my case it looks like most of the stuff in my cache is expired already.
mysql> select is_expired, count(*) as num, sum(value_size) as value_size from entries group by is_expired; +------------+-----+------------+ | is_expired | num | value_size | +------------+-----+------------+ | false | 1 | 415 | | true | 176 | 2392792 | +------------+-----+------------+ 2 rows in set (0.01 sec)
Or selecting by grouping slabs and displaying their sizes
mysql> select class_id as slab_class, max(value_size) as slab_size from entries group by slab_class; +------------+-----------+ | slab_class | slab_size | +------------+-----------+ | 1 | 8 | | 2 | 15 | | 9 | 445 | | 12 | 1100 | | 13 | 1402 | | 14 | 1710 | | 15 | 2174 | | 16 | 2409 | | 17 | 3223 | | 20 | 6154 | | 21 | 8395 | | 22 | 10763 | | 23 | 13395 | | 24 | 16991 | | 25 | 17925 | | 26 | 23320 | | 27 | 33361 | | 28 | 38824 | | 29 | 44904 | +------------+-----------+ 19 rows in set (0.00 sec)
Although there are a lot of tools such as Cacti that will display graphs of your Memcached usage, there are not many tools that will actually show you specifics of whats in memory. This tool can be used for a variety of reasons, but its especially useful to example the keys that you have in memory and the size of their values, as well as if they’re expired and what the individuals slabs are holding.
May 17 2009
In my opinion DBIx::Class is one of the best ORM solutions out there. Not only can it model your database, including mapping out any foreign key relationships, it can also be used as a canonical point of reference for your schema. This means we can use it not only as an application layer interface to the database, but can also define a versioned database structure using the code, where if you add a field into your result class, you can generate a version of alter statements and a DDL just by running a simple script. Of course this set of alterations can later be applied to your stage and production environments.
It’s possible to produce your schema classes with DBIx::Class::Schema::Loader if you have a pre-existing database and are initially generating your DBIC classes, but this article is going to show going from the ground up how to build your database from the code defining the fields and tables as well as the relations, which match to foreign key constraints when the database definition files are generated.
This example will be a pretty simple schema but enough to demonstrate generating a couple of tables with a foreign key constraint. Since a lot of people are doing social networking stuff these days we can assume a common use case would be a user who has a bunch of photos. So we’ll need to model a user table and a photo table, but first things first, lets create a few basic things, assuming we’ll call our application MySocialApp
mkdir -p MySocialApp/lib/MySocialApp/Schema/Result cd MySocialApp
We need to create a schema file at lib/MySocialApp/Schema.pm that inherits from the base DBIx::Class::Schema
package MySocialApp::Schema;
use base qw/DBIx::Class::Schema/;
use strict;
use warnings;
our $VERSION = '0.00001';
__PACKAGE__->load_namespaces();
__PACKAGE__->load_components(qw/Schema::Versioned/);
__PACKAGE__->upgrade_directory('sql/');
1;
So here all we are doing is extending the base Schema and loading the Versioned component. We’re setting the directory where DDL and Diff files will be generated into the sql directory. We’re also invoking load_namespaces which tells it to look at the MySocialApp::Schema::Result namespace for our result classes by default.
Here I will define what the database is going to look like
package MySocialApp::Schema::Result::User;
use Moose;
use namespace::clean -except => 'meta';
extends 'DBIx::Class';
__PACKAGE__->load_components(qw/Core/);
__PACKAGE__->table('user');
__PACKAGE__->add_columns(
user_id => {
data_type => "INT",
size => 11,
is_nullable => 0,
is_auto_increment => 1,
extra => { unsigned => 1 },
},
username => {
data_type => "VARCHAR",
is_nullable => 0,
size => 255,
},
);
__PACKAGE__->set_primary_key('user_id');
__PACKAGE__->has_many(
'photos',
'MySocialApp::Schema::Result::Photo',
{ "foreign.fk_user_id" => "self.user_id" },
);
1;
This is relatively straightforward. I’m creating a user table that has an auto incrementing primary key and a username field. I’m also defining a relationship that says a user can have many photos. Lets create the photo class.
package MySocialApp::Schema::Result::Photo;
use Moose;
use namespace::clean -except => 'meta';
extends 'DBIx::Class';
__PACKAGE__->load_components(qw/Core/);
__PACKAGE__->table('photo');
__PACKAGE__->add_columns(
photo_id => {
data_type => "INT",
size => 11,
is_nullable => 0,
is_auto_increment => 1,
extra => { unsigned => 1 },
},
url => {
data_type => "VARCHAR",
is_nullable => 0,
size => 255,
},
fk_user_id => {
data_type => "INT",
size => 11,
is_nullable => 0,
extra => { unsigned => 1 },
},
);
__PACKAGE__->set_primary_key('photo_id');
__PACKAGE__->belongs_to(
'user',
'MySocialApp::Schema::Result::User',
{ 'foreign.user_id' => 'self.fk_user_id' },
);
1;
Same basic thing here, a photo class with a photo_id primary key, a url field, and an fk_user_id field that keys into the user table. Each photo belongs to a user, and this relationship will define our foreign key constraint when the schema is generated.
We have the main DBIx::Class pieces in place to generate the database, but we’ll need a couple of scripts to support our versioned database. One script will generate the schema based on the version before it, introspecting which alterations have been made and producing a SQL diff file to alter the database. The other script will look at the database to see if it needs upgrading, and run the appropriate diff files to bring it up to the current version.
First the schema and diff generation script which we’ll call script/gen_schema.pl
use strict;
use FindBin;
use lib "$FindBin::Bin/../lib";
use Pod::Usage;
use Getopt::Long;
use MySocialApp::Schema;
my ( $preversion, $help );
GetOptions(
'p|preversion:s' => \$preversion,
) or die pod2usage;
my $schema = MySocialApp::Schema->connect(
'dbi:mysql:dbname=mysocialapp;host=localhost', 'mysocialapp', 'mysocialapp4u'
);
my $version = $schema->schema_version();
if ($version && $preversion) {
print "creating diff between version $version and $preversion\n";
} elsif ($version && !$preversion) {
print "creating full dump for version $version\n";
} elsif (!$version) {
print "creating unversioned full dump\n";
}
my $sql_dir = './sql';
$schema->create_ddl_dir( 'MySQL', $version, $sql_dir, $preversion );
This script will be run anytime we change something in the Result files. You give it the previous schema version, and it will create a diff between that and the new version. Before running this you’ll update the $VERSION variable in lib/MySocialApp/Schema.pm so that it knows a change has been made.
The next script is the upgrade script, we’ll call it script/upgrade_db.pl
use strict;
use FindBin;
use lib "$FindBin::Bin/../lib";
use MySocialApp::Schema;
my $schema = MySocialApp::Schema->connect(
'dbi:mysql:dbname=mysocialapp;host=localhost', 'mysocialapp', 'mysocialapp4u'
);
if (!$schema->get_db_version()) {
# schema is unversioned
$schema->deploy();
} else {
$schema->upgrade();
}
This script checks to see if any diffs need to be applied, and applies them if the version held by the database and the version in your Schema.pm file differ, bringing the database up to the correct schema version.
Note, in these scripts I’ve hardcoded the DB info which really should go into a configuration file.
We need to create the database that our tables will be created in. In the connect calls above we’re using this user and password to connect to our database. I’m using MySQL for the example so this would be done on the MySQL command prompt
mysql> create database mysocialapp; Query OK, 1 row affected (0.00 sec) mysql> grant all on mysocialapp.* to mysocialapp@'localhost' identified by 'mysocialapp4u'; Query OK, 0 rows affected (0.01 sec)
Now its time to deploy our initial schema into MySQL. But for the first go round we also have to create the initial DDL file, this way when we make changes in the future it can be compared against the Schema result classes to see what changes have been made. We can do this by supplying a nonexistent previous version to our gen_schema.pl script
$ perl script/gen_schema.pl -p 0.00000 Your DB is currently unversioned. Please call upgrade on your schema to sync the DB. creating diff between version 0.00001 and 0.00000 No previous schema file found (sql/MySocialApp-Schema-0.00000-MySQL.sql) at /home/jgoulah/perl5/lib/perl5/DBIx/Class/Storage/DBI.pm line 1685.
And we can see the DDL file now exists
$ ls sql/ MySocialApp-Schema-0.00001-MySQL.sql
Then we need to deploy to MySQL for the first time so we run the upgrade script
$ perl script/upgrade_db.pl Your DB is currently unversioned. Please call upgrade on your schema to sync the DB.
Now check out what its created in MySQL
mysql> use mysocialapp; Database changed mysql> show tables; +----------------------------+ | Tables_in_mysocialapp | +----------------------------+ | dbix_class_schema_versions | | photo | | user | +----------------------------+ 3 rows in set (0.00 sec)
There is our photo table, our user table, and also a dbix_class_schema_versions table. This last table just keeps track of what version the database is. You can see we are in sync with the Schema class by selecting from that table and also when this version was installed.
mysql> select * from dbix_class_schema_versions; +---------+---------------------+ | version | installed | +---------+---------------------+ | 0.00001 | 2009-05-17 21:59:57 | +---------+---------------------+ 1 row in set (0.00 sec)
The really great thing is that we have created the tables -and- constraints based on our schema result classes. Check out our photo table
mysql> show create table photo\G
*************************** 1. row ***************************
Table: photo
Create Table: CREATE TABLE `photo` (
`photo_id` int(11) unsigned NOT NULL auto_increment,
`url` varchar(255) NOT NULL,
`fk_user_id` int(11) unsigned NOT NULL,
PRIMARY KEY (`photo_id`),
KEY `photo_idx_fk_user_id` (`fk_user_id`),
CONSTRAINT `photo_fk_fk_user_id` FOREIGN KEY (`fk_user_id`) REFERENCES `user` (`user_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
Lets say we want to add a password field to our user table. I’d open up the lib/MySocialApp/Schema/Result/User.pm file and add a section for my password field to the add_columns definition so now it looks like:
__PACKAGE__->add_columns(
user_id => {
data_type => "INT",
size => 11,
is_nullable => 0,
is_auto_increment => 1,
extra => { unsigned => 1 },
},
username => {
data_type => "VARCHAR",
is_nullable => 0,
size => 255,
},
password => {
data_type => "VARCHAR",
is_nullable => 0,
size => 255,
},
);
Then we update the lib/MySocialApp/Schema.pm file and update to the next version so it looks like
our $VERSION = '0.00002';
To create the DDL and Diff for this version we run the gen_schema script with the previous version as the argument
$ perl script/gen_schema.pl -p 0.00001 Versions out of sync. This is 0.00002, your database contains version 0.00001, please call upgrade on your Schema. creating diff between version 0.00002 and 0.00001
If you look in the sql directory there are two new files. The DDL is named MySocialApp-Schema-0.00002-MySQL.sql and the diff is called MySocialApp-Schema-0.00001-0.00002-MySQL.sql and has the alter statement
$ cat sql/MySocialApp-Schema-0.00001-0.00002-MySQL.sql -- Convert schema 'sql/MySocialApp-Schema-0.00001-MySQL.sql' to 'MySocialApp::Schema v0.00002':; BEGIN; ALTER TABLE user ADD COLUMN password VARCHAR(255) NOT NULL; COMMIT;
Now we can apply that with our upgrade script
$ perl script/upgrade_db.pl Versions out of sync. This is 0.00002, your database contains version 0.00001, please call upgrade on your Schema. DB version (0.00001) is lower than the schema version (0.00002). Attempting upgrade.
and see that the upgrade has been made in MySQL
mysql> describe user; +----------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +----------+------------------+------+-----+---------+----------------+ | user_id | int(11) unsigned | NO | PRI | NULL | auto_increment | | username | varchar(255) | NO | | | | | password | varchar(255) | NO | | | | +----------+------------------+------+-----+---------+----------------+ 3 rows in set (0.00 sec)
So now we’ve seen a really easy way that we can maintain our database schema from our ORM code. We have a versioned database that is under revision control, and can keep our stage and production environments in sync with our development code. With this type of setup its also easy to maintain branches with different database changes and merge them into the mainline, and it also ensures that the database you are developing on is always in sync with the code.
Jan 25 2009
There are various reasons to setup quick “sandbox” instances of MySQL. You can use them to test different types of replication (such as master-master or various slave topologies), to test your code against different versions of MySQL, or to setup instances of MySQL on a per developer basis where each person has their own database running on a different port so they can breakdown/setup the DB easily or make schema changes without affecting other team members. A perfect tool to do all of these things easily is MySQL Sandbox.
To use MySQL sandbox effectively you need two things, the MySQL sandbox tool itself, and a MySQL tarball that the sandbox script can use to setup instances.
You can download the latest MySQL sandbox version like so (2.0.12 as of this writing):
wget http://launchpad.net/mysql-sandbox/mysql-sandbox-2.0/2.0/+download/mysql_sandbox_2.0.12.tar.gz
You can download the MySQL tarball from any MySQL mirror. Its important to get the non RPM, Intel C/C++ compiled, glibc-2.3 version (look for a tar.gz file with “icc” in the filename) for example if you want version 5.1:
wget ftp://mirror.anl.gov/pub/mysql/Downloads/MySQL-5.1/mysql-5.1.30-linux-i686-icc-glibc23.tar.gz
First you need to extract the MySQL Sandbox tool and change directory to it:
$ tar xzvf mysql_sandbox_2.0.12.tar.gz $ cd mysql_sandbox_2.0.12
The easiest and quickest way to create an instance is:
$ ./make_sandbox /path/to/mysql-X.X.XX-osinfo.tar.gz
where mysql-X.X.XX-osinfo.tar.gz is the MySQL tarball we just downloaded. And you’re done.
However, this will put the sandbox in a directory under your home directory ($HOME/sandboxes/msb_X_X_XX), which may or may not suit your purposes. It sets it up with default users, passwords, ports, and directory name. Lets fine tune things a bit.
I want to put my instance into a partition I created called /mnt/mysql_sandboxes. I’ve created a subdirectory in there called tarballs, which holds the MySQL tarball that we downloaded above which MySQL Sandbox will extract for setup. Since I’m installing version 5.1.30 I want to call the directory that houses the MySQL data files 5.1.30_single, but you can call it anything you like. I’ll create a default user named jgoulah and a password goulah. By default it sets the port to the version number without the dots (5130 in this case) so we’ll give it a custom port so that it listens on 10000 instead.
mysql_sandbox_2.0.12 $ ./make_sandbox \ /mnt/mysql_sandboxes/tarballs/mysql-5.1.30-linux-i686-icc-glibc23.tar.gz \ --upper_directory=/mnt/mysql_sandboxes/ --sandbox_directory=5.1.30_single \ --db_user=jgoulah --db_password=goulah --sandbox_port=10000
Here’s the output:
unpacking /mnt/mysql_sandboxes/tarballs/mysql-5.1.30-linux-i686-icc-glibc23.tar.gz
Executing ./low_level_make_sandbox \
--basedir=/mnt/mysql_sandboxes/tarballs/5.1.30 \
--sandbox_directory=msb_5_1_30 \
--install_version=5.1 \
--sandbox_port=5130 \
--no_ver_after_name \
--upper_directory=/mnt/mysql_sandboxes/ \
--sandbox_directory=5.1.30_single \
--db_user=jgoulah \
--db_password=goulah \
--basedir=/mnt/mysql_sandboxes/tarballs/5.1.30 \
--sandbox_port=10000 \
--my_clause=log-error=msandbox.err
The MySQL Sandbox, version 2.0.12 16-Oct-2008
(C) 2006,2007,2008 Giuseppe Maxia, Sun Microsystems, Database Group
installing with the following parameters:
upper_directory = /mnt/mysql_sandboxes/
sandbox_directory = 5.1.30_single
sandbox_port = 10000
datadir_from = script
install_version = 5.1
basedir = /mnt/mysql_sandboxes/tarballs/5.1.30
my_file =
operating_system_user = jgoulah
db_user = jgoulah
db_password = goulah
my_clause = log-error=msandbox.err
prompt_prefix = mysql
prompt_body = [\h] {\u} (\d) > '
force = 0
no_ver_after_name = 1
verbose = 0
load_grants = 1
no_load_grants = 0
do you agree? ([Y],n) y
loading grants
. sandbox server started
installation options saved to current_options.conf.
To repeat this installation with the same options,
use ./low_level_make_sandbox --conf_file=current_options.conf
----------------------------------------
Your sandbox server was installed in /mnt/mysql_sandboxes//5.1.30_single
Its now installed and started up, we can see that the process is running with the correct options:
$ ps -ef | grep mysql | grep jgoulah jgoulah 11128 1 0 13:48 pts/3 00:00:00 /bin/sh /mnt/mysql_sandboxes/tarballs/5.1.30/bin/mysqld_safe --defaults-file=/mnt/mysql_sandboxes//5.1.30_single/my.sandbox.cnf jgoulah 11203 11128 0 13:48 pts/3 00:00:00 /mnt/mysql_sandboxes/tarballs/5.1.30/bin/mysqld --defaults-file=/mnt/mysql_sandboxes//5.1.30_single/my.sandbox.cnf --basedir=/mnt/mysql_sandboxes/tarballs/5.1.30 --datadir=/mnt/mysql_sandboxes//5.1.30_single/data --user=jgoulah --log-error=/mnt/mysql_sandboxes//5.1.30_single/data/msandbox.err --pid-file=/mnt/mysql_sandboxes//5.1.30_single/data/mysql_sandbox10000.pid --socket=/tmp/mysql_sandbox10000.sock --port=10000
And we can connect to it on the port 10000:
$ mysql -u jgoulah --protocol=TCP -P 10000 -pgoulah
You can also go into the directory where we’ve installed this and there are some convenience scripts:
$ cd /mnt/mysql_sandboxes/5.1.30_single/
You can run the use script to connect into mysql (same thing we just did above except we don’t have to remember our port, user, or pass):
$ ./use
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.1.30 MySQL Community Server (GPL)
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql [localhost] {jgoulah} ((none)) >
Stop the instance:
$ ./stop
Or start it back up again:
$ ./start . sandbox server started
There are a few other scripts which you can experiment with in this directory which are documented here
The nice thing about this tool is it will also setup replicated instances of MySQL with a single command. This allows you to test your application under a replicated environment, or even test different replication topologies including multiple slaves or multi-master replication.
We’ll use similar options as the single instance above, except we’ll use port 11000 this time (the slaves get port + 1, …, port + n where n is number of slaves). We’ll put the install into /mnt/mysql_sandboxes/5.1.30_replicated. Note this time we use the make_replication_sandbox script:
mysql_sandbox_2.0.12 $ ./make_replication_sandbox \ /mnt/mysql_sandboxes/tarballs/mysql-5.1.30-linux-i686-icc-glibc23.tar.gz \ --upper_directory=/mnt/mysql_sandboxes/ \ --replication_directory=5.1.30_replicated --sandbox_base_port=11000 installing and starting master installing slave 1 installing slave 2 starting slave 1 . sandbox server started starting slave 2 . sandbox server started initializing slave 1 initializing slave 2 replication directory installed on /mnt/mysql_sandboxes//5.1.30_replicated
Now we have a master and two slaves going. Note that the command to setup replicated sandboxes will not let us specify the user as we did with the single instance, but two users are created by default:
User: root@localhost Password: msandbox
User: msandbox@% Password: msandbox
You can run the use script as shown above, or connect directly to the master:
$ mysql -u msandbox --protocol=TCP -P 11000 -pmsandbox
Create a database:
mysql> create database jg_repl_test; mysql> exit;
Connect to one of the slaves:
$ mysql -u msandbox --protocol=TCP -P 11001 -pmsandbox mysql> show databases like '%jg_%'; +------------------+ | Database (%jg_%) | +------------------+ | jg_repl_test | +------------------+
And we can see that it has replicated across.
There are various options you can give the make_replication_sandbox command, for example you can give it the –master_master option to setup a multi master instance.
We’ve seen how to create a single instance of MySQL and also a replicated master with two slaves. Each of these takes only a few seconds to setup once you have the directory layout decided. There really isn’t a much easier way to setup scratch instances of MySQL for all kinds of different purposes.
Jan 22 2009
LVM snapshots are a quick and easy way to take a backup of your MySQL server’s data files. The idea is to acquire a read lock on all tables, flush all server caches to disk, and create a snapshot of the volume containing the MySQL data directory, then unlock the tables again. This gives an atomic snapshot that takes only milliseconds. A mysql instance can then be launched to point at this data directory and you can perform a mysqldump to get data for setting up slaves, or you can take a completely destroyed DB and use the data directory to restore it to this point in time.
Figure out the partition you want to put the LVM on, in our case we’ll use /data2
$ df -h | grep data2 /dev/sda2 371G 195M 352G 1% /data2
We’ll need to run fdisk on it. If you have data on here, save it to somewhere else!
$ sudo fdisk /dev/sda2 Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel Building a new DOS disklabel. Changes will remain in memory only, until you decide to write them. After that, of course, the previous content won't be recoverable. The number of cylinders for this disk is set to 49932. There is nothing wrong with that, but this is larger than 1024, and could in certain setups cause problems with: 1) software that runs at boot time (e.g., old versions of LILO) 2) booting and partitioning software from other OSs (e.g., DOS FDISK, OS/2 FDISK) Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite) Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 1 First cylinder (1-49932, default 1): Using default value 1 Last cylinder or +size or +sizeM or +sizeK (1-49932, default 49932): Using default value 49932 Command (m for help): t Selected partition 1 Hex code (type L to list codes): 8e Changed system type of partition 1 to 8e (Linux LVM)
You can review the change with the ‘p’ command:
Command (m for help): p Disk /dev/sda2: 410.7 GB, 410704680960 bytes 255 heads, 63 sectors/track, 49932 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda2p1 1 49932 401078758+ 8e Linux LVM
You might need to reboot here, so go ahead and run sudo shutdown -r now
Now unmount the disk we want to create the lvm on and create a physical volume
$ sudo umount /dev/sda2 $ sudo pvcreate /dev/sda2 Physical volume "/dev/sda2" successfully created
You can see our new physical volume with pvdisplay
$ sudo pvdisplay "/dev/sda2" is a new physical volume of "382.50 GB" --- NEW Physical volume --- PV Name /dev/sda2 VG Name PV Size 382.50 GB Allocatable NO PE Size (KByte) 0 Total PE 0 Free PE 0 Allocated PE 0 PV UUID myxfgQ-alfA-y8sX-af4a-ifoA-3S29-YD3tuK
Now create a volume group on the physical volume, since we’re using this for mysql we’ll call it mysql
$ sudo vgcreate mysql /dev/sda2 Volume group "mysql" successfully created
And you can view it
$ sudo vgdisplay --- Volume group --- VG Name mysql System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 1 VG Access read/write VG Status resizable MAX LV 0 Cur LV 0 Open LV 0 Max PV 0 Cur PV 1 Act PV 1 VG Size 382.50 GB PE Size 4.00 MB Total PE 97919 Alloc PE / Size 0 / 0 Free PE / Size 97919 / 382.50 GB VG UUID 0VZxkR-873N-E6Vj-H6SV-FCWO-3jHF-ykHRiX
Create a logical volume on the volume group, we’ll put it on our mysql volume group and call it maindb. We want it to be about half (or less) of the entire amount of space we have to make a backup, in this case thats about half of 382GB so we’ll go with 190:
$ sudo lvcreate --name maindb --size 190G mysql Logical volume "maindb" created
And you can view it:
$ sudo lvdisplay --- Logical volume --- LV Name /dev/mysql/maindb VG Name mysql LV UUID 4hlcMf-cs5a-xjYU-yxjE-Fx2f-Pngv-z0RAEH LV Write Access read/write LV Status available # open 0 LV Size 190.00 GB Current LE 48640 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:0
Now we have to put an actual filesystem on it:
$ sudo mkfs.ext3 /dev/mysql/maindb
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
24903680 inodes, 49807360 blocks
2490368 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
1520 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 26 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
And now we can actually re mount it on the /data2 folder:
$ sudo mount /dev/mysql/maindb /data2
And add it to fstab, making sure to comment out the old entry
#LABEL=/data2 /data2 ext3 defaults 1 2 /dev/mysql/maindb /data2 ext3 rw,noatime 0 0
Note that we do not have to create the other logical volume on the volume group that is used for the snapshot, as the lvmbackup tool will take care of it for us.
Now we want to move our mysql installation into this folder:
$ sudo mkdir -p /data2/var/lib $ sudo mv /var/lib/mysql/ /data2/var/lib/
We need to ensure the my.cnf has settings pointing to this new directory:
[mysqld] datadir=/data2/var/lib/mysql socket=/data2/var/lib/mysql/mysql.sock log-bin = /data2/var/lib/mysql/bin.log [mysql.server] user=mysql basedir=/data2/var/lib [client] socket=/data2/var/lib/mysql/mysql.sock
We can keep the log and pid outside of the LVM if we want:
[mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid
Now you can start mysqld:
sudo /usr/bin/mysqld_safe &
Grab the latest mylvmbackup utility:
$ wget http://www.lenzg.org/mylvmbackup/mylvmbackup-current.tar.gz
Extract and install, its just a script so there is no compilation necessary:
$ tar xzvf mylvmbackup-0.11.tar.gz $ cd mylvmbackup-0.11 $ sudo make install
You’ll also need some perl modules:
*Config::IniFiles
*DBI
*DBD::mysql
*Sys::Syslog
*Date::Format
Now we need to setup the config file for mylvmbackup, located at /etc/mylvmbackup.conf
There are lots of options but at the very least we need to set:
[mysql]
user=root
password={your root pass here}
socket=/data2/var/lib/mysql/mysql.sock
[lvm]
vgname=mysql
lvname=maindb
lvsize=190G
[misc]
innodb_recover=1
[tools]
lvcreate=/usr/sbin/lvcreate
lvremove=/usr/sbin/lvremove
lvs=/usr/sbin/lvs
You may also want to change the backupdir parameter, as this is where the tarball of your database backup is stored, so you’ll need the space for that.
You’ll want to make sure this backup directory exists, so if you are using the default:
$ sudo mkdir -p /var/tmp/mylvmbackup/backup
And also the mount directory (this is where the LVM snapshot is mounted so a mysqldump can be taken:
$ sudo mkdir -p /var/tmp/mylvmbackup/mnt
Now you can simply run the mylvmbackup command as root. Heres what the output looks like:
$ sudo mylvmbackup 20090122 13:26:47 Info: Connecting to database... 20090122 13:26:47 Info: Flushing tables with read lock... 20090122 13:26:47 Info: Taking position record... 20090122 13:26:47 Info: Running: /usr/sbin/lvcreate -s --size=190G --name=maindb_snapshot /dev/mysql/maindb File descriptor 3 (socket:[101615]) leaked on lvcreate invocation. Parent PID 7209: /usr/bin/perl Logical volume "maindb_snapshot" created 20090122 13:26:47 Info: DONE: taking LVM snapshot 20090122 13:26:47 Info: Unlocking tables... 20090122 13:26:47 Info: Disconnecting from database... 20090122 13:26:47 Info: Mounting snapshot... 20090122 13:26:47 Info: Running: /bin/mount -o rw /dev/mysql/maindb_snapshot /var/tmp/mylvmbackup/mnt/backup 20090122 13:26:47 Info: DONE: mount snapshot 20090122 13:26:47 Info: Running: /bin/mount -o bind,ro /tmp/mylvmbackup-backup-20090122_132647_mysql-9zMEJA/pos /var/tmp/mylvmbackup/mnt/backup-pos 20090122 13:26:47 Info: DONE: bind-mount position directory 20090122 13:26:47 Info: Recovering InnoDB... 20090122 13:26:47 Info: Running: echo 'select 1;' | mysqld_safe --socket=/tmp/mylvmbackup.sock --pid-file=/var/tmp/mylvmbackup_recoverserver.pid --datadir=/var/tmp/mylvmbackup/mnt/backup --skip-networking --skip-grant --bootstrap --skip-ndbcluster --skip-slave-start Starting mysqld daemon with databases from /var/tmp/mylvmbackup/mnt/backup STOPPING server from pid file /var/tmp/mylvmbackup_recoverserver.pid 090122 13:26:47 mysqld ended 20090122 13:26:47 Info: DONE: InnoDB recovery on snapshot 20090122 13:26:47 Info: Copying my.cnf... 20090122 13:26:47 Info: Taking actual backup... 20090122 13:26:47 Info: Creating tar archive /var/tmp/mylvmbackup/backup/backup-20090122_132647_mysql.tar.gz 20090122 13:26:47 Info: Running: cd '/var/tmp/mylvmbackup/mnt' ; '/bin/tar' cvf - backup/ backup-pos/backup-20090122_132647_mysql.pos backup-pos/backup-20090122_132647_mysql_my.cnf /bin/gzip --stdout --verbose --best -> /var/tmp/mylvmbackup/backup/backup-20090122_132647_mysql.tar.gz.INCOMPLETE-uqO2w7 backup/ backup/test backup/var/ backup/var/lib/ backup/var/lib/mysql/ backup/var/lib/mysql/bin.index backup/var/lib/mysql/test/ backup/var/lib/mysql/ibdata1 backup/var/lib/mysql/ib_logfile1 /bin/tar: backup/var/lib/mysql/mysql.sock: socket ignored backup/var/lib/mysql/mysql/ backup/var/lib/mysql/mysql/time_zone_transition.MYI backup/var/lib/mysql/mysql/func.frm backup/var/lib/mysql/mysql/procs_priv.MYD backup/var/lib/mysql/mysql/db.MYI backup/var/lib/mysql/mysql/proc.frm backup/var/lib/mysql/mysql/proc.MYD backup/var/lib/mysql/mysql/proc.MYI backup/var/lib/mysql/mysql/time_zone_leap_second.MYI backup/var/lib/mysql/mysql/help_category.MYI backup/var/lib/mysql/mysql/tables_priv.frm backup/var/lib/mysql/mysql/user.MYI backup/var/lib/mysql/mysql/help_relation.MYD backup/var/lib/mysql/mysql/columns_priv.frm backup/var/lib/mysql/mysql/columns_priv.MYI backup/var/lib/mysql/mysql/time_zone_name.frm backup/var/lib/mysql/mysql/help_keyword.MYI backup/var/lib/mysql/mysql/help_relation.MYI backup/var/lib/mysql/mysql/host.frm backup/var/lib/mysql/mysql/user.frm backup/var/lib/mysql/mysql/procs_priv.MYI backup/var/lib/mysql/mysql/help_relation.frm backup/var/lib/mysql/mysql/time_zone_transition_type.MYD backup/var/lib/mysql/mysql/help_keyword.frm backup/var/lib/mysql/mysql/time_zone_transition_type.MYI backup/var/lib/mysql/mysql/time_zone.MYD backup/var/lib/mysql/mysql/tables_priv.MYD backup/var/lib/mysql/mysql/host.MYI backup/var/lib/mysql/mysql/help_category.frm backup/var/lib/mysql/mysql/user.MYD backup/var/lib/mysql/mysql/time_zone_leap_second.MYD backup/var/lib/mysql/mysql/time_zone.MYI backup/var/lib/mysql/mysql/time_zone_name.MYD backup/var/lib/mysql/mysql/time_zone_transition.MYD backup/var/lib/mysql/mysql/tables_priv.MYI backup/var/lib/mysql/mysql/columns_priv.MYD backup/var/lib/mysql/mysql/db.MYD backup/var/lib/mysql/mysql/time_zone_transition_type.frm backup/var/lib/mysql/mysql/help_topic.MYI backup/var/lib/mysql/mysql/procs_priv.frm backup/var/lib/mysql/mysql/time_zone_name.MYI backup/var/lib/mysql/mysql/func.MYD backup/var/lib/mysql/mysql/time_zone.frm backup/var/lib/mysql/mysql/help_topic.frm backup/var/lib/mysql/mysql/time_zone_transition.frm backup/var/lib/mysql/mysql/func.MYI backup/var/lib/mysql/mysql/help_keyword.MYD backup/var/lib/mysql/mysql/help_category.MYD backup/var/lib/mysql/mysql/db.frm backup/var/lib/mysql/mysql/time_zone_leap_second.frm backup/var/lib/mysql/mysql/help_topic.MYD backup/var/lib/mysql/mysql/host.MYD backup/var/lib/mysql/jgtest/ backup/var/lib/mysql/jgtest/test2.MYD backup/var/lib/mysql/jgtest/test2.frm backup/var/lib/mysql/jgtest/db.opt backup/var/lib/mysql/jgtest/test2.MYI backup/var/lib/mysql/jgtest/test.frm backup/var/lib/mysql/bin.000001 backup/var/lib/mysql/ib_logfile0 backup/lost+found/ backup-pos/backup-20090122_132647_mysql.pos backup-pos/backup-20090122_132647_mysql_my.cnf 99.2% 20090122 13:26:48 Info: DONE: create tar archive 20090122 13:26:48 Info: Cleaning up... 20090122 13:26:48 Info: LVM Usage stats: 20090122 13:26:48 Info:LV VG Attr LSize Origin Snap% Move Log Copy% Convert 20090122 13:26:48 Info: maindb_snapshot mysql swi-a- 190.00G maindb 0.00 Logical volume "maindb_snapshot" successfully removed
And now you should have a tarball in your backup directory, with all the files needed to restore:
$ ls /var/tmp/mylvmbackup/backup backup-20090122_132647_mysql.tar.gz