Reclaim Your Hard Drive - Saving Tons of Space with MySQL InnoDB tables

Drupal 7 uses InnoDB tables. InnoDB provides many benefits, but can cause some unexpected headaches. One headache for me is that, by default, MySQL tells InnoDB to create one file on your system, called ibdata1, to hold ALL the data from EVERY InnoDB table you have on your MySQL server. This file never shrinks in size; it only expands to contain new data. If you delete something from MySQL or drop a table, the space that table was using is reallocated for other new data. This isn't a bad thing, especially for those who have a lot of drive space, and not many databases that are altered or dropped quite frequently.

I develop a lot of sites on my little MacBook Air (with a 128GB SSD), so I often download database snapshots from live and testing environments, empty out the tables on my local environment, then import the database dumps. Can you spot the problem here?

Using Daisy Disk I just noticed that my ibdata1 file had grown to more than 10 GB, and my Air's drive only had about 5 GB free space!

So, after reading through MySQL's InnoDB Engine documentation and this answer on Stack Overflow, I found that it's not too hard to change MySQL to keep data tables in their own files, and delete the files after the tables are deleted (thus saving me a ton of space). It just takes a little time and annoyance.

Here's how to do it, roughly:

  1. Export/dump all your databases. (In my case, I didn't do this, since I could just grab them all from production or development servers.) If you have a good backup and restoration system in place, you shouldn't need to fret too much about this part, but if you don't, you'll probably need to spend a bit of time dumping each database or writing a script to do this for you.
  2. Drop (delete) all databases, except for the mysql database, and information_schema, if it exists.
  3. Shut down MySQL.
  4. Delete the ibdata1 file and any ib_logfile log files (I just had ib_logfile0 and ib_logfile1).
  5. Add innodb_file_per_table under the [mysqld] heading in your my.cnf file.
  6. Start MySQL.
  7. Import all your databases.

After doing this, my 'mysql' directory with all my databases only took up about 3 GB (there are a few large databases I regularly work with... but +/-3 GB is a lot less painful than 10+ GB!

I also took this opportunity to flush out some other testing databases that I had on my local computer for Drupal 4.7 (really!), 5, 6, 7 and 8 testing. It's easy enough to create a new database when the need arises, and with drush, it's easier than ever to create and sync databases and files for my Drupal sites.

On most of the production servers I manage, I don't worry about setting innodb_file_per_table, because there are often only one, two or three databases, and they aren't constantly changing like on my local computer—they only grow over time, so the ever-increasing size of the ibdata1 file isn't concerning to me.

Comments

You don't actually need to dump the databases, you could convert them all to myisam format first (Drupal doesn't do foreign keys, so that's fine for Drupal data) using the mysql_convert_table_format utility, then make the file_per_table config change, stop MySQL, remove iblogfile* and ibdata* and start MySQL again.

You may want to also tweak the OS and MySQL settings for open file cache, so you don't get a performance hit for suddenly opening a lot more actual database files.

Also, MySQL *still* put some data in the ibdata file even if you use file_per_table, and you must not *ever* do backups by simply copying the table files. MyISAM lets you get away with that, InnoDB will not.

Some very good points, especially concerning backups; I only use full database dumps because I don't want to risk screwing up something with MySQL's internal file layout.

Are there any performance concerns when switching from single file to multi-file? Does that cause more read/writes to the drive?

It shouldn't be a concern for most people, and the space savings are probably worth it for most people. However, if all the InnoDB tables on your system are write-heavy, don't have many deletes, and are fairly busy, it may help the system perform slightly more efficiently to have one file.

Using innodb_file_per_table and setting innodb_flush_method to O_DIRECT will usually speed up the DB. I haven't found any downsides to running with innodb_file_per_table
http://www.dbasquare.com/kb/mysql-configuration-innodb_file_per_table/