Beware of Your Log Files

A little anecdote today about log files and SSD wear: As you might have noticed, I’ve recently done a lot of disk drive benchmarking. The iostat command is a great tool to check how much data is written to a block device over time, and just because I’m curious, I had a look at how much data is written to the main drive of one of my servers since I rebooted it 4 days ago. When I looked I got a shocking number: 400 GB! In 4 days! Now that is impossible I thought at first, perhaps iostat is giving me wrong numbers. So I had a closer look.

After experimenting a bit with writing a defined amount of data to the disk and measuring the process with iostat, I came to the conclusion that iostat does show the correct amount of data read and written from and to a block device. So where were the 100 GB a day coming from?

To find out, I went from virtual machine to virtual machine running on that server and quickly discovered which VM was responsible for it. I then used the Linux find command to see which files on that VM had changed in the last day. Actually not too many, just a couple of small document files and a log file. However, that log file had a size of around 8 GB and was slowly but constantly growing. In addition, that log file was synchronized via rsync from another VM on another server every two hours. And while it didn’t grow very quickly, rsync rewrites the complete file every two hours when it receives changes from the other host. In other words, it writes 8 GB every two hours. In 24 hours, that’s around 100 GB of data written to flash for absolutely nothing. I normally wouldn’t mind too much, but we are talking of 3 TB of data every month written to a flash drive for nothing. True, the flash drive in question is specified to take at least 150 TBW over its lifetime, and current drives can take even more. But writing 3 TB a month or 36 TB a year to the drive for no benefit is a bit much.

I have to admit it is my own fault, because the log file only grew out of proportion because I set the log level to debug some time in the past to analyze a problem and never bothered to turn debug logging off again. An interesting lesson learnt: Just because rsync tells you it has transported a few MB over the network does not mean it has written the same amount to the local drive. If the files that were modified on the remote host and then synchronized to the local host are much larger, the amount of written data can easily be an order of magnitude higher.

One thought on “Beware of Your Log Files”

  1. I try to keep my logs small. But I am using `rsync` a lot. It’s good to keep in mind that it writes the whole file even if only bits have changed which can be troublesome depending on the media.
    Thanks for the memory refresh.

Comments are closed.