Howdy, guys. I wanted to make a quick thread to educate everyone on the importance of having regular, steady backups available for your server.
So let's say you've got a wiki on a server, and everything's going great. You've got a decent userbase and you've built up a nice amount of content. Then something happens. Maybe the database crashes for some reason. Whatever the reason for it, you've now lost your database and can't recover it.
This is where the prepared webmaster will clear the bugged database and recreate it from a backup.
This is where the unprepared webmaster will spend hours in vain trying to recover data until finally giving up and having to admit to his users that the data is gone.
Not sure about you guys, but I like to be the former. It's very important to maintain a system of backups, and I'll be going over some tips about creating a system for making regular backups. Note that you need shell access to your server for this. If you don't have it, you can usually get it enabled by your hosting company if you open a support request.
1. Create a script to backup your database regularly
Your first step should be to write a script which can run as a cron job to back up the DB. If you're too lazy to do this, send me an email. I've created a modular script which you can set up with your site's configuration that will provide you with a stable backup system.
Things to consider in your script:
A) Your script should not be in your web directory. To run automatically, it needs to have the password for your mysql user in it, and you don't want to publicize that information.
B) Consider keeping multiple backups, in case an underlying issue occurs but remains undetected for a little while. My recommendation is as follows:
Low-traffic ( < 20 edits per day ): Backup at least twice a week, keeping two weeks' worth of backups.
Mid-traffic ( 20-100 edits per day ): Backup 3-4 times a week, and keep at least 5 backups at one time.
High-traffic ( > 100 edits per day ): Backup every day, and keep a full week's worth of backups.
Naturally, the more edits you have per day, the more often you want to backup the DB. You want to make sure you stay up to date on the data in your backups.
C) Be sure to lock your database before backing it up. If you open LocalSettings.php, you can put the database into lock mode automatically by setting $wgReadOnly. You should set it with a message to the users explaining that you are performing maintenance and that the wiki is temporarily read-only. Here's a sample message:
$wgReadOnly = "The database is currently locked from editing while we perform routine maintenance. Check back in about fifteen minutes and it should be unlocked again. We apologize for the inconvenience.";
D) Plan a solid location for your backups. If possible, try to store your backups on a remote server, or on a separate drive from your database. It's very counterintuitive to keep your DB and backups in the same location on the same drive. If there's a filesystem failure, you lose the backups, too! I'M WORKING WITH SOMEONE WHO HAS THIS PROBLEM; DO NOT REPEAT HIS MISTAKE!
2. Test your script.
Once you've made the script, test it on a development server or a local install. Don't test it on your production server, as you run the risk of a small bug destroying everything.
3. Automate the script.
Once you've made the script, you need to automate it. If you're running a Microsoft IIS server, you'll need to research how to do that for yourself. (Sorry, but I don't advocate or support Windows.)
Linux distros usually allow each use to set their own crontab file via the crontab command. You can type crontab -e to edit the file with your default text editor. If you don't understand how a crontab works, this article goes in-depth about how to format a crontab file. Choose a time when your editing traffic is the lowest. Usually, the optimal time will be 7-10 AM UTC, as that is 2-5 AM EST in America. If the bulk of your traffic doesn't come from the US, identify your heaviest area of traffic, and tailor your database locking to non-peak traffic times for that region. If you have root access, you can view crontab files in /var/spool/crontab/, but it is not recommended to edit them directly. Use crontab -e instead. Do not place it in a system cron (/etc/crontab), because it will not run as the user who owns the files. It will instead run as root, and if the script manipulates any files, root will gain ownership of them.
4. Create your initial backup.
Run the script yourself! But again, don't do it during peak hours. You'll inconvenience your users, and could even cause some to think your site is broken. MediaWiki's readonly message is very unnoticeable at the top of edit forms, and it provides no indication other than a notice. Users can still edit the form and even click the save button, but the DB will refuse the connection and will not save the edit.
Once you have your initial backup, test its integrity and even try to import it into a local installation to ensure that it's working properly.
5. Monitor your backups regularly.
Pay attention to your script. It's a good idea to check your mail every time you login, just to be sure that there's no errors. And just to clarify, I'm talking about the account's mail on the server. Again, I can't help IIS users, but Linux distros store their mail in a file located in /var/mail/(username). I always read mine with:
tail -n 100 /var/mail/(name) | less
That will pull the last 100 lines from the mail file, and display it in a dynamic window in which you can scroll up or down with the arrow keys. By default, cron will automatically email the user any errors produced by scripts in their cron, which can be used for debugging.
This is good practice in general. After any major updates or package upgrades, test your script to make sure that it's still functional.
6. Relax.
Now you have nothing to fear. A database corruption can't stop you!
Some of you may be wondering why I took the time to write this out. Well, it's because I've noticed a disturbing trend of webmasters not backing their sites up, whichcan WILL lead to trouble in the end. It doesn't matter how long your site runs reliably; it will fail one day. Computers are not perfect. The disaster with WiKirby losing its database should serve as enough of a cautionary tale. In fact, Zelda Wiki lost some data, too, although we only lost a week's worth (and it was my own fault, anyway, because I failed to take a necessary step after an upgrade) because I was prepared.
So remember, people. Be prepared, and don't get caught with your pants down. Never dismiss the importance of backups.
So let's say you've got a wiki on a server, and everything's going great. You've got a decent userbase and you've built up a nice amount of content. Then something happens. Maybe the database crashes for some reason. Whatever the reason for it, you've now lost your database and can't recover it.
This is where the prepared webmaster will clear the bugged database and recreate it from a backup.
This is where the unprepared webmaster will spend hours in vain trying to recover data until finally giving up and having to admit to his users that the data is gone.
Not sure about you guys, but I like to be the former. It's very important to maintain a system of backups, and I'll be going over some tips about creating a system for making regular backups. Note that you need shell access to your server for this. If you don't have it, you can usually get it enabled by your hosting company if you open a support request.
1. Create a script to backup your database regularly
Your first step should be to write a script which can run as a cron job to back up the DB. If you're too lazy to do this, send me an email. I've created a modular script which you can set up with your site's configuration that will provide you with a stable backup system.
Things to consider in your script:
A) Your script should not be in your web directory. To run automatically, it needs to have the password for your mysql user in it, and you don't want to publicize that information.
B) Consider keeping multiple backups, in case an underlying issue occurs but remains undetected for a little while. My recommendation is as follows:
Low-traffic ( < 20 edits per day ): Backup at least twice a week, keeping two weeks' worth of backups.
Mid-traffic ( 20-100 edits per day ): Backup 3-4 times a week, and keep at least 5 backups at one time.
High-traffic ( > 100 edits per day ): Backup every day, and keep a full week's worth of backups.
Naturally, the more edits you have per day, the more often you want to backup the DB. You want to make sure you stay up to date on the data in your backups.
C) Be sure to lock your database before backing it up. If you open LocalSettings.php, you can put the database into lock mode automatically by setting $wgReadOnly. You should set it with a message to the users explaining that you are performing maintenance and that the wiki is temporarily read-only. Here's a sample message:
$wgReadOnly = "The database is currently locked from editing while we perform routine maintenance. Check back in about fifteen minutes and it should be unlocked again. We apologize for the inconvenience.";
D) Plan a solid location for your backups. If possible, try to store your backups on a remote server, or on a separate drive from your database. It's very counterintuitive to keep your DB and backups in the same location on the same drive. If there's a filesystem failure, you lose the backups, too! I'M WORKING WITH SOMEONE WHO HAS THIS PROBLEM; DO NOT REPEAT HIS MISTAKE!
2. Test your script.
Once you've made the script, test it on a development server or a local install. Don't test it on your production server, as you run the risk of a small bug destroying everything.
3. Automate the script.
Once you've made the script, you need to automate it. If you're running a Microsoft IIS server, you'll need to research how to do that for yourself. (Sorry, but I don't advocate or support Windows.)
Linux distros usually allow each use to set their own crontab file via the crontab command. You can type crontab -e to edit the file with your default text editor. If you don't understand how a crontab works, this article goes in-depth about how to format a crontab file. Choose a time when your editing traffic is the lowest. Usually, the optimal time will be 7-10 AM UTC, as that is 2-5 AM EST in America. If the bulk of your traffic doesn't come from the US, identify your heaviest area of traffic, and tailor your database locking to non-peak traffic times for that region. If you have root access, you can view crontab files in /var/spool/crontab/, but it is not recommended to edit them directly. Use crontab -e instead. Do not place it in a system cron (/etc/crontab), because it will not run as the user who owns the files. It will instead run as root, and if the script manipulates any files, root will gain ownership of them.
4. Create your initial backup.
Run the script yourself! But again, don't do it during peak hours. You'll inconvenience your users, and could even cause some to think your site is broken. MediaWiki's readonly message is very unnoticeable at the top of edit forms, and it provides no indication other than a notice. Users can still edit the form and even click the save button, but the DB will refuse the connection and will not save the edit.
Once you have your initial backup, test its integrity and even try to import it into a local installation to ensure that it's working properly.
5. Monitor your backups regularly.
Pay attention to your script. It's a good idea to check your mail every time you login, just to be sure that there's no errors. And just to clarify, I'm talking about the account's mail on the server. Again, I can't help IIS users, but Linux distros store their mail in a file located in /var/mail/(username). I always read mine with:
tail -n 100 /var/mail/(name) | less
That will pull the last 100 lines from the mail file, and display it in a dynamic window in which you can scroll up or down with the arrow keys. By default, cron will automatically email the user any errors produced by scripts in their cron, which can be used for debugging.
This is good practice in general. After any major updates or package upgrades, test your script to make sure that it's still functional.
6. Relax.
Now you have nothing to fear. A database corruption can't stop you!
Some of you may be wondering why I took the time to write this out. Well, it's because I've noticed a disturbing trend of webmasters not backing their sites up, which
So remember, people. Be prepared, and don't get caught with your pants down. Never dismiss the importance of backups.