We love designing and developing websites, but what really drives us is solving problems and cultivating strong relationships with our clients
Automatically backup mysql database to Amazon S3 using django (Python) script
By : rama
Many of us use Amazon S3 to backup their mysql databases.As we do it so often we thought of writing an automated script which does this task of taking the backup of a mysql database and then moving it to the Amazon S3 . Below is the django script .
1: def upload_db_to_Amazons3():
2: """ function which uploadd mysql database to amazon s3"""
3:
4: AWS_ACCESS_KEY_ID = settings.AWS_ACCESS_KEY_ID
5: AWS_SECRET_ACCESS_KEY = settings.AWS_SECRET_ACCESS_KEY
6: BUCKET_NAME = settings.BUCKET_NAME
7: MYSQL_DUMP_PATH = settings.MYSQL_DUMP_PATH
8: DATABASE_NAME = settings.DATABASE_NAME
9: ARCHIVE_NAME = settings.ARCHIVE_NAME
10: .....
11:
12: proc1 = subprocess.Popen(settings.MYSQL_DUMP_PATH+" --no-data -u root -pwelcome -x --databases doloto",shell=True,stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
13: #create data file
14: proc3 = subprocess.Popen(settings.MYSQL_DUMP_PATH+" --no-create-info -u root -pwelcome -x --databases doloto",shell=True,stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
15: ...
16:
17: t1.write(proc1.communicate()[0])
18: t2.write(proc3.communicate()[0])
19:
20: #create tar.gz for the above two files
21: tar = tarfile.open( (os.path.join(os.curdir, settings.ARCHIVE_NAME+"_archive.tar.gz")), "w|gz")
22: tar.add(t1.name,ARCHIVE_NAME+"_struct.sql")
23: tar.add(t2.name,ARCHIVE_NAME+"_data.sql")
24: ........
25: #upload file to Amazon S3
26: tardata = open(os.path.join(os.curdir, settings.ARCHIVE_NAME+"_archive.tar.gz") , "rb").read()
27: response = conn.put(BUCKET_NAME,settings.ARCHIVE_NAME+"_archive.tar.gz",S3.S3Object(tardata))
28:
29: upload_db_to_Amazons3()
Download the script by using the below link
1) http://uswaretech.com/blog/wp-content/uploads/2009/02/mysql_to_amazons3.zip
Things to do before using the script:
1) You need to put down Amazon S3 library for python in the sys.path .you can download the python S3 library by following the below link.(http://developer.amazonwebservices.com/connect/entry.jspa?externalID=134)
2)For a given project keep the script mysql_to_amazons3.py at the same level as that of settings.py.
3) Define the following variables in settings.py
1: #MySQL information
2: #MYSQL_DUMP_PATH should be the path to the mysqldump executable file
3: #To know where mysqldump executable is present in your local system
4: #use the command "which mysqldump".
5: #database name, user_name and password will be taken from settings.DATABASE_NAME,DATABASE_USER etc...
6: MYSQL_DUMP_PATH = "/usr/bin/mysqldump"
7:
8:
9: #Amazon S3 credentials
10: #Bucket_Name The name of the bucket in Amazon S3
11: AWS_ACCESS_KEY_ID = ''
12: AWS_SECRET_ACCESS_KEY = ''
13: BUCKET_NAME = 'tempstore'
14:
15: #Archive name with out any extension (i.e):
16: #what name do you want for the file which is uploaded to AMazon S3
17: #please give the name without any extension
18: #Also note that a copy of this file will be stored in the directory where the script resides
19: ARCHIVE_NAME = ""
a.MYSQL_DUMP_PATH= “/usr/bin/mysqldumpâ€
mysqldump is the command using which we take database backup.Here you should mention path to this mysqldump executable.Read the comments above that variable for more details.
b.AWS_ACCESS_KEY_ID = ''
Amazon access key which amazon will provide after signing up for their webservices.
c.AWS_SECRET_ACCESS_KEY = ''
Amazon secret key which amazon will provide after signing up for their webservices.
d.BUCKET_NAME = 'tempstore'
Bucket name to which to which you want to store the database archive.More details about what a bucket means at http://tinyurl.com/5nlrkz
e.ARCHIVE_NAME = " "
Name of archive file.What name do you want to give to this archive?.
f.Please note that the script will take the database name, database user,database password from already existing settings variables DATABASE_NAME , DATABASE_USER and DATABASE_PASSWORD
Running the script:
1) python mysql_to_amazons3.py to upload the database archive of settings.DATABASE_NAME to Amazon S3 into the respective bucket.
2) It also places the archive in the directory where the script resides for your convenience.
3)The archive contains two files ***_data.sql which contains database data, ***_struct.sql which contains database structure.We intentionally separate the data and structure as it is a good practice.
4) if you want to run this script as a cron job use crontab –e for setting up the cron job.
How to run this script in python?
if you want to run this script in python then do the following steps
1)Remove the top 4 lines of code in above script.
2) In upload_db_to_Amazons3() function replace all the capital variables with their corresponding values.
3)Run the script.
How does the script work?
1.It uses subprocess.Popen (pipe open) functions to first execute the commands (mysqldump - - - ).
2.It then reads the output from the pipes to create a .tar.gz archive.
3.It then uses Amazon S3 web services API to upload the above file to Amazon S3.
Do you have any problems with the script?
if you have any problems with the script or any other feedback please let me know in comments.
Resources:
2)http://developer.amazonwebservices.com/connect/entry.jspa?externalID=134
3)http://www.holovaty.com/code/update_s3.py
4)http://www.djangosnippets.org/snippets/580/
Comments
Great script. But I had to fix it up a little bit. On my system (CentOS 5.2) the backups ended up missing a few kilobytes at the end of the files. Looks like that tar doesn't wail for mysqldump to finish. I resolved this by adding t1.flush() and t2.flush() between the lines 18 and 20
From amazon s3 document,the object have a 5GB size limited,if we use this tool to backup to S3,Is that a problem ?
Reactions
Personnaly, I just have PhpMyAdmin do the backup for me.
This comment was originally posted on Noupe
This is what I use – Perl script to backup mysql databases
This comment was originally posted on Noupe
Nice list list of tips. Number 6 and #7 is handy.
This comment was originally posted on Noupe
Thank you noupe!! Nice list of resources to be used in different applications.
This comment was originally posted on Noupe
This list is incomplete imo, i’d reather see examples of using diff for db backups. This doesnt scale at all.
This comment was originally posted on Noupe
Dump a remote db locally in one easy line:
ssh user@server “/usr/bin/mysqldump -u user -p password database_name” | dd of=/where/you/want/the/dump.sql
This comment was originally posted on Noupe
Not a single one of these options will scale past a small to moderately sized database.
This comment was originally posted on Noupe
Currently I just use #11 but I’m going to look into #1 and #8.
Thanks for reminding me that I need to figure out a nice way to do this.
This comment was originally posted on Noupe
Currently I use zmanda mysql backup. I use the comunity version
This comment was originally posted on Noupe
We use version control to backup our database, saving us from having to rotate the dumps. This means we can go back in time as much as we like.
We use Git right now, but any SCM would do actually.
You do need to initialize a Git repository at /path/to/dbbackups first. Then add a cron tab that calls the backup script.
This comment was originally posted on Noupe
if you’re stuck using mySQL…. sorry
This comment was originally posted on Noupe
Wow, I was kind of expecting more than just mysqldump, mysqldump, and mysqldump. Has any tried replicating the Mysql DB as a backup solution?
This comment was originally posted on Noupe
brilliant. Thanks for a useful list
@john woah, what do you have against MySQL?
This comment was originally posted on Noupe
In many circumstances you would want to force the MYSQL dump to continue past any error with ‘-f’ so that you get the full database even if it contains oddness. Otherwise there is a chance it will crash out and give you only a fraction of your data.
This comment was originally posted on Noupe
My MySql databases are on a hosting provider, here’s the tricks I use.
1) Via ssh, connect to host, issue mysqldump or your dump script(s), have them dump to a non-quota’d location, like /tmp, or /dev/shm (if there’s enough ram on the remote box). My scripts use:
-a -Q -q -l –add-drop-table –add-locks –complete-insert -uusername -ppassword database_name > database_name.sql
2) Use rsync over ssh to pull down the dump(s) to the local machine. If you already have an older version of the dump(s), rsync will only transfer what it needs to update your copy. Use compression, too (-z) if you like. Remove these temporary dumps from the remote server.
3) Now that your local copy of the dump(s) are up to date, here’s the real magic. Why keep multiple copies of backups? Similar to rsync, what you want to do is only store the delta of these backups since the last copy. For this, use the simple old code revisioning utility, RCS. Check in your latest backup using the ‘ci’. Use the -m option to automatically add a comment to this revision’s check in, so that way it doesn’t prompt you for it. RCS will now diff the current backup against the previous version, and then record just the diff info, or the delta.
Works great. Just remember to retire older revisions to keep it from growing out of hand some day.
What’s the benefits of this?
Using RCS’s ‘co’ command, I can check out any revision still kept in the repository. RCS uses all the diffs to re-construct the backup that was checked in at that specific revision. So I can jump back 30 days if I need to find a table’s previously known good condition.
This comment was originally posted on Noupe
this is some good information. Thanks
This comment was originally posted on Noupe
Wow, I was kind of expecting more than just mysqldump, mysqldump, and mysqldump. Has any tried replicating the Mysql DB as a backup solution??
This comment was originally posted on Noupe
@Sarah
Replication is a HA solution, not a backup solution.
a “delete * from tablename” will wipe out your “backup” too.
This comment was originally posted on Noupe
Thanks! You’re saving my life again
This comment was originally posted on Noupe
I am still rookie in working with databases, very useful.
This comment was originally posted on Noupe
actually i scacely use Mysql except in CMS,but CMS sets mySQL already ,
This comment was originally posted on Noupe
Nice one. Web developers and admin can use these to backup their online database.
This comment was originally posted on Noupe
- How to use pep8.py to write better Django code
- Screencast: Django Tutorial Part 1
- How and why to use pyflakes to write better Python
- Getting started with South for Django DB migrations
- A brief overview of Vagrant
- Writing jQuery plugins using Coffeescript
- Behind the Scenes: Request to Response
- Using SQLite Database with Android
- Haml for Django developers
- Coffeescript for Python programmers
- rails
- django
- linkroundup
- django opinion
- opinion
- business
- API
- appengine
- python
- satire
- startup
- Uncategorized
- marketing
- personal
- rambling
- search
- interviews
- seo-interviews
- 5startupideas
- ideas
- seo
- tips
- forms
- paypal
- utilities
- datetime
- web2.0
- Amazon
- algorithms
- presentations
- products
- pinax
- satchmo
- ecommerce
- microsoft
- yahoo
- book
- tutorial
- models
- aggreagtion
- meta
- India
- apps
- about
- CSS
- Design
- wordpress
- test slug
- vim
- urls
- reviews
- javascript
- xmpp
- emacs
- Typography
- Grid Theory
- Color Theory
- iphone
- android
- titanium
- mobile applications
- CSS3
- Browser Compatibility
- mobile
- jobs
- lamson
- django setup
- files
- upload
- jsTree
- hierarchical view
- web page
- Treeview
- coffeescript
- request
- response
- South
- django south
- django migration
- --fake
- screencasts
- February 2012
- January 2012
- December 2011
- October 2011
- September 2011
- July 2011
- June 2011
- April 2011
- February 2011
- January 2011
- December 2010
- November 2010
- October 2010
- September 2010
- June 2010
- April 2010
- March 2010
- January 2010
- December 2009
- November 2009
- October 2009
- September 2009
- August 2009
- July 2009
- June 2009
- April 2009
- March 2009
- February 2009
- November 2008
- October 2008
- June 2008
- May 2008
- April 2008
If your Django webservers are running on EC2, you can use EBS snapshots for mySQL backups