Could you create an S3 FTP file backup/transfer solution without the normal administration headache on top of Amazon’s Simple Storage Service?
FTP (File Transfer Protocol) is a fast and convenient way to transfer large files over the Internet. You might, at some point, have configured an FTP server and used block storage, NAS, or an SAN as your backend. But using this kind of storage requires infrastructure support and can cost you a fair amount of both time and money.
Could an S3 FTP solution work better? Since their reliable and competitively priced infrastructure is just sitting there waiting to be used, I was curious to see whether AWS can give us what we need without the administration headache.
Why S3 FTP?
Amazon S3 is reliable and accessible, that’s why.
- Amazon S3 provides infrastructure that’s “designed for durability of 99.999999999% of objects.”
- Amazon S3 is built to provide “99.99% availability of objects over a given year.”
- You pay for exactly what you need, with no minimum commitments or up-front fees.
- With Amazon S3, there’s no limit to how much data you can store or when you can access it.
NOTE: FTP is not a secure protocol and should not be used to transfer sensitive data. You might consider using the SSH File Transfer Protocol (sometimes called SFTP) for that.
Using S3 FTP: object storage as filesystem
SAN, iSCSI, and local disks are block storage devices. That means block storage volumes that are attached directly to an machine running an operating system that drives your filesystem operations. But S3 is built for object storage. This mean interactions occur at the application level via an API or command lines. You can’t mount S3 directly on your operating systems.
So to mount S3 on your server and use it as an FTP server storage backend, we’ll need some help. S3fuse will let us mount a bucket as a local filesystem with read/write access. On s3fs-mounted files systems, we can simply use cp, mv, and ls – and all the basic Unix file management commands – to manage resources on locally attached disks. S3fuse is FUSE-based file system that enables fully functional filesystems in a userspace program.
So it seems that we’ve got all the pieces for an S3 FTP solution. How will it actually work?
Installing s3fs
1. Install the packages we’ll need:
1 |
yum install gcc libstdc++-devel gcc-c++ fuse fuse-devel curl-devel libxml2-devel mailcap |
2. Download and complile s3fs:
1 2 3 4 5 |
wget http://s3fs.googlecode.com/files/s3fs-1.74.tar.gz cd s3fs-1.74/ sudo ./configure --prefix=/opt sudo make sudo make install |
3. Configure AWS credentials:
1 2 |
echo AWS_ACCESS_KEY_ID:AWS_SECRET_ACCESS_KEY > ~/.passwd-s3fs chmod 600 ~/.passwd-s3fs |
4. Mount your S3 Bucket:
1 2 |
sudo mkdir -p /home/ec2/s3bucket sudo /opt/s3fs/bin/s3fs your_s3_bucket_name /home/ec2/s3bucket/ -o allow_other |
Now copy some files to the /home/ec2/s2bucket directory so you’ll have something to upload to your S3 bucket.
5. Install Vsftpd and configure storage as an S3 mounted bucket:
1 2 3 |
sudo yum install vsftd # open your configuration file: sudo vi /etc/vsftpd/vsftpd.conf |
Adjust the configuration to read like this (substitute the public IP address of your FTP server for xxx…):
1 2 3 4 5 6 7 8 |
anonymous_enable=NO local_enable=YES chroot_local_user=YES #Passive support pasv_enable=yes pasv_min_port=15393 pasv_max_port=15592 pasv_address=xxx.xxx.xxx.xxx # the public IP address of the FTP server |
6. Add the ftpuser and set its home directory as /home/ec2/s3bucket/
1 2 |
sudo useradd -d /home/ec2/s3bucket/ -s /sbin/nologin ftpuser sudo passwd ftpuser |
Finally, you can connect to the server via FTP using the username and passwords created earlier. Now whatever files you have copied to your s3bucket directory will be automatically uploaded to Amazon S3. Voila! An S3 FTP server!
If you want to deepen your understanding of how S3 works, this is your go-to course: AWS Storage Fundamentals – Simple Storage Service (S3)