Copy a folder in AmazonS3 using the C# API

I was looking for examples of how to copy a folder to another location using the Amazon S3 API for C#.  I really couldn't find anything... so I wrote my own.  First to the code, then the explanation...

This code takes two S3 paths and copies all files from the source to the destination.  An example would be:

CopyFolder("/my-bucket/thing1/i-am-source",  
           "/my-bucket/thing2/i-am-destination");
public bool CopyFolder(string source, string destination)  
{
    var client = AWSClientFactory.CreateAmazonS3Client(accessKeyID, secretAccessKeyID);

    var strippedSource = source;
    var strippedDestination = destination;

    // process source
    if (strippedSource.StartsWith("/"))
        strippedSource = strippedSource.Substring(1);
    if (strippedSource.EndsWith("/"))
        strippedSource = source.Substring(0, strippedSource.Length - 1);

    var sourceParts = strippedSource.Split('/');
    var sourceBucket = sourceParts[0];

    var sourcePrefix = new StringBuilder();
    for (var i = 1; i < sourceParts.Length; i++)
    {
        sourcePrefix.Append(sourceParts[i]);
        sourcePrefix.Append("/");
    }

    // process destination
    if (strippedDestination.StartsWith("/"))
        strippedDestination = destination.Substring(1);
    if (strippedDestination.EndsWith("/"))
        strippedDestination = destination.Substring(0, strippedDestination.Length - 1);

    var destinationParts = strippedDestination.Split('/');
    var destinationBucket = destinationParts[0];

    var destinationPrefix = new StringBuilder();
    for (var i = 1; i < destinationParts.Length; i++)
    {
        destinationPrefix.Append(destinationParts[i]);
        destinationPrefix.Append("/");
    }

    var listObjectsResult = client.ListObjects(new ListObjectsRequest()
        .WithBucketName(sourceBucket)
        .WithPrefix(sourcePrefix.ToString())
        .WithDelimiter("/"));

    // copy each file
    foreach (var file in listObjectsResult.S3Objects)
    {
        var request = new CopyObjectRequest();
        request.SourceBucket = file.BucketName;
        request.SourceKey = file.Key;
        request.DestinationBucket = destinationBucket;
        request.DestinationKey = destinationPrefix + file.Key.Substring(sourcePrefix.Length);
        request.CannedACL = S3CannedACL.PublicRead;
        var response = (CopyObjectResponse)client.CopyObject(request);
    }

    // copy subfolders
    foreach (var folder in listObjectsResult.CommonPrefixes)
    {
        var actualFolder = folder.Substring(sourcePrefix.Length);
        actualFolder = actualFolder.Substring(0, actualFolder.Length - 1);
        CopyFolder(strippedSource + "/" + actualFolder, strippedDestination + "/" + actualFolder);
    }

    return true;
}

How S3 Stores Objects

So first, a bit on S3. S3 isn't actually a file system. It is an object store. Objects in the store have a key that is associated with each object. You can use this object key to mimic a traditional file structure path by specifying a "file system-like path." When browsing a "directory" in S3, you're really just browsing objects that have the a key that has /'s in it. Don't believe, here's the Amazone S3 FAQ on the subject:

How do I mimic a typical file system on Amazon S3

You can mimic a file system hierarchy by using the Prefix and Delimiter parameters when you list a bucket. When you store your objects, create key names that correspond to a typical file system path.

For Example: the bucket "my-application" could include the following keys:

  • john/settings/conf.txt
  • jane/settings/conf.txt

Understanding this we can look at a typical S3 "folder" and break down the parts into a bucket/key combination:

Full Path:   my-bucket/thing1/i-am-source
Bucket:      my-bucket
Key:         thing1/i-am-source

They key represents the entire file path. When mapping a key to a file system, the key has a "prefix" which represents the folding structure that is the  parent of our target folder.  For example, i-am-source, resides inside thing1. Meaning:

Prefix:    thing1/
Folder:   i-am-source

Performing Folder Copy

My code copies the files in the source path to the destination path, it then recursively traverses all subsequent "directories" to complete the full copy.  The first part of the code breaks apart the full path into bucket and prefix parts for both the source and destination paths.  Suffice it to say, getting this portion correct required the most effort.  Once the bucket and prefix was determined, I was able to use the ListObjects API to retrieve the "files" and "folders" in the current source path.

The S3 API's ListObjects API, as shown in the List Objects API Documentation, can be used to query a bucket and return the S3Objects as well as the CommonPrefixes.  You can specify a prefix to search as well as the delimiter.  The delimiter "causes keys that contain the same string between the prefix and the first occurrence of the delimiter to be rolled up into a single result element in the CommonPrefixes collection." In short, CommonPrefixes are the "directories" of the current folder.

Say we have the following:

  • /my-bucket/thing1/i-am-source/index.html
  • /my-bucket/thing1/i-am-source/pics
  • /my-bucket/thing1/i-am-source/videos
var listObjectsResult = client.ListObjects(new ListObjectsRequest()  
 .WithBucketName("my-bucket")
 .WithPrefix("thing1/i-am-source")
 .WithDelimiter("/"));

The call would result in

S3Object
index.html

CommonPrefixes
pics
videos

The code copies the object to the destination path and calls CopyFolder with the two sub-directories.

That's basically all there is to it. There may be better ways of doing this, perhaps even batch operations.  For what it's worth, I couldn't find any code examples of doing "folder copy" using the API.  So feel free to leave comments or links if there are better ways.

comments powered by Disqus