Jump toUpdate content

Migrating Object Storage data with Rclone

Reviewed on 10 May 2021Published on 20 March 2019
  • compute
  • rclone
  • object
  • storage

Rclone provides a modern alternative to rsync. It is able to communicate with any S3 compatible cloud storage provider as well as other storage platforms, and can be used to migrate data from one bucket to another, even if those buckets are in different regions.

Requirements:

Installing Rclone

  1. Connect to your server as root via SSH.

  2. Update the APT packet cache and the software already installed on the instance:

    apt update && apt upgrade -y
  3. Download and install Rclone with the following sequence of commands:

    wget https://downloads.rclone.org/rclone-current-linux-amd64.zipapt install zipunzip rclone-current-linux-amd64.zipcd rclone*linux-amd64/mv rclone /usr/bin/

Configuring Rclone

  1. Begin rclone configuration with the following command:

    $> rclone config

    If you do not have any existing remotes, the following output displays:

    2021/01/18 16:03:28 NOTICE: Config file "/root/.config/rclone/rclone.conf" not found - using defaultsNo remotes found - make a new onen) New remotes) Set configuration passwordq) Quit config

    If you have previously configured rclone you may see a slightly different output. However, that does not affect the following steps.

  2. Type n to make a new remote. You are then prompted to type a name - here we type remote-sw-paris:

    n/s/q> nname> remote-sw-paris

    The following output displays:

    Type of storage to configure.Enter a string value. Press Enter for the default ("").Choose a number from below, or type in your own value1 / 1Fichier  \ "fichier"2 / Alias for an existing remote  \ "alias"3 / Amazon Drive  \ "amazon cloud drive"4 / Amazon S3 Compliant Storage Provider (AWS, Alibaba, Ceph, Digital Ocean, Dreamhost, IBM COS, Minio, Tencent COS, etc)  \ "s3"5 / Backblaze B2  \ "b2"6 / Box  \ "box"7 / Cache a remote  \ "cache"8 / Citrix Sharefile  \ "sharefile"9 / Dropbox  \ "dropbox"10 / Encrypt/Decrypt a remote  \ "crypt"11 / FTP Connection  \ "ftp"12 / Google Cloud Storage (this is not Google Drive)  \ "google cloud storage"13 / Google Drive  \ "drive"14 / Google Photos  \ "google photos"15 / Hubic  \ "hubic"16 / In memory object storage system.  \ "memory"17 / Jottacloud  \ "jottacloud"18 / Koofr  \ "koofr"19 / Local Disk  \ "local"20 / Mail.ru Cloud  \ "mailru"21 / Mega  \ "mega"22 / Microsoft Azure Blob Storage  \ "azureblob"23 / Microsoft OneDrive  \ "onedrive"24 / OpenDrive  \ "opendrive"25 / OpenStack Swift (Rackspace Cloud Files, Memset Memstore, OVH)  \ "swift"26 / Pcloud  \ "pcloud"27 / Put.io  \ "putio"28 / QingCloud Object Storage  \ "qingstor"29 / SSH/SFTP Connection  \ "sftp"30 / Sugarsync  \ "sugarsync"31 / Tardigrade Decentralized Cloud Storage  \ "tardigrade"32 / Transparently chunk/split large files  \ "chunker"33 / Union merges the contents of several upstream fs  \ "union"34 / Webdav  \ "webdav"35 / Yandex Disk  \ "yandex"36 / http Connection  \ "http"37 / premiumize.me  \ "premiumizeme"38 / seafile  \ "seafile"Storage>
  3. Type s3 and hit enter to confirm this storage type. The following output displays:

    Choose your S3 provider.Enter a string value. Press Enter for the default ("").Choose a number from below, or type in your own value1 / Amazon Web Services (AWS) S3  \ "AWS"2 / Alibaba Cloud Object Storage System (OSS) formerly Aliyun  \ "Alibaba"3 / Ceph Object Storage  \ "Ceph"4 / Digital Ocean Spaces  \ "DigitalOcean"5 / Dreamhost DreamObjects  \ "Dreamhost"6 / IBM COS S3  \ "IBMCOS"7 / Minio Object Storage  \ "Minio"8 / Netease Object Storage (NOS)  \ "Netease"9 / Scaleway Object Storage  \ "Scaleway"10 / StackPath Object Storage  \ "StackPath"11 / Tencent Cloud Object Storage (COS)  \ "TencentCOS"12 / Wasabi Object Storage  \ "Wasabi"13 / Any other S3 compatible provider  \ "Other"
  4. Type Scaleway and hit enter to confirm this S3 provider. The following output displays:

    Get AWS credentials from runtime (environment variables or EC2/ECS meta data if no env vars).Only applies if access_key_id and secret_access_key is blank.Enter a boolean value (true or false). Press Enter for the default ("false").Choose a number from below, or type in your own value1 / Enter AWS credentials in the next step  \ "false"2 / Get AWS credentials from the environment (env vars or IAM)  \ "true"env_auth>
  5. Type false and hit enter, to be able to enter your credentials in the next step.

    The following output displays:

    AWS Access Key ID.Leave blank for anonymous access or runtime credentials.Enter a string value. Press Enter for the default ("").access_key_id>
  6. Enter your API Access Key and hit enter.

    The following output displays:

    AWS Secret Access Key (password)Leave blank for anonymous access or runtime credentials.Enter a string value. Press Enter for the default ("").secret_access_key>
  7. Enter your API Secret Key and hit enter.

    The following output displays:

    Region to connect to.Enter a string value. Press Enter for the default ("").Choose a number from below, or type in your own value1 / Amsterdam, The Netherlands  \ "nl-ams"2 / Paris, France  \ "fr-par"region>
  8. Enter your chosen region and hit enter. Here we choose fr-par.

    The following output displays:

    Endpoint for Scaleway Object Storage.Enter a string value. Press Enter for the default ("").Choose a number from below, or type in your own value1 / Amsterdam Endpoint  \ "s3.nl-ams.scw.cloud"2 / Paris Endpoint  \ "s3.fr-par.scw.cloud"endpoint>
  9. Enter your chosen endpoint and hit enter. Here we choose s3.fr-par.scw.cloud.

    The following output displays:

    Canned ACL used when creating buckets and storing or copying objects.
    This ACL is used for creating objects and if bucket_acl isn't set, for creating buckets too.
    For more info visit https://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html#canned-acl
    Note that this ACL is applied when server side copying objects as S3doesn't copy the ACL from the source but rather writes a fresh one.Enter a string value. Press Enter for the default ("").Choose a number from below, or type in your own value1 / Owner gets FULL_CONTROL. No one else has access rights (default).  \ "private"2 / Owner gets FULL_CONTROL. The AllUsers group gets READ access.  \ "public-read"  / Owner gets FULL_CONTROL. The AllUsers group gets READ and WRITE access.3 | Granting this on a bucket is generally not recommended.  \ "public-read-write"4 / Owner gets FULL_CONTROL. The AuthenticatedUsers group gets READ access.  \ "authenticated-read"  / Object owner gets FULL_CONTROL. Bucket owner gets READ access.5 | If you specify this canned ACL when creating a bucket, Amazon S3 ignores it.  \ "bucket-owner-read"  / Both the object owner and the bucket owner get FULL_CONTROL over the object.6 | If you specify this canned ACL when creating a bucket, Amazon S3 ignores it.  \ "bucket-owner-full-control"acl>
  10. Enter your chosen ACL and hit enter. Here we choose private (1).

    The following output displays:

    The storage class to use when storing new objects in S3.Enter a string value. Press Enter for the default ("").Choose a number from below, or type in your own value1 / Default  \ ""2 / The Standard class for any upload; suitable for on-demand content like streaming or CDN.  \ "STANDARD"3 / Archived storage; prices are lower, but it needs to be restored first to be accessed.  \ "GLACIER"storage_class>
  11. Enter your chosen storage class and hit enter. Here we choose STANDARD (2).

    The following output displays:

    Edit advanced config? (y/n)y) Yesn) No (default)y/n>
  12. Type n and hit enter. A summary of your config displays:

    Remote config--------------------[remote-sw-paris]type = s3provider = Scalewayenv_auth = falseaccess_key_id = <ACCESS-KEY>secret_access_key = <SECRET-KEY>region = fr-parendpoint = s3.fr-par.scw.cloudacl = privatestorage_class = STANDARD--------------------y) Yes this is OK (default)e) Edit this remoted) Delete this remotey/e/d>
  13. Type y to confirm that this remote config is OK, and hit enter.

    The following output displays:

    Current remotes:
    Name                 Type====                 ====tuto                 s3
    e) Edit existing remoten) New remoted) Delete remoter) Rename remotec) Copy remotes) Set configuration passwordq) Quit confige/n/d/r/c/s/q> q
  14. Type q to quit the config, and hit enter.

  15. If you want to be able to transfer data to or from a bucket in a different region to the one you just set up, repeat steps 1-14 again to set up a new remote in the required region. Simply enter the required region at steps 7 and 8. Similarly, you may wish to set up a new remote for a different object storage provider.

For further information, please refer to the official RClone S3 Object Storage Documentation. Official documentation also exists for other storage backends

Migrating data

There are two commands that can be used to migrate data from one backend to another.

  • The copy command copies data from source to destination.
rclone copy --progress <SOURCE_BACKEND>:<SOURCE_PATH> <DEST_BACKEND>:<DEST_PATH>

For example, the following command copies data from a bucket named my-first-bucket in the remote-sw-paris remote backend that we previously set up, to another bucket named my-second-bucket in the same remote backend. The --progress flag allows us to follow the progress of the transfer:

rclone copy --progress remote-sw-paris:my-first-bucket remote-sw-paris:my-second-bucket
  • The sync command copies data from one one backend to another, but also deletes files/objects in the destination that are not present in the source:
rclone sync --progress <SOURCE_BACKEND>:<SOURCE_PATH> <DEST_BACKEND>:<DEST_PATH>

For example, the following command copies data from a bucket named my-first-bucket in the remote-sw-paris remote backend that we previously set up, to another bucket named my-third-bucket in a different remote backend that we configured for the nl-ams region and named remote-sw-ams. It also deletes any data that present in my-third-bucket that isn’t also present in my-first-bucket:

rclone sync --progress remote-sw-paris:my-first-bucket remote-sw-ams:my-third-bucket
Note:

This migration may incur some costs from the object storage you are migrating from since they may or may not bill egress bandwitdth.

There are other commands such as move, which progressively deletes data from the source backend.

Transferring data to C14 Cold Storage

When you copy or sync, your can determine which storage class you wish to transfer your data as.

At Scaleway Elements you can choose from two classes:

  • STANDARD: The Standard class for any upload; suitable for on-demand content like streaming or CDN.
  • GLACIER: Archived, long-term retention storage;

If the storage class is not specified, the data will be transferred as STANDARD by default.

To transfer data to C14 Cold Storage class, add

--s3-storage-class=GLACIER

to your command, as such:

rclone copy --progress --s3-storage-class=GLACIER <SOURCE_BACKEND>:<SOURCE_PATH> <DEST_BACKEND>:<DEST_PATH>

You can verify the storage class of the transferred data by accessing your bucket on the Scaleway Elements console.