Design considerations and our recommendations for data protection

Build
Arnaud de Bermingham
12 min read

When you build your infrastructure with Scaleway, it’s important to take a few simple rules into account, to limit the risk of data loss, whatever the cause. Data is a shared responsibility - between provider and customer.

The causes of data loss can, for example, be due to a hardware failure, a network failure, hacking, malicious acts or the destruction of physical infrastructure.

Certain precautions need to be taken depending on the product type, namely for bare metal, Infrastructure as a Service or Platform as a Service.

In the interest of transparency, we would like to clarify and elaborate on the means used by Scaleway - our design recommendations and the responsibility of each person with regard to the data we store and process.

The concept of regions and AZs

Regarding the location of data, it is important to distinguish between three key concepts of the public cloud: the region, the availability zone and the data center.

  • A region includes several availability zones (AZ), ideally three within a geographical area of about 200 km. A region is also a unique network that is dissociated from (not interconnected) other regions with the exception of Amsterdam which for historical reasons is also a peering location for the Paris region. At Scaleway, Paris, Amsterdam or Warsaw are regions.
  • An Availability Zone (AZ) is made up of one or more data centers situated in a geographical area of about 5km with a maximum internal latency of 1.4 ms and situated at least 50km from another availability zone in the same region. At Scaleway, the fr-par-1 availability zone contains our DC2 and DC3 data centers, and the fr-par-2 availability zone contains our DC5 data center. The fr-par-3 availability zone will soon be made available with our DC4 data center.
  • A data center (DC) is the physical location of an availability zone.

Customers can choose the region and availability zone when ordering infrastructure products (IaaS). The physical fault domain is the availability zone and the network failure area is the region.

As a customer, you are responsible for the redundancy and the management of the services that run on top of your infrastructure products. The highest level of redundancy is obtained by developing your application across several distinct regions.

Customers can choose only the region when ordering platform products (PaaS). In this case, fault domain corresponds to the region and is therefore essentially linked to the network. Redundancy and service management are the responsibility of Scaleway. In other words, the cloud provider operating PaaS services in several AZs in the same region, is responsible for them.

This is why an ideal public cloud design is usually based on three availability zones in the same region. At Scaleway, we fully subscribe to this logic. Indeed, with three availability zones, the distribution of a PaaS product across different AZs allows for a high level of redundancy and availability.

In the interests of transparency - we can't claim perfect implementation of this ideal logic for the public cloud.

To date, not all of our regions are made up of three availability zones. This has no impact on IaaS products. For PaaS products, the level of availability and disaster resilience is not as optimal as with a three-zone design. We have long been aware of this issue, but we have always categorically refused to compromise by having multiple availability zones, clusters or virtual data centers in the same physical data center.

To avoid misleading our customers, we systematically recommend that they build their infrastructure across multiple regions. This is the most elegant way to ensure a redundant, high availability service.

In 2021, we will add three new availability zones to our three current regions. This project has already been validated and investments secured. Our PaaS software stack is designed with this in mind, and will be redeployed accordingly by the end of the year.

Bare metal products

When you use bare metal products, we do not, and cannot, have control of your infrastructure and data.

Nevertheless, here are our recommendations to minimize the risks:

"RAID" storage is NOT a backup or a guarantee of data durability. Scaleway does not guarantee any backup of your data and cannot even physically do it for you.

You must, at the very least, set up a remote backup system, in accordance with basic IT security rules and standards. Moreover, we strongly recommend our customers distribute their sensitive data across several servers located in different data centers, or even different providers and with a DRP (Disaster Recovery Plan) or BCP (Business Continuity Plan) rationale.

Backup solutions:

  • Solution 1: we offer an FTP Dedibackup replicated backup space for all Dedibox server customers. There are two versions available: 100GB free of charge, and 750GB for 4.99 € excl. tax/month. At Scaleway, this data is stored in our Object Storage, with a high level of redundancy and durability (see Object Storage chapter). However, Dedibackup has very limited functionality and security, and should be considered as technologically outdated in 2021.
  • Solution 2: we strongly recommend the use of Object Storage combined with long term regional archiving on C14 Cold Storage. For example, if your server is located in DC3 (in the Paris region), we recommend storing your backup datasets on Object Storage in the Amsterdam or Warsaw region. This solution is inexpensive, easy to implement and offers extremely high durability.
  • Solution 3: the perfect solution in a multi-cloud approach is to store your backup datasets with a different provider, in a geographical location that is sufficiently far away from your primary server.

Important design considerations:

  • Dedibackup is based on Scaleway Object Storage. Although Object Storage is a regional product, due to the lack of three availability zones in the Paris region, data is currently mainly stored in the fr-par-2 availability zone (DC5). If your Bare Metal server is located in DC5, we recommend that you use solution 2 from this list. Also, for DC5 Bare Metal customers, we will soon allow you to choose the storage region of your Dedibackup.
  • To view the physical location of your server in our data centers, simply log in to the account management section of the console or contact our technical support team.

A note about RPN-SAN:

RPN-SAN is a turnkey block storage solution, managed by Scaleway and designed for Dedibox Bare Metal servers running on the RPN private network. The product is available in two versions - a "Basic" version and a "High-Availability" version.

Important design considerations:

  • The "Basic" version of RPN-SAN has no redundancy and is therefore only suitable for customers with specific needs, who are aware of the related constraints. The data is stored on a single server, on a RAID hard disk array. This storage should be considered temporary and used only as such. The data is not duplicated across multiple servers or data centers, and there is no backup on Scaleway's side.
  • The “High-Availability” version of RPN-SAN has geographical redundancy. Data is synchronized between two different data centers via a dedicated fiber optic network.

The backup process:

  • RPN-SAN requires a backup process just like a bare metal server, on the server side, following one of the three methods described above.
  • Even with the High Availability version, although unlikely, stored data can be irretrievably lost and therefore must be backed up just like the local storage of a bare metal server.

IaaS Products

Scaleway Instances and Block Storage

Scaleway Elements instances have local storage and can also benefit from optional remote storage with Block Storage.

There has long been confusion about Instances. In order to avoid incidents that we see all too often, let’s take a closer look at them.

Instances are not and will never be "VPS" (Virtual Private Server) products, they do not work in the same way, and data storage does not follow the same logic. Remember that a state-of-the-art Public Cloud client infrastructure should normally be designed to scale horizontally, over a large number of instances. As each instance has a limited lifespan, the number of instances running simultaneously at a given time is based on an application's usage. Persistent data is normally stored mainly in Object Storage but also in Block Storage.

The public cloud is not suitable for monolithic and non-distributed applications, and only bare metal server products or the private cloud can meet their requirements.

Confusion about the service level of instances has been around (for a long time) in the cloud world, but with a little effort, it is possible to better understand the difference with older models (VPS):

This translates to potentially 74.40 hours of unavailability over a 1 month period.

Another source can be found here.

Important design considerations:

  • So-called Local storage is temporary. This storage has the same lifetime as the instance, and can be moved by Scaleway to our Object Storage for future use when the instance is destroyed (Power-Off / Stop functionality). This local storage can be backed up as a "snapshot" and duplicated as many times as you like. It is based on SSDs and NVMe using high-performance RAID arrays that have, by definition, a limited lifetime and write rate. It is neither redundant nor backed up by Scaleway, and should never be used to store persistent and/or important data.
  • Local storage therefore works precisely as its name suggests. It is physically attached to the hypervisor that hosts the instance, without any abstraction layer. We do not know how to perform hot migration on this type of storage, so in the event of maintenance or a software or hardware incident on the hypervisor, data may be permanently lost.
  • Block Storage is a high availability and high performance persistent storage. It is made up of multiple redundant clusters offering triple replication of your data. As an IaaS product, its use, resilience and lifespan are controlled by an availability zone. The same goes for the instance.
  • Block Storage cannot be used in any other availability zone than that of the instance due to latency issues.

  • Best practices and backup techniques:

The golden rule - each type of storage has its own use.

  • Local storage should be used for operating systems (the images), logs, temporary transactional data, temporary files, and datasets requiring high performance calculations.
  • Block Storage should be used for storing user data, databases and content.
  • Object Storage should be used for all data that needs to be delivered and that is persistent.
  • If local storage contains persistent data, although strongly discouraged, at the very least a snapshot should be taken regularly and saved using Object Storage in another region and kept in C14 cold storage.
  • Similarly, even though Block Storage is redundant and replicated three times, it is physically hosted in the same availability zone as the instance. This is why we strongly recommend performing regular snapshots and creating a dataset using Object Storage in another region and the C14 Cold Storage.

A note about snapshots:

  • Scaleway keeps the snapshots of local volumes in Object Storage. Block volume snapshots are kept in the same cluster.
  • In the short term, we plan to give our customers direct access to these snapshots via an Object Storage bucket in order to facilitate backup or lifecycle actions by the customer.

PaaS Products

Object Storage

Object Storage is an S3-compatible data storage system that offers a high level of functionality and resilience. This product is entirely managed by Scaleway, meaning we have full responsibility for the durability of the data stored in Object Storage.

Important design considerations:

  • Object Storage is currently available across our three regions. We recommend using this product as primary storage for your cloud applications.
  • Our platform uses erasure coding 6+3.
  • Our regions currently do not consist of the three availability zones necessary for maximum resilience. In other words, data availability can be impacted in case of the total destruction of an availability zone. With the implementation of the fr-par-3 availability zone in the near future, Object Storage will be resilient to the total destruction of any availability zone in the same region.
  • Our Object Storage supports data life cycles as well as service classes, including Glacier class. Transferring stored data to our C14 Cold Storage can be done very easily and quickly at extremely low costs.
  • Our C14 Cold Storage is physically hosted separately from other availability zones, in the DC4 Datacenter (fallout shelter), which benefits from extremely high physical and fire protection standards. It is currently only available in the Paris region.

Best practices and backup techniques:

  • Ideally, your applications should be designed to store data using Object Storage in two different regions simultaneously. This is also the only technical solution that provides redundancy with respect to the region's network. This applies regardless of the public cloud provider you use. Besides being simple to use and requiring minimal development work during the build of your application, this solution is also extremely reliable in terms of both availability and durability.
  • Within a multi-cloud approach, the first solution is to also use a region of another S3-compatible public cloud provider that is far enough away from the Scaleway region you have chosen.
  • If the dual-region or multi-cloud solution cannot be implemented, we simply recommend backing up your buckets regularly. This is easy to do with Scaleway, just create a life cycle rule duplicating your S3 buckets in C14 Cold Storage. This solution is very simple, reliable and takes only a few minutes to set up.
  • C14 Cold Storage is the most reliable and resilient product on the market for your backups, delivering the highest market standards at a very low price. At Scaleway, we offer the first 75GB every month and additional gigabytes are charged at less than €0.002 per month. Once the data is written, the physical device is disconnected from the electricity supply, protecting your data from potential software or human error.

Scaleway Elements Database as a Service (DBaaS)

Our Managed Database is based on Scaleway Elements instances. Like all the other cloud providers on the market, our product has the same scope as our instances, and therefore the same availability zone.

Important design considerations:

  • Our standard DBaaS products offer Backup/Restore and database export capabilities through Scaleway Object Storage.
  • The HA version distributes the two nodes that make up your database cluster across different racks on different hypervisors, in the same availability zone. This option makes your database resilient to a hypervisor or instance crash.
  • We perform cross-region backups by default. Paris DBaaS are backed up in Amsterdam, Amsterdam in Paris, and Warsaw in Paris. Users can access these backups via the console or the API through a pre-signed link valid for 24 hours.

Best practices and backup techniques:

  • Ideally, for a multi-cloud approach, we recommend the simultaneous or failover use of several database clusters, across several regions or with different cloud providers. However, in practice, this is almost impossible to implement with relational databases.
  • A good practice is to always have a DBaaS cluster on standby, configured and ready to use in another region, and ready to receive the most recent dump from the main cluster. Consider using FQDN names with a low TTL (60 seconds) from our domains service to easily and quickly change the destination cluster of your applications.
  • Don't forget to regularly (several times a day) archive your DBaaS database dump in C14 Cold Storage.

Kubernetes Kapsule

Kapsule is an instance orchestrator. As such, it has the same scope as an instance, and therefore the same availability zone.

Our recommendations are identical to those of our Elements Instances. Ideally, we advise you use two Kubernetes clusters in two different regions, or even with two separate cloud providers, and use our multi-cloud load-balancing product for your different clusters.

Share on
Other articles about:

Recommended articles