NavigationContentFooter
Jump toSuggest an edit

How to use scratch storage on H100 GPU Instances

Reviewed on 04 January 2024Published on 18 September 2023

Scaleway H100 GPU Instances are equipped with additional scratch storage. This form of temporary local storage operates differently from our regular local storage.

Scratch storage temporarily accommodates data during computational or data processing tasks. It is commonly used for storing intermediate results, processing input data, or holding output data, before that data is moved to more permanent storage.

Unlike conventional storage, scratch storage lacks features like snapshots, backups, or restores. Furthermore, it is not designed for downloading images; its main function is to serve as a cache. Typically, data in scratch storage is deleted once the computation or processing task concludes.

Scaleway’s H100 GPU Instances use NVMe disks for their scratch storage, which are fine-tuned for high-speed data access. This design ensures fast read and write speeds, which is crucial for applications demanding extensive data processing capabilities.

Note

Scratch storage does not survive once the server is stopped: doing a full stop/start cycle will erase the scratch data. However, doing a simple reboot or using the stop in place function will keep the data.

What can I use scratch storage for?

For example, it may take several minutes to fetch a large image, a delay that impedes the user’s ability to promptly initiate model training. This delay becomes particularly problematic in the realm of AI, where even a 400 GB dataset is considered relatively small.

To address this issue, we have implemented scratch storage. Unlike traditional storage, with scratch storage, you do not need to download the entire image (which cannot be used for backup restoration as a result). Scratch storage boasts the capability to feed data into the GPU at a significantly accelerated rate. This enhancement allows us to provide the GPU with a substantial amount of scratch storage, ensuring a swifter and more efficient data input process. This enhancement allows us to provide the GPU with a substantial amount of scratch storage, ensuring a swifter and more efficient data input process.

Note

The maximum possible size for scratch storage is

  • for H100-1-80G Instances: 3 TB
  • for H100-2-80G Instances: 6 TB

How can I add scratch storage to my GPU Instance using the Scaleway CLI or console?

Scratch storage is automatically added when creating H100-1-80G and H100-2-80G Instances.

How can I add scratch storage to my GPU Instance when using the API?

You need to add an extra volume, for example:

"volumes":{"1":{"name":"scratch-volume","volume_type":"scratch","size":3000000000000}}

How can I add scratch storage to my GPU Instance using Terraform?

resource “scaleway_instance_volume” “scratch_volume” {
size_in_gb = 3000
type = “scratch”
}
resource “scaleway_instance-server” “myserver” {
type = “H100-1-80G”
image = “ubuntu_jammy_gpu_os_12”
additional_volume_ids = [scaleway_instance_volume.scratch_volume.id]
}
See also
How to how to use NVIDIA MIG technology with KubernetesHow to use the preinstalled environment
Docs APIScaleway consoleDedibox consoleScaleway LearningScaleway.comPricingBlogCarreer
© 2023-2024 – Scaleway