H100 GPU instances

Maximize the performance of your AI workloads with H100 GPU instances, supporting up to 8 GPUs per instance.

Say goodbye to bottlenecks

Have you ever reached the limit of your GPU when playing with the latest AI model? H100 SXM instances use NVLink interconnect technology, enabling faster GPU-to-GPU communication. Benchmarks show up to 30% higher compute performance compared to traditional setups.

Accelerate inference workloads up to 30 times

Accelerate your model serving workloads thanks to Transformer Engine 30x faster for AI inference and new data formats.

Maximize GPU utility up to your needs

Choose 1, 2, or 8 H100 GPUs per instance to match your compute requirements. Need more granularity? The second generation of Secure Multi-Instance GPU (MIG) lets you partition a GPU into isolated instances of the right size — maximizing utilization across workloads of all sizes.

Technical specifications

Characteristics	H100 PCIe	H100 SXM ^NEW
GPU	NVIDIA H100 PCIe Tensor Core	NVIDIA H100 Tensor Core
GPU memory	up to 2 x 80GB HBM2e	up to 8 x 80GB HBM3
Processor	up to 48 vCPUs AMD Epyc Zen 4	up to 128 vCPUs Xeon Platinum 8452Y
Processor Frequency	2.7 Ghz	2 Ghz
Memory	up to 480 GB of RAM	up to 960 GB of RAM
Memory type	DDR5	DDR5
Bandwidth	up to 20 Gbps	up to 20 Gbps
Storage	Block Storage for the boot and up to 6TB of Scratch Storage NVMe	Block Storage for the boot and up to 12.8TB of Scratch Storage NVMe

Choose the right GPU

Secure your H100 GPU Instance resource for months or years

Talk with an expert today

Estimate the Instance cost

Choose your plan

Estimated cost

Option	Value	Price
Zone	Paris 2
Instance	1x	0€
Volume	10GB	0€
Flexible IPv4	Yes	0.004€

Total
Hourly	0€
Daily	0€
Weekly	0€
Monthly	0€

Give the H100 GPU Instances a try today

Boost Innovation Sustainably: 100% Renewable Energy, 50% Less Power

DC5 PAR2 Paris

DC5 is one of Europe's greenest data centers, powered entirely by renewable wind and hydro energy (GO-certified) and cooled with ultra-efficient free and adiabatic cooling. With a PUE of 1.16 (vs. the 1.55 industry average), it slashes energy use by 30-50% compared to traditional data centers.

Get more details

WAW2 Warsaw

WAW2 runs on 100% wind power (GO-certified) and uses a combination of direct free cooling, free chilling, immersion systems, and air conditioning to optimize system cooling. With a PUE of 1.32—better than the industry average—it minimizes energy consumption for maximum efficiency.

Get more details

Discover Scaleway's environmental commitments

Customer success stories

"Execution difference is 40% in favor of using the H100 PCIe GPUs"

Sovereign AI specialists Golem.ai took a deep technical dive into the topic and shared their findings on our blog. “After running a hundred tests in total between Replicate.com and the H100 de Nvidia hosted by Scaleway, we conclude that the execution difference is 40% in favor of using the H100s,” says Golem.ai’s Kevin Baude.

Read the full article

Numerous AI applications and use cases

Natural Language Processing

Understands, interprets, and generates human language in a way that is both meaningful and contextually relevant.
Thanks to models and algorithms specialized in:

Text classification
Machine translation
Entailment prediction
Named entity recognition
Sequence-to-sequence, like BERT for text extraction
Text similarity search, like BERT to find semantic similarities
Language modeling

Choose your instance's format

Instance Name	Number of GPU	TFLOPs FP16 Tensor Cores	VRAM	Prices
H100-1-80GB	1 H100 PCIe Tensor Core	Up to 1,513 teraFLOPS	80GB	€2.73/hour
H100-2-80G	2 H100 PCIe Tensor Core	Up to 3,026 teraFLOPS	2 x 80GB	€5.46/hour
H100-SXM-2-80G^{/coming soon}	2 H100 Tensor Core	Up to 3,958 teraFLOPS	2 x 80GB	€6.018/hour
H100-SXM-4-80G^{/coming soon}	4 H100 Tensor Core	Up to 7,916 teraFLOPS	4 x 80GB	€11.61/hour
H100-SXM-4-80G^/NEW	8 H100 Tensor Core	Up to 15,832 teraFLOPS	8 x 80GB	€23.028/hour

Enjoy the simplicity of a pre-configured AI environment

Optimized GPU OS Image

Benefit from a ready-to-use Ubuntu image to launch your favorite deep learning containers (pre-installed NVIDIA driver and Docker environment).

Learn more

Enjoy your favorite Jupyter environment

Easily launch your favorite JupyterLab or Notebook thanks to the pre-installed Docker environment

Learn more

Choose your AI containers among multiple registries

Access multiple container registries: your own build containers, Scaleway AI containers, NVIDIA NGC registry and any other registry

Learn more

NVIDIA Enterprise AI software at your disposal

Access hundreds of AI softwares optimized by Nvidia to maximise the efficiency of your GPUs and boost your productivity. Among hundreds of softwares developed by NVIDIA and tested by leaders of their industry, harness the efficiency of

NVIDIA Nemo for LLM fine-tuning,
NVIDIA TAO for computer vision,
NVIDIA TRITON for inference

Learn more

Deploy and Scale your infrastructure with Kubernetes

Frequently asked questions

What is included in the instance price?

3TB of Scratch Storage are included in the instance price, but any Block Storage provisioned by you, is at your expense.
For redundancy and thus security reasons we strongly recommend that you provision extra Block Storage volume, as Scratch Storage is ephemeral storage that disappears when you switch off the machine. Scratch Storage purpose is to speed up the transfer of your data sets to the gpu.
How to use Scratch storage then? Follow the guide

Whats the difference between H100-1-80G and H100-2-80G?

These are 2 formats of the same instance embedding NVIDIA H100 PCIe Tensor Core.

H100-1-80G embeds 1 GPU NVIDIA H100 PCIe Tensor Core, offering a GPU memory of 80GB
H100-2-80G embeds 2 GPUs NVIDIA H100 PCIe Tensor Core, offering a GPU memory of 2 times 80GB. This instance enables faster time to train for bigger Transformers models that scale 2 GPUs at a time. Thanks to the PCIe Board Factor, the servers of the H100 PCIe GPU instance are made of 2 GPUs. By launching a H100-2-80G instance format, the user benefits from a fully dedicated server with 2 GPUs.

What is the environmental impact of the H100 PCIe instance?

NVIDIA announced the H100 to enable companies to slash costs for deploying AI, "delivering the same AI performance with 3.5x more energy efficiency and 3x lower total cost of ownership, while using 5x fewer server nodes over the previous generation."
What inside the product can confirm this announcement?

The thinner engraving of the chip reduces the surface and thus the energy required to power the chip
Thanks to innovations like the new data format FP8 (8bits) more calculations are done with the same amount of consumption resulting in time and energy optimization

In addition, at Scaleway we decided to localize our H100 PCIe instances in the adiabatic Data Center DC5. With a PUE (Power User Effectiveness) of 1.15 (average is usually 1.6) this datacenter saves between 30% and 50% electricity compared with a conventional data centre.
Stay tuned for our benchmarks on the topic!

How can I use MIG to get the most out of my GPU?

NVIDIA Multi-Instance GPU (MIG) is a technology introduced by NVIDIA to enhance the utilization and flexibility of their data center GPUs, specifically designed for virtualization and multi-tenant environments. It allows a single physical GPU to be partitioned into up to seven smaller Instances, each of which operates as an independent MIG partition with its own dedicated resources, such as memory, compute cores, and video outputs.
Read the dedicated documentation to use MIG technology on your GPU instance

How to choose the right GPU for my workload?

There are many criteria to take into account to choose the right GPU instance:

Workload requirements
Performance requirements
GPU type
GPU memory
CPU and RAM
GPU driver and software compatibility
Scaling

For more guidance read the dedicated documentation on that topic

What is NVlink?

NVIDIA NVLink is a high-speed interconnect technology developed by NVIDIA that allows for faster data transfer between GPUs and between GPUs and CPUs.
It is designed to significantly increase the bandwidth and reduce the latency of data transfers compared to traditional PCIe (Peripheral Component Interconnect Express) connections.
This is particularly beneficial in high-performance computing (HPC) and data center environments where multiple GPUs are used in parallel to accelerate computations.
NVLink enables better scalability and efficiency in deep learning, scientific simulations, and other data-intensive applications by providing a more efficient way to share data and workload between GPUs

H100 GPU instances

Say goodbye to bottlenecks

Accelerate inference workloads up to 30 times

Maximize GPU utility up to your needs

Technical specifications

Estimate the Instance cost

Choose your plan

Estimated cost

Boost Innovation Sustainably: 100% Renewable Energy, 50% Less Power

DC5 PAR2 Paris

WAW2 Warsaw

Customer success stories

Numerous AI applications and use cases

Natural Language Processing

Automatic Speech Recognition

Generative AI

Computer vision

Recommender

Choose your instance's format

Enjoy the simplicity of a pre-configured AI environment

Optimized GPU OS Image

Enjoy your favorite Jupyter environment

Choose your AI containers among multiple registries

NVIDIA Enterprise AI software at your disposal

Deploy and Scale your infrastructure with Kubernetes

Frequently asked questions