ScalewaySkip to loginSkip to main contentSkip to footer section

Versatile usage

Leveraging efficient and fast image and video decoding, rendering, as well as Deep Learning model training and inference, the L4 GPU Instance covers a versatile range of needs.

Affordability without compromise

Offering a cost-effective alternative to higher-priced GPUs, the L4 GPU Instance achieves solid performance without burdening restricted budgets, enabling startups and small-scale projects.

AI Video Excellence

L4 GPU Instance delivers excellent video generation, decoding, and pre-post processing, empowering industries like fashion, architecture, gaming, and advertising with coherent visual content creation.

Boost Innovation Sustainably: 100% Renewable Energy, 50% Less Power

DC5 PAR2 Paris

DC5 is one of Europe's greenest data centers, powered entirely by renewable wind and hydro energy (GO-certified) and cooled with ultra-efficient free and adiabatic cooling. With a PUE of 1.16 (vs. the 1.55 industry average), it slashes energy use by 30-50% compared to traditional data centers.

Get more details

WAW2 Warsaw

WAW2 runs on 100% wind power (GO-certified) and uses a combination of direct free cooling, free chilling, immersion systems, and air conditioning to optimize system cooling. With a PUE of 1.32—better than the industry average—it minimizes energy consumption for maximum efficiency.

L4 GPU technical specifications

  • gpu

    GPU

    NVIDIA L4 Tensor Core GPU

  • memory

    GPU Memory

    24GB GDDR6 (300 GB/s)

  • processor

    Processor

    8 vCPUs AMD EPYC 7413

  • processor_frequency

    Processor frequency

    2.65 Ghz

  • memory

    Memory

    48 GB of RAM

  • memory_type

    Memory type

    DDR4

  • bandwidth

    Network Bandwidth

    2.5 Gbps

  • storage

    Storage

    Block Storage

  • threads_cores

    Cores

    Tensor Cores 4th generation RT Cores 3rd generation

Ideal use cases with the L4 GPU Instance

Accelerate Image and video generation affordably

If you’ve put a text-to-image model in production, and you’re looking to optimize the cost of you’re infrastructure but not at the cost of performance. L4 Instance GPU is a serious candidate.


L4 GPU Instance generates a 256x256px image in 14445.1 pixel/second

  • It’s 56% faster than with a Render Instance (9278.3 pixel/second)
  • It’s 6,3% faster than with a T4 GPU Instance (13583.3 pixel/second)
  • It’s 8.5% faster than with a V100 PCIe GPU (16G) Instance (13314.6 pixel/second)
  • It’s 8,2% faster than with a V100 SXM2 GPU (16G) Instance (13348.8 pixel/second)
  • And almost as fast as A100 SXM 40GB (-1,6%) (14681.1pixel/second)

And with 50 percent more memory capacity, L4 enables larger image generation, up to 1024x768, which wasn’t possible on the previous GPU generation (T4)

Source: Model tested InvokeAI a popular open-source framework for image generation and modifications. On top of it Cloud Mercato created invokeai-benchmark, a handy tool make our tests and methodology more easily reproducible.

Estimate the GPU cost

Choose your plan

Select...
Select...
GB

Min. 10 GB

You need a Flexible IP if you want to get an Instance with a public IPv4. Uncheck this box if you already have one available on your account, or if you don’t need an IPv4.

Estimated cost

OptionValuePrice
ZoneParis 2
Instance1x0€
Volume10GB0€
Flexible IPv4Yes0.004€
Total
Daily0
Weekly0
Monthly0
Give the L4 GPU Instance a try today

Scale your infrastructure effortlessly

Choose your Instance's format

With four flexible formats, including 1, 2, 4, and 8 GPU options, you can now easily scale your infrastructure according to your specific requirements.

Instance NameNumber of GPUTFLOPs FP16 Tensor CoresVRAMprice per hourprice per minute
L4-1-24G1 NVIDIA L4 Tensor Core GPU242 TFLOPS24GB€0.75/hour€0.0125/min
L4-2-24G2 NVIDIA L4 Tensor Core GPU484 TFLOPS2 x 24GB€1.5/hour€0.025/min
L4-4-24G4 NVIDIA L4 Tensor Core GPU968 TFLOPS4x 24GB€3/hour€0.05/min
L4-8-24G8 NVIDIA L4 Tensor Core GPU1936 TFLOPS8x 24GB€6/hour€0.1/min

Build and monitor a flexible and secured cloud infrastructure powered by GPU

KubernetesDedicatedControlPlane-Schema-1040px-Dar.webp

Benefit from a complete cloud ecosystem

Kubernetes Kapsule

Match any growth of resource needs effortlessly with an easy-to-use managed Kubernetes compatible with a dedicated control plane for high-performance container management.

Learn more

Load Balancer

Distribute workloads across multiple servers with Load Balancer to ensure continued availability and avoid servers being overloaded.

Learn more

Frequently asked questions

What is included in the Instance price?

Our GPU Instances' prices include the vCPU and RAM needed for optimal performance.
To launch the L4 GPU Instance you will need to provision a minimum of Block Storage and a flexible IP at your expense.
Any doubt about the price, use the calculator, it's made for it!

What are the differences between L4-1-24G, L4-2-24G, L4-4-24G and L4-8-24G?

These are 4 formats of the same instance embedding NVIDIA L4 Tensor Core GPU.

  • L4-1-24G embeds 1 NVIDIA L4 Tensor Core GPU, offering a GPU memory of 24GB
  • L4-2-24G embeds 2 NVIDIA L4 Tensor Core GPUs, offering a GPU memory of 2 times 24GB.
  • L4-4-24G embeds 4 NVIDIA L4 Tensor Core GPUs, offering a GPU memory of 4 times 24GB.
  • L4-8-24G embeds 8 NVIDIA L4 Tensor Core GPUs, offering a GPU memory of 8 times 24GB.
Can I use MIG to get the most out of my GPU?

NVIDIA Multi-Instance GPU (MIG) is a technology introduced by NVIDIA to enhance the utilization and flexibility of their data center GPUs, specifically designed for virtualization and multi-tenant environments. This features is available on H100 PCIe GPU Instance but not on the L4 GPU Instance. However users can benefit from Kubernetes Kapsule compatibility to optimize their infrastructure.

Learn more

How to choose the right GPU for my workload?

There are many criteria to take into account to choose the right GPU instance:

  • Workload requirements
  • Performance requirements
  • GPU type
  • GPU memory
  • CPU and RAM
  • GPU driver and software compatibility
  • Scaling

For more guidance read the dedicated documentation on that topic