GPU dedicated servers

GPU-Driven Infrastructure.
Unleash your AI and ML potential with scalable GPU servers optimized for performance and cost-efficiency.

Configure my server

HP enterprise servers

Your GPU configuration is installed on Hewlett Packard Enterprise servers, stress tested for 100% compatibility and stability.

Choose your data center

Get a GPU dedicated server, deployed in one of our New York, Miami, San Francisco, Amsterdam or Bucharest data centers.

Low latency network

Your server is connected to a custom-built, low latency global network.

Support

Get access to instant support, from real humans, available around the clock via phone or live chat.

Get instant access to affordable GPU dedicated servers

Unbeatable prices

Found it somewhere else cheaper? Take 10% off the lowest advertised price. Contact us for details.

5 minutes deployment

Access your bare-metal GPU server within 5 minutes, once your payment is verified.

24/7 Support

Instant, round-the-clock support provided by a team of GPU server experts.

See Configurations

Hourly price per A100 GPU

*Pricing based on a single A100 GPU with 40GB VRAM.

up to 4 matching gpu's per server

nvidia A40 / A100

NVIDIA's Ampere Architecture is the fundamental solution for AI acceleration, from the edge to the cloud. The NVIDIA A40 enables multi-workload capabilities with ultra-modern features for ray-traced rendering, VR, and more. The NVIDIA A100 Tensor Core GPU delivers unrivaled acceleration at every scale with Multi-Instance GPU (MIG) technology.

NVIDIA A100 Specifications

48 GB GDDR6 with ECC
10752 CUDA Cores
336 Tensor Cores
696 GB/s Max Bandwidth
NVIDIA GPU Boost

NVIDIA A40 Specifications

40 GB GDDR6
6912 CUDA Cores
432 Tensor Cores
1555 GB/s Max Bandwidth
NVIDIA GPU Boost

Order configuration

nvidia H100

Unlock next-generation AI performance with the NVIDIA H100 Tensor Core GPU. Engineered on the breakthrough Hopper architecture, the H100 is purpose-built for large language models, generative AI, and complex deep learning workloads. Experience up to 9x faster AI training compared to previous generations, powered by the innovative Transformer Engine and fourth-generation Tensor Cores. With massive 80GB HBM3 memory and 3TB/s bandwidth, the H100 handles the most data-intensive AI applications with ease, making it the ideal choice for researchers, data scientists, and AI developers pushing the boundaries of machine learning.

NVIDIA H100 Specifications

80 GB HBM3 with ECC
8448 CUDA Cores
528 Tensor Cores (4th Gen)
3 TB/s Memory Bandwidth
Transformer Engine
NVIDIA GPU Boost

Order configuration

nvidia rtx 6000 pro

Transform your creative workflows with the NVIDIA RTX 6000 Pro, engineered for professionals who demand uncompromising performance. This powerhouse workstation GPU delivers stunning real-time ray tracing, AI-accelerated content creation, and seamless 8K video editing capabilities. With a massive 48GB of error-correcting memory and support for multiple high-resolution displays, the RTX 6000 Pro excels in 3D animation, visual effects, architectural visualization, and product design. Whether you're rendering cinematic scenes, developing AI-enhanced content, or streaming professional broadcasts, this GPU provides the exceptional performance and rock-solid reliability that studios, creators, and production teams depend on for their most ambitious projects.

NVIDIA RTX 6000 Pro Specifications

96 GB GDDR6 with ECC
24 064 CUDA Cores
568 Tensor Cores (4th Gen)
142 RT Cores (3rd Gen)
1 792 GB/s Memory Bandwidth
NVIDIA GPU Boost

Order configuration

nvidia L4 / L40S

Accelerate your video streaming, AI applications, and creative workloads with the versatile NVIDIA L4 and L40S GPUs. The L4 is engineered for high-density streaming and AI inference, delivering exceptional video transcoding performance for live broadcasts, VOD platforms, and real-time content delivery with minimal latency. The L40S steps up the power for content creators and developers, providing robust support for 3D rendering, virtual production, AI-enhanced video processing, and generative AI workflows. Both GPUs excel at handling multiple simultaneous streams, making them perfect for broadcasting studios, streaming platforms, game servers, and AI development environments where efficiency and performance are critical.

NVIDIA L4 Specifications

24 GB GDDR6 with ECC
7424 CUDA Cores
232 Tensor Cores (4th Gen)
58 RT Cores (3rd Gen)
300 GB/s Memory Bandwidth

NVIDIA L40S Specifications

48 GB GDDR6 with ECC
18176 CUDA Cores
568 Tensor Cores (4th Gen)
142 RT Cores (3rd Gen)
864 GB/s Memory Bandwidth

Order configuration

NVIDIA GeForce RTX 5080 / RTX 5090

NVIDIA's latest generation of GPUs pushes the boundaries of gaming and creative performance with cutting-edge advancements in graphics rendering and AI acceleration. Built on next-gen architecture, these GPUs offer improved power efficiency, significantly faster ray tracing, and remarkable computational capabilities.

GeForce RTX 5080 Specifications

16 GB GDDR6X
10 752 CUDA Cores
Ultra-fast memory bandwidth

GeForce RTX 5090 Specifications

24 GB GDDR6X
21 760 CUDA Cores
Blazing-fast memory bandwidth

Order configuration

nvidia rtx 4090D

Unlock the pinnacle of gaming and creative performance with the NVIDIA GeForce RTX 4090D. Powered by the groundbreaking Ada Lovelace architecture, this GPU delivers remarkable power and efficiency for ultra-realistic graphics and immersive experiences.

RTX 4090D Specifications

24 GB GDDR6X
14 592 CUDA Cores
1 008 GB/s Max Bandwidth

Compatible with Linux, CUDA/OpenCL, DirectX, Windows.

Order configuration

nvidia quadro rtx a4000 / a5000 / a6000

Nvidia's new generation of Ampere-based GPUs brings significant improvements over the Turing Quadro RTX series. With double the processing speed for single precision floating point (FP32) operations and greater power efficiency, the RTX A generation delivers more visually accurate renders and 2x faster ray-tracing.

QUADRO RTX A4000 Specifications

16 GB GDDR6
6144 CUDA Cores
448 GB/s Max Bandwidth

QUADRO RTX A5000 Specifications

24 GB GDDR6
8192 CUDA Cores
768 GB/s Max Bandwidth

QUADRO RTX A6000 Specifications

48GB GDDR6X
10752 CUDA Cores
768 GB/s Max Bandwidth

Order configuration

nvidia rtx 3070 / 3080 / 3090

NVIDIA'S GeForce RTX 30 Series graphics cards run on Ampere architecture, 2nd generation RTX, featuring several new technologies, from faster Ray Tracing and Tensor Cores to advanced streaming multiprocessors. The world's fastest graphics memory, GDDR6X, delivers remarkable performance perfect for AI, visualization, and gaming.

RTX 3070 Specifications

8 GB GDDR6
5888 CUDA Cores
512 GB/s Max Bandwidth
NVIDIA GPU Boost

RTX 3080 Specifications

10 GB GDDR6X
8704 CUDA Cores
760 GB/s Max Bandwidth
NVIDIA GPU Boost

RTX 3090 Specifications

24GB GDDR6X
10496 CUDA Cores
936 GB/s Max Bandwidth
NVIDIA GPU Boost

Compatible with Linux, CUDA/OpenCL, KVM, Windows.

Order configuration

nvidia quadro 5000 / 6000 / 8000

The NVIDIA Quadro RTX series gives you access to the well-known Turing™ chip architecture that reformed the work of millions of designers and creators. Hardware-accelerated ray tracing, state-of-the-art shading, new AI-based abilities enable artists to increase their rendering capabilities.

QUADRO RTX 5000 Specifications

16 GB GDDR6
3072 CUDA Cores
448 GB/s Max Bandwidth
NVIDIA GPU Boost

QUADRO RTX 6000 Specifications

24 GB GDDR6
4608 CUDA Cores
672 GB/s Max Bandwidth
NVIDIA GPU Boost

QUADRO RTX 8000 Specifications

48 GB GDDR6
4608 CUDA Cores
672 GB/s Max Bandwidth
NVIDIA GPU Boost

Compatible with Linux, CUDA/OpenCL, KVM, Windows.

Order configuration

quadro rtx 4000

Get access to the best performance and features from a single PCI-e slot with NVIDIA'S QUADRO RTX 4000. State-of-the-art display and memory technologies combined with the Turing™ chip architecture delivers photorealistic single ray-traced rendering in a fraction of a second.

QUADRO RTX 4000 Specifications

8 GB GDDR6
2304 CUDA Cores
416 GB/s Max Bandwidth
NVIDIA GPU Boost

Compatible with Linux, CUDA/OpenCL, KVM, Windows.

Order configuration

Video

Transcode up to two video streams simultaneously, through the new Turing chip architecture.

3D Rendering

Use the power of the RTX 2080 to render 3D graphics faster than ever.

Mining

Mine cryptocurrency through the new Turing chip architecture, found on the RTX 2080 and RTX 2080 Ti.

nvidia tesla t4

The T4 introduces Tensor Core technology with multi-precision computing, making it up to 40 times faster than a CPU and up to 3.5 times faster than its Pascal predecessor, the Tesla P4. Get access to 8.1 TFLOPS of single precision performance from a single T4 GPU.

Specifications

TURO TU104
320 TURING TENSOR CORES
2560 CUDA CORES
16 GB GDDR6
8.1 TFLOPS SINGLE PRECISION
65 FP16 TFLOPS
130 INT8 TOPS
260 INT4 TOPS
320 GB/s Max Bandwidth

Compatible: VMWare ESXi, Citrix Xenserver, KVM, Linux, Windows.

Order configuration

The Coral USB Accelerator

You can now add an Edge TPU coprocessor to any Linux-based system with the Coral USB Accelerator designed by Google. The small ASIC chip provides high-performance ML inferencing with low power cost. For example, it can execute 100fps on MobileNet v2 models, while using very little power.

Specifications

ARM 32 Bit Cortex 32 MHz
Edge TPU ASIC (for Lite TensorFlow models)
USB 3.1 5Gb/s transfer speed

Compatible with Linux machines, Debian 6.0 or higher, or any derivative (such as Ubuntu 10.0+), but also with Raspberry Pi (213 Mode B/B+).

NVIDIA GeForce RTX 2080 / RTX 2080 Ti

NVIDIA's Turing chip architecture delivers up to six times the performance of previous generation GPU's, with breakthrough technologies and next generation, ultra-fast GDDR6 memory.

RTX 2080 Specifications

8 GB GDDR6
2944 CUDA Cores
448 GB/s Max Bandwidth
NVIDIA GPU Boost 4.0

RTX 2080 TI Specifications

11 GB GDDR6
2944 CUDA Cores
616 GB/s Max Bandwidth
NVIDIA GPU Boost 4.0

Compatible with Linux, CUDA/OpenCL, KVM.

Order configuration

NVIDIA GeForce GTX 1080 / 1070 TI

NVIDIA's previous chip architecture, great for mining, graphics rendering and computing. The NVIDIA Pascal architecture delivers excellent performance at a budget friendly price.

Specifications

8 GB DDR5
2560 CUDA Cores
320 GB/s Max Bandwidth
NVIDIA GPU Boost 3.0

Compatible with Linux, CUDA/OpenCL, KVM.

Order configuration

NVIDIA TESLA P4 / P40 / P100

An optimal chip for machine learning and video transcoding. NVIDIA's Pascal chip architecture has been proven to be faster and more power efficient than its Maxwell predecessor. Transcode up to 20 simultaneous video streams with a single Tesla P4.

Specifications

Pascal GP100 or GP104 chip
Up to 3584 CUDA cores
Up to 16 GB CoWoS
Enterprise grade hardware

Compatible: VMWare ESXi, Citrix Xenserver, KVM, Linux, Windows.

Order configuration

NVIDIA TITAN V

The first GPU to break the 100 teraflop barrier of deep learning performance. NVIDIA's Volta chip, is up to 3x faster than it's Pascal chip predecessor. Your deep learning project design can now be a reality, with little investment.

Specifications

NVIDIA Volta Chip
5120 CUDA cores
640 Tensor Cores
12 GB CoWoS Stacked HBM2
653Gbps max bandwidth

Compatible: VMWare ESXi, Citrix Xenserver, KVM, Linux, Windows.

Order configuration

Why Primcast?

Add a GPU to HP enterprise hardware, designed specifically for use with GPU add-ons, eliminating incompatibility issues or poor/under performance of hardware. Your services are deployed on our global low latency network, backed by a 99.9% uptime SLA and supported by GPU server experts, around the clock.