Servers in stock
 Checking availability...
50% off 1st month on Instant Servers - code 50OFF Sales: +1‑917‑284‑6090
Configure server

GPU dedicated servers

GPU-Driven Infrastructure.
Unleash your AI and ML potential with scalable GPU servers optimized for performance and cost-efficiency.

HP enterprise servers

Your GPU configuration is installed on Hewlett Packard Enterprise servers, stress tested for 100% compatibility and stability.

Choose your data center

Get a GPU dedicated server, deployed in one of our New York, Miami, San Francisco, Amsterdam or Bucharest data centers.

Low latency network

Your server is connected to a custom-built, low latency global network.

Support

Get access to instant support, from real humans, available around the clock via phone or live chat.

Get instant access to affordable GPU dedicated servers

Unbeatable prices

Found it somewhere else cheaper? Take 10% off the lowest advertised price. Contact us for details.

5 minutes deployment

Access your bare-metal GPU server within 5 minutes, once your payment is verified.

24/7 Support

Instant, round-the-clock support provided by a team of GPU server experts.

See Configurations

Hourly price per A100 GPU

chart-GPU-costs-comparison

*Pricing based on a single A100 GPU with 40GB VRAM.

up to 4 matching gpu's per server

NVIDIA A100 CHIP
nvidia A40 / A100

NVIDIA's Ampere Architecture is the fundamental solution for AI acceleration, from the edge to the cloud. The NVIDIA A40 enables multi-workload capabilities with ultra-modern features for ray-traced rendering, VR, and more. The NVIDIA A100 Tensor Core GPU delivers unrivaled acceleration at every scale with Multi-Instance GPU (MIG) technology.

NVIDIA A100 Specifications

  • 48 GB GDDR6 with ECC
  • 10752 CUDA Cores
  • 336 Tensor Cores
  • 696 GB/s Max Bandwidth
  • NVIDIA GPU Boost

NVIDIA A40 Specifications

  • 40 GB GDDR6
  • 6912 CUDA Cores
  • 432 Tensor Cores
  • 1555 GB/s Max Bandwidth
  • NVIDIA GPU Boost
Order configuration
NVIDIA H100
nvidia H100

Unlock next-generation AI performance with the NVIDIA H100 Tensor Core GPU. Engineered on the breakthrough Hopper architecture, the H100 is purpose-built for large language models, generative AI, and complex deep learning workloads. Experience up to 9x faster AI training compared to previous generations, powered by the innovative Transformer Engine and fourth-generation Tensor Cores. With massive 80GB HBM3 memory and 3TB/s bandwidth, the H100 handles the most data-intensive AI applications with ease, making it the ideal choice for researchers, data scientists, and AI developers pushing the boundaries of machine learning.

NVIDIA H100 Specifications

  • 80 GB HBM3 with ECC
  • 8448 CUDA Cores
  • 528 Tensor Cores (4th Gen)
  • 3 TB/s Memory Bandwidth
  • Transformer Engine
  • NVIDIA GPU Boost
Order configuration
NVIDIA RTX 6000 PRO
nvidia rtx 6000 pro

Transform your creative workflows with the NVIDIA RTX 6000 Pro, engineered for professionals who demand uncompromising performance. This powerhouse workstation GPU delivers stunning real-time ray tracing, AI-accelerated content creation, and seamless 8K video editing capabilities. With a massive 48GB of error-correcting memory and support for multiple high-resolution displays, the RTX 6000 Pro excels in 3D animation, visual effects, architectural visualization, and product design. Whether you're rendering cinematic scenes, developing AI-enhanced content, or streaming professional broadcasts, this GPU provides the exceptional performance and rock-solid reliability that studios, creators, and production teams depend on for their most ambitious projects.

NVIDIA RTX 6000 Pro Specifications

  • 96 GB GDDR6 with ECC
  • 24 064 CUDA Cores
  • 568 Tensor Cores (4th Gen)
  • 142 RT Cores (3rd Gen)
  • 1 792 GB/s Memory Bandwidth
  • NVIDIA GPU Boost
Order configuration
NVIDIA L4 L40S
nvidia L4 / L40S

Accelerate your video streaming, AI applications, and creative workloads with the versatile NVIDIA L4 and L40S GPUs. The L4 is engineered for high-density streaming and AI inference, delivering exceptional video transcoding performance for live broadcasts, VOD platforms, and real-time content delivery with minimal latency. The L40S steps up the power for content creators and developers, providing robust support for 3D rendering, virtual production, AI-enhanced video processing, and generative AI workflows. Both GPUs excel at handling multiple simultaneous streams, making them perfect for broadcasting studios, streaming platforms, game servers, and AI development environments where efficiency and performance are critical.

NVIDIA L4 Specifications

  • 24 GB GDDR6 with ECC
  • 7424 CUDA Cores
  • 232 Tensor Cores (4th Gen)
  • 58 RT Cores (3rd Gen)
  • 300 GB/s Memory Bandwidth

NVIDIA L40S Specifications

  • 48 GB GDDR6 with ECC
  • 18176 CUDA Cores
  • 568 Tensor Cores (4th Gen)
  • 142 RT Cores (3rd Gen)
  • 864 GB/s Memory Bandwidth
Order configuration
NVIDIA GeForce RTX 5090
NVIDIA GeForce RTX 5080 / RTX 5090

NVIDIA's latest generation of GPUs pushes the boundaries of gaming and creative performance with cutting-edge advancements in graphics rendering and AI acceleration. Built on next-gen architecture, these GPUs offer improved power efficiency, significantly faster ray tracing, and remarkable computational capabilities.

GeForce RTX 5080 Specifications

  • 16 GB GDDR6X
  • 10 752 CUDA Cores
  • Ultra-fast memory bandwidth

GeForce RTX 5090 Specifications

  • 24 GB GDDR6X
  • 21 760 CUDA Cores
  • Blazing-fast memory bandwidth
Order configuration
NVIDIA RTX 4090D
nvidia rtx 4090D

Unlock the pinnacle of gaming and creative performance with the NVIDIA GeForce RTX 4090D. Powered by the groundbreaking Ada Lovelace architecture, this GPU delivers remarkable power and efficiency for ultra-realistic graphics and immersive experiences.

RTX 4090D Specifications

  • 24 GB GDDR6X
  • 14 592 CUDA Cores
  • 1 008 GB/s Max Bandwidth

Compatible with Linux, CUDA/OpenCL, DirectX, Windows.

Order configuration
NVIDIA QUADRO RTX A6000
nvidia quadro rtx a4000 / a5000 / a6000

Nvidia's new generation of Ampere-based GPUs brings significant improvements over the Turing Quadro RTX series. With double the processing speed for single precision floating point (FP32) operations and greater power efficiency, the RTX A generation delivers more visually accurate renders and 2x faster ray-tracing.

QUADRO RTX A4000 Specifications

  • 16 GB GDDR6
  • 6144 CUDA Cores
  • 448 GB/s Max Bandwidth

QUADRO RTX A5000 Specifications

  • 24 GB GDDR6
  • 8192 CUDA Cores
  • 768 GB/s Max Bandwidth

QUADRO RTX A6000 Specifications

  • 48GB GDDR6X
  • 10752 CUDA Cores
  • 768 GB/s Max Bandwidth
Order configuration
NVIDIA RTX 3090
nvidia rtx 3070 / 3080 / 3090

NVIDIA'S GeForce RTX 30 Series graphics cards run on Ampere architecture, 2nd generation RTX, featuring several new technologies, from faster Ray Tracing and Tensor Cores to advanced streaming multiprocessors. The world's fastest graphics memory, GDDR6X, delivers remarkable performance perfect for AI, visualization, and gaming.

RTX 3070 Specifications

  • 8 GB GDDR6
  • 5888 CUDA Cores
  • 512 GB/s Max Bandwidth
  • NVIDIA GPU Boost

RTX 3080 Specifications

  • 10 GB GDDR6X
  • 8704 CUDA Cores
  • 760 GB/s Max Bandwidth
  • NVIDIA GPU Boost

RTX 3090 Specifications

  • 24GB GDDR6X
  • 10496 CUDA Cores
  • 936 GB/s Max Bandwidth
  • NVIDIA GPU Boost

Compatible with Linux, CUDA/OpenCL, KVM, Windows.

Order configuration
NVIDIA QUADRO RTX 8000
nvidia quadro 5000 / 6000 / 8000

The NVIDIA Quadro RTX series gives you access to the well-known Turing™ chip architecture that reformed the work of millions of designers and creators. Hardware-accelerated ray tracing, state-of-the-art shading, new AI-based abilities enable artists to increase their rendering capabilities.

QUADRO RTX 5000 Specifications

  • 16 GB GDDR6
  • 3072 CUDA Cores
  • 448 GB/s Max Bandwidth
  • NVIDIA GPU Boost

QUADRO RTX 6000 Specifications

  • 24 GB GDDR6
  • 4608 CUDA Cores
  • 672 GB/s Max Bandwidth
  • NVIDIA GPU Boost

QUADRO RTX 8000 Specifications

  • 48 GB GDDR6
  • 4608 CUDA Cores
  • 672 GB/s Max Bandwidth
  • NVIDIA GPU Boost

Compatible with Linux, CUDA/OpenCL, KVM, Windows.

Order configuration
NVIDIA QUADRO RTX 4000
quadro rtx 4000

Get access to the best performance and features from a single PCI-e slot with NVIDIA'S QUADRO RTX 4000. State-of-the-art display and memory technologies combined with the Turing™ chip architecture delivers photorealistic single ray-traced rendering in a fraction of a second.

QUADRO RTX 4000 Specifications

  • 8 GB GDDR6
  • 2304 CUDA Cores
  • 416 GB/s Max Bandwidth
  • NVIDIA GPU Boost

Compatible with Linux, CUDA/OpenCL, KVM, Windows.

Order configuration

Video

Transcode up to two video streams simultaneously, through the new Turing chip architecture.

3D Rendering

Use the power of the RTX 2080 to render 3D graphics faster than ever.

Mining

Mine cryptocurrency through the new Turing chip architecture, found on the RTX 2080 and RTX 2080 Ti.

NVIDIA TESLA T4
nvidia tesla t4

The T4 introduces Tensor Core technology with multi-precision computing, making it up to 40 times faster than a CPU and up to 3.5 times faster than its Pascal predecessor, the Tesla P4. Get access to 8.1 TFLOPS of single precision performance from a single T4 GPU.

Specifications

  • TURO TU104
  • 320 TURING TENSOR CORES
  • 2560 CUDA CORES
  • 16 GB GDDR6
  • 8.1 TFLOPS SINGLE PRECISION
  • 65 FP16 TFLOPS
  • 130 INT8 TOPS
  • 260 INT4 TOPS
  • 320 GB/s Max Bandwidth

Compatible: VMWare ESXi, Citrix Xenserver, KVM, Linux, Windows.

Order configuration
CORAL

The Coral USB Accelerator

You can now add an Edge TPU coprocessor to any Linux-based system with the Coral USB Accelerator designed by Google. The small ASIC chip provides high-performance ML inferencing with low power cost. For example, it can execute 100fps on MobileNet v2 models, while using very little power.

Specifications

  • ARM 32 Bit Cortex 32 MHz
  • Edge TPU ASIC (for Lite TensorFlow models)
  • USB 3.1 5Gb/s transfer speed

Compatible with Linux machines, Debian 6.0 or higher, or any derivative (such as Ubuntu 10.0+), but also with Raspberry Pi (213 Mode B/B+).

NVIDIA GeForce RTX 2080 Ti
NVIDIA GeForce RTX 2080 / RTX 2080 Ti

NVIDIA's Turing chip architecture delivers up to six times the performance of previous generation GPU's, with breakthrough technologies and next generation, ultra-fast GDDR6 memory.

RTX 2080 Specifications

  • 8 GB GDDR6
  • 2944 CUDA Cores
  • 448 GB/s Max Bandwidth
  • NVIDIA GPU Boost 4.0

RTX 2080 TI Specifications

  • 11 GB GDDR6
  • 2944 CUDA Cores
  • 616 GB/s Max Bandwidth
  • NVIDIA GPU Boost 4.0

Compatible with Linux, CUDA/OpenCL, KVM.

Order configuration
NVIDIA GeForce GTX 1080
NVIDIA GeForce GTX 1080 / 1070 TI

NVIDIA's previous chip architecture, great for mining, graphics rendering and computing. The NVIDIA Pascal architecture delivers excellent performance at a budget friendly price.

Specifications

  • 8 GB DDR5
  • 2560 CUDA Cores
  • 320 GB/s Max Bandwidth
  • NVIDIA GPU Boost 3.0

Compatible with Linux, CUDA/OpenCL, KVM.

Order configuration
NVIDIA TESLA P4
NVIDIA TESLA P4 / P40 / P100

An optimal chip for machine learning and video transcoding. NVIDIA's Pascal chip architecture has been proven to be faster and more power efficient than its Maxwell predecessor. Transcode up to 20 simultaneous video streams with a single Tesla P4.

Specifications

  • Pascal GP100 or GP104 chip
  • Up to 3584 CUDA cores
  • Up to 16 GB CoWoS
  • Enterprise grade hardware

Compatible: VMWare ESXi, Citrix Xenserver, KVM, Linux, Windows.

Order configuration
NVIDIA TITAN V
NVIDIA TITAN V

The first GPU to break the 100 teraflop barrier of deep learning performance. NVIDIA's Volta chip, is up to 3x faster than it's Pascal chip predecessor. Your deep learning project design can now be a reality, with little investment.

Specifications

  • NVIDIA Volta Chip
  • 5120 CUDA cores
  • 640 Tensor Cores
  • 12 GB CoWoS Stacked HBM2
  • 653Gbps max bandwidth

Compatible: VMWare ESXi, Citrix Xenserver, KVM, Linux, Windows.

Order configuration

Why Primcast?

Add a GPU to HP enterprise hardware, designed specifically for use with GPU add-ons, eliminating incompatibility issues or poor/under performance of hardware. Your services are deployed on our global low latency network, backed by a 99.9% uptime SLA and supported by GPU server experts, around the clock.