Stockholm, Sweden

Dedicated Apple Silicon

Bare-metal Mac Studio and Mac mini hosting. Full root access, no shared resources. Built for AI inference, CI/CD, and heavy workloads.

512 GB Unified Memory
80-core GPU
Stockholm Datacenter
235B Largest model

Available machines

Every machine is dedicated to a single customer. No virtualisation, no noisy neighbours.

Mac Studio M3 Ultra

Limited availability — Apple discontinued this configuration

ChipApple M3 Ultra
Memory512 GB unified
GPU80-core
Neural Engine32-core
Storage2 TB SSD
ConnectivityThunderbolt 5, 10 GbE
$990 /month
Request access

Mac Studio M4 Max

ChipApple M4 Max
Memory128 GB unified
GPU40-core
Neural Engine16-core
Storage1 TB SSD
ConnectivityThunderbolt 5, 10 GbE
$400 /month
Request access

Weekly rental available. Custom setup on request. All machines include full root access via SSH and remote desktop.

Inference API

OpenAI-compatible endpoint running on our hardware. Drop-in replacement for any OpenAI client library.

Example request
curl https://api.gpu-io.net/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3:30b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Models

qwen3:235b 235B params ~45 tok/s
qwen3:30b 30B params ~91 tok/s
qwen3.5:27b 27B params ~85 tok/s
gemma4:26b 26B params ~88 tok/s
mistral-small 23B params ~95 tok/s
medgemma:27b 27B params ~82 tok/s

Custom model deployments available on dedicated machines. We can load any GGUF or MLX model.

Pricing

Dedicated Machine Inference API
Starting at $400/month $0.50 / 1M tokens
Access Full root (SSH + VNC) REST API (OpenAI-compatible)
Resources Entire machine, no sharing Shared infrastructure
Models Any — you control the machine Pre-loaded selection
Best for Heavy workloads, custom setups Occasional inference, prototyping
Commitment Weekly or monthly Pay as you go

Get started

Tell us what you need. We typically provision machines within 24 hours.

Contact us