Stockholm, Sweden
Dedicated Apple Silicon
Bare-metal Mac Studio and Mac mini hosting. Full root access, no shared resources. Built for AI inference, CI/CD, and heavy workloads.
Available machines
Every machine is dedicated to a single customer. No virtualisation, no noisy neighbours.
Mac Studio M3 Ultra
Limited availability — Apple discontinued this configuration
| Chip | Apple M3 Ultra |
| Memory | 512 GB unified |
| GPU | 80-core |
| Neural Engine | 32-core |
| Storage | 2 TB SSD |
| Connectivity | Thunderbolt 5, 10 GbE |
$990 /month
Request access Mac Studio M4 Max
| Chip | Apple M4 Max |
| Memory | 128 GB unified |
| GPU | 40-core |
| Neural Engine | 16-core |
| Storage | 1 TB SSD |
| Connectivity | Thunderbolt 5, 10 GbE |
$400 /month
Request access Weekly rental available. Custom setup on request. All machines include full root access via SSH and remote desktop.
Inference API
OpenAI-compatible endpoint running on our hardware. Drop-in replacement for any OpenAI client library.
Example request
curl https://api.gpu-io.net/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3:30b",
"messages": [{"role": "user", "content": "Hello"}]
}' Models
qwen3:235b 235B params ~45 tok/s
qwen3:30b 30B params ~91 tok/s
qwen3.5:27b 27B params ~85 tok/s
gemma4:26b 26B params ~88 tok/s
mistral-small 23B params ~95 tok/s
medgemma:27b 27B params ~82 tok/s
Custom model deployments available on dedicated machines. We can load any GGUF or MLX model.
Pricing
| Dedicated Machine | Inference API | |
|---|---|---|
| Starting at | $400/month | $0.50 / 1M tokens |
| Access | Full root (SSH + VNC) | REST API (OpenAI-compatible) |
| Resources | Entire machine, no sharing | Shared infrastructure |
| Models | Any — you control the machine | Pre-loaded selection |
| Best for | Heavy workloads, custom setups | Occasional inference, prototyping |
| Commitment | Weekly or monthly | Pay as you go |