# Dream Server Hardware Guide *Last updated: 2026-02-09 (merged M6 value research)* What to buy for local AI at different budgets. --- ## TL;DR Recommendations | Tier | GPU | RAM | What You Get | |------|-----|-----|--------------| | Lightweight (any) | Any GPU or CPU-only | 4GB+ | 2B model, personal chat | | Entry ($800-1,200) | RTX 3060 12GB | 32GB | 7B-14B models, basic chat | | Prosumer ($2,000-3,000) | RTX 4070 Ti Super 16GB | 64GB | 32B models, voice, 5-8 users | | Pro ($4,000-6,000) | RTX 4090 24GB | 128GB | 70B models, 10-20 users | | Enterprise ($12,000-18,000) | 2x RTX 4090 | 256GB | 40+ concurrent users | --- ## Tier 1: Entry ($800-1,200) **Goal:** Get started with local AI, personal use ### Recommended Build - **GPU:** RTX 3060 12GB (used: $200-250) - **CPU:** Any modern 6+ core (i5-12400, Ryzen 5 5600) - **RAM:** 32GB DDR4 - **Storage:** 500GB NVMe SSD - **PSU:** 550W 80+ Bronze ### What Runs - ✅ 7B-14B models (Qwen2.5-7B, Llama-3-8B) - ✅ Basic voice (Whisper small/medium) - ✅ Single user, personal projects - ⚠️ Slow with complex prompts (~30 tok/s) ### Buy Used Look for: - Dell Precision/HP Z workstations with RTX 3060 - Avoid: GTX cards (no FP16) --- ## Tier 2: Prosumer ($2,000-3,000) **Goal:** Serious local AI, small team use ### Recommended Build - **GPU:** RTX 4070 Ti Super 16GB ($800) or RTX 4080 16GB ($1000) - **CPU:** i7-13700 or Ryzen 7 7700X - **RAM:** 64GB DDR5 - **Storage:** 1TB NVMe Gen4 - **PSU:** 750W 80+ Gold ### What Runs - ✅ 32B AWQ quantized models (Qwen2.5-32B-AWQ) - ✅ Full voice pipeline (Whisper medium + Kokoro) - ✅ 5-8 concurrent users - ✅ ~50-60 tok/s generation ### Best Value RTX 4070 Ti Super at $800 is the sweet spot for: - 16GB VRAM (critical for 32B models) - Good efficiency (200W TDP) - DLSS 3 for future-proofing --- ## Tier 3: Pro ($4,000-6,000) **Goal:** Production workloads, growing business ### Recommended Build - **GPU:** RTX 4090 24GB ($1800-2000) - **CPU:** i9-14900K or Ryzen 9 7950X - **RAM:** 128GB DDR5 - **Storage:** 2TB NVMe Gen4 - **PSU:** 1000W 80+ Platinum - **Cooling:** AIO or custom loop (4090 runs hot) ### What Runs - ✅ 70B AWQ models (Llama-3-70B-AWQ, Qwen2.5-72B-AWQ) - ✅ Multiple models simultaneously - ✅ 10-15 concurrent users - ✅ Full RAG + embeddings + voice ### Alternative: Dual 4070 Ti Two RTX 4070 Ti Super (32GB total) can be better than one 4090 for: - Running separate specialized models - Redundancy - But: More complex setup, higher power --- ## Tier 4: Enterprise ($12,000-18,000) **Goal:** Full production, organization-wide ### Option A: Dual RTX 4090 - 2x RTX 4090 (48GB VRAM total) - Requires: PCIe bifurcation, 1500W+ PSU - Good for: Separate model instances ### Option B: RTX 6000 Ada (48GB) - Single GPU, 48GB VRAM - Runs: 70B at FP16 (no quantization) - Pro: Simpler than dual-GPU - Con: $6000+ ### Option C: Dual RTX PRO 6000 Blackwell (What We Run) - 2x 96GB VRAM (192GB total) - Runs: Multiple 70B models, 40+ users - Cost: ~$15-20k total build ### Capacity (Our Real Numbers) From M8 benchmarks on dual PRO 6000: | Use Case | Per GPU | Both GPUs | |----------|---------|-----------| | Voice agents (<2s) | 10-20 | 20-40 | | Interactive chat (<5s) | ~50 | ~100 | | Batch processing | 100+ | 200+ | --- ## Best Value Picks (M6 Research) Based on price/performance analysis (see `research/M6-CONSUMER-GPU-BENCHMARKS-2026-02-09.md`): ### Hidden Gem: Used RTX 3090 ($700-900) At used prices, the RTX 3090 offers: - 24GB VRAM (same as 4090!) - 936 GB/s bandwidth (better than new 4080 SUPER) - Runs 32B+ models that 16GB cards can't - ~75% of 4090 performance at ~50% cost **Trade-off:** Higher power (350W), older architecture ### Memory Bandwidth Insight Token generation is **memory-bound**, not compute-bound. This is why: - RTX 3080 Ti (912 GB/s) matches newer cards in inference - Used high-bandwidth cards punch above their weight ### Quick Value Table | Budget | Best Pick | Why | |--------|-----------|-----| | $250 | Used RTX 3060 12GB | Entry, can run 7B-14B | | $500 | Used RTX 3080 Ti 12GB | Great bandwidth for price | | $700-900 | **Used RTX 3090** | **Best overall value** | | $800 | New RTX 4070 Ti SUPER | Best new 16GB card | | $1,600 | RTX 4090 | Maximum single-GPU | --- ## Key Specs Explained ### VRAM (Most Important) VRAM determines what models fit. Rough guide: | VRAM | Max Model (AWQ 4-bit) | |------|----------------------| | 8GB | 7B | | 12GB | 14B | | 16GB | 32B | | 24GB | 70B | | 48GB | 70B FP16 or 2x 32B | ### Memory Bandwidth Faster bandwidth = faster inference | GPU | Bandwidth | Relative Speed | |-----|-----------|----------------| | RTX 3060 | 360 GB/s | 1.0x | | RTX 4070 Ti | 504 GB/s | 1.4x | | RTX 4090 | 1008 GB/s | 2.8x | | PRO 6000 | 1792 GB/s | 5.0x | ### System RAM Rule: 2x your model size minimum | Model | Min RAM | Recommended | |-------|---------|-------------| | 7B | 16GB | 32GB | | 32B | 32GB | 64GB | | 70B | 64GB | 128GB | --- ## What NOT to Buy ❌ **GTX 16xx/10xx** — No FP16 tensor cores ❌ **AMD discrete GPUs (RX 7900 etc.)** — ROCm support limited; AMD Strix Halo APUs are fully supported (see README) ❌ **Intel Arc** — Driver problems, limited support ❌ **Cloud GPUs (H100/A100)** — Can't buy, rental only ⚠️ **8GB cards** — Limited to smaller models (2B-7B) but functional with Tier 0 --- ## Where to Buy ### New - Newegg, Amazon, Micro Center - EVGA B-Stock (refurbished) - Manufacturer direct (MSI, ASUS) ### Used - eBay (check seller ratings) - r/hardwareswap - Facebook Marketplace (local pickup) - Mining cards: Usually fine, verify fans work --- ## Power Considerations | GPU | TDP | PSU Needed | |-----|-----|------------| | RTX 3060 | 170W | 550W | | RTX 4070 Ti | 285W | 700W | | RTX 4090 | 450W | 1000W | | Dual 4090 | 900W | 1500W | Add 150-200W for CPU + system overhead. --- ## Cooling - **Single GPU:** Good case airflow is enough - **RTX 4090:** AIO or very good air cooling (315W slot power) - **Dual GPU:** Custom loop or enterprise chassis --- ## Summary 1. **Entry:** RTX 3060 12GB — personal use, getting started 2. **Prosumer:** RTX 4070 Ti Super 16GB — serious work, small teams 3. **Pro:** RTX 4090 24GB — production workloads, 10-20 users 4. **Enterprise:** Dual 4090 — organization-wide, 40+ users **VRAM is king.** Buy the most VRAM you can afford. --- *Built by The Collective based on real-world testing*