I Put a Datacenter GPU in My Gaming PC for £200

37 points by puhsu

robalex

Really cool approach, I'm curious about the GPU falling off PCIe, but there's so many things it could be.

The loud GPU fan reminds me of my time on the CUDA team at NVIDIA. My co-worker was adding the fan control feature to NVML and nvidia-smi. Over the cube wall I heard a fan spinning up and down then he popped up with a giant grin on his face. He said it was his favorite feature to work on since the moment he had the code working he could hear the results.

lor_louis

If anyone is interested in self hosted LLMs, dell OEM rtx 3090s are generally cheaper than the big name brand variants and I was able to get my hands on one for ~800$ CAD.

Now I need to read up more on how vllm works because the model sometimes starts spewing long lists of related names and adjectives, I've probably messed something up.

msfjarvis

What kind of models are you running on a 3090? I was under the impression that most useful models need at least 48 to 64 gigs of VRAM to run properly, hence the popularity of Apple M-series chips in the space due to their integrated memory design.
- ocramz
  
  Qwen3.6-27B-MTP quantized at Q5_K_M, which comes in at about 19GB VRAM
  
  and they observe 32 tokens/s inference rate on the V100. So by fitting a model with more bits per weight, the 3090 might even produce better quality (at ~10x the price of the aftermarket datacenter-grade stuff)
  - msfjarvis
    
    Qwen3.6-27B-MTP quantized at Q5_K_M [...] they observe 32 tokens/s inference rate
    
    Wow, that's pretty good. My experience with older Qwen models was much worse but I think I didn't use the right variant since there were so many on Hugging Face. Could I trouble you for a link to the version you're running? Thanks!
    
    ocramz
    
    could be this one? https://huggingface.co/unsloth/Qwen3.6-27B-MTP-GGUF
  - lor_louis
    
    I followed unsloths' tutorial to get qwen 3.6 working pretty well, I already had a 3090 for gaming so the second OEM I got for cheap (ish) lets me run K_XL versions of Q5 and I wanted to investigate Q6 and Q8 this weekend
- ocramz
  
  they also come pre-packaged for that matter, but "3 months manufacturer warranty" and off you go :
  
  https://www.ebay.com/itm/297819576914?_skw=Tesla+V100+SXM2+16GB%5C&itmmeta=01KSZ34MHVY4GWY9AJ3JW9V594&hash=item45576e1a52:g:kuUAAeSw0lVp946F&itmprp=enc%3AAQALAAAA8GfYFPkwiKCW4ZNSs2u11xAvcd881EUrq8Wfyf%2FskKSleHGaA6tHR1zTm7pRZJ1zt3OO%2F5UH8lMPUXjqHIzW4mICUYuPLqiDyxont6TWeF%2FvOtrcT30y3XD6YiQsJUpKU9Ph%2F9wQ6h0eglYNE5cjJSWlYz3BHoMdcT7QfPQM%2FAC0oPN1V62RYfC1lK7W7mNMwluDbU2zZJVNa%2BHG%2BlMFKz09yyMAViypQ8MIq1vpYt6x3dMPp9kasxvv7FfZbOrv0ByUFGP44yIdW%2FUlNZPoBOh0K9BcjVfzba7uITbmy0uQHYeCT9IUK19C%2B9Asz52pIg%3D%3D%7Ctkp%3ABk9SR4jJkuPPZw
  - nelson
    
    oh that is really tempting. I assume this doesn't have the fan hack the post here talks about.