It was a real fight to deploy a large-scale LLM. DeepSeek AI was becoming the star, and so its 67B model appeared exciting to me as I was curious to see if I could make this model work outside of an enterprise-grade setup.
Unlike researchers working at OpenAI, Google, or Meta, I donโt had access to a corporate high-performance computing (HPC) cluster. No racks of NVIDIA H100 GPUs, no enterprise-scale TPU pods, no institutional grants funding for my experiment. But with all these buzz around the low-cost and efficient DeepSeek model, I was very much fascinated.
I immediately saw the potential of running a powerful LLM on my own infrastructure, so that I can research, experiment, and try some custom applications. It was something that I wanted to explore because a lot of my enterprise clients were asking questions for on-premise deployment, which were raising a lot of what-ifs in my mind.
๐๐ก๐ ๐ค๐๐ฒ ๐ช๐ฎ๐๐ฌ๐ญ๐ข๐จ๐ง ๐ฐ๐๐ฌ…
Is it even feasible to deploy such a large model on a cloud platform like AWS or GCP as an individual?
- Can I afford the compute costs without burning through thousands of dollars?
- Will a CPU-only approach work, or is it non-viable?
At first, I thought cloud computing would make this relatively easy. After all, Amazon Web Services (AWS) and Google Cloud Platforms have this on-demand access to high-performance hardware. But…