DeepSeek launched a free, open-source large language model in late December, claiming it was developed in just two months at a cost of under $6 million.
Thank the fucking sky fairies actually, because even if AI continues to mostly suck it’d be nice if it didn’t swallow up every potable lake in the process. When this shit is efficient that makes it only mildly annoying instead of a complete shitstorm of failure.
While this is great, the training is where the compute is spent. The news is also about R1 being able to be trained, still on an Nvidia cluster but for 6M USD instead of 500
Here’s someone doing 200 tokens/s (for context, OpenAI doesn’t usually get above 100) on… A Raspberry Pi.
Yes, the “$75-$120 micro computer the size of a credit card” Raspberry Pi.
If all these AI models can be run directly on users devices, or on extremely low end hardware, who needs large quantities of top of the line GPUs?
Thank the fucking sky fairies actually, because even if AI continues to mostly suck it’d be nice if it didn’t swallow up every potable lake in the process. When this shit is efficient that makes it only mildly annoying instead of a complete shitstorm of failure.
While this is great, the training is where the compute is spent. The news is also about R1 being able to be trained, still on an Nvidia cluster but for 6M USD instead of 500