DeepSeek R1: Testing multiple sizes
0 – Introduction In this article, I tested distilled DeepSeek reasoning models ranging from 1.5B to 32B parameters. I evaluate tasks like math and logic challenges to explore how model size influences performance. While smaller models show promise when fine-tuned with chain-of-thought techniques demonstrating how distillation can make high-quality reasoning models both accessible and efficient. […]
DeepSeek R1: Testing multiple sizes Read More »