Tech Stocks, and more specifically AI stocks have taken a hit on Jan 27 as the markets processed the implications of DeepSeek R1.ย  For those who do not know DeepSeek-R1 is an open source LRM (Large Reasoning Model) released by a Chinese lab, that has been developed with 1/50th the budget previously thought necessary. The lab did this despite the US ban on high performing GPUs to be exported to China.

Curiously, this is the question that Rajan Anandan asked Sam Altman during his 2023 visit to India .. the possibility that a small lab can create frontier models. Neither Anandan nor Altman would have visualized this happening within 18 months. Here is a primer (as of 28 Jan 2025) for those who are interested:

Reasoning and Foundation Models – First a distinction between Reasoning Models and Language Models. LLMs are the type of models that were popularized by ChatGPT .. essentially capable of large scale understanding of language, human like response to questions and ability to generate natural language responses. These capabalities were combined with other technical advancements and forged into other outputs including generating code, images etc. Reasoning Models on the other hand give considered and reasoned response, pausing to reflect the multiple paths in which a problem can be approached and choosing one that looks promising. o1 was the first Reasoning Model demonstrated by OpenAI in Sep 2024. The buzz from this month is the speed with which a lab with limited resources have caught up with OpenAI in catching up with frontier capabilities .. within 4 months.

What goes into the model capability? – Three things essentially: compute, data and algorithms. The current wave is produced by a breakthrough algo called Transformers published by Google in 2016. OpenAI was the first lab to put the algo to work in ways that created what was then magical output from this new class of models. No further improvements in algorithms have been considered that huge an impact since 2016 Transformers paper. The data used was essentially text corpus from many publicly available sources .. the quality of the data impacts the performance of the model. On the data front, there is potential to use synthetic data (dummy data created by computer to train models), video data and other privately guarded data sources. The big labs have been trying to win the data race by tying up with original content copyright holders. Finally Compute – it was assumed that winning this race is a matter of getting sufficient compute (read GPUs) to throw at large amounts of data during the training of the model. NVIDIA stocks rose because of this, and the US export control bans can be seen from this perspective

What has DeepSeek achieved? Over the last month, DeepSeek has achieved two things. They released an LLM called DeepSeek-V3 and a reasoning model called DeepSeek-R1. They open sourced both of these meaning anyone can build on these, and host/run it on their own infrastructure. They published a paper outlining in detail their research methods which went into creating these models. It is stunning to see a small lab with a lot of constraints catching up to OpenAI o1 within 4 months or so. You may remember from the news that there was something code-named Strawberry or Q* which created the riff between Ilya Sutskeyver and Sam Altman in OpenAI, prompting rumours about AI safety and bringing big changes in OpenAI. Maybe OpenAI slowed down as it handled these controversies .. but whatever be the situation, you have the open source world catching with the frontier within 4 months. And they do it with far less compute. They did it through a lot of smart algorithmic/methodical improvements to train the model .. and given that they have published it the doors for reproduction worldwide are wide open.

Impact on AI stocks – NVIDIA fell 17%, while Microsoft fell 2.1% and Google(Alphabet) fell 4.2% yesterday. SoftBank fell by around 8%. The big impact seems to be on the hardware providers and model provideers who benefited from the boom in recent days. It is not clear which way the long term impact will be. The logic in NVIDIA (and other SemiConductor stocks) falling is that this new algo reduces structurally the need for compute needed to achieve better performance. However, there might be more advanced use cases that could open up given the new model, and that might spur more compute. We need to wait and watch how this evolves. There is a structural question of how much value gets created/eroded in the market because of open source. Much of the value of the leading tech giants is because of their execution, product capabilities and hold over customer data. I do not see that changing too much.

Impact on India – DeepSeek has done what CP Gurnani of Tech Mahindra claimed he would do. The possibility of frontier models (or near frontier models) coming from India just shot up. There is more to sustained DeepTech capability coming out of India than just the algorithmic unlock represented by R1. We need the right incentive structures and a culture. We have done catch up in Digital Public Infrastructure. We can take inspiration from the Chinese in DeepTech.

Impact on the world – The geopolitical competition betwen US and China is on. One gets to hear crazy stories in the wild on the innovationsย pioneered by Chinese labs, and with R1 we got to see some of those. That DeepSeek open sourced their model will definitely put China in a good light for rest of the world (including Europe). US continues to lead AI innovation by a huge margin. Overall, this will spur further innovation and faster realization of the benefits of intelligence promised by GenAI

more similar articles