What to know about DeepSeek's new V3.2-Exp model

Fuente Cryptopolitan

China’s tech wonder kid DeepSeek has launched a new experimental model, V3.2-Exp, as part of its attempt to challenge American dominance in AI. The release came on Monday and was first made public through a post on Hugging Face, a popular AI forum.

DeepSeek claims that this latest version builds on its current model, V3.1-Terminus, but with a stronger emphasis on speed, cost, and memory handling.

According to Hugging Face’s Chinese community lead Adina Yakefu, the model features something called DeepSeek Sparse Attention, or DSA, which she said “makes the AI better at handling long documents and conversations” while also cutting operating costs in half.

If you recall, around a year ago, DeepSeek dropped and shook things up by dropping its first model, R1, without warning. That model showed it was possible to train a large language model using fewer chips and much less computing power. No one expected a Chinese team to pull that off under those constraints. With V3.2-Exp, the goal hasn’t changed: less hardware, more performance.

Adds DeepSeek Sparse Attention and reduces AI running cost

DSA is the big feature in this model. It changes how the AI picks which information to look at. Instead of scanning everything, DeepSeek trains the model to focus only on what seems useful for the task. Adina explained that the benefit here is twofold: “efficiency” and “cost reduction.”

By skipping irrelevant data, the model moves faster and requires less energy. She said the model was designed with open-source collaboration in mind.

Nick Patience, who leads AI research at The Futurum Group, told CNBC the model has the potential to open up powerful AI tools to developers who can’t afford to use more expensive models. “It should make the model faster and more cost-effective to use without a noticeable drop in performance,” Nick said. But that doesn’t mean there aren’t risks.

The way DeepSeek uses sparse attention is like how airlines pick flight routes. There might be hundreds of ways to get from one place to another, but only a few make sense. The model filters through the noise and focuses on what matters — or at least what it thinks matters.

But this comes with concerns. Ekaterina Almasque, who cofounded BlankPage Capital, explained it simply: “So basically, you cut out things that you think are not important.” But the issue, she said, is that there’s no guarantee the model is cutting the right things.

Ekaterina, who has backed companies like Dataiku, Darktrace, and Graphcore, warned that cutting corners might create problems later. “They [sparse attention models] have lost a lot of nuances,” she said. “And then the real question is, did they have the right mechanism to exclude not important data, or is there a mechanism excluding really important data, and then the outcome will be much less relevant?”

Connects to Chinese chips and releases open code

Despite those concerns, DeepSeek insists that V3.2-Exp performs just as well as V3.1-Terminus. The model can also run directly on domestic Chinese chips like Ascend and Cambricon, with no extra configurations required. That’s key in China’s broader effort to build AI on homegrown hardware and reduce dependency on foreign tech. “Right out of the box,” Adina said, DeepSeek works with these chips.

The company also made the model’s full code and tools public. That means anyone can download, run, modify, or build on top of V3.2-Exp. This move aligns with DeepSeek’s open-source strategy, but it raises another issue: patents. Since the model is open and the core idea, sparse attention, has been around since 2015, DeepSeek can’t lock it down legally.

“The approach is not super new,” said Ekaterina. For her, the only defensible part of the tech is how DeepSeek chooses what to keep and what to ignore.

That’s where the real competition lies now. Not just in making smarter models, but making them faster, cheaper, and leaner — without screwing up results. Even DeepSeek called this version “an intermediate step toward our next-generation architecture,” which suggests they’re already working on something bigger.

Nick said the model shows that efficiency is now just as important as raw power. And Adina believes the company has a long-term play in mind. “DeepSeek is playing the long game to keep the community invested in their progress,” she said. “People will always go for what is cheap, reliable, and effective.”

If you're reading this, you’re already ahead. Stay there with our newsletter.

Descargo de responsabilidad: Sólo con fines informativos. Rentabilidades pasadas no son indicativas de resultados futuros.
placeholder
Las mineras de oro suben y el lingote supera los 3.800 $/oz29 sept - ** Las acciones de las mineras de oro que cotizan en EE.UU. suben antes de la apertura de los mercados, después de que el metal precioso superó por primera vez los 3.800 dólares la onza ** El oro al contado XAU= subió un 1,57% hasta los 3.818,67 dólares la onza a las 0855 GMT, tras alcanza...
Autor  Reuters
9 Mes 29 Día Lun
29 sept - ** Las acciones de las mineras de oro que cotizan en EE.UU. suben antes de la apertura de los mercados, después de que el metal precioso superó por primera vez los 3.800 dólares la onza ** El oro al contado XAU= subió un 1,57% hasta los 3.818,67 dólares la onza a las 0855 GMT, tras alcanza...
placeholder
EUR/USD repunta por preocupaciones sobre un posible cierre del Gobierno de EE.UU.El EUR/USD abrió la semana en un tono moderadamente alcista, extendiendo su recuperación a 1.1725 en el momento de escribir el lunes, después de rebotar desde mínimos de 1.1645 la semana pasada.
Autor  FXStreet
9 Mes 29 Día Lun
El EUR/USD abrió la semana en un tono moderadamente alcista, extendiendo su recuperación a 1.1725 en el momento de escribir el lunes, después de rebotar desde mínimos de 1.1645 la semana pasada.
placeholder
El Oro baja a medida que se desvanecen las apuestas de recorte de tasas de la Fed; carece de convicción bajista antes de los datos del PCE de EE.UU.El Oro (XAU/USD) no logra aprovechar las modestas ganancias del día anterior y atrae a nuevos vendedores durante la sesión asiática del viernes
Autor  FXStreet
9 Mes 26 Día Vie
El Oro (XAU/USD) no logra aprovechar las modestas ganancias del día anterior y atrae a nuevos vendedores durante la sesión asiática del viernes
placeholder
Pronóstico del Precio de Ethereum: ETH cae por debajo de los 4.000$, provoca fuertes liquidaciones mientras REX-Osprey lanza el ETF de staking de EtherEthereum (ETH) está cayendo por debajo de 4.000$ el jueves, acelerando las liquidaciones entre los tenedores de posiciones largas en futuros de ETH. Las liquidaciones largas superaron los 400 millones$ por segunda vez en cuatro días, ya que el sentimiento bajista predominante en el mercado de criptomonedas sigue afectando a Ethereum.
Autor  FXStreet
9 Mes 26 Día Vie
Ethereum (ETH) está cayendo por debajo de 4.000$ el jueves, acelerando las liquidaciones entre los tenedores de posiciones largas en futuros de ETH. Las liquidaciones largas superaron los 400 millones$ por segunda vez en cuatro días, ya que el sentimiento bajista predominante en el mercado de criptomonedas sigue afectando a Ethereum.
placeholder
EUR/USD se desploma mientras datos sólidos de EE.UU. alimentan la recuperación del DólarEl EUR/USD cayó bruscamente por debajo de 1.1700 el jueves, extendiendo sus pérdidas por debajo de dos niveles de soporte técnico, lo que abre la puerta a un mayor descenso. Los sólidos datos del mercado laboral de Estados Unidos (EE.UU.) presionaron al Euro a la baja. Al momento de escribir, el par se negocia en 1.1667, con una caída del 0.60%.
Autor  FXStreet
9 Mes 26 Día Vie
El EUR/USD cayó bruscamente por debajo de 1.1700 el jueves, extendiendo sus pérdidas por debajo de dos niveles de soporte técnico, lo que abre la puerta a un mayor descenso. Los sólidos datos del mercado laboral de Estados Unidos (EE.UU.) presionaron al Euro a la baja. Al momento de escribir, el par se negocia en 1.1667, con una caída del 0.60%.
goTop
quote