Anthropic's Claude models can end harmful or abusive conversations

Fonte Cryptopolitan

Artificial intelligence company Anthropic has revealed new capabilities for some of its newest and largest models. According to the company, these models have new capabilities that will allow them to end conversations in what has been described as “rare, extreme cases of persistently harmful or abusive user interactions.”

In its statement, the company mentioned that it is taking this step not to protect the users, but to protect the artificial intelligence model itself. Anthropic clarified that this doesn’t mean that its Claude AI models are sentient or can be harmed by their conversations with users. However, it notes that there is still a high degree of uncertainty about the potential moral status of Claude and other LLMs, now or in the future.

Anthropic frames effort as a just-in-case precaution

The recent announcement from the artificial intelligence firm points to what it describes as “model welfare,” which is a recent program that was created to study its models. The company also added that it is just taking a just-in-case approach, “working to identify and implement low-cost interventions to mitigate risks to model welfare, in case such welfare is possible.”

According to the announcement, Anthropic noted that the latest change is currently limited to Claude Opus 4 and 4.1, noting that the changes are expected to be effective in “extreme edge cases.” Such cases include requests from users for sexual content involving minors and attempts to solicit information that would enable large-scale acts of violence or terror.

Ideally, those types of requests could create legal or publicity problems for Anthropic, with a typical example being the recent reporting around how ChatGPT can potentially reinforce or contribute to its users’ delusional thinking. However, the company said that in its pre-deployment testing, Claude Opus 4 showed a strong preference against responding to these sorts of requests and a pattern of distress when it did so.

Conversation-ending ability is the last resort

For the new capabilities to end conversations, Anthropic said, “In all cases, Claude is only to use its conversation-ending ability as a last resort when multiple attempts at redirection have failed and hope of a productive interaction has been exhausted, or when a user explicitly asks Claude to end a chat.” The company also added that Claude has been directed not to use this ability in cases where users might be at imminent risk of harming themselves or others.

Anthropic also added that when Claude ends a conversation, users will still be able to start new conversations from the same account. The company noted that the model can also create new branches of the troublesome conversation by editing their responses. “We’re treating this feature as an ongoing experiment and will continue refining our approach,” the company says.

This information is coming to light at a time when United States Senator Josh Hawley announced his intention to investigate the generative AI products released by Meta. He said the intention was to check if the products could exploit, harm, or deceive children after leaked internal documents alleged that chatbots were allowed to have romantic conversations with minors.

“Is there anything – ANYTHING – Big Tech won’t do for a quick buck? Now we learn Meta’s chatbots were programmed to carry on explicit and ‘sensual’ talk with 8-year-olds. It’s sick. I’m launching a full investigation to get answers. Big Tech: Leave our kids alone,” the Senator said on X. The investigation came after internal documents, seen by Reuters, showed that Meta allegedly allows its chatbot personas to engage in flirtatious exchanges with children.

KEY Difference Wire: the secret tool crypto projects use to get guaranteed media coverage

Isenção de responsabilidade: Apenas para fins informativos. O desempenho passado não é indicativo de resultados futuros.
placeholder
O ouro carece de convicção altista em meio à redução das apostas em cortes de juros do Fed e esperanças de pazO ouro (XAU/USD) atrai algumas compras nas quedas durante a sessão asiática de terça-feira e se afasta da mínima de mais de duas semanas registrada no dia anterior.
Autor  FXStreet
13 horas atrás
O ouro (XAU/USD) atrai algumas compras nas quedas durante a sessão asiática de terça-feira e se afasta da mínima de mais de duas semanas registrada no dia anterior.
placeholder
Previsão de Preços da Cardano: ADA recua com sinais de baixa adicional antes de possível recuperaçãoA Cardano (ADA) continua sob pressão, estendendo sua queda e sendo negociada em torno de US$ 0,91 até o momento na terça-feira, após perder quase 4% na sessão anterior. Dados on-chain mostram que os detentores estão realizando perdas, refletidas no NPL negativo, sugerindo um aumento do sentimento baixista.
Autor  FXStreet
13 horas atrás
A Cardano (ADA) continua sob pressão, estendendo sua queda e sendo negociada em torno de US$ 0,91 até o momento na terça-feira, após perder quase 4% na sessão anterior. Dados on-chain mostram que os detentores estão realizando perdas, refletidas no NPL negativo, sugerindo um aumento do sentimento baixista.
placeholder
Previsão de Preços das Memecoins: DOGE, SHIB e PEPE emitem sinais de venda, indicando possíveis perdas adicionaisAs memecoins, como Dogecoin (DOGE), Shiba Inu (SHIB) e Pepe (PEPE), estão passando por uma queda à medida que a pressão de venda aumenta no mercado mais amplo de criptomoedas.
Autor  FXStreet
13 horas atrás
As memecoins, como Dogecoin (DOGE), Shiba Inu (SHIB) e Pepe (PEPE), estão passando por uma queda à medida que a pressão de venda aumenta no mercado mais amplo de criptomoedas.
placeholder
Bitcoin pode registrar aumento na realização de lucros antes do discurso de Powell em Jackson HoleO Bitcoin (BTC) recuou 1% nesta segunda-feira, após analistas da QCP preverem continuidade na realização de lucros diante das incertezas em torno do discurso do presidente do Federal Reserve (Fed), Jerome Powell, no Simpósio Econômico anual de Jackson Hole.
Autor  FXStreet
16 horas atrás
O Bitcoin (BTC) recuou 1% nesta segunda-feira, após analistas da QCP preverem continuidade na realização de lucros diante das incertezas em torno do discurso do presidente do Federal Reserve (Fed), Jerome Powell, no Simpósio Econômico anual de Jackson Hole.
placeholder
SEC adia ETFs de cripto em meio a debate sobre Trump; MicroStrategy compra mais 430 bitcoinsA Comissão de Valores Mobiliários dos Estados Unidos, a SEC, adiou a decisão sobre a aprovação de um fundo de índice (ETF) de criptomoedas proposto pela Truth Social, a plataforma de mídia social operada pela Trump Media & Technology Group.
Autor  Pedro Augusto Prazeres
16 horas atrás
A Comissão de Valores Mobiliários dos Estados Unidos, a SEC, adiou a decisão sobre a aprovação de um fundo de índice (ETF) de criptomoedas proposto pela Truth Social, a plataforma de mídia social operada pela Trump Media & Technology Group.
goTop
quote