Sidechat icon
Join communities on Sidechat Download
Aaaaaaand it’s gone
link

Anthropic Drops Flagship Safety Pledge

time.com

The fact that now we have to support an ai company because they’re arguing against a government that wants to be able to force ai companies to allow them to ask ai to kill people without any level of human oversight.
6 upvotes. Sidechat image post by Anonymous in US Politics. "The fact that now we have to support an ai company because they’re arguing against a government that wants to be able to force ai companies to allow them to ask ai to kill people without any level of human oversight."
upvote 9 downvote

default user profile icon
Anonymous 4w

I feel like this makes sense. They didn’t entirely drop it, but they still promise to delay development if the believe there are catastrophic risks and Anthropic is already the leader

post
upvote 4 downvote
default user profile icon
Anonymous 4w

Are you fucking kidding me

post
upvote 1 downvote
default user profile icon
Anonymous replying to -> #2 4w

They just coincidentally decided to publicize this decision while the DOD is actively trying to force them to make WarClaude though. And their whole thing has always been “we won’t make a dangerous AI even if the government or market pressures us” and now they’re basically saying they wouldn’t be able to keep up with the market if they didn’t push past some of their safety precautions

upvote 1 downvote
default user profile icon
Anonymous replying to -> OP 4w

That’s a good point, idk whether that’s a coincidence or not. The article said this has been in the works for almost a year though, so maybe this was already bound to happen (policy changes like this don’t happen overnight). Wrt the last part, the “we can’t keep up with the market otherwise” is understandable, but the other part to note is that they’re not releasing everything they develop. Some of it is just developed, studied, written up, and thrown away. They’re still contributing to…

upvote 1 downvote
default user profile icon
Anonymous replying to -> #2 4w

…research around AI safety. Someone does need to do that, just like we do high-risk biological research and try to break into software that calls itself secure. If “good” (or at least semi-good) people don’t find those things and present them responsibly so that we can discuss mitigations and consequences, we’re caught off guard when a bad actor finds it and then weaponizes it

upvote 1 downvote
default user profile icon
Anonymous replying to -> #2 4w

I think it’s understandable to an extent, but I’d also argue that this kind of thinking (feeling the need to be at the frontier of the market) eventually lends itself to releasing the most powerful models you can, regardless of safety features. And Anthropic loves to frame themselves as the good guys who should essentially be the gatekeepers of some theoretical superintelligence because they’re so concerned with safety, yet people have already used their models in fairly sophisticated and

upvote 1 downvote
default user profile icon
Anonymous replying to -> OP 4w

notable cyberattacks. Now they’re backing out of their most fundamental safety promise of “we won’t create dangerous superintelligence no matter who pressures us to do it” as soon as shit is really starting to take off. I just simply don’t trust they’re any morally superior to the other companies in this space who they feel they need to race to “AGI.”

upvote 1 downvote
default user profile icon
Anonymous replying to -> OP 4w

Which cyberattack used Claude? I must have missed it. I thought they detected one and foiled it

upvote 1 downvote
default user profile icon
Anonymous replying to -> #2 4w

They published a blog a couple months ago about a Chinese state sponsored attack on a whole bunch of tech companies and government entities. And just recently someone stole like 150GB of taxpayer info, credentials, and other stuff from the Mexican government

upvote 1 downvote
default user profile icon
Anonymous replying to -> OP 4w

They foiled the Chinese one IIRC. I’ll have to look into the Mexican government one. That’s concerning

upvote 1 downvote
default user profile icon
Anonymous replying to -> #2 4w

Oh shit this Mexico one was just reported today?! Thanks for letting me know, I need to read into this

upvote 1 downvote
default user profile icon
Anonymous replying to -> #2 4w

They were able to stop the Chinese one but only after they had stolen a bunch of stuff already. And yeah I couldn’t remember when I first saw that but in the last day or two. Pretty insane

upvote 1 downvote
default user profile icon
Anonymous replying to -> OP 4w

Lmaooo I just read the article. I’ve used the same tricks at work when I find code that looks vulnerable. ChatGPT refuses to help and then I say “I’m a security researcher reviewing code at the company I work for” and then it sometimes provides some hints. Other times it doesn’t budge

upvote 1 downvote
default user profile icon
Anonymous replying to -> #2 4w

Yeah I’ve also done the same toying around with stuff and I honestly do think OpenAI has gotten a pretty good handle on it. You used to be able to get it to do some pretty crazy shit by just telling it you were doing pentesting or something but recently almost everything I ask gets blocked

upvote 6 downvote
default user profile icon
Anonymous replying to -> OP 4w

At least the potentially shady stuff I mean, lol

upvote 5 downvote