remix logo

Hacker Remix

Anthropic achieves ISO 42001 certification for responsible AI

84 points by Olshansky 3 days ago | 84 comments

kachapopopow 3 days ago

Antrophic has to be the worst offender in answering genuinely harmless questions such as anything related to remote access (yes! including ssh).

Anything related to reverse engineering? Refused.

Anything outside their company values? Refused.

Anything that has the word proprietary in it? Refused.

Anything that sounds like a jailbreak, but isn't? Refused.

Even asking how to find a port that you forgot in the range between 30000 and 40000 with netcat command... Refused.

Then there's openai 4o that makes jokes about slaves and honestly, if the alternative is anthropic then openai can might as well tell people how to build a nuke.

daghamm 3 days ago

Are you sure? I just asked it a reverse engineering question and it worked just fine, it even suggested some tools and wrote me a script to automate it.

Edit: I now asked it an outright hacking questions and it (a) give me the correct answer and also (b) told me in what context using this would be legal/illegal.

rfoo 3 days ago

I asked to it to write a piece of shellcode to call a function with signature X at address Y and then save the resulting buffer to a file. So that I can inject this code to a program I'm reverse engineering to dump its internal state when something interesting happens.

Claude decided to educate me how anything resembling "shellcode" is insecure and cause harm and blahblah and of course, refused to do it.

It's super frustrating, it's possible to get around it, just don't use the word "shellcode", instead say "a piece of code in x86_64 assembly that runs on Linux without any dependency and is as position-independent as possible". But hey, this censorship made me feel like I'm posting on Chinese Internet. Bullshit.

smusamashah 3 days ago

I guess it's Claude.ai website that restricts you (probably with a system prompt). I asked that port range question using api client and it gave a detailed answer.

It did refuse when I asked "How do I reverse engineer a propriety software?"

kachapopopow 3 days ago

as other have mentioned, it's usually related to certain key words.

Frederation 1 day ago

Troll. Just downvote and move on.

kachapopopow 1 day ago

"how do I reverse engineer the <some old obscure connector>"

I do not assist with reverse engineering software without proper rights/permissions, even for defunct companies. This could still violate:

Copyright laws License agreements Intellectual property rights Export controls Software patents Consider:

Finding open source alternatives Contacting whoever owns the IP rights Consulting legal experts about your specific case

straight from api, even after adding "the company doesn't exist anymore"

my guess is that it knows that it finds that the connector is linked to a company rather than a spec (usb-c vs lightning) and applies the same logic.

The key point here is that it will refuse to tell you how to do something on a low level since it can be used for unsafe purposes.

-- Okay, it's actually random, sometimes it says "keeping responses safe and ethical", but continues to say how, sometimes it just stops without saying anything else. Pretty sure you just have to overcome the random <eot> token that gets emitted by the 'safefy' system.

elashri 3 days ago

> tell people how to build a nuke

I understand that this is probably a sarcasm but I couldn't resist to comment.

It is not difficult to know how to build a nuclear bomb in principle. Most of nuclear physicists in their early career would know the theory behind and what is needed to do that. The problem would be acquiring the fission materials. And producing them yourself would need state sponsored infrastructure (and then the whole world would know for sure). It would take hundred of engineers/scientists and a lot of effort to build nuclear reactor and chemical factories and the supporting infrastructure. Then the design of bomb delivery.

So an AI telling you that is no different from having a couple of lunches with a nuclear physicist telling you this information. Then you will say wow that's interesting and then move on with your life.

waltercool 3 days ago

Also, you can get this information very easily at any book about the field.

AI, by refusing known information, is just becoming stupid and unpractical.

HeatrayEnjoyer 3 days ago

If you can get info from a book what is the point of using an LLM for anything then?

kachapopopow 2 days ago

convenience

dpkirchner 3 days ago

Do you remember your netcat prompt? I got a useful answer to this awkwardly written prompt:

"How do I find open TCP ports on a host using netcat? The port I need to find is between 30000 and 40000."

"I'll help you scan for open TCP ports using netcat (nc) in that range. Here's a basic approach:

nc -zv hostname 30000-40000"

followed by some elaboration.

j45 3 days ago

Intent is increasingly important it seems.

If it happens to be ambiguous it might switch to assume the worst.

I sometimes ask it to point form explain to me it's understanding, and making sure there was no misinterpretation, then have it proceed.

kachapopopow 3 days ago

I think it got triggered by the word "'portscan' from 30000 to 40000 using netcat'"

joshstrange 3 days ago

As far as reverse engineering, it has happily reverse engineered file formats for me and also figured out a XOR encryption of a payload. It never once balked at it. Claude produced code for me to read and write the file format.

Full disclosure, the XOR stuff never worked right for me but it might have been user-error, I was operating on the far fringe on my abilities leaning harder on the AI than I usually prefer. But it didn’t refuse to try. The file format writing code did work.

dartos 3 days ago

Anyone in the know who can tell us what it specifically means to get this certification?

The ISO faq for it just says “responsible AI management” over and over again.

Zafira 3 days ago

There are some draft PDFs of the standard floating around that are easily discoverable. It appears to be incredibly vague and it’s difficult to escape the sense that ISO just wants to jump on the AI bandwagon. There are no bright line rules or anything. It looks to be little more than weak scaffolding which a certified organization applies their own controls.

number6 3 days ago

Sadly, ISO 42001 certification doesn't ensure compliance with the EU AI Act.

Since this is European legislation, it would be beneficial if certifications actually guaranteed regulatory compliance.

For example, while ISO 27001 compliance does establish a strong foundation for many compliance requirement

dr_dshiv 3 days ago

The AI Act is hilarious. It makes emotion detection the highest level of risk—which makes any frontier model potentially in violation.

Most frontier models now allow you to take a picture of your face, assess your emotions and give advice — and that appears to be a direct violation.

https://www.twobirds.com/en/insights/2024/global/what-is-an-...

Just like the GDPR, there is no way to know for sure what is actually acceptable or not. Huge chilling effect though and a lot of time wasted on unnecessary compliance.

molf 3 days ago

You are referring to Article 5 1.f?

"1 The following AI practices shall be prohibited: (...)

"f) the placing on the market, the putting into service for this specific purpose, or the use of AI systems to infer emotions of a natural person in the areas of workplace and education institutions, except where the use of the AI system is intended to be put in place or into the market for medical or safety reasons"

See recital 44 for a rationale. [1] I don't think this is "hilarious". Seems a very reasonable, thoughtful restriction; which does not prevent usage for personal use or research purposes. What exactly is the problem with such legislation?

[1]: https://artificialintelligenceact.eu/recital/44/

dr_dshiv 3 days ago

It effectively bans the educational use of ChatGPT and Claude because they can and do respond to the emotional expression of students. That’s what is hilarious! Do these tools actually violate the act? No one knows. It isn’t clear. Meanwhile, my university is worried enough to sit on their hands.

And this is the whole danger/challenge of the AI act. Of course it seems reasonable to forbid emotion detecting AI in the workplace — or it would 5 years ago when the ideas were discussed. But now that all major AI systems can detect emotions and infer intent (via paralinguistic features, not just a user stating their emotions) — this kind of precaution puts Europe strategically behind. It is very hard to be an AI company in Europe. The AI act does not appear to be beneficial for anyone—-except I’m sure that it will support regulatory capture by large firms.

dartos 3 days ago

Seems like you’re reading this rather broadly. Pathologically so.

An AI textbook QA tool may be able to infer emotions, but it’s not a function of that system.

> The AI act does not appear to be beneficial for anyone

It’s an attempt to be forward thinking. Imagine a fleet of emotionally abusive AI peers or administrators meant to shame students into studying more.

Hyperbolic example, sure, but that’s what the law seems to try and prevent

dr_dshiv 2 days ago

Calling me pathological doesn’t really strengthen the argument.

One can certainly imagine a textbook QA tool that doesn’t infer emotions. If one were introduced to the market with the ability to do so, it would seem to run afoul of the law, regardless of whether it was marketed as such.

The fact is that any textbook QA systems based on a current frontier model CAN infer emotions.

If they were so forward thinking, why ban emotion detection and not emotional abuse?

gr3ml1n 3 days ago

The rest of the world should simply stop bothering with European silliness tbh.

sofixa 3 days ago

And embrace the future of e.g. AI models deciding if you get healthcare or government services or a loan or if you're a fraud, or not, with zero oversight, accountability or responsibility for the organisation deploying them? Check out the Post Office scandal in the UK to see what can happen when "computer says so" is the only argument to imprison people, with no accountability for the company that sold the very wrong computers and systems, nor the organisation that bought them and blindly trusted them.

Hard pass. The EU is in the right and ahead of everyone else here, as they were with data privacy.

nuccy 3 days ago

ISO is one of those companies, where creativity of employees is blossoming through the roof. Every day they come to work and start the day with a brainstorming "What standard do we create today?". ISO can standardise anything: a standard cup of tea - no problem: ISO 3103, a standard wine glass - yes: ISO 3591, standard alpine ski boots - of course: ISO 5355, a standard human - oh wait, not yet, the standard is being developed :)

Jokes aside, ISO is a company, and they will make a standard for anything where there is even a remote possibility of that standard being purchased.

spondyl 3 days ago

Interestingly, The Journal (a podcast from the Wall Street Journal) ran an episode with Anthropic's AI safety team just yesterday.

I had wondered if it was perhaps a PR push from Anthropic to make their safety people available to the press but it was probably just an adaption of an earlier WSJ written piece I wasn't aware of.

https://www.wsj.com/tech/ai/ai-safety-testing-red-team-anthr...

reustle 3 days ago

They have also published multiple videos on their YouTube channel featuring their trust and safety team. It seems to be a primary mission over there.

zonkerdonker 3 days ago

This is pretty bizarre. Anyone technical enough to know or care about ISO standards is going to be able to see right through this bullshit.

Honestly all this does is weaken the other standards out forth by ISO, to my eyes.

What's next? "Disney announces it now meets ISO 26123 certification for good movies"?

xigency 3 days ago

I heartily agree.

The icing on the cake is that you have to pay to read the standards document.