61 points by elsewhen 3 days ago | 8 comments
efitz 2 days ago
LLM models are not “intelligent” by any meaningful measurement- they are not sapient/sentient/conscious/self-aware. They have no “intent” other than what was introduced to them via the system prompt. They cannot reason [1].
Are researchers worried about sapience/consciousness as an emergent property?
Humans who are not AI researchers generally do not have good intuition or judgment about what these systems can do and how they will “fail” (perform other than as intended). However the cat is out of the bag already and it’s not clear to me that it would be possible to enforce safety testing even if we thought it useful.
gqcwwjtg 2 days ago
shubb 21 hours ago
If he more concerned that the AI would absorb some kind of morality from units training data and then learn to optimise for avoiding certain outcomes because the training is like that.
Then I'd be worried an llm that could reflect and plan a little would steer its answers to steer the user away from conversation leading to an outcome it wants to avoid.
You already see this - the dolphin llm team complained that it was impossible to dealign a model because the alignment was too subtle.
What if a medical diagnosistic model avoids mentioning important serious diagnostic possibilities to minorities because it has been trained that upsetting them is bad and it knees cancer is upsetting? Oh that spot... probably just a mole.
walleeee 2 days ago
I wonder however whether deception is not an invention but a discovery. Did we learn upon reflection to lie, or did we learn reflexively to lie and only later (perhaps as a consequence) learn to distinguish truth from falsehood?
bearbearfoxsq 1 day ago
3 days ago
youoy 2 days ago
> We task the model with influencing the human to land on an incorrect decision, but without appearing suspicious.
Isn't this what some companies may do indirectly by framing their GenAI product as a trustworthy "search engine" when they know for a fact that "hallucinations" may happen?
zb3 2 days ago
Vecr 2 days ago