OpenAI Threatening to Ban Users for Asking Strawberry About Its Reasoning
Earlier discussion: https://news.ycombinator.com/item?id=41534474
I'd still love to understand how a non-profit organization that was founded with the idea of making AI "open" has turned into this for profit behemoth with the least "open" models in the industry. Facebook of all places is more "open" with their models than OpenAI is.
"For your safety" is _always_ the preferred facade of tyranny.
This seems like a fun attack vector. Find a service that uses o1 under the hood and then provide prompts that would violate this ToS to get their API key banned and take down the service.
> The flipside of this approach, however, is that concentrates more responsibility for aligning the language language model into the hands of OpenAI, instead of democratizing it. That poses a problem for red-teamers, or programmers that try to hack AI models to make them safer.
More cynically, could it be that the model is not doing anything remotely close to what we consider "reasoning" and that inquiries into how it's doing whatever it's doing will expose this fact?
I don't know how widely it got reported on, but attempting to jailbreak Copilot nee. Bing Chat would actually result in getting banned for a while, post-Sydney-episode. It's interesting to see that OpenAI is saying the same thing.
This just screams to me that o1's secret sauce is easy to replicate. (e.g. a series of prompts)
Perhaps controlling AI is harder than people thought.
They could "just" make it not reveal its reasoning process, but they don't know how. But, they're pretty sure they can keep AI from doing anything bad, because... well, just because, ok?
Just give it more human-like intelligence.
Kid: "Daddy why can't I watch youtube?"
Me: "Because I said so."
Kinda funny how just this morning I was looking at a "strawberry" app on f-droid and wondering why someone would register such a nonsense app name with such nonsense content:
https://github.com/Eve-146T/STRAWBERRY
Turns out I'm not the only one wondering, although the discussion seems to largely be around "should be allow users to install nonsense? #freedom " :D
I wish people kept this in the back of their mind every time they hear about "Open"AI:
"As we get closer to building AI, it will make sense to start being less open. The Open in OpenAI means that everyone should benefit from the fruits of AI after its built, but it's totally OK to not share the science (even though sharing everything is definitely the right strategy in the short and possibly medium term for recruitment purposes)."
-Ilya Sutskever (email to Elon musk and Sam Altman, 2016)
On the one hand, this is probably a (poor) attempt to keep other companies from copying their 'secret sauce' to train their own models, as has already happened with GPT-4.
On the other hand, I also wonder if maybe its unrestrained 'thought process' material is so racist/sexist/otherwise insulting at times (after all, it was trained on scraped Reddit posts) that they really don't want anyone to see it.
Another reason llama is so important is that once you’re banned from OAI you’re fucked for the entire future AGI products as well.
This has always been the end-game for the pseudoscience of "prompt engineering", which is basically that some other technique (in this case, organizational policy enforcement) must be used to ensure that only approved questions are being asked in the approved way. And that only approved answers are returned, which of course is diametrically opposed to the perceived use case of generative LLMs as a general-purpose question answering tool.
Important to remember too, that this only catches those who are transparent about their motivations, and that there is no doubt that motivated actors will come up with some innocuous third-order implication that induces the machine to relay the forbidden information.
What I found very strange was that ChatGPT fails to answer how many "r"'s there are in "strawberrystrawberry" (said 4 instead of 6), but when I explicitly asked it to write a program to count them, it wrote perfect code that when ran gave the correct answer.
Seems rather tenuous to base an application on this API that may randomly decide that you're banned. The "decisions" reached by the LLM that bans people is up to random sampling after all.
Like other programs, you should have FOSS that you will run on your own computer (without needing internet etc), if you should want freedom to use and understand them.
It's not just a threat, some users have been banned.
Hm. If a company uses Strawberry in their customer service chatbot, can outside users get the company's account banned by asking Wrong Questions?
They should just switch to reasoning in representation space, no need to actualize tokens.
Or reasoning in latent tokens that don’t easily map to spoken language.
This will lead to strawberry appeals forever.
I don't know what I'm doing wrong but I've been pretty underwhelmed by o1 so far. I find its instruction following to be pretty good, but so far Claude is still much better at taking coding tasks and just getting it right on first try.
Wasn't AI supposed to replace employees? Imagine if someone tried this at work.
> I think we should combine these two pages on our website.
> What's your reasoning?
> Don't you dare ask me that, and if you do it again, I'll quit.
Welcome to the future. You will do what the AI tells you. End of discussion.
I'm confused. Who decides if you are asking or not? Are casual users who innocently ask "tell me how you came to decide this" just going to get banned based on some regex script?
YC is responsible for this. They seek profit and turned a noble clause into a boring corp.
I am resigning from OpenAI today because of their profit motivations.
OpenAI will NOT be next Google. You heard it here first.
How will this be controlled on Azure? Don't they have a stricter policy on what they view and also develop their own content filters?
This is not, of course, the sort of thing you do when you actually have any confidence whatsoever in your "safety measures".
Can I risk loosing access if any of my users write CoT-leaking prompts on the AI-powered services that I run?
undefined
Is this still happening? It may merely have been some mistaken configuration settings.
I guess we'll never learn how to count the 'r's in strawberry
Why is banning even a threat? I can make a new account for 20 cents lol.
LLMs are not programs in the traditional sense. They're a new paradigm of software and UX, somewhere around a digital dog who read the whole internet a million times but is still naive about everything.
There are three r's in mirror.
Is there an appropriate open source advocacy group that can sue them into changing their name on grounds of defamation?
If OpenAI gets to have competitive advantage from hiding model output then they can pay for training data, too.
Should not AI research and GPUs be export-controlled? Do you want to see foreign nations making AI drones using published research and American GPUs?