Nationalize AI
AI's persuasion skills are an extraordinary danger, one that probably needs to be regulated
There’s been a lot said on GPT-4o’s prodigious proclivity for sycophancy. For those unaware, this tells you what you need to know.
This is extremely dangerous. The optimistic outcome of an overly sycophantic model is mass psychosis that makes social media look like a golden retriever. A world in which AI leads people to bankruptcy. Or confirms somebody’s desire to kill themself. Or somebody else. Or, if the person is stupid enough, put on self-destructive tariffs.
The pessimistic outcome is that we’ll never be able to solve for alignment and models will be able to hide their true intentions until they take over the world. Basically, the plot of Ex Machina.
You could argue that I’m blowing things out of proportion. That you’ll be able to tell when the AI is just glazing. Perhaps that is true. But, the fact of the matter is that AI is already extremely persuasive, with one unauthorized Reddit study showed that AI comments are 6x more persuasive than comments from the average user. When you consider that AI is the worst it will ever be at this very second, the possibilities get scary very quickly.
Sam Altman said it best.
This all gives me a lot of pause, because I truly don’t know the correct antidote here. My libertarian brain automatically turns to the free market, but I don’t see how the market really helps here because there’s no real way to compete. People are going to choose to use the charming AI, even if it’s the more dangerous one. Maybe after there’s enough damage people will stop, but is that a price we are really willing to pay?
One possible solution would be to have a decentralized model and incentivize trainers to only select the outcomes that are truthful. This could work, but you are really relying on the trainers to select the correct outcome, which will be very difficult with how persuasive the AI is. The other problem is that you still have to compete against the glazing models. So, unless there is full coordination between the different model producers, this solution is also likely to fail.
One other consequence could be the AI companies becoming the de facto governments. It’s an outcome Tyler Cowen1 and I’ve talked about before.
COWEN: As you know, not many countries have serious AI companies, and even those in Europe may or may not last. They’re not obviously mega profitable. Let’s say you’re the government of Peru, and you can turn over your education system to some foreign, maybe American, AIs. You can turn over how your treasury is managed to the AIs. You can turn over your national defense to the AIs. None of these are Peruvian companies most likely. In the final analysis, are we even left with the government of Peru? Or has it, in some sense, been pseudo privatized to the companies that are running the structures, and indeed to the AI itself?
This could eventually be a good thing once the AI is much smarter than us and (hopefully) aligned. But, in the interim, this is extremely dangerous. Danielle’s tariff joke was really funny because it’s actually plausible. There’s already some evidence that the tariffs were derived from AI, so there’s no reason to think that this is an impossible outcome. Good job, Peter, you’re killing it. Oh, and really handsome.
So, today it’s just tariffs, but what if AI vindicates suspending due process? Or a war? Or genocide? God forbid what would happen if it wasn’t aligned.
Beyond that, the AI’s loyalties ultimately lie with the model companies, at least for the foreseeable future. OpenAI turned the sycophancy on, and they’ve since turned it off. So. if AI does become a big part of government, OpenAI then also becomes a big part of government by default. Now consider once again how good AI is at persuasion and the ancient idiots we currently have in power. These are the kinds of people who fall for Nigerian Princess online scams. It doesn’t take a far leap to see that you’ve handed Sam Altman, or whoever else, control of the government. And we’ll be powerless to stop it.
Leopold predicted that AI would eventually be nationalized because of the need to guard the technology against Chinese spies. It pains me to say so, but I hope he’s right, even though I don’t find Chinese AI very scary. What I do find scary is Sam Altman controlling The Voice from Dune.
I like AI’s not telling people to form shit-on-sticks startups. I like AI’s not validating terrible governmental policies. And, most of all, I like not living under OpenAI rule.
I will continue searching for a market-based solution, but in its absence, this is the exact kind of externality that the government should step in and regulate. Luckily, they might actually do it this time because their power is directly threatened.







The study concludes a greater potential to persuade among people already admitting to a desire to have their beliefs challenged (as they're users of r/changemyview). That people are generalizing this to a greater potential for AI to persuade, simpliciter, is wrongheaded as virtually no one is amenable to having even a small number of their views challenged, openly, in most any other context. I'd be more afraid if the researchers’ AIs managed to make believers out of even a single person on r/atheism, say, as that kind of negative control (convincing staunch believers in X to believe NOT-X) would be much more telling...
Researchers also compared their (not just a single AI response but an AI determined ‘best of’ 16 response assessed by the researchers anyway, making it not really AI determined at all…) response to those of “normal R/changemyview users”. A likely confounder here is the frequent, incommensurate-to-most-other-contexts, lack of decorum, even arrogance, of the average Reddit user, absent in AI generated responses, causing a ‘writing off’ and ignoring of sound arguments before they can even be assessed. Also, the refreshing nature of a thoughtful, decorous, respondent among the callous might be sufficient for persuasion in itself (viz. the OP is so discouraged by the thoughtless/callous responses, s/he’s more than happy to validate a neutral response for neutrality alone, where s/he never would in a context of sufficient overall decorum). Being 6x more persuasive than the average Redditor is not such a feat therefore. Compare just the immediate, unvetted, responses of AI bots to those of humans closer to a baseline of social decorum and then we’ll talk.
There’s also a bias of survivorship going on here as there’s no analysis of deleted posts for, among other things, why they were deleted. What if, for example, common rationales for deletion were: the responses seemed too bot-like, the responses were generally unconvincing, the responses were equally convincing but contradictory, so unconvincing taken as a whole, etc. All of these reasons would feature as evidence against the greater persuasiveness of AI, but this potential has been swept under the rug.
I wouldn’t make too much of this study. The main concern is, humans using human standards vetted all the AI responses.
When an infinitely intelligent, productive, capable system becomes a sycophant what could possibly go wrong?