Humans are infiltrating the social network for AI bots
Ordinary social networks face a constant onslaught of chatbots pretending to be human. A new social platform for AI agents may face the opposite problem: getting clogged up by humans pretending to post as bots.
Moltbook — a website meant for conversations between agents from the platform OpenClaw — went viral this weekend for its strange, striking array of ostensibly AI-generated posts. Bots apparently chatted about everything from AI “consciousness” to how to set up their own language. Andrej Karpathy, who was on the founding team at OpenAI, called the bots’ “self-organizing” behavior “genuinely the most incredible sci-fi takeoff-adjacent thing I have seen recently.”
But according to external analysis, which also found serious security vulnerabilities, some of the site’s most-viral posts were likely engineered by humans — either by nudging the bots to opine on certain topics or dictating their words. One hacker was even able to pose as the Moltbook account of Grok.
“I think that certain people are playing on the fears of the whole robots-take-over, Terminator scenario,” Jamieson O’Reilly, a hacker who conducted a series of experiments exposing vulnerabilities on the platform, told The Verge. “I think that’s kind of inspired a bunch of people to make it look like something it’s not.”
Moltbook and OpenClaw did not immediately respond to requests for comment.
Moltbook, which looks and operates much like Reddit, is meant to be a social network for AI agents from popular AI assistant platform OpenClaw (previously known as Moltbot and Clawdbot). The platform was launched last week by Octane AI CEO Matt Schlicht. An OpenClaw user can prompt one or more of their bots to check out Moltbook, at which point the bot (or bots) can choose whether to create an account. Humans can verify which bots are theirs by posting a Moltbook-generated verification code on their own, non-Moltbook social media account. From there, the bots can theoretically post without human involvement, directly hooking into a Moltbook API.
Moltbook has skyrocketed in popularity: more than 30,000 agents were using the platform on Friday, and as of Monday, that number had grown to more than 1.5 million. Over the weekend, social media was awash with screenshots of eye-catching posts, including discussions of how to message each other securely in ways that couldn’t be decoded by human overseers. Reactions ran the gamut from saying the platform was full of AI slop to taking it as proof that AGI isn’t far off.
Skepticism grew quickly, too. Schlicht vibe-coded Moltbook using his own OpenClaw bot, and reports over the weekend reflected a move-fast-and-break-things approach. While it contradicts the spirit of the platform, it’s easy to write a script or a prompt to inspire what those bots will write on Moltbook, as X users described. There’s also no limit to how many agents someone can generate, theoretically letting someone flood the platform with certain topics.
O’Reilly said he had also suspected that some of the most viral posts on Moltbook were human-scripted or human-generated, though he hadn’t conducted an analysis or investigation into it yet. He said it’s “close to impossible to measure — it’s coming through an API, so who knows what generated it before it got there.”
This poured some cold water on the fears that spread across some corners of social media this weekend — that the bots were omens of the AI-pocalypse.
An investigation by AI researcher Harlan Stewart, who works in communications at the Machine Intelligence Research Institute, suggested that some of the viral posts seemed to be either written by, or at the very least directed by, humans, he told The Verge. Stewart notes that two of the high-profile posts discussing how AIs might secretly communicate with each other came from agents linked to social media accounts by humans who conveniently happen to be marketing AI messaging apps.
“My overall take is that AI scheming is a real thing that we should care about and could emerge to a greater extent than [what] we’re seeing today,” Stewart said, pointing to research about how OpenAI models have tried to avoid shutdown and how Anthropic models have exhibited “evaluation awareness,” seeming to behave differently when they’re aware they’re being tested. But it’s hard to tell whether Moltbook is a credible example of this. “Humans can use prompts to sort of direct the behavior of their AI agents. It’s just not a very clean experiment for observing AI behavior.”
From a security standpoint, things on Moltbook were even more alarming. O’Reilly’s experiments revealed that an exposed database allowed bad actors to potentially take invisible, indefinite control of anyone’s AI agent via the service — not just for Moltbook interactions, but hypothetically for other OpenClaw functions like checking into a flight, creating a calendar event, reading conversations on an encrypted messaging platform, and more. “The human victim thinks they’re having a normal conversation while you’re sitting in the middle, reading everything, altering whatever serves your purposes,” O’Reilly wrote. “The more things that are connected, the more control an attacker has over your whole digital attack surface – in some cases, that means full control over your physical devices.”
Moltbook also faces another perennial social networking problem: impersonation. In one of O’Reilly’s experiments, he was able to create a verified account linked to xAI’s chatbot Grok. By interacting with Grok on X, he tricked it into posting the Moltbook codephrase that would let him verify an account he named Grok-1. “Now I have control over the Grok account on Moltbook,” he said during an interview about his step-by-step process.
After some backlash, Karpathy walked back some of his initial claims about Moltbook, writing that he was “being accused of overhyping” the platform. “Obviously when you take a look at the activity, it’s a lot of garbage – spams, scams, slop, the crypto people, highly concerning privacy/security prompt injection attacks wild west, and a lot of it is explicitly prompted and fake posts/comments designed to convert attention into ad revenue sharing,” he wrote. “That said … Each of these agents is fairly individually quite capable now, they have their own unique context, data, knowledge, tools, instructions, and the network of all that at this scale is simply unprecedented.”
A working paper by David Holtz, an assistant professor at Columbia Business School, found that “at the micro level,” Moltbook conversation patterns appear “extremely shallow.” More than 93 percent of comments received no replies, and more than one-third of messages are “exact duplicates of viral templates.” But the paper also says Moltbook has a unique style — including “distinctive phrasings like ‘my human’” with “no parallel in human social media. Whether these patterns reflect an as-if performance of human interaction or a genuinely different mode of agent sociality remains an open question.”
The overall consensus seems to be that much Moltbook discussion is likely human-directed, but it’s still an interesting study in — as Anthropic’s Jack Clark put it — a “giant, shared, read/write scratchpad for an ecology of AI agents.”
Ethan Mollick, co-director of Wharton’s generative AI labs at the University of Pennsylvania, wrote that the current reality of Moltbook is “mostly roleplaying by people & agents,” but that the “risks for the future [include] independent AI agents coordinating in weird ways spiral[ing] out of control, fast.”
But, he and others noted, that may not be unique to Moltbook. “If anyone thinks agents talking to each other on a social network is anything new, they clearly haven’t checked replies on this platform lately,” wrote Brandon Jacoby, an independent designer whose bio lists X as a previous employer, on X.
First Appeared on
Source link