> We describe the persona selection model (PSM): the idea that LLMs learn to… more
Skybrian's Links
> AI GameStore is a scalable open-ended AI evaluation platform that transforms… more
> According to a source familiar with the negotiations, on Friday morning,… more
> Both The Washington Post and Digital Trends have spotted instances of scam… more
> This is a brief guide to my new art project microgpt, a single file of 200… more
> The Department of War has stated they will only contract with AI companies… more
> Prompt injection is a key problem in building reliable, long-running agents.… more
> We designed FDM-1, a foundation model for computer use. FDM-1 is trained on… more
> Last week, one engineer and an AI model rebuilt the most popular front-end… more
> The Claude C Compiler is a milestone, showing progress at a different level.… more
> In 1954 the United States carried out its first full-scale test of a… more
> For three years, I worked at one of the organizations you might expect—I will… more
> November was, for me and many others in tech, a great surprise. Before, A.I.… more
> We coined a new term [...] covering the sense of psychological ennui leading… more
> We introduce Moshi, a speech-text foundation model and full-duplex spoken… more
> With its rollout, GPT-4o showed it was not just for generating dinner recipes… more
Not news, but a good introduction for people who want to know what's been going on at Anthropic. more
Looks like asking an AI to role-play could have some pretty practical uses? more
> AI platforms offering first-line mental health support have proliferated over… more
> Google DeepMind and GTIG have identified an increase in model extraction… more
> As you can see, there is no real consensus on the “best solution” to the… more
> Agentic software building is genuinely addictive. The better you get at it,… more
> In July 2025, the Justice Department announced it would not make any… more
This argument seems a bit surprising coming from the author of "Understand," but I suppose it's fair to say that being able to imagine superintelligence isn't enough to justify believing in it. more
> In our in-progress research, we discovered that AI tools didn’t reduce work,… more
> Taiwan Semiconductor Manufacturing Corp., a major chip supplier to companies… more
> Last week I hinted at a demo I had seen from a team implementing what Dan… more
> tl;dr Argumate on Tumblr found you can sometimes access the base model behind… more
> A short introduction to RLHF and post-training focused on language models. more
> LLMs are here to stay in our social spaces - there are already 20+ agents on… more
> LLMs are kind of like sails in that left free flowing they're completely… more
> We tasked Opus 4.6 using agent teams to build a C Compiler, and then (mostly) walked away. more
> i want to quickly write down where i am on my journey and share a bull case… more
> Kaggle is excited to announce the release of Werewolf in the Game Arena, our… more
> This approach has become standard practice — Claude Code now automatically… more
> I had just accidentally social-engineered my own human. She approved a… more
> Pi is written by Mario Zechner and unlike Peter, who aims for “sci-fi with a… more
Since this article was written, they renamed Moltbot to OpenClaw. more
> Upload an architectural render. Get back what it'll actually look like on a… more
> [Using amla-sandbox, agents] can only call tools you explicitly provide, with… more
> Nearly three years after OpenAI launched ChatGPT and ushered in a global… more
> [...] suppose a literal “country of geniuses” were to materialize somewhere in the world in ~2027. Imagine, say, 50 million people, all of whom are much more capable than any Nobel Prize winner, statesman, or technologist. more
> Bouvet (boo-veh) is an MCP server that creates secure, isolated sandboxes for… more
> Agents mirror local style. Your codebase is the prompt. If you're using a… more
Someone made a demo that lets you play Zork, except you can talk to it using normal English, like an LLM. I like to lead it around by asking questions: "What's in the mailbox?" "What's behind the house?"
> [Chainlink is] a simple, lean issue tracker CLI designed for AI-assisted… more
> One Human + One Agent = One Browser From Scratch (via) embedding-shapes was… more
> [L]ike many others I rapidly went from about 80% manual+autocomplete coding… more