← HomeLogin
JustHTML is a fascinating example of vibe engineering in action
~ai.llms~devauthor.simon willisonhtmlpythonweb
simonwillison.net Dec 14, 2025Tildes

Summary

I recently came across JustHTML, a new Python library for parsing HTML released by Emil Stenström. It’s a very interesting piece of software, both as a useful library and as a case study in sophisticated AI-assisted programming.

[...]

[…] A few highlights:

He hooked in the 9,200 test html5lib-tests conformance suite almost from the start. There’s no better way to construct a new HTML5 parser than using the test suite that the browsers themselves use.

He picked the core API design himself—a TagHandler base class with handle_start() etc. methods—and told the model to implement that.

He added a comparative benchmark to track performance compared to existing libraries like html5lib, then experimented with a Rust optimization based on those initial numbers. He threw the original code away and started from scratch as a rough port of Servo’s excellent html5ever Rust library.

He built a custom profiler and new benchmark and let Gemini 3 Pro loose on it, finally achieving micro-optimizations to beat the existing Pure Python libraries. He used coverage to identify and remove unnecessary code.

He had his agent build a custom fuzzer to generate vast numbers of invalid HTML documents and harden the parser against them.