← HomeLogin
Gwtar: a static efficient single-file HTML format
~devauthor.gwernhtmlfile formatsweb
gwern.net Feb 15, 2026Tildes

Summary

Gwtar is a new poly­glot HTML archival for­mat which pro­vides a sin­gle, self-contained, HTML file which still can be ef­fi­ciently lazy-loaded by a web browser. This is done by a header’s JavaScript mak­ing HTTP range re­quests. It is used on Gwern.net to serve large HTML archives.

[...]

We in­tro­duce a new for­mat, Gwtar (⁠⁠logo⁠; pro­nounced “gui­tar”, .gw⁠tar.html ex­ten­sion), which achieves all 3 prop­er­ties si­mul­ta­ne­ously. A Gwtar is a clas­sic fully-inlined HTML file, which is then processed into a self-extracting con­cate­nated file of an HTML + JavaScript header fol­lowed by a tar­ball of the orig­i­nal HTML and as­sets. The HTML header’s JS stops web browsers from load­ing the rest of the file, loads just the orig­i­nal HTML, and then hooks re­quests and turns them into range re­quests into the tar­ball part of the file.

Thus, a reg­u­lar web browser loads what seems to be a nor­mal HTML file, and all as­sets down­load only when they need to. In this way, a sta­tic HTML page can in­line any­thing—such as gigabyte-size media files—but those will not be down­loaded until nec­es­sary, even while the server sees just a sin­gle large HTML file it serves as nor­mal. And be­cause it is self-contained in this way, it is forwards-compatible: no fu­ture user or host of a Gwtar file needs to treat it spe­cially, as all func­tion­al­ity re­quired is old stan­dard­ized web browser/server func­tion­al­ity.