On forking the Web
46 points by spc476
46 points by spc476
Adding scripting capabilities was a mistake, so we can avoid it now. This doesn't restrict users to have interactive programs. An example is an interactive map that is currently loaded in the browser using JavaScript so show a location of a place of interest. Instead, you can provide a Geo link to open the location in any client that supports the protocol.
So I need an app for everything again? I don't agree that scripting capabilities are a mistake per se. I do like the web as the universal platform crossing OS borders.
Yes, came here to say the same.
The advantage of using a native program to load a standardized file or URL is that it can be optimized to the device in use and prevent the "one size fits all" approach of many interactive Web pages.
I don't want to go back to the time when Linux users were a 2nd class citizen because there was only Windows support (and sometimes Mac).
I don't either, but the Web has become a VM running applications that by sheer coincidence sometimes resemble documents, and only platforms capable of running the VM get to meaningfully access it. Given the size of the VM now and the table stakes needed to support it, that becomes a substantial barrier to entry. I'm not even talking about running Chromium on NumbnutBSD on WeirdAsArchitecture, I mean even things like Ladybird on Linux/x86_64 that should be mainstream.
Ladybird on Linux/x86_64 that should be mainstream
Is this problematic? I haven't used Linux as a daily driver in quite a few years, but I thought this sort of thing wasn't a problem any more, did it get worse since I left?
Yeah. We can rant a lot about web apps, but they are the only way to dodge the Apple tax (and a potential future Google tax) when distributing apps to mobile platforms. And the situation with native desktop development is also not trivial, so I completely sympathize with people who reach for webapps or Electron on the desktop
We can rant a lot about web apps, but they are the only way to dodge the Apple tax (and a potential future Google tax) when distributing apps to mobile platforms.
The Apple tax is a political problem though, not a technical one. Web apps are already inferior to native apps on iOS. Were they to reach technical parity, Apple would institute a new tax. The entire business model of running a software distribution monopoly needs to be made illegal.
So I need an app for everything again?
Why would you? Nothing about this spec precludes you from running a normal web browser, the web as it exists now isn’t going anywhere.
I think the real sweet spot would be to figure out how much can be done with standardized markups. The problem with the modern web is it endlessly reinvents concepts, a lot of which should be declarative markups. The display path for a website really shouldn't involve Javascript. Scripting should be for specific client side programmability, like munging data sets returned from a server and things that would otherwise be done server side.
I think there's a lot of room to achieve this sort of thing by peeling apart intractably complex "web standards" like HTML/CSS into smaller, disjoint formats and protocols.
Instead of tables or complex datagrid webcomponents, make browsers understand CSV or JSONL and render them with a nice tabular UI; no need to scrape HTML to get at the underlying data, and even very rudimentary support for these formats would get you something easy to copy and paste into a local spreadsheet. Need to traverse and fetch directory trees of documents? Open an (S)FTP or Gopher client in a new tab, ideally with some affordances like rendering README.(txt|md) at the bottom of the selected directory's tree view. So many of the protocols we need already exist!
It would be great if, as a user, I could easily and seamlessly extend my web browser to support UXN rom images or display a nice hex editor for unrecognized binary formats, while the authors of the browser and the page can just think in terms of links to documents with MIME types and nested rectangles on a page. Something like the dream of user-stylesheets, but for document interpretation rather than just presentation, and segmented neatly at divisions between document types.
Years ago, we solved these problems of extending browser support for new media types with plugins- Java Applets, Flash, Shockwave, Silverlight. The plugins were platform- and browser-specific and often hosted undocumented, opaque filetypes. Today, we extend browser capabilities with the general-purpose building-blocks of JS, CSS, and HTML, and for the most part those "plugins" are inherently cross-platform, but they're delivered by the site-builders, rather than chosen by users, and the code and data of these extensions are mingled into the soup of the surrounding page.
To me, this illustrates a problem I have with forking the Web, or with Gemini---people often want their subset of features they like and want, but it's a different subset of features from person to person.
So I need an app for everything again? I don't agree that scripting capabilities are a mistake per se. I do like the web as the universal platform crossing OS borders.
I mean, we do have an app for everything. It's just called a URL or a domain name instead of an app.
I get the impression that the IT world rallied behind the Web Browser as the VM of choice when it became clear that the available sandboxing alternatives, like Java (along with its Swing interface libraries), or even Flash (still within a browser), were painfully inferior. Now, we have a single application --- Google Chrome --- acting as that VM for the vast majority of all general purpose computing that gets done by users (that is owned and developed by a surveillance capitalist monopolist). Whether this is genuinely more secure, or whether the active zero-days are now simply too valuable to ever disclose, I don't know.
I do think adding scripting was a mistake. Or at least, it certainly was a bolted-on after-thought, and I agree with Dillo's scoping for a hypertext document reader that is not concerned with also enabling writing or editing of those documents within the reader itself:
The objective is not to create a feature-by-feature clone of the Web, but to create an specification that allows humans to exchange knowledge, notes, and other forms of information without the imposed requirement of having to run a full blown VM to read it.
I'd love to see a trimmed down "universal application" that handles the majority of the "interactive" concerns within a better sandbox. Think of something like Reddit, or any other Social Media feed. Do we really need an entire VM for pushing and pulling hypertext? Or "order an item from a store." Is an entire VM necessary for handling a shopping cart and payment information?
But alas, the "universal" tends to take over the "application" and you'd probably just re-invent The Web at that point. Maybe that would still be preferable, in that you'd have a chance for an entity besides Google --- and a language besides C++ --- to be the foundations of it (ETA: looks like Dillo is C and C++; one out of two is better than nothing I suppose :P).
My suggestions for additional things to improve:
Came here to say the same. This alone will prevent several very serious issues.
Thanks!
Ideally, I plan to make it transport agnostic. Probably I'll try local first so it can work over any remote FS mount point.
Let's see if we can improve the diet and avoid cookies this time.
This is a machine representation of the document. It is designed to be readable by humans, but not really to be written by them. Instead you'll want to use a frontend language like Markdown and then compile into a portable strict document.
I wrote some notes on a related idea, but focusing ot the HTTP side rather than the HTML side: HTTP 1.0bis: a modest proposal
My suggestions?
example.net won't be sent for sub.example.net, and cookies for sub.example.net won't be set for example.net.Require support for rendering text/markdown
Which version? There's about half a dozen variations to support.
The idea that "adding scripting capabilities was a mistake" has been a meme among dour, no-fun-allowed programmer types for a long time, but I think it's a large failure of imagination. Applied thoughtfully, interactive "multimedia" can vastly enhance communication and explanation. Consider, for example, the interactive figures throughout the Red Blob Games Hex-Grid tutorial or Bartosz Ciechanowski's fantastic explanation of mechanical watch movement. Interactive media in the web makes it possible to try out an obscure but historically significant computer like the Canon Cat in seconds by clicking a link, instead of wrangling with the often nightmarish steps to compile and run a native emulator. Form submission and image maps offer the palest imaginable shadow of multimedia, and they shift the burden of supporting interactivity to an inherently server-heavy (and potentially bandwidth-heavy) model.
The problem is not scripted behavior, it's what scripting is presently allowed to do in a browser. In much the same way that HTTP and HTML could be curtailed into a leaner, simpler system that better respects user autonomy, most of the positive aspects of JavaScript on the web could be retained while vastly reducing the API surface area and malicious potential. Imagine, just for example, a web with Flash-like rectangles of interactivity within pages, but where that interactivity was furnished by user-accessible and inspectable Lua scripts with a Love2D-like set of facilities for drawing graphics and reading input, and where anything privacy-infringing- like phoning home to a remote server or accessing a webcam- was gated behind a strong sandbox and informed, affirmative consent from users. It's possible to write web applications that are respectful in this fashion today, but the substrate is lumpy, inconsistent, and peppered with both obvious omissions of useful functionality and glaring, unnecessary threats to the safety and privacy of users.
For a more accessibility-oriented vision, how about "client-side forms" that strictly process input from declarative UIs- buttons, fields, checkboxes, sliders- and render images and other markup the same as a static page in an <iframe>, but do their work without round-tripping to a remote server? A wide variety of useful calculators, tools, and interactive visualizations could fit in such a model, with better latency and user security than any backend-driven model.
"strict grammar" isn't going to work. It is why XHTML failed, and why HTML5 added rules for how to deal with lots of common "breakage". It may be possible to respecify HTML5 in a more formal grammar like the author wants, but to hard-reject pages isn't a great use of a fork imo.
the other alternative is this becomes yet another replacement for gopher/gemini, and i know those have hardcore fanbases, but they aren't popular for a reason. backcompat is just too strong.
"strict grammar" isn't going to work. It is why XHTML failed
Disagree. XHTML failed because IE didn’t support it until far too late (2011), so people weren’t actually using XHTML, and thus couldn’t reap its benefits. Strictness can be really good, but only works with support.
Certainly “IE!” is a simplification, perhaps gross. It wasn’t the only thing that failed to support—as I’ve done various archaeology I’ve been surprised to discover just how bad the state of affairs was around XML, SGML, HTML… people played fast and loose, and had a dreadful habit of defining complex machines and then only getting round to implementing a quarter of it (and not always the same quarter), buggily and incompatibly.
But IE categorically blocked any chance of a clean switch-over which is what people had hoped for. Once IE didn’t support it, nothing else mattered.
In a clean room, XHTML is fine. Genuinely. But XHTML was trying to replace something that worked, when despite efforts, it didn’t.
XHTML failed because it was overly complex and disconnected from the actual web, because even experts who were motivated to get it right routinely could not, and because it was full of truly nasty edge cases which could cause your seemingly well-formed, passes-the-validator XHTML document to nonetheless error when handed to client software.
Some prior comments of mine on the topic to illustrate:
Text first
And down with CSS. It exists largely to serve companies, not users. Let the browser, not the page, control style.
If a user chooses to read a raw page payload, they should see that the bulk of it was the same information the browser presented for them to read. Today, the readable content is just the tip of the iceberg.
No scripting
I speculate, if we take away styling and bulky pages, that may greatly reduce the need for scripting that affects presentation. And scripting that doesn't affect presentation has generally been used against the interests of users.
And down with CSS. It exists largely to serve companies, not users. Let the browser, not the page, control style.
This was the whole point of the CSS cascade. There's a reasonable subset of CSS that lets you do formatting of books and papers, and it was supposed to be merged with user styles. But then CSS and formatting got overcomplex, and user styles have to start with a full CSS reset and/or be highly site-specific.
And down with CSS. It exists largely to serve companies, not users. Let the browser, not the page, control style.
I think I'll choose being happy in my world full of CSS filled with colours, fancy backgrounds, nice fonts, layouts of flexboxes, and grids, rather than look only at bland documents which wash down the author's voice to black text on a white background.
CSS is a form of author expression on the Web, and I really would not like to get rid of it. It is complex yeah, but I argue that's a good thing, since it accomodates more individuals doing fun things with their websites.
I speculate, if we take away styling and bulky pages, that may greatly reduce the need for scripting that affects presentation.
If I want to display a timestamp in the reader's timezone, there is currently no way for me to do that without client side scripting. I ran into this while trying to build a thing for myself using no JS on the client and realized I'd have to either have a "set my timezone" setting in the server or just add a little shim for this.
Let the browser, not the page, control style.
This seems like it could easily lead into an even worse case of "my page looks readable in browsers X and Y, but not Z" than we have now.
And down with CSS. It exists largely to serve companies, not users. Let the browser, not the page, control style.
Bear in mind that the alternative to CSS isn't user-driven stylesheets, it's inaccessible, unmodifiable images instead. You saw this on MySpace, on BB forums, even websites running into the limitations of early CSS, and you still see it today, for example on Amazon pages or GitHub README files.
Some of this is a desire for consistent corporate design, but everything you take away from companies, you also take away from users, and users love to express themselves via colour, font, shape, etc. How many web developers got their start playing around with MySpace's CSS options, either for themselves or as technical support for their mates in a band? And forums used to be full of this sort of thing - I remember whenever someone figured out a new technique for getting BBCode to do something cool, it would suddenly be everywhere as everyone wanted to try it out.
So either you need to limit images as well, and possibly even remove things like tables, or you need to accept that most people creating thing online want to express themselves visually as well as textually, and give them the tools to do that in an accessible way.
Let the browser, not the page, control style.
Yes, I agree. I need to do more research before banning optional styles from the author.
I speculate, if we take away styling and bulky pages, that may greatly reduce the need for scripting that affects presentation.
I also have a similar feeling. Having a simple grammar could mean that you can just embed a "document" in any interactive native program, rather than the other way around.
Like others are mentioning, I think Gemini is a good example to look at. I'll say it again: I think Gemini is performative art, but there's a LOT that we can learn from it.
One notable thing about Gemini that does not get enough exposure is subscribable pages. Links in a page whose text start with YYYY-MM-DD form an implicit feed. This sounds like extremely limited and dumb, but I find it one of the most striking Gemini features. Spec here.
With traditional HTML, it's reasonable for people to write a blog by hand. It's surely tedious, but quite doable. However, maintaining an RSS/ATOM feed nearly requires something that generates the feed.
A next-gen "content-oriented" HTML would do well to add a similar feature. Maybe h-feed in microformats is the proper way, but I really like the simplicity of Gemini subscribable pages. And pervasive feeds are GOOD.
Gemini being line oriented and easy to parse is a great feature, but I feel it's too limited and might have bad accessibility implications. But I wouldn't mind if there was an HTML-lite that looked like Gemini.
Something else that a web fork could benefit from is fixing some of the HTML bolted-on stuff. <meta name="viewport" content="width=device-width, initial-scale=1.0"/> bothers me a lot. A new version based on what we know today would likely be very nice.
About the other stuff, I'm not so sure.
In principle, I'm completely onboard of no JS. However, I think one of the best use cases of the web is universal access to essentials like government, banking, etc. Can you really do everything without JS with good usability? I am not entirely convinced, but could be.
I will also highlight this from another comment:
Nothing about this spec precludes you from running a normal web browser, the web as it exists now isn’t going anywhere.
I would really like if I could run a "content web browser" and "app browsers". Really, there aren't so many good alternatives to the web as an app platform for many purposes- it has evolved a lot and developers seem to prefer it heavily to anything else.
In this world, yeah, Google Maps would not work in my content web browser and would open in my app browser. And if I ran GMail in my app browser, links in emails would open in my content web browser.
Content web browsers ideally would be much easier to implement and this would foster competition and then innovation. However, I don't really see any path to making this happen, which is unfortunately. For sure I'd be much happier if I could do all my content browsing in such a content browser- by having a much smaller surface area, I would feel much more comfortable about security. But I don't think anyone cares anymore :(
Can you really do everything without JS with good usability?
basically every legitimate use case for JS in a web page (not web app) results from browsers missing important features. we've had decades to learn, and scripts have allowed browsers to just not bother to add these features. so just... add them!
I think this is a very naive view, that you can provide every possible “legitimate use case” without code, just by… adding code.
It really begs the question, who gets to decide what’s “legitimate”? And even if we accept that some things are not legitimate, is it even possible to enumerate the entire set of legitimate use cases? Let alone provide a sufficiently generic solution for all possible combinations?
It’s hard to imagine the various incredible interactive documents for things as diverse as watches and GPS being possible without scripting. Are these not web pages? Are they not legitimate?
I think this would work better with a clear motivation. "Make things simpler" is too abstract, because everyone has a different idea of what's simpler, so I think this needs a more explicit goal: why does the web need to be simpler, what specific need does this serve?
For example, the Gemini project is about creating a community that values certain forms of communication. They have made the web simpler by restricting it to those forms of communication (with even images technically not being supported, iirc), because that fits the goals for their community.
On the other hand, you've got tools like Sciter or Blitz that have the goal of making it easier to embed a browser-like renderer in other applications. They simplify things by removing unnecessary quirks, or making things like HTML parsing or JS execution optional, so that there's less for them to implement and less for their users to embed.
Both are aiming for simplicity, but because their underlying goals are very different, the results look very different. So what's the underlying goal here?
Sounds a bit like Gemini, though I suppose this fork would allow bit more.
I think the website could be written in some variation of markdown (or something else like it). Have it be a document that can be easily read even in it's raw form. Gemtext but bit more features, such as inline media.
Then allow some styling capabilities: web was/is great place for creativity. Keeping a simple, consistent set of styling would be allowing the more creative types to craft more whimsical sites.
Instead, you can provide a Geo link to open the location in any client that supports the protocol.
I think it would be quite cool to implement some kind of transport plugin architecture for URIs so you can also open links such as urn:lex:eu:council:directive:2010-03-09;2010-19-UE or urn:ietf:rfc:2648 directly in your browser. The transport plugin would be responsible for fetching the resource and optionally converting it into a compatible format.
That idea leads to the DDDS and NAPTR records in the DNS. (Which miraculously avoid being Turing-equivalent despite being based on repeated rewriting with regexes.)
This is something I care a lot about. I've been working on this for ~7 years (though don't have much to show yet. There's still a lot to learn.)
What I would like to build is a universal application platform that is simple enough for anyone to hold in their head and is structured in a way that makes client-server architectures very hard but local-first software very easy. Some of my past work:
I also have some blog posts:
I don't know if any of the above will become what I am imagining. What I basically want is a system where:
Markdown interface by returning a markdown string. As long as a browser knows about a Markdown interface that only depends on browser system interfaces, it can draw the content. Since other applications can provide interfaces for downstream use, users could choose between multiple markdown reader providers. Same goes for "legacy" HTML and CSS pages in theory.My current plan is to take Wasm (runtime), WebGPU (graphics), and WIT (IDL), strip out everything that is unnecessary, and wire them together into a single static rust binary that can be run as a browser or as a server. My only worry is that Wasm, WebGPU, and WIT are still very complex. This idea is sort of what my project Isocore is, but I haven't had a lot of time to work on it recently. (If this were more of an art project, I'd implement some bespoke VM and graphics library. That's probably more fun.)
Forking is not realistic, but defining a subset is. Write a subset of the current standards and tools to check conformance for websites. Then market it for the benefits that it has and allow showing a label or a score. The subset for HTML could force it to be valid XHTML as well.
I have seen a some discussion of Gemini recently. However, most of the discussion is around the technical achievements or potential design mistakes.
A couple of posters here have mentioned Gemini but without much additional detail.
What is the subjective experience of using Gemini? What is the ecosystem of sites like? Is it easy to discover high-quality interesting things?
For a quick intro to Gemini, download the Lagrange client (available on most platforms) and check out this aggregator: gemini://warmedal.se/~antenna/
Thank you—I'll try that.
But I actually also want to hear people's subjective feelings about the protocol and ecosystem, which is largely missing from the discussion. Is it fun and exciting and cool and interesting?
(Because, for example, for all the technical discussion around decentralised, federated social networking platforms like Mastodon… it just doesn't feel fun in the way that social media seemed in the early days?)
Obligatory disclaimer: I was the second person to be involved with the Gemini protocol.
The protocol itself: meh. I mean, I had fun writing a server, and it was fun discussing the development of the protocol with Solderpunk, but now? I still read Gemini, and there a few people I follow there. I still run my own server as it's easy enough. Development has slowed down though---I've only made four patches to my server in the past two years.
The ecosystem? Less than meh. I found the community getting rather toxic as it grew, to the point where I largely stopped bothering. Even Solderpunk stepped away for a bit. The toxicity came from the majority who did not program, nor were very technical and hated all the discussion about technical issues within Gemini, while Gemini was still under development. It didn't help that most who were willing to talk tech weren't willing to actually implement their ideas---they just expected other people to implement their obviously genius ideas (not that I'm still bitter, mind you 8-P
Now, it's just a smaller web, more like the web of the early 90s. Are there cool things there? Yes. Interesting? Yes. But not a lot of such sites.
I'll be happy to follow up if you want to talk a bit or have some advise to prevent known pitfails. I also think that public discourse is not very helpful, so I would rather use private IRC or email.
But I actually also want to hear people's subjective feelings about the protocol and ecosystem, which is largely missing from the discussion. Is it fun and exciting and cool and interesting?
You know the Simpsons joke about ham radio? Basically that
Slightly off-topic for the discussion on this article, but subjectively, using Gemini is a lot like using a more modern Gopher, mixed with Web 1.0. A lot of the easiest content to find is blogs, as there are several well-established blog aggregators. There are a lot of personal 'web garden' type sites, but they are not necessarily as easy to discover; the least bad way is probably to climb up from blogs that you like to explore sites and their linkages. There are not quite the kind of obsessive "everything about specific topic" sites that you would find on the early web before Web 2.0 and platformization, at least not in the same quantity.
So kind of imagine a cross between Geocities-era and peak-blogosphere WWW, but designed for a terminal browser or a Markdown previewer depending on your choice of client.
By the way, I'm surprised no one has discussed the other side of this.
If the web were to give way to a text-only, document-focused replacement… what do we do with all the apps?
Well, it turns out that VNC is actually surprisingly usable for its age. I have, for example, used in-flight wifi—on an international flight (!!)—to connect to a QEMU virtual machine running x11vnc for an Xvfb session that was running in my desktop connected via residential (and very asymmetric) coax… and the latency was rough, but it wasn't wholly unusable!
Considering that (naïve) VNC clients exist on every single platform, and considering that we could adapt the approach taken by x0vncserver and have a GUI (written in whatever language with whatever toolkit) render to a framebuffer, could we also replace the application-focused use of the web?
Obviously, using a naïve VNC client for this will be rough and unsuitable for universal use. You're rendering an opaque framebuffer, so there's no accessibility accommodations at all. But I'm not sure what stops you from forking the VNC protocol to include this information in a way that a non-naïve client could use.
Given the state of most internal-only corporate apps (and that said apps usually assume fast, low-latency server⇋desktop connectivity,) I'm quite surprised this hasn't already been tried…
You could actually get something working in fairly short order… with (usable but imperfect) client support on all platforms!
For an internal app, you wouldn't (necessarily) need to learn a new language or new protocols: you'd just need a library that could render to and serve a VNC framebuffer. It would look like a variation on normal GUI programming.
Just to casually explore this idea a bit, I took the fifteen minutes since I posted the above to ask a code generator to create a (working!) proof of concept for this approach. Create a simple ±counter app, writing everything in C using only libvncserver (i.e., writing the GUI from scratch, using an immediate mode style.)
I won't share the 500 lines of code it generated, because why should you bother to read something that I couldn't be bothered to write? Also, the code is quite bad (as language-model-generated code tends to be,) and makes some very odd choices, but as solely a measure of feasibility… the app works. It works on my Linux desktop and my iPad (using stock VNC apps.) Mouse input is a bit laggy on the tablet (which I presume is a consequence of using libvncserver…?) Keyboard input is very smooth.
So… if you're wondering if this approach is even possible… you're only fifteen minutes away from trying it (and seeing the limitations) yourself.
(And I just can't stress this detail enough: native clients already exist on every platform everywhere.)