A year on from writing "Truth in the Age of Mechanical Reproduction", much of what I had feared has come to pass. In fact, in many cases things are worse than I expected.
I would never have imagined a year ago that Google would kill web searching as we know it. I did not expect absolutely every product in the tech sector to attempt to increase valuation by tossing LLMs into their product, regardless of functionality or utility. Perhaps I should have.
But the rapid takeover of the web by generative text and images? That I did see coming, and here we are. I don't know about you, but interacting with the wider web these days feels like I'm picking up a device with an exposed wire that zaps me about 50% of the time. It used to be 30%. Next week, it may be 70%. Point is, the web I grew up with, fell in love with, and have—in many ways—built a life around, is being choked out of existence.
If the internet of websites is a garden tended by many villagers, generative models are an army of steamrollers that have come to pave over the green, wrecking balls to demolish the cottages, and scores of builders ready to erect strip malls as far as the eye can see.
This is not just a threat to the state of human knowledge; it's a threat to anyone who makes a living on the internet. As Casey Newton aptly puts it.
...To everyone who depended even a little bit on web search to have their business discovered, or their blog post read, or their journalism funded, the arrival of AI search bodes ill for the future. Google will now do the Googling for you, and everyone who benefited from humans doing the Googling will very soon need to come up with a Plan B.
And as a final injury and insult, this mania is actively harming the planet. The energy-intensive nature of training these models is not a carbon expenditure the planet can afford.
In truth, I don't know anyone who wants the web to go this way. I mean: maybe the closed loop of venture capital, big tech, and financial analysts do, but I've yet to find a user of the web who thinks all this makes our digital lives better. It's just...what the corpos want, for reasons that will almost certainly end up hurting most people and access to human knowledge.
So here in 2024, I'd have to classify the state of human knowledge as "not great." If we used the Bulletin of the Atomic Scientists model, I'd say we're about 90 seconds to midnight.
So time's of the essence. We need to get to fixing this, if there's any hope of saving what's worth saving of the old web.
I have some ideas, humbly submitted.
The Human Web
For the rest of this piece, I'll be referring to this idea of "The Human Web." This is the network of sites and works created by people, without generative assistance. It is art, culture, journalism, history, technical information, and more. Is it commerce? Personally I think it has to be, but we'll get to that.
What isn't The Human Web? Aggregated, distilled, spoonfed by central platforms vying to be the mediator of your experience.
These places and works absolutely still exist on the internet we know. But good luck trying to find them amidst the roiling sea of ad-driven content—and now, the flood of LLM-generated slurry. The internet always had a scale problem; that's why search engines were so critical. But now, the very tools we rely on to make the web discoverable are working against us.
The question then is: can the human web survive and flourish on an internet dominated by synthetic content, aggregated by tools that favor the large players to the near-exclusion of everyone else?
I'm not so sure. And if not, what we're talking about is a hard fork of the web.
The Hard Fork
Alternative webs already exist. Of course there's Tor, but also things like IPFS which create alternative, decentralized webs. Whether any of the extant tools are truly adequate for creating a Human Web is unclear to me. I think perhaps not, because none of these tools have built-in mechanisms for verifying the humanity of creators. That's what we want, right? Some guarantee that the material we access was made by people, for people.
Let's imagine what a greenfield Human Web would look like, and what it would require. I've arrived at five tenets. There could be more. But for now, here they are:
- Accessibility
- Identity
- Accountability
- Security
- Democracy
The Human Web Must Be Accessible
If we're talking about a whole new protocol, or even some additional layer on top of HTTP, access to THW must be dead-simple. Some have claimed that the only way to guarantee that the "right" people arrive in THW is to maintain some technical barrier to entry. I unequivocally reject this notion. No one should be barred from entry simply because they couldn't figure out how to work the door.
From a technical perspective, this would constitute a new market of browsers, or perhaps extensions to existing web browsers that enable interfacing with THW. You could imagine something like an extension that, when activated, enabled access to THW pages, and perhaps optionally blocked access (or warned on access) to "old web" sites.
I am so not the right person to design a protocol. Keep that in mind as we continue.
Accessibility also includes considerations for the differently-abled. Some kind of enforcement for things like alt text on images and accessible page design would be nice. That's a tricky one to pull off, but more on that later.
Lastly on accessibility, I believe the tenet obliges creators to be mindful of compute resource usage, both on server and client. On THW, nothing should require the planet-scale compute we've become accustomed to. Nor should it require megabytes of client-side code to render. That may just be wishful thinking, but I think it's worth considering.
Put another way: The Human Web must run on everything.
The Human Web is Made of Identities
I've been thinking a lot about how our traditional model of networks no longer applies. Sure, we still have switches, routers, and servers. But what we carry across those physical barriers, what we use to access resources, is some concept of identity. Be that an account, a key, or something else, identity has become a nearly atomic element of modern networked systems. In enterprise systems, tools like Okta take advantage of this to unify access to multiple resources behind verification of a single user identity. Of course, major platforms like Google and Facebook have leveraged their ubiquity to make themselves Identity providers in the Open ID Connect authentication scheme.
Newer technologies have taken this concept further. I'm thinking in particular of Bluesky's ATProto and its use of W3C-standard Distributed Identifiers for portable account identification.
In the Human Web, I believe that identities should be the top-level object that users access, not "sites." Identities can host pages. Those pages can have whatever (human-created) material you want on them. But the top-level "domain," as it were, is an identity with a human behind it.
Now, an identity need not be tied to an IRL identity. This is not a "real name" policy. It is simply an identifier that we have confidence a human being is operating. It should also be possible for an individual to own multiple identities. And organizations could maintain an identity as well, provided what they do with it is human-created.
Remember that THW must be built on a currency of trust, however that is defined at a technical level. To what do we ascribe trust? Domains? Websites? No; trust is only between two entities, or identities. This a corollary of "A computer can never be held accountable."
Also therefore, a computer can not be trusted. That's like, the whole point of The Human Web.
Speaking of accountability...
The Human Web Must be Accountable
This is by far the most difficult part of what I'm proposing. I am not sure it's possible to design this system in a way that prevents cheating it. Still, I think it's worth taking a stab at it. Ideally, The Human Web would comprise identities that are vetted as human, and otherwise welcomed in the space. I'm not exactly talking about Trust and Safety or moderation here, because those issues exist at levels above a protocol. Nevertheless, the protocol could include a mechanism to empower those functions.
Once again going back to how THW would be distinct from the oldweb, we would seek to severely limit the power of content farms and generative content. Should such material appear, the network should be able to respond like an immune system to diminish its discoverability.
Once again, keep in mind this is not wholly gamed out, merely a rough sketch.
You might imagine a reputation score attached to all Identities in The Human Web. Perhaps on a range between 0-1000. All Identities start at 500. Visits to one's ID and associated pages confer some fractional increase. The browser has the ability to report IDs for violations of the network's Terms of Use (no AI, no hate speech, etc.). Enough reports in a given time period would confer a reputation penalty.
If a reputation falls below a certain threshold, browsers could warn before access. And in-network search engines could rank based on reputation. At the very least, the reputation could be displayed with search results.
You might be thinking that this is a soft sell for a blockchain. Maybe? You can do distributed ledgers without blockchain. I do think that the system has to be distributed and verifiable to have any meaning. So a cryptographically secure means of producing and distributing that list of transactions would seem prudent. This is where it's important to remember that distributed ledgers are not the same as blockchain or cryptocurrency. Hell, even Veilid uses distributed hash tables.
This is not a complete plan. I know there are holes in it. But the bigger point is: The Human Web requires an inbuilt method of holding bad faith actors accountable, and protecting users from them.
And on that note:
The Human Web Must Be Secure
I'm not certain that complete end-to-end encryption is possible for an entire network that works the way we expect the web to work. Still, to the degree possible, The Human Web should seek to maximize secrecy and privacy. This may seem in direct contradiction with the concept of Identities. I would say privacy in useful tension with Identities. The Human Web is built on trust, so perfect anonymity can't be an objective. But communications in the network should have protocol-level assurances that nobody is snooping on them.
Think about it this way: say someone creates a new social media platform on THW: one that includes private messages. It could be the case that the platform acts simply as a way of linking your Identity's chat resources to someone else's, never observing either end of the communication. This is a different model than how we imagine servers and clients today, but something that an Identity-first network enables.
Identities themselves must be secure. That's going to mean at least some degree of multi-factor authentication, and I'm actually open to higher standards than that. If our goal is to guarantee a human brain behind each Identity, there may be other tests necessary to pass before an Identity is granted. I don't think the Google Maps for Business method of mailing a confirmation code makes total sense, but then again, maybe there's something to that. Perhaps even the slowness of onboarding could be a defense against the artificial. Humans can wait. Bots and botnets can't.
Okay, one last one, and it may be even less possible than Accountability.
The Human Web Must Be Democratic
Who decides what goes and what doesn't on The Human Web? The Human Web does.
You know, I grew up on the internet. I made friends, fell in love, and built a life online. It is, in many senses, no less real to me than "meatspace." And to me, the internet absolutely is a place. It may not be physical, but it is something in which I exist in some form. I don't have another word for it other than "place."
But the internet has always been a weird demilitarized zone when it comes to the laws that govern conduct there. That's of course in part due to corporate interests, international law (or lack thereof), and the hodgepodge of local regulations that impact how things get done on the web. But the internet never thought of itself as a place, and so it was unimaginable to build a government around it as a whole.
The Human Web must do exactly that. Since it is a network based on upholding certain shared values, there is an inherent need for a body to guarantee that the network is doing its job. This is the purpose of a government: to meet the needs of its citizens.
I'll spare you the Churchill quote, but democracy really is the least worst option we have for governance at scale. Whether direct or representative, making sure users have a meaningful voice in the direction of THW and the hard choices it will face over time.
This is not a business, and there should be no CEO. Reject benevolent dictators. Embrace term limits. The elected officers of THW would serve to administer the network and ask the questions that need asking. Perhaps there is room for a broader legislative body as well. Smarter people than I can figure that out while drafting the Human Web Constitution.
I don't know if one Identity equals one vote. That seems easy to game. Perhaps some reputational threshold would be necessary for voting rights. That feels gross as well though, since it would almost necessarily exclude newcomers from the conversation. Like I said, this one may be harder than Accountability. If states can't figure out digital voting, how would we?
Disclaimer: Imperfect
I told you up front that I was no protocol designer. I also said this is incomplete. It is submitted as thoughts in a moment in time—a moment in which I believe the users of the web must band together to protect human knowledge from the coming flood. It's already raining; the question remains what will survive the storm.
Despite these imperfections, I hope there was some value here for you, even as a list of points to disagree with. And I hope someday to see you on The Human Web.