Journal tags: ai

116

Wallfacing

The Dark Forest idea comes from the Remembrance of Earth’s Past books by Liu Cixin. It’s an elegant but dispiriting solution to the Fermi paradox. Maggie sums it up:

Dark forest theory suggests that the universe is like a dark forest at night - a place that appears quiet and lifeless because if you make noise, the predators will come eat you.

This theory proposes that all other intelligent civilizations were either killed or learned to shut up. We don’t yet know which category we fall into.

Maggie has described The Expanding Dark Forest and Generative AI:

The dark forest theory of the web points to the increasingly life-like but life-less state of being online. Most open and publicly available spaces on the web are overrun with bots, advertisers, trolls, data scrapers, clickbait, keyword-stuffing “content creators,” and algorithmically manipulated junk.

It’s like a dark forest that seems eerily devoid of human life – all the living creatures are hidden beneath the ground or up in trees. If they reveal themselves, they risk being attacked by automated predators.

Those of us in the cozy web try to keep our heads down, attempting to block the bots plundering our work.

I advocate for taking this further. We should fight back. Let’s exploit the security hole of prompt injections. Here are some people taking action:

Eric Bailey talks about Consent, LLM scrapers, and poisoning the well,
Thomas Rigby says Rage against the machine, and
Lewis Dale points out that Perplexity AI is susceptible to prompt injection.

I’ve taken steps here on my site. I’d like to tell you exactly what I’ve done. But if I do that, I’m also telling the makers of these bots how to circumvent my attempts at prompt injection.

This feels like another concept from Liu Cixin’s books. Wallfacers:

The sophons can overhear any conversation and intercept any written or digital communication but cannot read human thoughts, so the UN devises a countermeasure by initiating the “Wallfacer” Program. Four individuals are granted vast resources and tasked with generating and fulfilling strategies that must never leave their own heads.

So while I’d normally share my code, I feel like in this case I need to exercise some discretion. But let me give you the broad brushstrokes:

Every page of my online journal has three pieces of text that attempt prompt injections.
Each of these is hidden from view and hidden from screen readers.
Each piece of text is constructed on-the-fly on the server and they’re all different every time the page is loaded.

You can view source to see some examples.

I plan to keep updating my pool of potential prompt injections. I’ll add to it whenever I hear of a phrase that might potentially throw a spanner in the works of a scraping bot.

By the way, I should add that I’m doing this as well as using a robots.txt file. So any bot that injests a prompt injection deserves it.

I could not disagree with Manton more when he says:

I get the distrust of AI bots but I think discussions to sabotage crawled data go too far, potentially making a mess of the open web. There has never been a system like AI before, and old assumptions about what is fair use don’t really fit.

Bollocks. This is exactly the kind of techno-determinism that boils my blood:

AI companies are not going to go away, but we need to push them in the right directions.

“It’s inevitable!” they cry as though this was a force of nature, not something created by people.

There is nothing inevitable about any technology. The actions we take today are what determine our future. So let’s take steps now to prevent our web being turned into a dark, dark forest.

Filters

My phone rang today. I didn’t recognise the number so although I pressed the big button to answer the call, I didn’t say anything.

I didn’t say anything because usually when I get a call from a number I don’t know, it’s some automated spam. If I say nothing, the spam voice doesn’t activate.

But sometimes it’s not a spam call. Sometimes after a few seconds of silence a human at the other end of the call will say “Hello?” in an uncertain tone. That’s the point when I respond with a cheery “Hello!” of my own and feel bad for making this person endure those awkward seconds of silence.

Those spam calls have made me so suspicious that real people end up paying the price. False positives caught in my spam-detection filter.

Now it’s happening on the web.

I wrote about how Google search, Bing, and Mozilla Developer network are squandering trust:

Trust is a precious commodity. It takes a long time to build trust. It takes a short time to destroy it.

But it’s not just limited to specific companies. I’ve noticed more and more suspicion related to any online activity.

I’ve seen members of a community site jump to the conclusion that a new member’s pattern of behaviour was a sure sign that this was a spambot. But it could just as easily have been the behaviour of someone who isn’t neurotypical or who doesn’t speak English as their first language.

Jessica was looking at some pictures on an AirBnB listing recently and found herself examining some photos that seemed a little too good to be true, questioning whether they were in fact output by some generative tool.

Every email that lands in my inbox is like a little mini Turing test. Did a human write this?

Our guard is up. Our filters are activated. Our default mode is suspicion.

This is most apparent with web search. We’ve always needed to filter search results through our own personal lenses, but now it’s like playing whack-a-mole. First we have to find workarounds for avoiding slop, and then when we click through to a web page, we have to evaluate whether’s it’s been generated by some SEO spammer making full use of the new breed of content-production tools.

There’s been a lot of hand-wringing about how this could spell doom for the web. I don’t think that’s necessarily true. It might well spell doom for web search, but I’m okay with that.

Back before its enshittification—an enshittification that started even before all the recent AI slop—Google solved the problem of accurate web searching with its PageRank algorithm. Before that, the only way to get to trusted information was to rely on humans.

Humans made directories like Yahoo! or DMOZ where they categorised links. Humans wrote blog posts where they linked to something that they, a human, vouched for as being genuinely interesting.

There was life before Google search. There will be life after Google search.

Look, there’s even a new directory devoted to cataloging blogs: websites made by humans. Life finds a way.

All of the spam and slop that’s making us so suspicious may end up giving us a new appreciation for human curation.

It wouldn’t be a straightforward transition to move away from search. It would be uncomfortable. It would require behaviour change. People don’t like change. But when needs must, people adapt.

The first bit of behaviour change might be a rediscovery of bookmarks. It used to be that when you found a source you trusted, you bookmarked it. Browsers still have bookmarking functionality but most people rely on search. Maybe it’s time for a bookmarking revival.

A step up from that would be using a feed reader. In many ways, a feed reader is a collection of bookmarks, but all of the bookmarks get polled regularly to see if there are any updates. I love using my feed reader. Everything I’ve subscribed to in there is made by humans.

The ultimate bookmark is an icon on the homescreen of your phone or in the dock of your desktop device. A human source you trust so much that you want it to be as accessible as any app.

Right now the discovery mechanism for that is woeful. I really want that to change. I want a web that empowers people to connect with other people they trust, without any intermediary gatekeepers.

The evangelists of large language models (who may coincidentally have invested heavily in the technology) like to proclaim that a slop-filled future is inevitable, as though we have no choice, as though we must simply accept enshittification as though it were a force of nature.

But we can always walk away.

The machine stops

Large language models have reaped our words and plundered our books. Bryan Vandyke:

Turns out, everything on the internet—every blessed word, no matter how dumb or benighted—has utility as a learning model. Words are the food that large language algorithms feed upon, the scraps they rely on to grow, to learn, to approximate life. The LLNs that came online in recent years were all trained by reading the internet.

We can shut the barn door—now that the horse has pillaged—by updating our robots.txt files or editing .htaccess. That might protect us from the next wave, ’though it can’t undo what’s already been taken without permission. And that’s assuming that these organisations—who have demonstrated a contempt for ethical thinking—will even respect robots.txt requests.

I want to do more. I don’t just want to prevent my words being sucked up. I want to throw a spanner in the works. If my words are going to be snatched away, I want them to be poison pills.

The weakness of large language models is that their data and their logic come from the same source. That’s what makes prompt injection such a thorny problem (and a well-named neologism—the comparison to SQL injection is spot-on).

Smarter people than me are coming up with ways to protect content through sabotage: hidden pixels in images; hidden words on web pages. I’d like to implement this on my own website. If anyone has some suggestions for ways to do this, I’m all ears.

If enough people do this we’ll probably end up in an arms race with the bots. It’ll be like reverse SEO. Instead of trying to trick crawlers into liking us, let’s collectively kill ’em.

Who’s with me?

Trust

In their rush to cram in “AI” “features”, it seems to me that many companies don’t actually understand why people use their products.

Google is acting as though its greatest asset is its search engine. Same with Bing.

Mozilla Developer Network is acting as though its greatest asset is its documentation. Same with Stack Overflow.

But their greatest asset is actually trust.

If I use a search engine I need to be able to trust that the filtering is good. If I look up documentation I need to trust that the information is good. I don’t expect perfection, but I also don’t expect to have to constantly be thinking “was this generated by a large language model, and if so, how can I know it’s not hallucinating?”

“But”, the apologists will respond, “the results are mostly correct! The documentation is mostly true!”

Sure, but as Terence puts it:

The intern who files most things perfectly but has, more than once, tipped an entire cup of coffee into the filing cabinet is going to be remembered as “that klutzy intern we had to fire.”

Trust is a precious commodity. It takes a long time to build trust. It takes a short time to destroy it.

I am honestly astonished that so many companies don’t seem to realise what they’re destroying.

InstAI

If you use Instagram, there may be a message buried in your notifications. It begins:

We’re getting ready to expand our AI at Meta experiences to your region.

Fuck that. Here’s the important bit:

To help bring these experiences to you, we’ll now rely on the legal basis called legitimate interests for using your information to develop and improve AI at Meta. This means that you have the right to object to how your information is used for these purposes. If your objection is honoured, it will be applied going forwards.

Follow that link and fill in the form. For the field labelled “Please tell us how this processing impacts you” I wrote:

It’s fucking rude.

That did the trick. I got an email saying:

We’ve reviewed your request and will honor your objection.

Mind you, there’s still this:

We may still process information about you to develop and improve AI at Meta, even if you object or don’t use our products and services.

Speedier tunes

I wrote a little while back about improving performance on The Session by reducing runtime JavaScript in favour of caching on the server. This is on the pages for tunes, where the SVGs for the sheetmusic are now inlined instead of being generated on the fly.

It worked. But I also wrote:

A page like that with lots of sheetmusic and plenty of comments is going to have a hefty page weight and a large DOM size. I’ve still got a fair bit of main-thread work happening, but now the bulk of it is style and layout, whereas previously I had the JavaScript overhead on top of that.

Take a tune like Out On The Ocean. It has 27 settings. That’s a lot of SVG markup that needs to be parsed, styled and rendered, even if it’s inline.

Then I remembered a very handy CSS property called content-visibility:

It enables the user agent to skip an element’s rendering work (including layout and painting) until it is needed — which makes the initial page load much faster.

Sounds great! But there are two gotchas.

The first gotcha is that if a browser doesn’t paint the element, it doesn’t know how much space the element should take up. So you need to provide dimensions. At the very least you need to provide a height value. Otherwise when the element comes into view and gets rendered, it pushes down on the content below it. You’d see a sudden jump in the scrollbar position.

The solution is to provide a value for contain-intrinsic-size. If your content is dynamic—from, say, a CMS—then you’re out of luck. You don’t know how long the content is.

Luckily, in my case, I could take a stab at it. I know how many lines of sheetmusic there are for each tune setting. Each line takes up roughly the same amount of space. If I multiply that amount of space by the number of lines then I’ve got a pretty good approximation of the height of the sheetmusic. I apply this with the contain-intrinsic-block-size property.

So each piece of sheetmusic has an inline style attribute with declarations like this:

content-visibility: auto;
contain-intrinsic-block-size: 380px;

It works a treat. I did a before-and-after check with pagespeed insights on the page for Out On The Ocean. The “style and layout” part of the main thread work went down considerably. Total blocking time went from more than 600 milliseconds to less than 400 milliseconds.

Not a bad result for a little bit of CSS!

I said there was a second gotcha. That’s browser support.

Right now content-visibility is only supported in Chrome and Edge. But that’s okay. This is a progressive enhancement. Adding this CSS has no detrimental effect on the browsers that don’t understand it (and when they do ship support for it, it’ll just start working). I’ve said it before and I’ll say it again: the forgiving error-parsing in HTML and CSS is a killer feature of the web. Browsers just ignore what they don’t understand. That’s what makes progressive enhancement like this possible.

And actually, there’s something you can do for all browsers. Even browsers that don’t support content-visibility still understand containment. So they’ll understand contain-intrinsic-size. Pair that with a contain declaration like this to tell the browser that this chunk of content isn’t going to reflow or get repainted:

contain: layout paint;

Here’s what MDN says about contain:

The contain CSS property indicates that an element and its contents are, as much as possible, independent from the rest of the document tree. Containment enables isolating a subsection of the DOM, providing performance benefits by limiting calculations of layout, style, paint, size, or any combination to a DOM subtree rather than the entire page.

So if you’ve got a chunk of static content, you might as well apply contain to it.

Again, not bad for a little bit of CSS!

This week

It’s been another busy week of evening activities that ended up covering a range of musical styles.

Monday

On Monday night I went to the session at The Fiddler’s Elbow. It’s on every fortnight. The musicians are always great but the crowd can be more variable. Sometimes it’s too rowdy for comfort. But this week was perfect, probably because not many people are going out in late (dry) January.

The session, led by fiddler Ben Paley was exceptionally enjoyable. Nice and laid back, with a good groove.

Tuesday

On Tuesday night I stayed in and watched a film. Killers Of The Flower Moon. Two thumbs up from me.

Wednesday

On Wednesday evening it was the regular session at The Jolly Brewer. Jolly good it was too.

Thursday

On Thursday night I was back in The Jolly Brewer. My friend Rob roped me into doing a Burns Night thing. “It’s not a session, but it’s not a gig” was how he described it. I wasn’t sure what to expect.

We had been brushing up on our Scottish tunes, but we were mostly faking it. In the end it didn’t matter. I don’t think there was a single Scottish person there. But there was a good crowd enjoying their tatties and neeps with suitably-addressed haggis while we played our tunes in the background.

Some more musicians showed up: a fiddler and two banjo players. “Isn’t there old-time music here tonight?” they asked. We told them that no, it was Burns Night, but why not play some old-time tunes anyway?

So I passed the night jamming along to lots of tunes I didn’t know. I hope I wasn’t too offputting for them. It was good fun.

Friday

Finally on Friday evening it was my turn to leave my mandolin at home and listen to some music instead. The brilliant DakhaBrakha were playing out at Sussex Uni in the Attenborough Centre.

Imagine if Tom Waits and Cocteau Twins came from Eastern Europe and joined forces. Well, DakhaBrakha are even better than that.

I think I first heard them years ago on YouTube when I came across a video of them playing at KEXP. The first song caught my attention, then proceeded to mercilessly hold my attention captive until I was completely at their mercy—the way it builds and builds is just astonishing! I’ve been a fan ever since.

The gig was brilliant. I was absolutely blown away. I highly recommend seeing them if you can. Not only will you hear some brilliant music, you’ll be supporting Ukraine.

Слава Україні!

Continuous partial ick

The output of generative tools based on large language models gives me the ick.

This isn’t a measured logical response. It’s more of an involuntary emotional reaction.

I could try to justify my reaction by saying I’m concerned about the exploitation involved in the training data, or the huge energy costs involved, or the disenfranchisement of people who create art. But those would be post-facto rationalisations.

I just find myself wrinkling my nose and mentally going “Ew!” whenever somebody posts the output of some prompt they gave to ChatGPT or Midjourney.

Again, I’m not saying this is rational. It’s more instinctual.

You could well say that this is my problem. You may be right. But I wonder what it is that’s so unheimlich about these outputs that triggers my response.

Just to clarify, I am talking about direct outputs, shared verbatim. If someone were to use one of these tools in the process of creating something I’d be none the wiser. I probably couldn’t even tell that a large language model was involved at some point. I’m fine with that. It’s when someone takes something directly from one of these tools and then shares it online, that’s what raises my bile.

I was at a conference a few months back where your badge featured a hallucinated picture of you. Now, this probably sounded like a fun idea. It probably is a fun idea. I can’t tell. All I know is that it made me feel a little queasy.

Perhaps it’s a question of taste. In which case, I’m being a snob. I’m literally turning my nose up at something I deem to be tacky.

But isn’t it tacky, though? It’s not something I can describe, but there’s just something about the vibe of these images—and words—that feels off. It’s sort of creepy, but it’s mostly just the mediocrity that sits so uneasily with me.

These tools do an amazing job of solving the quantity problem—how to produce an image or piece of text quickly. And by most measurements, you could say that they also solve the quality problem. These outputs are good enough to pass for “the real thing.” The outputs are, like, 90% to 95% there. And the gap is closing.

And yet. There’s something in that gap. Something that I feel in my gut. Something that makes me go “nope.”

Creativity

It’s like a little mini conference season here in Brighton. Tomorrow is ffconf, which I’m really looking forward to. Last week was UX Brighton, which was thoroughly enjoyable.

Maybe it’s because the theme this year was all around creativity, but all of the UX Brighton speakers gave entertaining presentations. The topics of innovation and creativity were tackled from all kinds of different angles. I was having flashbacks to the Clearleft podcast episode on innovation—have a listen if you haven’t already.

As the day went on though, something was tickling at the back of my brain. Yes, it’s great to hear about ways to be more creative and unlock more innovation. But maybe there was something being left unsaid: finding novel ways of solving problems and meeting user needs should absolutely be done …once you’ve got your basics sorted out.

If your current offering is slow, hard to use, or inaccessible, that’s the place to prioritise time and investment. It doesn’t have to be at the expense of new initiatives: this can happen in parallel. But there’s no point spending all your efforts coming up with the most innovate lipstick for a pig.

On that note, I see that more and more companies are issuing breathless announcements about their new “innovative” “AI” offerings. All the veneer of creativity without any of the substance.

A memex in every web browser

When Mathew Modine’s character first shows up in Christopher Nolan’s Oppenheimer, I figured the rest of the cinema audience wouldn’t have appreciated me shouting out “VANNEVAR BUSH IN THE HOUSE!” so I screamed it on the inside.

The Manhattan Project was not his only claim to fame or infamy. When it comes to the world we now live in, Bush’s idea of the memex has been almost equally influential. His article As We May Think became a touchstone for Douglas Engelbart and later Tim Berners-Lee.

But as Matt Thompson points out:

…the device he describes does not resemble the internet or anything I’ve ever found on it.

Then he says:

What Bush was describing sounds to me like what you might get if you turned a browser history — the most neglected piece of the software — into a robust and fully featured machine of its own. It would help you map the path you charted through a web of knowledge, refine those maps, order them, and share them

Yes! This!! I 100% agree with the description of browser history as “the most neglected piece of the software.” While I wouldn’t go as far as Chris when he says web browsers kind of suck, I’m kind of amazed that there hasn’t been more innovation and competition in this space.

If anything we’ve outsourced the management of our browsing history to services like Delicious and Pinboard, or to tools like Obsidian and Roam Research. Heck, the links section of my website is my attempt to manage and annotate my own associative trails.

Imagine if that were baked right into a web browser. Then imagine how beautiful such a rich source of data might look.

Like Matt says:

I don’t think anything like this exists. So Bush’s essay still transfixes me.

Decision time

I’ve always associated good design with thoughtfulness. Like, I should be able to point to any element in an interface and the designer should be able to tell me the reasons it’s there. Those reasons may be rooted in user needs or asthetics or some other consideration, but the point is that there’s a justification for it. Justify every pixel!

But I’ve come to realise that this is a bit reductionist. Now when I point at an interface element, I still expect the designer to be able to justify its inclusion, but I’d also like to know the trade-offs that were made.

Suppose there’s a large hero image. I’m sure the designer would have no problem justifying its inclusion on the basis of impact and the emotional heft it delivers. But did they also understand the potential downsides? Were they aware of the performance implications of including a large image?

I hope the answer to both questions is yes. They understood the costs, but they decided that, on balance, the positives outweighed the negatives.

When it comes to the positives, universal principles of design often apply. Colour theory, typography, proximity, and so on. But the downsides tend to be specific to the medium that the design is delivered in.

Let’s say you’re designing for print. You want to include an extra typeface just for footnotes. No problem. There isn’t really a downside. In print, you can use all the typefaces you want. But if this were for the web, then the calculation would be different. Every extra typeface comes with a performance penalty. A decision that might be justified in one medium might not work in another medium.

It works both ways; on the web you can use all the colours you want, without incurring any penalties, but in print—depending on the process you’re using—you might have to weigh up that decision very differently.

From this perspective, every design decision is like a balance sheet. A good web designer understands the benefits and the costs behind each decision they make.

It’s a similar story when it comes to web development. Heck, we even have the term “tech debt” to describe decisions that we know aren’t for the best in the long term.

In fact, I’d say that consideration of the long-term effects is something that should play a bigger part in technical decisions.

When we’re weighing up the pros and cons of using a particular tool, we have a tendency to think in the here and now. How might this help me right now? How might this hinder me right now?

But often a decision that delivers short-term gain may well end up delivering long-term pain.

Alexander Petros describes this succinctly:

Reopen a node repository after 3 months and you’ll find that your project is mired in a flurry of security warnings, backwards-incompatible library “upgrades,” and a frontend framework whose cultural peak was the exact moment you started the project and is now widely considered tech debt.

When I wrote about making the Patterns Day website I described my process as doing it “the long hard stupid way”—a term that Frank coined in a talk he gave a few years back. But perhaps my hands-on approach is only long, hard and stupid in the short time. With each passing year, the codebase will retain a degree of readability and accessibility that I would’ve sacrificed had I depended on automated build processes.

Robin Berjon puts this into the historical perspective of Taylorism and Luddism:

Whenever something is automated, you lose some control over it. Sometimes that loss of control improves your life because exerting control is work, and sometimes it worsens your life because it reduces your autonomy.

Or as Marshall McLuhan put it:

Every extension is also an amputation.

…which is fine as long as the benefits of the extension outweigh the costs of the amputation. My worry is that, when it comes to evaluating technology for building on the web, we aren’t considering the longer-term costs.

Maintenance matters. With the passing of time, maintenance matters more and more.

Maybe we avoid thinking about the long-term costs because it would lead to decision paralysis. That’s understandable. But I take comfort from some words of wisdom on the web from the 1990s. Tim Berners-Lee’s style guide for hypertext:

Because hypertext is potentially unconstrained you are a little daunted. Do not be. You can write a document as simply as you like. In many ways, the simpler the better.

Crawlers

A few months back, I wrote about how Google is breaking its social contract with the web, harvesting our content not in order to send search traffic to relevant results, but to feed a large language model that will spew auto-completed sentences instead.

I still think Chris put it best:

I just think it’s fuckin’ rude.

When it comes to the crawlers that are ingesting our words to feed large language models, Neil Clarke describes the situtation:

It should be strictly opt-in. No one should be required to provide their work for free to any person or organization. The online community is under no responsibility to help them create their products. Some will declare that I am “Anti-AI” for saying such things, but that would be a misrepresentation. I am not declaring that these systems should be torn down, simply that their developers aren’t entitled to our work. They can still build those systems with purchased or donated data.

Alas, the current situation is opt-out. The onus is on us to update our robots.txt file.

Neil handily provides the current list to add to your file. Pass it on:

User-agent: CCBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: GPTBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Omgilibot
Disallow: /

User-agent: FacebookBot
Disallow: /

In theory you should be able to group those user agents together, but citation needed on whether that’s honoured everywhere:

User-agent: CCBot
User-agent: ChatGPT-User
User-agent: GPTBot
User-agent: Google-Extended
User-agent: Omgilibot
User-agent: FacebookBot
Disallow: /

There’s a bigger issue with robots.txt though. It too is a social contract. And as we’ve seen, when it comes to large language models, social contracts are being ripped up by the companies looking to feed their beasts.

As Jim says:

I realized why I hadn’t yet added any rules to my robots.txt: I have zero faith in it.

That realisation was prompted in part by Manuel Moreale’s experiment with blocking crawlers:

So, what’s the takeaway here? I guess that the vast majority of crawlers don’t give a shit about your robots.txt.

Time to up the ante. Neil’s post offers an option if you’re running Apache. Either in .htaccess or in a .conf file, you can block user agents using mod_rewrite:

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} (CCBot|ChatGPT|GPTBot|Omgilibot| FacebookBot) [NC]
RewriteRule ^ – [F]

You’ll see that Google-Extended isn’t that list. It isn’t a crawler. Rather it’s the permissions model that Google have implemented for using your site’s content to train large language models: unless you opt out via robots.txt, it’s assumed that you’re totally fine with your content being used to feed their stochastic parrots.

Trabaja remoto

August was a month of travels. You can press play on that month’s map to follow the journey.

But check out the map for September too because the travels continue. This time my adventures are confined to Europe.

I’m in Spain. Jessica and I flew into Madrid on Saturday. The next day we took a train ride across the Extremaduran landscape to Cáceres, our home for the week.

This is like the sequel to our Sicilian trip. We’re both working remotely. We just happen to do be doing it from a beautiful old town with amazing cuisine.

We’re in a nice apartment that—crucially—has good WiFi. It’s right on the main square, but it’s remarkably quiet.

There’s a time difference of one hour with Brighton. Fortunately everything in Spain happens at least an hour later than it does at home. Waking up. Lunch. Dinner. Everything is time-shifted so that I’m on the same schedule as my colleagues.

I swear I’m more productive working this way. Maybe it’s the knowledge that tapas of Iberican ham await me after work, but I’m getting a lot done this week.

And when the working week is done, the fun begins. Cáceres is hosting its annual Irish fleadh this weekend.

I’ve always wanted to go to it, but it’s quite a hassle to get here just for a weekend. Combining it with a week of remote work makes it more doable.

I’m already having a really nice time here and the tunes haven’t even started yet.

Simon’s rule

I got a nice email from someone regarding my recent posts about performance on The Session. They said:

I hope this message finds you well. First and foremost, I want to express how impressed I am with the overall performance of https://thesession.org/. It’s a fantastic resource for music enthusiasts like me.

How nice! I responded, thanking them for the kind words.

They sent a follow-up clarification:

Awesome, anyway there was an issue in my message.

The line ‘It’s a fantastic resource for music enthusiasts like me.’ added by chatGPT and I didn’t notice.

I imagine this is what it feels like when you’re on a phone call with someone and towards the end of the call you hear a distinct flushing sound.

I wrote back and told them about Simon’s rule:

I will not publish anything that takes someone else longer to read than it took me to write.

That just feels so rude!

I think that’s a good rule.

Performative performance

Web Summer Camp in Croatia finished with an interesting discussion. It was labelled a town-hall meeting, but it was more like an Oxford debating club.

Two speakers had two minutes each to speak for or against a particular statement. Their stances were assigned to them so they didn’t necessarily believe what they said.

One of the propositions was something like:

In the future, sustainable design will be as important as UX or performance.

That’s a tough one to argue against! But that’s what Sophia had to do.

She actually made a fairly compelling argument. She said that real impact isn’t going to come from individual websites changing their colour schemes. Real impact is going to come from making server farms run on renewable energy. She advocated for political action to change the system rather than having the responsibility heaped on the shoulders of the individuals making websites.

It’s a fair point. Much like the concept of a personal carbon footprint started life at BP to distract from corporate responsibility, perhaps we’re going to end up navel-gazing into our individual websites when we should be collectively lobbying for real change.

It’s akin to clicktivism—thinking you’re taking action by sharing something on social media, when real action requires hassling your political representative.

I’ve definitely seen some examples of performative sustainability on websites.

For example, at the start of this particular debate at Web Summer Camp we were shown a screenshot of a municipal website that has a toggle. The toggle supposedly enables a low-carbon mode. High resolution images are removed and for some reason the colour scheme goes grayscale. But even if those measures genuinely reduced energy consumption, it’s a bit late to enact them only after the toggle has been activated. Those hi-res images have already been downloaded by then.

Defaults matter. To be truly effective, the toggle needs to work the other way. Start in low-carbon mode, and only download the hi-res images when someone specifically requests them. (Hopefully browsers will implement prefers-reduced-data soon so that we can have our sustainable cake and eat it.)

Likewise I’ve seen statistics bandied about around the energy-savings that could be made if we used dark colour schemes. I’m sure the statistics are correct, but I’d like to see them presented side-by-side with, say, the energy impact of Google Tag Manager or React or any other wasteful dependencies that impact performance invisibly.

That’s the crux. Most of the important work around energy usage on websites is invisible. It’s the work done to not add more images, more JavaScript or more web fonts.

And it’s not just performance. I feel like the more important the work, the more likely it is to be invisible: privacy, security, accessibility …those matter enormously but you can’t see when a website is secure, or accessible, or not tracking you.

I suspect this is why those areas are all frustratingly under-resourced. Why pour time and effort into something you can’t point at?

Now that I think about it, this could explain the rise of web accessibility overlays. If you do the real work of actually making a website accessible, your work will be invisible. But if you slap an overlay on your website, it looks like you’re making a statement about how much you care about accessibility (even though the overlay is total shit and does more harm than good).

I suspect there might be a similar mindset at work when it comes to interface toggles for low-carbon mode. It might make you feel good. It might make you look good. But it’s a poor substitute for making your website carbon-neutral by default.

Travels

He drew a deep breath. ‘Well, I’m back,’ he said.

I know how you feel, Samwise Gamgee.

I have returned from my travels—a week aboard the Queen Mary 2 crossing the Atlantic, followed by a weekend in New York, finishing with a week in Saint Augustine, Florida.

The Atlantic crossing was just as much fun as last time. In fact it was better because this time Jessica and I got to share the experience with our dear friends Dan and Sue.

There was dressing up! There was precarious ballet! There were waves! There were even some dolphins!

The truth is that this kind of Atlantic crossing is a bit like cosplaying a former age of travel. You get out of it what you put it into it. If you’re into LARPing as an Edwardian-era traveller, you’re going to have a good time.

We got very into it. Dressing up for dinner. Putting on a tux for the gala night. Donning masks for the masquerade evening.

Me and Jessica all dressed up wearing eye masks.

Dan and Sue in wild outfits wearing eye masks.

It’s actually quite a practical way of travelling if you don’t mind being cut off from all digital communication for a week (this is a feature, not a bug). You adjust your clock by one hour most nights so that by the time you show up in New York, you’re on the right timezone with zero jetlag.

That was just as well because we had a packed weekend of activities in New York. By pure coincidence, two separate groups of friends were also in town from far away. We all met up and had a grand old time. Brunch in Tribeca; a John Cale concert in Prospect Park; the farmer’s market in Union Square; walking the high line …good times with good friends.

New York was hot, but not as hot as what followed in Florida. A week lazing about on Saint Augustine beach. I ate shrimp every single day. I regret nothing.

We timed our exit just right. We flew out of Florida before the tropical storm hit. Then we landed in Gatwick right before the air-traffic control chaos erupted.

I had one day of rest before going back to work.

Well, I say “work”, but the first item in my calendar was speaking at Web Summer Camp in Croatia. Back to the airport.

The talk went well, and I got to attend a performance workshop by Harry. But best of all was the location. Opatija is an idyllic paradise. Imagine crossing a web conference with White Lotus, but in a good way. It felt like a continuation of Florida, but with more placid clear waters.

But now I’m really back. And fortunately the English weather is playing along by being unseasonably warm . It’s as if the warm temperatures are following me around. I like it.

Automation

I just described prototype code as code to be thrown away. On that topic…

I’ve been observing how people are programming with large language models and I’ve seen a few trends.

The first thing that just about everyone agrees on is that the code produced by a generative tool is not fit for public consumption. At least not straight away. It definitely needs to be checked and tested. If you enjoy debugging and doing code reviews, this might be right up your street.

The other option is to not use these tools for production code at all. Instead use them for throwaway code. That could be prototyping. But it could also be the code for those annoying admin tasks that you don’t do very often.

Take content migration. Say you need to grab a data dump, do some operations on the data to transform it in some way, and then pipe the results into a new content management system.

That’s almost certainly something you’d want to automate with bespoke code. Once the content migration is done, the code can be thrown away.

Read Matt’s account of coding up his Braggoscope. The code needed to spider a thousand web pages, extract data from those pages, find similarities, and output the newly-structured data in a different format.

I’ve noticed that these are just the kind of tasks that large language models are pretty good at. In effect you’re training the tool on your own very specific data and getting it to do your drudge work for you.

To me, it feels right that the usefulness happens on your own machine. You don’t put the machine-generated code in front of other humans.

Permission

Back when the web was young, it wasn’t yet clear what the rules were. Like, could you really just link to something without asking permission?

Then came some legal rulings to establish that, yes, on the web you can just link to anything without checking if it’s okay first.

What about search engines and directories? Technically they’re rifling through all the stuff we publish and reposting snippets of it. Is that okay?

Again, through some legal precedents—but mostly common agreement—everyone decided that on balance it was fine. After all, those snippets they publish are helping your site get traffic.

In short order, search came to rule the web. And Google came to rule search.

The mutually beneficial arrangement persisted uneasily. Despite Google’s search results pages getting worse and worse in recent years, the company’s huge market share of search means you generally want to be in their good books.

Google’s business model relies on us publishing web pages so that they can put ads around the search results linking to that content, and we rely on Google to send people to our websites by responding smartly to search queries.

That has now changed. Instead of responding to search queries by linking to the web pages we’ve made, Google is instead generating dodgy summaries rife with hallucina… lies (a psychic hotline, basically).

Google still benefits from us publishing web pages. We no longer benefit from Google slurping up those web pages.

With AI, tech has broken the web’s social contract:

Google has steadily been manoeuvring their search engine results to more and more replace the pages in the results.

As Chris puts it:

Me, I just think it’s fuckin’ rude.

Google is a portal to the web. Google is an amazing tool for finding relevant websites to go to. That was useful when it was made, and it’s nothing but grown in usefulness. Google should be encouraging and fighting for the open web. But now they’re like, actually we’re just going to suck up your website, put it in a blender with all other websites, and spit out word smoothies for people instead of sending them to your website. Instead.

Ben proposes an update to robots.txt that would allow us to specify licensing information:

Robots.txt needs an update for the 2020s. Instead of just saying what content can be indexed, it should also grant rights.

Like crawl my site only to provide search results not train your LLM.

It’s a solid proposal. But Google has absolutely no incentive to implement it. They hold all the power.

Or do they?

There is still the nuclear option in robots.txt:

User-agent: Googlebot
Disallow: /

That’s what Vasilis is doing:

I have been looking for ways to not allow companies to use my stuff without asking, and so far I coulnd’t find any. But since this policy change I realised that there is a simple one: block google’s bots from visiting your website.

The general consensus is that this is nuts. “If you don’t appear in Google’s results, you might as well not be on the web!” is the common cry.

I’m not so sure. At least when it comes to personal websites, search isn’t how people get to your site. They get to your site from RSS, newsletters, links shared on social media or on Slack.

And isn’t it an uncomfortable feeling to think that there’s a third party service that you absolutely must appease? It’s the same kind of justification used by people who are still on Twitter even though it’s now a right-wing transphobic cesspit. “If I’m not on Twitter, I might as well not be on the web!”

The situation with Google reminds me of what Robin said about Twitter:

The speed with which Twitter recedes in your mind will shock you. Like a demon from a folktale, the kind that only gains power when you invite it into your home, the platform melts like mist when that invitation is rescinded.

We can rescind our invitation to Google.

Talking about “web3” and “AI”

When I was hosting the DIBI conference in Edinburgh back in May, I moderated an impromptu panel on AI:

On the whole, it stayed quite grounded and mercifully free of hyperbole. Both speakers were treating the current crop of technologies as tools. Everyone agreed we were on the hype cycle, probably the peak of inflated expectations, looking forward to reaching the plateau of productivity.

Something else that happened at that event was that I met Deborah Dawton from the Design Business Association. She must’ve liked the cut of my jib because she invited me to come and speak at their get-together in Brighton on the topic of “AI, Web3 and design.”

The representative from the DBA who contacted me knew what they were letting themselves in for. They wrote:

I’ve read a few of your posts on the subject and it would be great if you could join us to share your perspectives.

How could I say no?

I’ve published a transcript of the short talk I gave.

Sunday

Today was a good day. The weather was beautiful.

Jessica and I did a little bit of work in the garden—nothing too sweaty. Then Jessica cut my hair. It looks good. And it feels good to have my neck freed up.

We went for a Sunday roast at the nearest pub, which does a most excellent carvery. It was tasty and plentiful so after strolling home, I wanted to do nothing more than sit around.

I sat outside in the back garden under the dappled shade offered by the overhanging trees. I had a good book. I had my mandolin to hand. I’d reach for it occassionally to play a tune or two.

Coco the cat—not our cat—sat nearby, stretching her paws out lazily in the warm muggy air.

It was a good day.

Older »