Journal tags: privacy

18

Ad tech

Back when South by Southwest wasn’t terrible, there used to be an annual panel called Browser Wars populated with representatives from the main browser vendors (except for Apple, obviously, who would never venture onto a stage outside of their own events).

I remember getting into a heated debate with the panelists during the 2010 edition. I was mad about web fonts.

Just to set the scene, web fonts didn’t exist back in 2010. That’s what I was mad about.

There was no technical reason why we couldn’t have web fonts. The reason why we didn’t get web fonts for years and years was because browser makers were concerned about piracy and type foundries.

That’s nice and all, but as I said during that panel, I don’t recall any such concerns being raised for photographers when the img element was shipped. Neither was the original text-only web held back by the legimate fear by writers of plagiarism.

My point was not that these concerns weren’t important, but that it wasn’t the job of web browsers to shore up existing business models. To use standards-speak, these concerns are orthogonal.

I’m reminded of this when I see browser makers shoring up the business of behavioural advertising.

I subscribe to the RSS feed of updates to Chrome. Not all of it is necessarily interesting to me, but all of it is supposedly aimed at developers. And yet, in amongst the posts about APIs and features, there’ll be something about the Orwellianly-titled “privacy sandbox”.

This is only of interest to one specific industry: behavioural online advertising driven by surveillance and tracking. I don’t see any similar efforts being made for teachers, cooks, architects, doctors or lawyers.

It’s a ludicrous situation that I put down to the fact that Google, the company that makes Chrome, is also the company that makes its money from targeted advertising.

But then Mozilla started with the same shit.

Now, it’s one thing to roll out a new so-called “feature” to benefit behavioural advertising. It’s quite another to make it enabled by default. That’s a piece of deceptive design that has no place in Firefox. Defaults matter. Browser makers know this. It’s no accident that this “feature” was enabled by default.

This disgusts me.

It disgusts me all the more that it’s all for nothing. Notice that I’ve repeatly referred to behavioural advertising. That’s the kind that relies on tracking and surveillance to work.

There is another kind of advertising. Contextual advertising is when you show an advertisement related to the content of the page the user is currently on. The advertiser doesn’t need to know anything about the user, just the topic of the page.

Conventional wisdom has it that behavioural advertising is much more effective than contextual advertising. After all, why would there be such a huge industry built on tracking and surveillance if it didn’t work? See, for example, this footnote by John Gruber:

So if contextual ads generate, say, one-tenth the revenue of targeted ads, Meta could show 10 times as many ads to users who opt out of targeting. I don’t think 10× is an outlandish multiplier there — given how remarkably profitable Meta’s advertising business is, it might even need to be higher than that.

Seems obvious, right?

But the idea that behavioural advertising works better than contextual advertising has no basis in reality.

If you think you know otherwise, Jon Bradshaw would like to hear from you:

Bradshaw challenges industry to provide proof that data-driven targeting actually makes advertising more effective – or in fact makes it worse. He’s spoiling for a debate – and has three deep, recent studies that show: broad reach beats targeting for incremental growth; that the cost of targeting outweighs the return; and that second and third party data does not outperform a random sample. First party data does beat the random sample – but contextual ads massively outperform even first party data. And they are much, much cheaper. Now, says Bradshaw, let’s see some counter-evidence from those making a killing.

If targeted advertising is going to get preferential treatment from browser makers, I too would like to see some evidence that it actually works.

Further reading:

Responsibility

My colleague Chris has written a terrific post over on the Clearleft blog: Is the planet the missing member of your project team?

Rather than hand-wringing and finger-wagging, it gets down to some practical steps that you—we—can take on every project.

Chris finishes by asking:

Let me know how you design with the environment in mind. What practical advice would you suggest?

Well, here’s something that I keep coming up against…

Chris shows that the environment can be part of project management, specifically the RACI methodology:

We list who is responsible, accountable, consulted, and informed within the project. It’s a simple exercise but the clarity is useful for identifying what expertise and input we should seek from the named individuals.

Having the planet be a proactive partner in your project ensures its needs are considered.

Whenever responsibilities are being assigned there are some things that inevitably fall through the cracks. One I’ve seen over and over again is responsibility for third-party scripts.

On the face of it this seems like another responsibility for developers. We’re talking about code here, right?

But in my experience it is never the developers adding “beacons” and other third-party embedded scripts.

Chris rightly points out:

Development decisions, visual design choices, content approach, and product strategy all contribute to the environmental impact of your website.

But what about sales and marketing? Often they’re the ones who’ll drop in a third-party script to track user journeys. That’s understandable. That’s kind of their job.

Dropping in one line of JavaScript seems like a victimless crime. It’s just one small script, right? But JavaScript can import more JavaScript. Tools like Request Map Generator can show just how much destruction third-party JavaScript can wreak:

You pop in a URL, it fetches the page and maps out all the subsequent requests in a nifty interactive diagram of circles, showing how many requests third-party scripts are themselves generating. I’ve found it to be a very effective way of showing the impact of third-party scripts to people who aren’t interested in looking at waterfall diagrams.

Just to be clear, the people adding third-party scripts to websites usually aren’t doing so maliciously. They often don’t realise the negative effect the scripts will have on performance and the environment.

As is so often the case, this isn’t a technical problem. At root it’s about understanding people’s needs (like “I need a way to see what pages are converting!”) and finding a way to meet those needs without negatively impacting the planet. A good open-minded discussion can go a long way.

So I echo Chris’s call to think about environmental impacts from the very start of a project. Establish early on who will have the ability to add third-party scripts to the site. Do all of those people understand the responsibility that gives them?

I saw this lack of foresight in action on a project recently. The front-end development was going really well and the site was going to be exceptionally performant: green Lighthouse scores across the board. But when the site went live it had tracking scripts. That meant that users needed to consent to being tracked. That meant adding another third-party script to generate a consent banner. It completely tanked the Lighthouse scores.

I’m sure the people who added the tracking scripts and consent banners thought they had no choice. But there are alternatives. There are ways to get the data you need without the intrusive surveillance and performance-wrecking JavaScript.

The problem is that it’s not the norm. “Everyone else is doing it” was the justification for Flash intros two decades ago and it’s the justification for enshittification via third-party scripts now.

It doesn’t have to be this way.

Opportunity

You can split the web in many ways. Lionel Dricot wrote about one of those ways in a blog post called Splitting the Web. In it, he outlines an ever-increasing divide he sees on the web.

On the one hand you’ve got people experiencing the advertising-driven, tracking-addicted big players who provide a bloated and buggy user experience.

On the hand, you’ve got the more tech-savvy users with tracking blockers (misleadingly called ad blockers) using browsers and search engines that value privacy and performance.

It feels like everyone is now choosing its side. You can’t stay in the middle anymore. You are either dedicating all your CPU cycles to run JavaScript tracking you or walking away from the big monopolies. You are either being paid to build huge advertising billboards on top of yet another framework or you are handcrafting HTML.

Maybe the web is not dying. Maybe the web is only splitting itself in two.

This reminded me of a post by Chris. No, not The Great Divide, although that’s obviously relevant here. Chris wrote a post just yesterday called Other People’s Busted Software is an Opportunity:

One way to look at it is opportunity. If you make software that does work reliably, you’ve got a leg up. Even if your customers don’t tell you “I like your software because it always works”, they’ll feel it and make choices around knowing it.

I like that optimistic take. If the majority seems to be doubling down on more tracking, more JavaScript, and more enshittification, then there’s a potential opportunity there (acknowledging that you’ve still got to battle against inertia and sunk cost).

This reminds of a fantastic talk that Stuart gave a few years ago called Privacy could be the next big thing:

How do you end up shaping the world? By inventing a thing that the current incumbents can’t compete against. By making privacy your core goal. Because companies who have built their whole business model on monetising your personal information cannot compete against that. They’d have to give up on everything that they are, which they can’t do. Facebook altering itself to ensure privacy for its users… wouldn’t exist. Can’t exist. That’s how you win.

Tracking

I’ve been reading the excellent Design For Safety by Eva PenzeyMoog. There was a line that really stood out to me:

The idea that it’s alright to do whatever unethical thing is currently the industry norm is widespread in tech, and dangerous.

It stood out to me because I had been thinking about certain practices that are widespread, accepted, and yet strike me as deeply problematic. These practices involve tracking users.

The first problem is that even the terminology I’m using would be rejected. When you track users on your website, it’s called analytics. Or maybe it’s stats. If you track users on a large enough scale, I guess you get to just call it data.

Those words—“analytics”, “stats”, and “data”—are often used when the more accurate word would be “tracking.”

Or to put it another way; analytics, stats, data, numbers …these are all outputs. But what produced these outputs? Tracking.

Here’s a concrete example: email newsletters.

Do you have numbers on how many people opened a particular newsletter? Do you have numbers on how many people clicked a particular link?

You can call it data, or stats, or analytics, but make no mistake, that’s tracking.

Follow-on question: do you honestly think that everyone who opens a newsletter or clicks on a link in a newsletter has given their informed constent to be tracked by you?

You may well answer that this is a widespread—nay, universal—practice. Well yes, but a) that’s not what I asked, and b) see the above quote from Design For Safety.

You could quite correctly point out that this tracking is out of your hands. Your newsletter provider—probably Mailchimp—does this by default. So if the tracking is happening anyway, why not take a look at those numbers?

But that’s like saying it’s okay to eat battery-farmed chicken as long as you’re not breeding the chickens yourself.

When I try to argue against this kind of tracking from an ethical standpoint, I get a frosty reception. I might have better luck battling numbers with numbers. Increasing numbers of users are taking steps to prevent tracking. I had a plug-in installed in my mail client—Apple Mail—to prevent tracking. Now I don’t even need the plug-in. Apple have built it into the app. That should tell you something. It reminds me of when browsers had to introduce pop-up blocking.

If the outputs generated by tracking turn out to be inaccurate, then shouldn’t they lose their status?

But that line of reasoning shouldn’t even by necessary. We shouldn’t stop tracking users because it’s inaccurate. We should stop stop tracking users because it’s wrong.

Writing on web.dev

Chrome Dev Summit kicked off yesterday. The opening keynote had its usual share of announcements.

There was quite a bit of talk about privacy, which sounds good in theory, but then we were told that Google would be partnering with “industry stakeholders.” That’s probably code for the kind of ad-tech sharks that have been making a concerted effort to infest W3C groups. Beware.

But once Una was on-screen, the topics shifted to the kind of design and development updates that don’t have sinister overtones.

My favourite moment was when Una said:

We’re also partnering with Jeremy Keith of Clearleft to launch Learn Responsive Design on web.dev. This is a free online course with everything you need to know about designing for the new responsive web of today.

This is what’s been keeping me busy for the past few months (and for the next month or so too). I’ve been writing fifteen pieces—or “modules”—on modern responsive web design. One third of them are available now at web.dev/learn/design:

  1. Introduction
  2. Media queries
  3. Internationalization
  4. Macro layouts
  5. Micro layouts

The rest are on their way: typography, responsive images, theming, UI patterns, and more.

I’ve been enjoying this process. It’s hard work that requires me to dive deep into the nitty-gritty details of lots of different techniques and technologies, but that can be quite rewarding. As is often said, if you truly want to understand something, teach it.

Oh, and I made one more appearance at the Chrome Dev Summit. During the “Ask Me Anything” section, quizmaster Una asked the panelists a question from me:

Given the court proceedings against AMP, why should anyone trust FLOC or any other Google initiatives ostensibly focused on privacy?

(Thanks to Jake for helping craft the question into a form that could make it past the legal department but still retain its spiciness.)

The question got a response. I wouldn’t say it got an answer. My verdict remains:

I’m not sure that Google Chrome can be considered a user agent.

The fundamental issue is that you’ve got a single company that’s the market leader in web search, the market leader in web advertising, and the market leader in web browsers. I honestly believe all three would function better—and more honestly—if they were separate entities.

Monopolies aren’t just damaging for customers. They’re damaging for the monopoly too. I’d love to see Google Chrome compete on being a great web browser without having to also balance the needs of surveillance-based advertising.

The cage

I subscribe to Peter Gasston’s newsletter, The Tech Landscape. It’s good. Peter’s a smart guy with his finger on the pulse of many technologies that are beyond my ken. I recommend subscribing.

But I was very taken aback by what he wrote in issue 202. It was to do with algorithmic recommendation engines.

This week I want to take a little dump on a tweet I read. I’m not going to link to it (I’m not that person), but it basically said something like: “I’m afraid to Google something because I don’t want the algorithm to think I like it, and I’m afraid to click a link because I don’t want the algorithm to show me more like it… what a cage.”

I saw the same tweet. It resonated with me. I had responded with a link to a post I wrote a while back called Get safe. That post made two points:

  1. GET requests shouldn’t have side effects. Adding to a dossier on someone’s browsing habits definitely counts as a side effect.
  2. It is literally a fundamental principle of the web platform that it should be safe to visit a web page.

But Peter describes ubiquitous surveillance as a feature, not a bug:

It’s observing what someone likes or does, then trying to make recommendations for more things like it—whether that’s books, TV shows, clothes, advertising, or whatever. It works on probability, so it’s going to make better guesses the more it knows you; if you like ten things of type A, then liking one thing of type B shouldn’t be enough to completely change its recommendations. The problem is, we don’t like “the algorithm” if it doesn’t work, and we don’t like it if works too well (“creepy!”). But it’s not sinister, and it’s not a cage.

He would be correct if the balance of power were tipped towards the person actively looking for recommendations. As I said in my earlier post:

Don’t get me wrong: building a profile of someone based on their actions isn’t inherently wrong. If a user taps on “like” or “favourite” or “bookmark”, they are actively telling the server to perform an update (and so those actions should be POST requests). But do you see the difference in where the power lies?

When Peter says “it’s not sinister, and it’s not a cage” that may be true for him, but that is not a shared feeling, as the original tweet demonstrates. I don’t think it’s fair to dismiss someone else’s psychological pain because you don’t think they “get it”. I’m pretty sure everyone “gets” how recommendation engines are supposed to work. That’s not the issue. Trying to provide relevant content isn’t the problem. It’s the unbelievably heavy-handed methods that make it feel like a cage.

Peter uses the metaphor of a record shop:

“The algorithm” is the best way to navigate a world of infinite choice; imagine you went to a record shop (remember them?) which had every recording ever released; how would you find new music? You’d either buy music by bands you know you already liked, or you’d take a pure gamble on something—which most of the time would be a miss. So you’d ask a store worker, and they’d recommend the music they liked—but that’s no guarantee you’d like it. A good worker would ask what type of music you like, and recommend music based on that—you might not like all the recommendations, but there’s more of a chance you’d like some. That’s just what “the algorithm” does.

But that’s not true. You don’t ask “the algorithm” for a recommendation—it foists them on you whether you want them or not. A more apt metaphor would be that you walked by a record shop once and the store worker came out and followed you down the street, into your home, and watched your every move for the rest of your life.

What Peter describes sounds great—a helpful knowledgable software agent that you ask for recommendations. But that’s not what “the algorithm” is. And that’s why it feels like a cage. That’s why it is a cage.

The original tweet was an open, honest, and vulnerable insight into what online recommendation engines feel like. That’s a valuable insight that should be taken on board, not dismissed.

And what a lack of imagination to look at an existing broken system—that doesn’t even provide good recommendations while making people afraid to click on links—and shrug and say that this is the best we can do. If this really is “is the best way to navigate a world of infinite choice” then it’s no wonder that people feel like they need to go on a digital detox and get away from their devices in order to feel normal. It’s like saying that decapitation is the best way of solving headaches.

Imagine living in a surveillance state like East Germany, and saying “Well, how else is the government supposed to make informed decisions without constantly monitoring its citizens?” I think it’s more likely that you’d feel like you’re in a cage.

Apples to oranges? Kind of. But whether it’s surveillance communism or surveillance capitalism, there’s a shared methodology at work. They’re both systems that disempower people for the supposedly greater good of amassing data. Both are built on the false premise that problems can be solved by getting more and more data. If that results in collateral damage to people’s privacy and mental health, well …it’s all for the greater good, right?

It’s fucking bullshit. I don’t want to live in that cage and I don’t want anyone else to have to live in it either. I’m going to do everything I can to tear it down.

Get the FLoC out

I’ve always liked the way that web browsers are called “user agents” in the world of web standards. It’s such a succinct summation of what browsers are for, or more accurately who browsers are for. Users.

The term makes sense when you consider that the internet is for end users. That’s not to be taken for granted. This assertion is now enshrined in the Internet Engineering Task Force’s RFC 8890—like Magna Carta for the network age. It’s also a great example of prioritisation in a design principle:

When there is a conflict between the interests of end users of the Internet and other parties, IETF decisions should favor end users.

So when a web browser—ostensibly an agent for the user—prioritises user-hostile third parties, we get upset.

Google Chrome—ostensibly an agent for the user—is running an origin trial for Federated Learning of Cohorts (FLoC). This is not a technology that serves the end user. It is a technology that serves third parties who want to target end users. The most common use case is behavioural advertising, but targetting could be applied for more nefarious purposes.

The Electronic Frontier Foundation wrote an explainer last month: Google Is Testing Its Controversial New Ad Targeting Tech in Millions of Browsers. Here’s What We Know.

Let’s back up a minute and look at why this is happening. End users are routinely targeted today (for behavioural advertising and other use cases) through third-party cookies. Some user agents like Apple’s Safari and Mozilla’s Firefox are stamping down on this, disabling third party cookies by default.

Seeing which way the wind is blowing, Google’s Chrome browser will also disable third-party cookies at some time in the future (they’re waiting to shut that barn door until the fire is good’n’raging). But Google isn’t just in the browser business. Google is also in the ad tech business. So they still want to advertisers to be able to target end users.

Yes, this is quite the cognitive dissonance: one part of the business is building a user agent while a different part of the company is working on ways of tracking end users. It’s almost as if one company shouldn’t simultaneously be the market leader in three separate industries: search, advertising, and web browsing. (Seriously though, I honestly think Google’s search engine would get better if it were split off from the parent company, and I think that Google’s web browser would also get better if it were a separate enterprise.)

Anyway, one possible way of tracking users without technically tracking individual users is to assign them to buckets, or cohorts of interest based on their browsing habits. Does that make you feel safer? Me neither.

That’s what Google is testing with the origin trial of FLoC.

If you, as an end user, don’t wish to be experimented on like this, there are a few things you can do:

  • Don’t use Chrome. No other web browser is participating in this experiment. I recommend Firefox.
  • If you want to continue to use Chrome, install the Duck Duck Go Chrome extension.
  • Alternatively, if you manually disable third-party cookies, your Chrome browser won’t be included in the experiment.
  • Or you could move to Europe. The origin trial won’t be enabled for users in the European Union, which is coincidentally where GDPR applies.

That last decision is interesting. On the one hand, the origin trial is supposed to be on a small scale, hence the lack of European countries. On the other hand, the origin trial is “opt out” instead of “opt in” so that they can gather a big enough data set. Weird.

The plan is that if and when FLoC launches, websites would have to opt in to it. And when I say “plan”, I meanbest guess.”

I, for one, am filled with confidence that Google would never pull a bait-and-switch with their technologies.

In the meantime, if you’re a website owner, you have to opt your website out of the origin trial. You can do this by sending a server header. A meta element won’t do the trick, I’m afraid.

I’ve done it for my sites, which are served using Apache. I’ve got this in my .conf file:

<IfModule mod_headers.c>
Header always set Permissions-Policy "interest-cohort=()"
</IfModule>

If you don’t have access to your server, tough luck. But if your site runs on Wordpress, there’s a proposal to opt out of FLoC by default.

Interestingly, none of the Chrome devs that I follow are saying anything about FLoC. They’re usually quite chatty about proposals for potential standards, but I suspect that this one might be embarrassing for them. It was a similar situation with AMP. In that case, Google abused its monopoly position in search to blackmail publishers into using Google’s format. Now Google’s monopoly in advertising is compromising the integrity of its browser. In both cases, it makes it hard for Chrome devs claiming to have the web’s best interests at heart.

But one of the advantages of having a huge share of the browser market is that Chrome can just plough ahead and unilaterily implement whatever it wants even if there’s no consensus from other browser makers. So that’s what Google is doing with FLoC. But their justification for doing this doesn’t really work unless other browsers play along.

Here’s Google’s logic:

  1. Third-party cookies are on their way out so advertisers will no longer be able to use that technology to target users.
  2. If we don’t provide an alternative, advertisers and other third parties will use fingerprinting, which we all agree is very bad.
  3. So let’s implement Federated Learning of Cohorts so that advertisers won’t use fingerprinting.

The problem is with step three. The theory is that if FLoC gives third parties what they need, then they won’t reach for fingerprinting. Even if there were any validity to that hypothesis, the only chance it has of working is if every browser joins in with FLoC. Otherwise ad tech companies are leaving money on the table. Can you seriously imagine third parties deciding that they just won’t target iPhone or iPad users any more? Remember that Safari is the only real browser on iOS so unless FLoC is implemented by Apple, third parties can’t reach those people …unless those third parties use fingerprinting instead.

Google have set up a situation where it looks like FLoC is going head-to-head with fingerprinting. But if FLoC becomes a reality, it won’t be instead of fingerprinting, it will be in addition to fingerprinting.

Google is quite right to point out that fingerprinting is A Very Bad Thing. But their concerns about fingerprinting sound very hollow when you see that Chrome is pushing ahead and implementing a raft of browser APIs that other browser makers quite rightly point out enable more fingerprinting: Battery Status, Proximity Sensor, Ambient Light Sensor and so on.

When it comes to those APIs, the message from Google is that fingerprinting is a solveable problem.

But when it comes to third party tracking, the message from Google is that fingerprinting is inevitable and so we must provide an alternative.

Which one is it?

Google’s flimsy logic for why FLoC is supposedly good for end users just doesn’t hold up. If they were honest and said that it’s to maintain the status quo of the ad tech industry, it would make much more sense.

The flaw in Google’s reasoning is the fundamental idea that tracking is necessary for advertising. That’s simply not true. Sacrificing user privacy is fundamental to behavioural advertising …but behavioural advertising is not the only kind of advertising. It isn’t even a very good kind of advertising.

Marko Saric sums it up:

FLoC seems to be Google’s way of saving a dying business. They are trying to keep targeted ads going by making them more “privacy-friendly” and “anonymous”. But behavioral profiling and targeted advertisement is not compatible with a privacy-respecting web.

What’s striking is that the very monopolies that make Google and Facebook the leaders in behavioural advertising would also make them the leaders in contextual advertising. Almost everyone uses Google’s search engine. Almost everyone uses Facebook’s social network. An advertising model based on what you’re currently looking at would keep Google and Facebook in their dominant positions.

Google made their first many billions exclusively on contextual advertising. Google now prefers to push the message that behavioral advertising based on personal data collection is superior but there is simply no trustworthy evidence to that.

I sincerely hope that Chrome will align with Safari, Firefox, Vivaldi, Brave, Edge and every other web browser. Everyone already agrees that fingerprinting is the real enemy. Imagine the combined brainpower that could be brought to bear on that problem if all browsers made user privacy a priority.

Until that day, I’m not sure that Google Chrome can be considered a user agent.

Prediction

Arthur C. Clarke once said:

Trying to predict the future is a discouraging and hazardous occupation becaue the profit invariably falls into two stools. If his predictions sounded at all reasonable, you can be quite sure that in 20 or most 50 years, the progress of science and technology has made him seem ridiculously conservative. On the other hand, if by some miracle a prophet could describe the future exactly as it was going to take place, his predictions would sound so absurd, so far-fetched, that everybody would laugh him to scorn.

But I couldn’t resist responding to a recent request for augery. Eric asked An Event Apart speakers for their predictions for the coming year. The responses have been gathered together and published, although it’s in the form of a PDF for some reason.

Here’s what I wrote:

This is probably more of a hope than a prediction, but 2021 could be the year that the ponzi scheme of online tracking and surveillance begins to crumble. People are beginning to realize that it’s far too intrusive, that it just doesn’t work most of the time, and that good ol’-fashioned contextual advertising would be better. Right now, it feels similar to the moment before the sub-prime mortgage bubble collapsed (a comparison made in Tim Hwang’s recent book, Subprime Attention Crisis). Back then people thought “Well, these big banks must know what they’re doing,” just as people have thought, “Well, Facebook and Google must know what they’re doing”…but that confidence is crumbling, exposing the shaky stack of cards that props up behavioral advertising. This doesn’t mean that online advertising is coming to an end—far from it. I think we might see a golden age of relevant, content-driven advertising. Laws like Europe’s GDPR will play a part. Apple’s recent changes to highlight privacy-violating apps will play a part. Most of all, I think that people will play a part. They will be increasingly aware that there’s nothing inevitable about tracking and surveillance and that the web works better when it respects people’s right to privacy. The sea change might not happen in 2021 but it feels like the water is beginning to swell.

Still, predicting the future is a mug’s game with as much scientific rigour as astrology, reading tea leaves, or haruspicy.

Much like behavioural advertising.

Get safe

The verbs of the web are GET and POST. In theory there’s also PUT, DELETE, and PATCH but in practice POST often does those jobs.

I’m always surprised when front-end developers don’t think about these verbs (or request methods, to use the technical term). Knowing when to use GET and when to use POST is crucial to having a solid foundation for whatever you’re building on the web.

Luckily it’s not hard to know when to use each one. If the user is requesting something, use GET. If the user is changing something, use POST.

That’s why links are GET requests by default. A link “gets” a resource and delivers it to the user.

<a href="/items/id">

Most forms use the POST method becuase they’re changing something—creating, editing, deleting, updating.

<form method="post" action="/items/id/edit">

But not all forms should use POST. A search form should use GET.

<form method="get" action="/search">
<input type="search" name="term">

When a user performs a search, they’re still requesting a resource (a page of search results). It’s just that they need to provide some specific details for the GET request. Those details get translated into a query string appended to the URL specified in the action attribute.

/search?term=value

I sometimes see the GET method used incorrectly:

  • “Log out” links that should be forms with a “log out” button—you can always style it to look like a link if you want.
  • “Unsubscribe” links in emails that immediately trigger the action of unsubscribing instead of going to a form where the POST method does the unsubscribing. I realise that this turns unsubscribing into a two-step process, which is a bit annoying from a usability point of view, but a destructive action should never be baked into a GET request.

When the it was first created, the World Wide Web was stateless by design. If you requested one web page, and then subsequently requested another web page, the server had no way of knowing that the same user was making both requests. After serving up a page in response to a GET request, the server promptly forgot all about it.

That’s how web browsing should still work. In fact, it’s one of the Web Platform Design Principles: It should be safe to visit a web page:

The Web is named for its hyperlinked structure. In order for the web to remain vibrant, users need to be able to expect that merely visiting any given link won’t have implications for the security of their computer, or for any essential aspects of their privacy.

The expectation of safe stateless browsing has been eroded over time. Every time you click on a search result in Google, or you tap on a recommended video in YouTube, or—heaven help us—you actually click on an advertisement, you just know that you’re adding to a dossier of your online profile. That’s not how the web is supposed to work.

Don’t get me wrong: building a profile of someone based on their actions isn’t inherently wrong. If a user taps on “like” or “favourite” or “bookmark”, they are actively telling the server to perform an update (and so those actions should be POST requests). But do you see the difference in where the power lies? With POST actions—fave, rate, save—the user is in charge. With GET requests, no one is supposed to be in charge—it’s meant to be a neutral transaction. Alas, the reality of today’s web is that many GET requests give more power to the dossier-building servers at the expense of the user’s agency.

The very first of the Web Platform Design Principles is Put user needs first :

If a trade-off needs to be made, always put user needs above all.

The current abuse of GET requests is damage that the web needs to route around.

Browsers are helping to a certain extent. Most browsers have the concept of private browsing, allowing you some level of statelessness, or at least time-limited statefulness. But it’s kind of messed up that private browsing is the exception, while surveillance is the default. It should be the other way around.

Firefox and Safari are taking steps to reduce tracking and fingerprinting. Rejecting third-party coookies by default is a good move. I’d love it if third-party JavaScript were also rejected by default:

In retrospect, it seems unbelievable that third-party JavaScript is even possible. I mean, putting arbitrary code—that can then inject even more arbitrary code—onto your website? That seems like a security nightmare!

I imagine if JavaScript were being specced today, it would almost certainly be restricted to the same origin by default.

Chrome has different priorities, which is understandable given that it comes from a company with a business model that is currently tied to tracking and surveillance (though it needn’t remain that way). With anti-trust proceedings rumbling in the background, there’s talk of breaking up Google to avoid monopolistic abuses of power. I honestly think it would be the best thing that could happen to Chrome if it were an independent browser that could fully focus on user needs without having to consider the surveillance needs of an advertising broker.

But we needn’t wait for the browsers to make the web a safer place for users.

Developers write the code that updates those dossiers. Developers add those oh-so-harmless-looking third-party scripts to page templates.

What if we refused?

Front-end developers in particular should be the last line of defence for users. The entire field of front-end devlopment is supposed to be predicated on the prioritisation of user needs.

And if the moral argument isn’t enough, perhaps the technical argument can get through. Tracking users based on their GET requests violates the very bedrock of the web’s architecture. Stop doing that.

Clean advertising

Imagine if you were told that fossil fuels were the only way of extracting energy. It would be an absurd claim. Not only are other energy sources available—solar, wind, geothermal, nuclear—fossil fuels aren’t even the most effecient source of energy. To say that you can’t have energy without burning fossil fuels would be pitifully incorrect.

And yet when it comes to online advertising, we seem to have meekly accepted that you can’t have effective advertising without invasive tracking. But nothing could be further from the truth. Invasive tracking is to online advertising as fossil fuels are to energy production—an outmoded inefficient means of getting substandard results.

Before the onslaught of third party cookies and scripts, online advertising was contextual. If I searched for property insurance, I was likely to see an advertisement for property insurance. If I was reading an article about pet food, I was likely to be served an advertisement for pet food.

Simply put, contextual advertising ensured that the advertising that accompanied content could be relevant and timely. There was no big mystery about it: advertisers just needed to know what the content was about and they could serve up the appropriate advertisement. Nice and straightforward.

Too straightforward.

What if, instead of matching the advertisement to the content, we could match the advertisement to the person? Regardless of what they were searching for or reading, they’d be served advertisements that were relevant to them not just in that moment, but relevant to their lifestyles, thoughts and beliefs? Of course that would require building up dossiers of information about each person so that their profiles could be targeted and constantly updated. That’s where cross-site tracking comes in, with third-party cookies and scripts.

This is behavioural advertising. It has all but elimated contextual advertising. It has become so pervasive that online advertising and behavioural advertising have become synonymous. Contextual advertising is seen as laughably primitive compared with the clairvoyant powers of behavioural advertising.

But there’s a problem with behavioural advertising. A big problem.

It doesn’t work.

First of all, it relies on mind-reading powers by the advertising brokers—Facebook, Google, and the other middlemen of ad tech. For all the apocryphal folk tales of spooky second-guessing in online advertising, it mostly remains rubbish.

Forget privacy: you’re terrible at targeting anyway:

None of this works. They are still trying to sell me car insurance for my subway ride.

Have you actually paid attention to what advertisements you’re served? Maciej did:

I saw a lot of ads for GEICO, a brand of car insurance that I already own.

I saw multiple ads for Red Lobster, a seafood restaurant chain in America. Red Lobster doesn’t have any branches in San Francisco, where I live.

Finally, I saw a ton of ads for Zipcar, which is a car sharing service. These really pissed me off, not because I have a problem with Zipcar, but because they showed me the algorithm wasn’t even trying. It’s one thing to get the targeting wrong, but the ad engine can’t even decide if I have a car or not! You just showed me five ads for car insurance.

And yet in the twisted logic of ad tech, all of this would be seen as evidence that they need to gather even more data with even more invasive tracking and surveillance.

It turns out that bizarre logic is at the very heart of behavioural advertising. I highly recommend reading the in-depth report from The Correspondent called The new dot com bubble is here: it’s called online advertising:

It’s about a market of a quarter of a trillion dollars governed by irrationality.

The benchmarks that advertising companies use – intended to measure the number of clicks, sales and downloads that occur after an ad is viewed – are fundamentally misleading. None of these benchmarks distinguish between the selection effect (clicks, purchases and downloads that are happening anyway) and the advertising effect (clicks, purchases and downloads that would not have happened without ads).

Suppose someone told you that they keep tigers out of their garden by turning on their kitchen light every evening. You might think their logic is flawed, but they’ve been turning on the kitchen light every evening for years and there hasn’t been a single tiger in the garden the whole time. That’s the logic used by ad tech companies to justify trackers.

Tracker-driven behavioural advertising is bad for users. The advertisements are irrelevant most of the time, and on the few occasions where the advertising hits the mark, it just feels creepy.

Tracker-driven behavioural advertising is bad for advertisers. They spend their hard-earned money on invasive ad tech that results in no more sales or brand recognition than if they had relied on good ol’ contextual advertising.

Tracker-driven behavioural advertising is very bad for the web. Megabytes of third-party JavaScript are injected at exactly the wrong moment to make for the worst possible performance. And if that doesn’t ruin the user experience enough, there are still invasive overlays and consent forms to click through (which, ironically, gets people mad at the legislation—like GDPR—instead of the underlying reason for these annoying overlays: unnecessary surveillance and tracking by the site you’re visiting).

Tracker-driven behavioural advertising is good for the middlemen doing the tracking. Facebook and Google are two of the biggest players here. But that doesn’t mean that their business models need to be permanently anchored to surveillance. The very monopolies that make them kings of behavioural advertising—the biggest social network and the biggest search engine—would also make them titans of contextual advertising. They could pivot from an invasive behavioural model of advertising to a privacy-respecting contextual advertising model.

The incumbents will almost certainly resist changing something so fundamental. It would be like expecting an energy company to change their focus from fossil fuels to renewables. It won’t happen quickly. But I think that it may eventually happen …if we demand it.

In the meantime, we can all play our part. Just as we can do our bit for the environment at an individual level by sorting our recycling and making green choices in our day to day lives, we can all do our bit for the web too.

The least we can do is block third-party cookies. Some browsers are now doing this by default. That’s good.

Blocking third-party JavaScript is a bit trickier. That requires a browser extension. Most of these extensions to block third-party tracking are called ad blockers. That’s a shame. The issue is not with advertising. The issue is with tracking.

Alas, because this software is labelled under ad blocking, it has led to the ludicrous situation of an ethical argument being made to allow surveillance and tracking! It goes like this: websites need advertising to survive; if you block the ads, then you are denying these sites revenue. That argument would make sense if we were talking about contextual advertising. But it makes no sense when it comes to behavioural advertising …unless you genuinely believe that online advertising has to be behavioural, which means that online advertising has to track you to be effective. Such a belief would be completely wrong. But that doesn’t stop it being widely held.

To argue that there is a moral argument against blocking trackers is ridiculous. If anything, there’s a moral argument to be made for installing anti-tracking software for yourself, your friends, and your family. Otherwise we are collectively giving up our privacy for a business model that doesn’t even work.

It’s a shame that advertisers will lose out if tracking-blocking software prevents their ads from loading. But that’s only going to happen in the case of behavioural advertising. Contextual advertising won’t be blocked. Contextual advertising is also more lightweight than behavioural advertising. Contextual advertising is far less creepy than behavioural advertising. And crucially, contextual advertising works.

That shouldn’t be a controversial claim: the idea that people would be interested in adverts that are related to the content they’re currently looking at. The greatest trick the ad tech industry has pulled is convincing the world that contextual relevance is somehow less effective than some secret algorithm fed with all our data that’s supposed to be able to practically read our minds and know us better than we know ourselves.

Y’know, if this mind-control ray really could give me timely relevant adverts, I might possibly consider paying the price with my privacy. But as it is, YouTube still hasn’t figured out that I’m not interested in Top Gear or football.

The next time someone is talking about the necessity of advertising on the web as a business model, ask for details. Do they mean contextual or behavioural advertising? They’ll probably laugh at you and say that behavioural advertising is the only thing that works. They’ll be wrong.

I know it’s hard to imagine a future without tracker-driven behavioural advertising. But there are no good business reasons for it to continue. It was once hard to imagine a future without oil or coal. But through collective action, legislation, and smart business decisions, we can make a cleaner future.

Implementors

The latest newsletter from The History Of The Web is a good one: The Browser Engine That Could. It’s all about the history of browsers and more specifically, rendering engines.

Jay quotes from a 1992 email by Tim Berners-Lee when there was real concern about having too many different browsers. But as history played out, the concern shifted to having too few different browsers.

I wrote about this—back when Edge switched to using Chromium—in a post called Unity where I compared it to political parties:

If you have hundreds of different political parties, that’s not ideal. But if you only have one political party, that’s very bad indeed!

I talked about this some more with Brian and Stuart on the Igalia Chats podcast: Web Ecosystem Health (here’s the mp3 file).

In the discussion we dive deeper into the naunces of browser engine diversity; how it’s not the numbers that matter, but representation. The danger with one dominant rendering engine is that it would reflect one dominant set of priorities.

I think we’re starting to see this kind of battle between different sets of priorities playing out in the browser rendering engine landscape.

Webkit published a list of APIs they won’t be implementing in their current form because of security concerns around fingerprinting. Mozilla is taking the same stand. Google is much more gung-ho about implementing those APIs.

I think it’s safe to say that every implementor wants to ship powerful APIs and ensure security and privacy. The issue is with which gets priority. Using the language of principles and priorities, you could crudely encapsulate Apple and Mozilla’s position as:

Privacy, even over capability.

That design principle would pass the reversibility test. In fact, Google’s position might be represented as:

Capability, even over privacy.

I’m not saying Apple and Mozilla don’t value powerful APIs. I’m not saying Google doesn’t value privacy. I’m saying that Google’s priorities are different to Apple’s and Mozilla’s.

Alas, Alex is saying that Apple and Mozilla don’t value capability:

There is a contingent of browser vendors today who do not wish to expand the web platform to cover adjacent use-cases or meaningfully close the relevance gap that the shift to mobile has opened.

That’s very disappointing. It’s a cheap shot. As cheap as saying that, given Google’s business model, Chrome wouldn’t want to expand the web platform to provide better privacy and security.

Opening up the AMP cache

I have a proposal that I think might alleviate some of the animosity around Google AMP. You can jump straight to the proposal or get some of the back story first…

The AMP format

Google AMP is exactly the kind of framework I’d like to get behind. Unlike most front-end frameworks, its components take a declarative approach—no knowledge of JavaScript required. I think Lea’s excellent Mavo is the only other major framework that takes this inclusive approach. All the configuration happens in markup, and all the styling happens in CSS. Excellent!

But I cannot get behind AMP.

Instead of competing on its own merits, AMP is unfairly propped up by the search engine of its parent company, Google. That makes it very hard to evaluate whether AMP is being used on its own merits. Instead, the evidence suggests that most publishers of AMP pages are doing so because they feel they have to, rather than because they want to. That’s a real shame, because as a library of web components, AMP seems pretty good. But there’s just no way to evaluate AMP-the-format without taking into account AMP-the-ecosystem.

The AMP ecosystem

Google AMP ostensibly exists to make the web faster. Initially the focus was specifically on mobile performance, but that distinction has since fallen by the wayside. The idea is that by using AMP’s web components, your pages will be speedy. Though, as Andy Davies points out, this isn’t always the case:

This is where I get confused… https://independent.co.uk only have an AMP site yet it’s performance is awful from a user perspective - isn’t AMP supposed to prevent this?

See also: Google AMP lowered our page speed, and there’s no choice but to use it:

According to Google’s own Page Speed Insights audit (which Google recommends to check your performance), the AMP version of articles got an average performance score of 87. The non-AMP versions? 95.

Publishers who already have fast web pages—like The Guardian—are still compelled to make AMP versions of their stories because of the search benefits reserved for AMP. As Terence Eden reported from a meeting of the AMP advisory committee:

We heard, several times, that publishers don’t like AMP. They feel forced to use it because otherwise they don’t get into Google’s news carousel — right at the top of the search results.

Some people felt aggrieved that all the hard work they’d done to speed up their sites was for nothing.

The Google AMP team are at pains to point out that AMP is not a ranking factor in search. That’s true. But it is unfairly privileged in other ways. Only AMP pages can appear in the Top Stories carousel …which appears above any other search results. As I’ve said before:

Now, if you were to ask any right-thinking person whether they think having their page appear right at the top of a list of search results would be considered preferential treatment, I think they would say hell, yes! This is the only reason why The Guardian, for instance, even have AMP versions of their content—it’s not for the performance benefits (their non-AMP pages are faster); it’s for that prime real estate in the carousel.

From A letter about Google AMP:

Content that “opts in” to AMP and the associated hosting within Google’s domain is granted preferential search promotion, including (for news articles) a position above all other results.

That’s not the only way that AMP pages get preferential treatment. It turns out that the secret to the speed of AMP pages isn’t the web components. It’s the prerendering.

The AMP cache

If you’ve ever seen an AMP page in a list of search results, you’ll have noticed the little lightning icon. If you’ve ever tapped on that search result, you’ll have noticed that the page loads blazingly fast!

That’s not down to AMP-the-format, alas. That’s down to the fact that the page has been prerendered by Google before you even went to it. If any page were prerendered that way, it would load blazingly fast. But currently, this privilege is reserved for AMP pages only.

If, after tapping through to that AMP page, you looked at the address bar of your browser, you might have noticed something odd. Even though you might have thought you were visiting The Washington Post, or The New York Times, the URL of the (blazingly fast) page you’re looking at is still under Google’s domain. That’s because Google hosts any AMP pages that it prerenders.

Google calls this “the AMP cache”, but it would be better described as “AMP hosting”. The web page sent down the wire is hosted on Google’s domain.

Here’s that AMP letter again:

When a user navigates from Google to a piece of content Google has recommended, they are, unwittingly, remaining within Google’s ecosystem.

Through gritted teeth, I will refer to this as “the AMP cache”, because that’s what everyone else calls it. But make no mistake, Google is hosting—not caching—these pages.

But why host the pages on a Google domain? Why not prerender the original URLs?

Prerendering and privacy

Scott summed up the situation with AMP nicely:

The pitch I think site owners are hearing is: let us host your pages on our domain and we’ll promote them in search results AND preload them so they feel “instant.” To opt-in, build pages using this component syntax.

But perhaps we could de-couple the AMP format from the AMP cache.

That’s what Terence suggests:

My recommendation is that Google stop requiring that organisations use Google’s proprietary mark-up in order to benefit from Google’s promotion.

The AMP letter, too:

Instead of granting premium placement in search results only to AMP, provide the same perks to all pages that meet an objective, neutral performance criterion such as Speed Index.

Scott reiterates:

It’s been said before but it would be so good for the web if pages with a Lighthouse score over say, 90 could get into that top search result area, even if they’re not built using Google’s AMP framework. Feels wrong to have to rebuild/reproduce an already-fast site just for SEO.

This was also what I was calling for. But then Malte pointed out something that stumped me. Privacy.

Here’s the problem…

Let’s say Google do indeed prerender already-fast pages when they’re listed in search results. You, a search user, type something into Google. A list of results come back. Google begins pre-rendering some of them. But you don’t end up clicking through to those pages. Nonetheless, the servers those pages are hosted on have received a GET request coming from a Google search. Those publishers now know that a particular (cookied?) user could have clicked through to their site. That’s very different from knowing when someone has actually arrived at a particular site.

And that’s why Google host all the AMP pages that they prerender. Given the privacy implications of prerendering non-Google URLs, I must admit that I see their point.

Still, it’s a real shame to miss out on the speed benefit of prerendering:

Prerendering AMP documents leads to substantial improvements in page load times. Page load time can be measured in different ways, but they consistently show that prerendering lets users see the content they want faster. For now, only AMP can provide the privacy preserving prerendering needed for this speed benefit.

A modest proposal

Why is Google’s AMP cache just for AMP pages? (Y’know, apart from the obvious answer that it’s in the name.)

What if Google were allowed to host non-AMP pages? Google search could then prerender those pages just like it currently does for AMP pages. There would be no privacy leaks; everything would happen on the same domain—google.com or ampproject.org or whatever—just as currently happens with AMP pages.

Don’t get me wrong: I’m not suggesting that Google should make a 1:1 model of the web just to prerender search results. I think that the implementation would need to have two important requirements:

  1. Hosting needs to be opt-in.
  2. Only fast pages should be prerendered.

Opting in

Currently, by publishing a page using the AMP format, publishers give implicit approval to Google to host that page on Google’s servers and serve up this Google-hosted version from search results. This has always struck me as being legally iffy. I’ve looked in the AMP documentation to try to find any explicit granting of hosting permission (e.g. “By linking to this JavaScript file, you hereby give Google the right to serve up our copies of your content.”), but no luck. So even with the current situation, I think a clear opt-in for hosting would be beneficial.

This could be a meta element. Maybe something like:

<meta name="caches-allowed" content="google">

This would have the nice benefit of allowing comma-separated values:

<meta name="caches-allowed" content="google, yandex">

(The name is just a strawman, by the way—I’m not suggesting that this is what the final implementation would actually look like.)

If not a meta element, then perhaps this could be part of robots.txt? Although my feeling is that this needs to happen on a document-by-document basis rather than site-wide.

Many people will, quite rightly, never want Google—or anyone else—to host and serve up their content. That’s why it’s so important that this behaviour needs to be opt-in. It’s kind of appalling that the current hosting of AMP pages is opt-in-by-proxy-sort-of.

Criteria for prerendering

Which pages should be blessed with hosting and prerendering? The fast ones. That’s sorta the whole point of AMP. But right now, there’s a lot of resentment by people with already-fast websites who quite rightly feel they shouldn’t have to use the AMP format to benefit from the AMP ecosystem.

Page speed is already a ranking factor. It doesn’t seem like too much of a stretch to extend its benefits to hosting and prerendering. As mentioned above, there are already a few possible metrics to use:

  • Page Speed Index
  • Lighthouse
  • Web Page Test

Ah, but what if a page has good score when it’s indexed, but then gets worse afterwards? Not a problem! The version of the page that’s measured is the same version of the page that gets hosted and prerendered. Google can confidently say “This page is fast!” After all, they’re the ones serving up the page.

That does raise the question of how often Google should check back with the original URL to see if it has changed/worsened/improved. The answer to that question is however long it currently takes to check back in on AMP pages:

Each time a user accesses AMP content from the cache, the content is automatically updated, and the updated version is served to the next user once the content has been cached.

Issues

This proposal does not solve the problem with the address bar. You’d still find yourself looking at a page from The Washington Post or The New York Times (or adactio.com) but seeing a completely different URL in your browser. That’s not good, for all the reasons outlined in the AMP letter.

In fact, this proposal could potentially make the situation worse. It would allow even more sites to be impersonated by Google’s URLs. Where currently only AMP pages are bad actors in terms of URL confusion, opening up the AMP cache would allow equal opportunity URL confusion.

What I’m suggesting is definitely not a long-term solution. The long-term solutions currently being investigated are technically tricky and will take quite a while to come to fruition—web packages and signed exchanges. In the meantime, what I’m proposing is a stopgap solution that’s technically a lot simpler. But it won’t solve all the problems with AMP.

This proposal solves one problem—AMP pages being unfairly privileged in search results—but does nothing to solve the other, perhaps more serious problem: the erosion of site identity.

Measuring

Currently, Google can assess whether a page should be hosted and prerendered by checking to see if it’s a valid AMP page. That test would need to be widened to include a different measurement of performance, but those measurements already exist.

I can see how this assessment might not be as quick as checking for AMP validity. That might affect whether non-AMP pages could be measured quickly enough to end up in the Top Stories carousel, which is, by its nature, time-sensitive. But search results are not necessarily as time-sensitive. Let’s start there.

Assets

Currently, AMP pages can be prerendered without fetching anything other than the markup of the AMP page itself. All the CSS is inline. There are no initial requests for other kinds of content like images. That’s because there are no img elements on the page: authors must use amp-img instead. The image itself isn’t loaded until the user is on the page.

If the AMP cache were to be opened up to non-AMP pages, then any content required for prerendering would also need to be hosted on that same domain. Otherwise, there’s privacy leakage.

This definitely introduces an extra level of complexity. Paths to assets within the markup might need to be re-written to point to the Google-hosted equivalents. There would almost certainly need to be a limit on the number of assets allowed. Though, for performance, that’s no bad thing.

Make no mistake, figuring out what to do about assets—style sheets, scripts, and images—is very challenging indeed. Luckily, there are very smart people on the Google AMP team. If that brainpower were to focus on this problem, I am confident they could solve it.

Summary

  1. Prerendering of non-Google URLs is problematic for privacy reasons, so Google needs to be able to host pages in order to prerender them.
  2. Currently, that’s only done for pages using the AMP format.
  3. The AMP cache—and with it, prerendering—should be decoupled from the AMP format, and opened up to other fast web pages.

There will be technical challenges, but hopefully nothing insurmountable.

I honestly can’t see what Google have to lose here. If their goal is genuinely to reward fast pages, then opening up their AMP cache to fast non-AMP pages will actively encourage people to make fast web pages (without having to switch over to the AMP format).

I’ve deliberately kept the details vague—what the opt-in should look like; what the speed measurement should be; how to handle assets—I’m sure smarter folks than me can figure that stuff out.

I would really like to know what other people think about this proposal. Obviously, I’d love to hear from members of the Google AMP team. But I’d also love to hear from publishers. And I’d very much like to know what people in the web performance community think about this. (Write a blog post and send me a webmention.)

What am I missing here? What haven’t I thought of? What are the potential pitfalls (and are they any worse than the current acrimonious situation with Google AMP)?

I would really love it if someone with a fast website were in a position to say, “Hey Google, I’m giving you permission to host this page so that it can be prerendered.”

I would really love it if someone with a slow website could say, “Oh, shit! We’d better make our existing website faster or Google won’t host our pages for prerendering.”

And I would dearly love to finally be able to embrace AMP-the-format with a clear conscience. But as long as prerendering is joined at the hip to the AMP format, the injustice of the situation only harms the AMP project.

Google, open up the AMP cache.

Designing for Personalities by Sarah Parmenter

Following on from Jeffrey and Margot, the third talk in the morning’s curated content at An Event Apart Seattle is from Sarah Parmenter. Her talk is Designing for Personalities. Here’s the description:

Just as our designs today must accommodate differences of gender, cultural background, and other factors, it’s time to create apps, websites, and internal processes that account for still another strand of human diversity: our very different personality types.

In this new presentation, Sarah shares real-life case studies demonstrating how businesses and organizations large and small are learning to adjust the thinking behind their websites and processes to account for the wishes, needs, and comfort levels of all kinds of people.

We know that the world is full of different conventions—currency, measuring systems, and more—and our web forms address these differences. Let’s do the same for the emotional and psychological assumptions behind our customer profiles. Let’s learn to design for a palette of different personalities.

I’m going to do my best to write down some of what she says…

Sarah works with Adobe, and at a gathering last year, she ended up chatting with some of her co-workers about ancestry, for some reason. She mentioned that she had French and Norwegian roots. The French part is evident in her surname: parmentier means potato farmer. So Sarah did a DNA test. It turned out that Sarah had no French or Norwegian roots—everything in her ancestry came from within an eighty mile radius of her home. It was scary how much she strongly believed for years in something that just wasn’t true.

It’s like that on the web. There are things we do because lots of people do them, but that doesn’t mean they work. Many websites and digital processes are broken and it’s down to us to fix it.

With traditional personas, we make an awful lot of assumptions about people. Have a look at facebook.com/ads/preferences. See just how easy it is for computers to make startling amounts of assumptions.

The other problem with personas is that they are amalgamations. But there’s no such thing as an average costumer. The Microsoft design team add much more context so that they can design for real people in real situations.

Designing for personas only takes care of a fraction of the work we need to do. When we add in another layer of life getting in the way, and a layer of how someone is feeling, you’ve a medley of UX issues that need solving.

The problem is that personality traits aren’t static. They evolve with context. Personas are contextual but static. What we should really be doing is creating the most desirable experience for the user, and we can only do that by empowering them, as Margot also said. We need to give our users control.

If there were such controls, Sarah would use them to reduce motion on websites. She suffers from motion sickness and some websites literally make her sick. There is a prefers-reduced-motion media query but so far only Safari and Firefox support it. It’s hard to believe that we haven’t been doing this already. This stuff seems so obvious in hindsight.

Sarah asks who in the room are introverts. People raise their hands (which seems like quite an extroverted thing to do).

Now Sarah brings up the Meyers-Briggs test, a piece of pseudoscientic bollocks. Sarah is INFJ—introversion, intuition, feeling, judging. Weird flex, but okay.

Introverts will patiently seek out complex UX patterns if it aligns with their levels of comfort. These are people who would rather do anything rather than speak to someone on the phone. An introvert figured out that if you sat on the Virgin Atlantic homepage long enough, a live chat will pop up after twenty minutes.

Apple is great for introverts. They don’t bury their chat options (unlike Amazon). Remember, introverts are a third of the population.

Users will begin to value those applications and services that bother them the least, respect their privacy, and allow them a certain level of control.

Let’s talk about designing compassionate products.

What we’re asking of people in time-critical or exceptionally personal situations is for them to have the foresight to turn on incognito mode. Everyone has an urban legend horror story about cookies following them around the web. Cookies can seem like a smart marketing solution until context lets them down.

Sarah’s best friend got pregnant. She started excitedly clicking around the web looking for pregnancy-related products. She sadly lost the baby. Sarah explained to her how to use a cookie eraser. Her friend that she was joking. Sarah showed her how to clean her search history. But if you’ve liked and subscribed a bunch of things while you’re excited, it’s not that easy—when the worst happens—to think back on everything you did.

There’s an app that’s not in the US. It’s a menstrual cycle and fertility tracking app. It captures a lot of data. At the point when Sarah’s friend lost her baby, this change was caught by the app. The message she got was lacking in empathy. It was more like market research than a compassionate message. At a time when they should’ve been thinking of the mindset of their user, they were focused on getting data. No one caught this when the app was being designed.

The entire user experience of our websites and apps is going to rely on how empathetic we are.

We don’t always save things to reminisce; we save to give us the option to remember. We can currently favourite a photograph or flag as inapropriate. It would be nice to simply save something to a memory vault.

Bloom and Wild is a company in the UK. They send nice mailbox flowers. On March 5th last year, Sarah sent an email to the CEO of Bloom and Wild. She had just received a mailout about mother’s day after her mother passed away. Was their no way of opting out of receiving mother’s day emails without unsubscribing completely?

Well, yesterday they finally implemented it! Bloom and Wild have been overwhelmed by the positive response.

For those of us trying to make the web a better place, sometimes it can be as simple as reaching out to point out what companies could be doing better. And sometimes, just sometimes, they listen.

Also, read Design For Real Life by Eric Meyer and Sara Wachter-Boettcher.

As standard, we should be giving users end-to-end control over how they interact with us.

Sarah wants to talk about designing a personal UX journey. For one of her clients, Sarah dip-sampled hundreds of existing customers. There were gaps in the customer journey. They think that what was happening was the company was getting very aggressive after initial interaction—they were phoning customers. Sarah and her team started researching this. That made them unpopular with other parts of the company. Sarah gave her team Groucho Marx glasses whenever they had to go and ask people uncomfortable questions.

Sarah’s team went on a remarketing effort. They sent an email to people who were in the gap between booking an appointment and making a purchase. They asked the users what their preferences were for contacting them. The company didn’t think they were doing anything wrong but this research showed that 76% of people prefered to avoid phone calls.

They asked a few more questions. If you ask questions, there has to be value in it for the users. Sarah got the budget for some gift cards. They got feedback that many people don’t like taking calls, especially when they’re at work. The best: “I’m an intorvert. I hate calls. Sorry.”

The customer feedback was very, very clear. Even though this would take a lot of money to fix, it was crucial to fix it. Being agile was crucial.

Then they looked at a different (shorter) gap in the customer journey. It was clear that an online booking service was desirable. They made a product quickly that booked more appointments in ten days than had previously been booked in a month by sales agents.

They also made a live chat system. You see a very slow roll-out. At the beginning, it has all new customers. After a while, people return with more questions.

The mistake they made was having a tech-savvy team with multiple browser windows open. That’s not how the customer service people operate. They usually deal with people one on one. So they were happy to leave people waiting on live chat for twenty or twenty five minutes, and of course that was far too long. So when you’re adding in a new system like this, think about key performance indicators that you want to go along with it e.g. live chat must have a response within five minutes.

There’s also a long tail of conversion. Sometimes the sales cycle is very lengthy. They decided to give users the ability to select which product they wanted and switch options on and off. It was all about giving the power back to the user. This was a phenomenal change for the company. They were able to completely change the customer journey and reduce those big gaps. They went from a cycle of fourteen weeks to seven days. They did that by handing the power back to the user.

Sarah’s question for the audience is: What is stopping your user completing your cycle? This can be very difficult. You might have to do horrible things to validate a concept. It’s okay. We’re all perfectionists, but sometimes you have to use quick’n’dirty code to achieve your goal. If the end goal is we’re able to say “hey, this thing worked!” then we can go back and do it properly.

To recap:

  • Respect privacy and build in a personal level of UX adjustment into every product.
  • Outlier data can create superfans of your product.
  • Build the most empathetic experience that you can.

Webmentions at Indie Web Camp Berlin

I was in Berlin for most of last week, and every day was packed with activity:

By the time I got back to Brighton, my brain was full …just in time for FF Conf.

All of the events were very different, but equally enjoyable. It was also quite nice to just attend events without speaking at them.

Indie Web Camp Berlin was terrific. There was an excellent turnout, and once again, I found that the format was just right: a day of discussions (BarCamp style) followed by a day of doing (coding, designing, hacking). I got very inspired on the first day, so I was raring to go on the second.

What I like to do on the second day is try to complete two tasks; one that’s fairly straightforward, and one that’s a bit tougher. That way, when it comes time to demo at the end of the day, even if I haven’t managed to complete the tougher one, I’ll still be able to demo the simpler one.

In this case, the tougher one was also tricky to demo. It involved a lot of invisible behind-the-scenes plumbing. I was tweaking my webmention endpoint (stop sniggering—tweaking your endpoint is no laughing matter).

Up until now, I could handle straightforward webmentions, and I could handle updates (if I receive more than one webmention from the same link, I check it each time). But I needed to also handle deletions.

The spec is quite clear on this. A 404 isn’t enough to trigger a deletion—that might be a temporary state. But a status of 410 Gone indicates that a resource was once here but has since been deliberately removed. In that situation, any stored webmentions for that link should also be removed.

Anyway, I think I got it working, but it’s tricky to test and even trickier to demo. “Not to worry”, I thought, “I’ve always got my simpler task.”

For that, I chose to add a little map to my homepage showing the last location I published something from. I’ve been geotagging all my content for years (journal entries, notes, links, articles), but not really doing anything with that data. This is a first step to doing something interesting with many years of location data.

I’ve got it working now, but the demo gods really weren’t with me at Indie Web Camp. Both of my demos failed. The webmention demo failed quite embarrassingly.

As well as handling deletions, I also wanted to handle updates where a URL that once linked to a post of mine no longer does. Just to be clear, the URL still exists—it’s not 404 or 410—but it has been updated to remove the original link back to one of my posts. I know this sounds like another very theoretical situation, but I’ve actually got an example of it on my very first webmention test post from five years ago. Believe it or not, there’s an escort agency in Nottingham that’s using webmention as a vector for spam. They post something that does link to my test post, send a webmention, and then remove the link to my test post. I almost admire their dedication.

Still, I wanted to foil this particular situation so I thought I had updated my code to handle it. Alas, when it came time to demo this, I was using someone else’s computer, and in my attempt to right-click and copy the URL of the spam link …I accidentally triggered it. In front of a room full of people. It was midly NSFW, but more worryingly, a potential Code Of Conduct violation. I’m very sorry about that.

Apart from the humiliating demo, I thoroughly enjoyed Indie Web Camp, and I’m going to keep adjusting my webmention endpoint. There was a terrific discussion around the ethical implications of storing webmentions, led by Sebastian, based on his epic post from earlier this year.

We established early in the discussion that we weren’t going to try to solve legal questions—like GDPR “compliance”, which varies depending on which lawyer you talk to—but rather try to figure out what the right thing to do is.

Earlier that day, during the introductions, I quite happily showed webmentions in action on my site. I pointed out that my last blog post had received a response from another site, and because that response was marked up as an h-entry, I displayed it in full on my site. I thought this was all hunky-dory, but now this discussion around privacy made me question some inferences I was making:

  1. By receiving a webention in the first place, I was inferring a willingness for the link to be made public. That’s not necessarily true, as someone pointed out: a CMS could be automatically sending webmentions, which the author might be unaware of.
  2. If the linking post is marked up in h-entry, I was inferring a willingness for the content to be republished. Again, not necessarily true.

That second inferrence of mine—that publishing in a particular format somehow grants permissions—actually has an interesting precedent: Google AMP. Simply by including the Google AMP script on a web page, you are implicitly giving Google permission to store a complete copy of that page and serve it from their servers instead of sending people to your site. No terms and conditions. No checkbox ticked. No “I agree” button pressed.

Just sayin’.

Anyway, when it comes to my own processing of webmentions, I’m going to take some of the suggestions from the discussion on board. There are certain signals I could be looking for in the linking post:

  • Does it include a link to a licence?
  • Is there a restrictive robots.txt file?
  • Are there meta declarations that say noindex?

Each one of these could help to infer whether or not I should be publishing a webmention or not. I quickly realised that what we’re talking about here is an algorithm.

Despite its current usage to mean “magic”, an algorithm is a recipe. It’s a series of steps that contribute to a decision point. The problem is that, in the case of silos like Facebook or Instagram, the algorithms are secret (which probably contributes to their aura of magical thinking). If I’m going to write an algorithm that handles other people’s information, I don’t want to make that mistake. Whatever steps I end up codifying in my webmention endpoint, I’ll be sure to document them publicly.

Name That Script! by Trent Walton

Trent is about to pop his AEA cherry and give a talk at An Event Apart in Boston. I’m going to attempt to liveblog this:

How many third-party scripts are loading on our web pages these days? How can we objectively measure the value of these (advertising, a/b testing, analytics, etc.) scripts—considering their impact on web performance, user experience, and business goals? We’ve learned to scrutinize content hierarchy, browser support, and page speed as part of the design and development process. Similarly, Trent will share recent experiences and explore ways to evaluate and discuss the inclusion of 3rd-party scripts.

Trent is going to speak about third-party scripts, which is funny, because a year ago, he never would’ve thought he’d be talking about this. But he realised he needed to pay more attention to:

any request made to an external URL.

Or how about this:

A resource included with a web page that the site owner doesn’t explicitly control.

When you include a third-party script, the third party can change the contents of that script.

Here are some uses:

  • advertising,
  • A/B testing,
  • analytics,
  • social media,
  • content delivery networks,
  • customer interaction,
  • comments,
  • tag managers,
  • fonts.

You get data from things like analytics and A/B testing. You get income from ads. You get content from CDNs.

But Trent has concerns. First and foremost, the user experience effects of poor performance. Also, there are the privacy implications.

Why does Trent—a designer—care about third party scripts? Well, over the years, the areas that Trent pays attention to has expanded. He’s progressed from image comps to frontend to performance to accessibility to design systems to the command line and now to third parties. But Trent has no impact on those third-party scripts. That’s very different to all those other areas.

Trent mostly builds prototypes. Those then get handed over for integration. Sometimes that means hooking it up to a CMS. Sometimes it means adding in analytics and ads. It gets really complex when you throw in third-party comments, payment systems, and A/B testing tools. Oftentimes, those third-party scripts can outweigh all the gains made beforehand. It happens with no discussion. And yet we spent half a meeting discussing a border radius value.

Delivering a performant, accessible, responsive, scalable website isn’t enough: I also need to consider the impact of third-party scripts.

Trent has spent the last few months learning about third parties so he can be better equiped to discuss them.

UX, performance and privacy impact

We feel the UX impact every day we browse the web (if we turn off our content blockers). The Food Network site has an intersitial asking you to disable your ad blocker. They promise they won’t spawn any pop-up windows. Trent turned his ad blocker off—the page was now 15 megabytes in size. And to top it off …he got a pop up.

Privacy can harder to perceive. We brush aside cookie notifications. What if the wording was “accept trackers” instead of “accept cookies”?

Remarketing is that experience when you’re browsing for a spatula and then every website you visit serves you ads for spatula. That might seem harmless but allowing access to our browsing history has serious privacy implications.

Web builders are on the front lines. It’s up to us to advocate for data protection and privacy like we do for web standards. Don’t wait to be told.

Categories of third parties

Ghostery categories third-party providers: advertising, comments, customer interaction, essential, site analytics, social media. You can dive into each layer and see the specific third-party services on the page you’re viewing.

Analyse and itemise third-party scripts

We have “view source” for learning web development. For third parties, you need some tool to export the data. HAR files (HTTP ARchive) are JSON files that you can create from most browsers’ network request panel in dev tools. But what do you do with a .har file? The site har.tech has plenty of resources for you. That’s where Trent found the Mac app, Charles. It can open .har files. Best of all, you can export to CSV so you can share spreadsheets of the data.

You can visualise third-party requests with Simon Hearne’s excellent Request Map. It’s quite impactful for delivering a visceral reaction in a meeting—so much more effective than just saying “hey, we have a lot of third parties.” Request Map can also export to CSV.

Know industry averages

Trent wanted to know what was “normal.” He decided to analyse HAR files for Alexa’s top 50 US websites. The result was a massive spreadsheet of third-party providers. There were 213 third-party domains (which is not even the same as the number of requests). There was an average of 22 unique third-party domains per site. The usual suspects were everywhere—Google, Amazon, Facebook, Adobe—but there were many others. You can find an alphabetical index on better.fyi/trackers. Often the lesser-known domains turn out to be owned by the bigger domains.

News sites and shopping sites have the most third-party scripts, unsurprisingly.

Understand benefits

Trent realised he needed to listen and understand why third-party scripts are being included. He found out what tag managers do. They’re funnels that allow you to cram even more third-party scripts onto your website. Trent worried that this was a Pandora’s box. The tag manager interface is easy to access and use. But he was told that it’s more like a way of organising your third-party scripts under one dashboard. But still, if you get too focused on the dashboard, you could lose focus of the impact on load times. So don’t blame the tool: it’s all about how it’s used.

Take action

Establish a centre of excellence. Put standards in place—in a cross-discipline way—to define how third-party scripts are evaluated. For example:

  1. Determine the value to the business.
  2. Avoid redundant scripts and services.
  3. Fit within the established performance budget.
  4. Comply with the organistional privacy policy.

Document those decisions, maybe even in your design system.

Also, include third-party scripts within your prototypes to get a more accurate feel for the performance implications.

On a live site, you can regularly audit third-party scripts on a regular basis. Check to see if any are redundant or if they’re exceeding the performance budget. You can monitor performance with tools like Calibre and Speed Curve to cover the time in between audits.

Make your case

Do competitive analysis. Look at other sites in your sector. It’s a compelling way to make a case for change. WPO Stats is very handy for anecdata.

You can gather comparative data with Web Page Test: you can run a full test, and you can run a test with certain third parties blocked. Use the results to kick off a discussion about the impact of those third parties.

Talk it out

Work to maintain an ongoing discussion with the entire team. As Tim Kadlec says:

Everything should have a value, because everything has a cost.

100 words 010

  • Nobody writes on their own website any more. People write on Twitter, Facebook, and Medium instead. Personal publishing is dead.
  • You can’t build a complex interactive app on the web without requiring JavaScript. It’s fine if it doesn’t work without JavaScript. JavaScript is ubiquitous.
  • Privacy is dead. The technologies exists to monitor your every movement and track all your communications so it is inevitable that our society will bend to this reality.

These statements aren’t true. But they are repeated so often, as if they were truisms, that we run the risk of believing them and thus, fulfilling their promise.

Analytical

When I was talking about Huffduffer, I mentioned that I don’t have any analytics set up for the site:

To be honest, I’m okay with that—one of the perks of having a personal project is that only metric that really matters is your own satisfaction.

For a while, I did have Google Analytics set up on The Session. But I started to feel a bit uncomfortable about willingly opening up a wormhole between my site and the Google mothership. It bothered me that Adblock Plus would show that one ad had been blocked on the site. There are no ads on the site, but the presence of the Google Analytics code was providing valuable information to Google—and its advertiser customer base—so I can understand why it gets flagged up like any other unwanted tracking.

Theoretically, users have a way of opting out of this kind of tracking by switching on the Do Not Track header (if it isn’t switched on by default). Looking at the default JavaScript code that Google provides for setting up Google Analytics, I don’t see any mention of navigator.doNotTrack.

Now, it may well be that Google sniffs for that header (and abandons any tracking) when its server is pinged via the analytics code, but there’s no way to tell from this side of the Googleplex. I certainly don’t see any mention of it in the JavaScript that gets inserted into our web pages.

I was wondering whether it makes sense to explictly check for the doNotTrack header before opening up that connection to google-analytics.com via a generated script element.

So if the current code looks like:

<script>
// generate a script element
// point it to google-analytics.com
</script>

Would it make sense to wrap it with some kind of test for navigator.doNotTrack:

<script>
if (!navigator.doNotTrack || navigator.doNotTrack != 'yes') {
// generate a script element
// point it to google-analytics.com
}
</script>

For the love of mercy, don’t actually use that code—it’s completely untested and probably causes more harm than good. But you can see the idea that I’m trying to get at, right? Google Analytics most definitely counts as tracking so it seems like the ideal use-case for Do Not Track.

It raises a few questions:

  1. Is anyone doing this already? It might well be that the answer is “no”, not because of any reluctance to respect user preferences but because the doNotTrack spec is very much in flux.
  2. Would you consider doing this?
  3. If you were to do this, could you foresee getting pushback from within your own company?

Tracking

Ajax was a really big deal six, seven, eight years ago. My second book was all about Ajax. I spoke about Ajax at conferences and gave workshops all about using Ajax and progressive enhancement.

During those workshops, I would often point out that Ajax had the potential to be abused terribly. Until the advent of Ajax, it was very clear to a user when data was being submitted to a server: you’d have to click a link or submit a form. As soon as you introduce asynchronous communication, it’s possible for the server to get information from the client even without a full-page refresh.

Imagine, for example, that you’re typing a message into a textarea. You might begin by typing, “Why, you stuck up, half-witted, scruffy-looking nerf…” before calming down and thinking better of it. Before Ajax, there was no way that what you had typed could ever reach the server. But now, it’s entirely possible to send data via Ajax with every key press.

It was just a thought experiment. I wasn’t actually that worried that anyone would ever do something quite so creepy.

Then I came across this article by Jennifer Golbeck in Slate all about Facebook tracking what’s entered—but then erased—within its status update form:

Unfortunately, the code that powers Facebook still knows what you typed—even if you decide not to publish it. It turns out that the things you explicitly choose not to share aren’t entirely private.

Initially I thought there must have been some mistake. I erronously called out Jen Golbeck when I found the PDF of a paper called The Post that Wasn’t: Exploring Self-Censorship on Facebook. The methodology behind the sample group used for that paper was much more old-fashioned than using Ajax:

First, participants took part in a weeklong diary study during which they used SMS messaging to report all instances of unshared content on Facebook (i.e., content intentionally self-censored). Participants also filled out nightly surveys to further describe unshared content and any shared content they decided to post on Facebook. Next, qualified participants took part in in-lab interviews.

But the Slate article was referencing a different paper that does indeed use Ajax to track instances of deleted text:

This research was conducted at Facebook by Facebook researchers. We collected self-censorship data from a random sample of approximately 5 million English-speaking Facebook users who lived in the U.S. or U.K. over the course of 17 days (July 6-22, 2012).

So what I initially thought was a case of alarmism—conflating something as simple as simple as a client-side character count with actual server-side monitoring—turned out to be a pretty accurate reading of the situation. I originally intended to write a scoffing post about Slate’s linkbaiting alarmism (and call it “The shocking truth behind the latest Facebook revelation”), but it turns out that my scoffing was misplaced.

That said, the article has been updated to reflect that the Ajax requests are only sending information about deleted characters—not the actual content. Still, as we learned very clearly from the NSA revelations, there’s not much practical difference between logging data and logging metadata.

The nerds among us may start firing up our developer tools to keep track of unexpected Ajax requests to the server. But what about everyone else?

This isn’t the first time that the power of JavaScript has been abused. Every browser now ships with an option to block pop-up windows. That’s because the ability to spawn new windows was so horribly misused. Maybe we’re going to see similar preference options to avoid firing Ajax requests on keypress.

It would be depressingly reductionist to conclude that any technology that can be abused will be abused. But as long as there are web developers out there who are willing to spawn pop-up windows or force persistent cookies or use Ajax to track deleted content, the depressingly reductionist conclusion looks like self-fulfilling prophecy.