Journal tags: files

7

Directory enquiries

I was talking to someone recently about a forgotten battle in the history of the early web. It was a battle between search engines and directories.

These days, when the history of the web is told, a whole bunch of services get lumped into the category of “competitors who lost to Google search”: Altavista, Lycos, Ask Jeeves, Yahoo.

But Yahoo wasn’t a search engine, at least not in the same way that Google was. Yahoo was a directory with a search interface on top. You could find what you were looking for by typing or you could zero in on what you were looking for by drilling down through a directory structure.

Yahoo wasn’t the only directory. DMOZ was an open-source competitor. You can still experience it at DMOZlive.com:

The official DMOZ.com site was closed by AOL on February 17th 2017. DMOZ Live is committed to continuing to make the DMOZ Internet Directory available on the Internet.

Search engines put their money on computation, or to use today’s parlance, algorithms (or if you’re really shameless, AI). Directories put their money on humans. Good ol’ information architecture.

It turned out that computation scaled faster than humans. Search won out over directories.

Now an entire generation has been raised in the aftermath of this battle. Monica Chin wrote about how this generation views the world of information:

Catherine Garland, an astrophysicist, started seeing the problem in 2017. She was teaching an engineering course, and her students were using simulation software to model turbines for jet engines. She’d laid out the assignment clearly, but student after student was calling her over for help. They were all getting the same error message: The program couldn’t find their files.

Garland thought it would be an easy fix. She asked each student where they’d saved their project. Could they be on the desktop? Perhaps in the shared drive? But over and over, she was met with confusion. “What are you talking about?” multiple students inquired. Not only did they not know where their files were saved — they didn’t understand the question.

Gradually, Garland came to the same realization that many of her fellow educators have reached in the past four years: the concept of file folders and directories, essential to previous generations’ understanding of computers, is gibberish to many modern students.

Dr. Saavik Ford confirms:

We are finding a persistent issue with getting (undergrad, new to research) students to understand that a file/directory structure exists, and how it works. After a debrief meeting today we realized it’s at least partly generational.

We live in a world ordered only by search:

While some are quite adept at using labels, tags, and folders to manage their emails, others will claim that there’s no need to do because you can easily search for whatever you happen to need. Save it all and search for what you want to find. This is, roughly speaking, the hot mess approach to information management. And it appears to arise both because search makes it a good-enough approach to take and because the scale of information we’re trying to manage makes it feel impossible to do otherwise. Who’s got the time or patience?

There are still hold-outs. You can prise files from Scott Jenson’s cold dead hands.

More recently, Linus Lee points out what we’ve lost by giving up on directory structures:

Humans are much better at choosing between a few options than conjuring an answer from scratch. We’re also much better at incrementally approaching the right answer by pointing towards the right direction than nailing the right search term from the beginning. When it’s possible to take a “type in a query” kind of interface and make it more incrementally explorable, I think it’s almost always going to produce a more intuitive and powerful interface.

Directory structures still make sense to me (because I’m old) but I don’t have a problem with search. I do have a problem with systems that try to force me to search when I want to drill down into folders.

I have no idea what Google Drive and Dropbox are doing but I don’t like it. They make me feel like the opposite of a power user. Trying to find a file using their interfaces makes me feel like I’m trying to get a printer to work. Randomly press things until something happens.

Anyway. Enough fist-shaking from me. I’m going to ponder Linus’s closing words. Maybe defaulting to a search interface is a cop-out:

Text search boxes are easy to design and easy to add to apps. But I think their ease on developers may be leading us to ignore potential interface ideas that could let us discover better ideas, faster.

Downloading from Google Fonts

If you’re using web fonts, there are good performance (and privacy) reasons for hosting your own font files. And fortunately, Google Fonts gives you that option. There’s a “Download family” button on every specimen page.

But if you go ahead and download a font family from Google Fonts, you’ll notice something a bit odd. The .zip file only contains .ttf files. You can serve those on the web, but it’s far from the best choice. Woff2 is far leaner in file size.

This means you need to manually convert the downloaded .ttf files into .woff or .woff2 files using something like Font Squirrel’s generator. That’s fine, but I’m curious as to why this step is necessary. Why doesn’t Google Fonts provide .woff or .woff2 files in the downloaded folder? After all, if you choose to use Google Fonts as a third-party hosting service for your fonts, it most definitely serves up the appropriate file formats.

I thought maybe it was something to do with the licensing. Maybe some licenses only allow for unmodified truetype files to be distributed? But I’ve looked at fonts with different licenses—some have Apache 2 licensing, some have Open Font licensing—and they’re all quite permissive and definitely allow for modification.

Maybe the thinking is that, if you’re hosting your own font files, then you know what you’re doing and you should be able to do your own file conversion and subsetting. But I’ve come across more than one website in the wild serving up .ttf files. And who can blame them? They want to host their own font files. They downloaded those files from Google Fonts. Why shouldn’t they assume that they’re good to go?

It’s all a bit strange. If anyone knows why Google Fonts only provides .ttf files for download, please let me know. In a pinch, I will also accept rampant speculation.

Trys also pointed out some weird default behaviour if you do let Google Fonts do the hosting for you. Specifically if it’s a variable font. Let’s say it’s a font with weight as a variable axis. You specify in advance which weights you’ll be using, and then it generates separate font files to serve for each different weight.

Doesn’t that defeat the whole point of using a variable font? I mean, I can see how it could result in smaller file sizes if you’re just using one or two weights, but isn’t half the fun of having a weight axis that you can go crazy with as many weights as you want and it’s all still one font file?

Like I said, it’s all very strange.

Optimise without a face

I’ve been playing around with the newly-released Squoosh, the spiritual successor to Jake’s SVGOMG. You can drag images into the browser window, and eyeball the changes that any optimisations might make.

On a project that Cassie is working on, it worked really well for optimising some JPEGs. But there were a few images that would require a bit more fine-grained control of the optimisations. Specifically, pictures with human faces in them.

I’ve written about this before. If there’s a human face in image, I open that image in a graphics editing tool like Photoshop, select everything but the face, and add a bit of blur. Because humans are hard-wired to focus on faces, we’ll notice any jaggy artifacts on a face, but we’re far less likely to notice jagginess in background imagery: walls, materials, clothing, etc.

On the face of it (hah!), a browser-based tool like Squoosh wouldn’t be able to optimise for faces, but then Cassie pointed out something really interesting…

When we were both at FFConf on Friday, there was a great talk by Eleanor Haproff on machine learning with JavaScript. It turns out there are plenty of smart toolkits out there, and one of them is facial recognition. So I wonder if it’s possible to build an in-browser tool with this workflow:

  • Drag or upload an image into the browser window,
  • A facial recognition algorithm finds any faces in the image,
  • Those portions of the image remain crisp,
  • The rest of the image gets a slight blur,
  • Download the optimised image.

Maybe the selecting/blurring part would need canvas? I don’t know.

Anyway, I thought this was a brilliant bit of synthesis from Cassie, and now I’ve got two questions:

  1. Does this exist yet? And, if not,
  2. Does anyone want to try building it?

Detecting image requests in service workers

In Going Offline, I dive into the many different ways you can use a service worker to handle requests. You can filter by the URL, for example; treating requests for pages under /blog or /articles differently from other requests. Or you can filter by file type. That way, you can treat requests for, say, images very differently to requests for HTML pages.

One of the ways to check what kind of request you’re dealing with is to see what’s in the accept header. Here’s how I show the test for HTML pages:

if (request.headers.get('Accept').includes('text/html')) {
    // Handle your page requests here.
}

So, logically enough, I show the same technique for detecting image requests:

if (request.headers.get('Accept').includes('image')) {
    // Handle your image requests here.
}

That should catch any files that have image in the request’s accept header, like image/png or image/jpeg or image/svg+xml and so on.

But there’s a problem. Both Safari and Firefox now use a much broader accept header: */*

My if statement evaluates to false in those browsers. Sebastian Eberlein wrote about his workaround for this issue, which involves looking at file extensions instead:

if (request.url.match(/\.(jpe?g|png|gif|svg)$/)) {
    // Handle your image requests here.
}

So consider this post a patch for chapter five of Going Offline (page 68 specifically). Wherever you see:

if (request.headers.get('Accept').includes('image'))

Swap it out for:

if (request.url.match(/\.(jpe?g|png|gif|svg)$/))

And feel to add any other image file extensions (like webp) in there too.

Re-finding five numbers

So, remember when I posted all those episodes of Simon Singh’s Five Numbers radio series on Pownce so that they’d have permanent URLs? Yeah, well, so much for that.

Fortunately Brian had saved all the MP3s. I’ve posted them on S3 and huffduffed them all. I can be fairly confident that Huffduffer won’t be going the way of Pownce, Magnolia, Geocities, and so many more.

Anyway, if you want to listen to the fifteen episodes of the three radio series’ on mathematics, you can subscribe to the podcast at https://huffduffer.com/adactio/tags/five+numbers/rss.

Or you can listen to each episode at these permanent URLs:

  1. Five Numbers

    1. A Countdown to Zero
    2. Simple as Pi
    3. The Golden Ratio
    4. The Imaginary Number
    5. Infinity
  2. Another Five Numbers

    1. The Number Four
    2. The Number Seven
    3. The Largest Prime Number
    4. Kepler’s Conjecture
    5. Game Theory
  3. A Further Five Numbers

    1. 1 — The Most Popular Number
    2. 2 — At The Double
    3. 6 Degrees of Separation
    4. 6.67 x 10^-11 – The Number That Defines the Universe
    5. 1729 — The First Taxicab Number

Finding five numbers

I like Tumblr. I like Pownce. They both make it very quick and easy to post discrete quanta of information. I use Pownce for posting audio files and links to videos. I use Tumblr to post quotations. But both services suffer from the same problem: refindability.

Magnolia and Delicious encourage tagging. Those tags can then surface some pretty interesting aggregate behaviour but first and foremost, they’re useful for the individual doing the tagging. It’s pretty easy for me to track down something I bookmarked on Magnolia even if it was quite a while back. I don’t need to keep a list of all the tags I’ve ever used: I just need to search for a word that I think I might have used when I was tagging a bookmark. While it would be very difficult for me to try to second-guess how someone else might describe something, it’s usually pretty easy to put myself in the shoes of my past self.

As my store of data on Pownce and Tumblr increases, I’m starting to miss tagging (or any kind of search) more and more. Then again, I can understand why both services would resist that kind of scope creep. Both services rely on their simplicity. Adding another field to fill in could potentially be a road block between the user and the task they want to accomplish (although it doesn’t feel that way with Delicious or Magnolia). Update: it turns out that you can tag in Tumblr but it’s hidden behind the “advanced” link. Thanks to Keith Bell for pointing that out.

Here’s a case in point. Over time I’ve been posting MP3 files to Pownce of a series of radio programmes by Simon Singh, author of The Code Book — a superb piece of work. The audio from the radio programmes is available from the BBC website but only in Real Audio which, let’s face it, is complete pants. I originally got the MP3 files from Brian but after a catastrophic hard drive crash, I realised that it would be better to store them at an addressable URL. Besides, I wanted to geek out with my mathematically-minded friends. Pownce’s raison d’être is sharing stuff with friends so it seemed like the perfect home for the Singh files.

But without any kind of tagging or search, there’s no easy way for me or anyone else to revisit just those files at a later date. As a temporary patch, I’m listing the URLs for the Pownce posts that correspond to each episode. If you want to download the files, you’ll need to log in to Pownce.

  1. Five Numbers

    1. A Countdown to Zero
    2. Simple as Pi
    3. The Golden Ratio
    4. The Imaginary Number
    5. Infinity
  2. Another Five Numbers

    1. The Number Four
    2. The Number Seven
    3. The Largest Prime Number
    4. Kepler’s Conjecture
    5. Game Theory
  3. A Further Five Numbers

    1. 1 — The Most Popular Number!
    2. 2 — At the Double
    3. 6 Degrees of Separation
    4. 6.67 x 10-11 — The Number that Defines the Universe
    5. 1729 — The First Taxicab Number

Pownce

The latest social networking app de jour is called Pownce. Like most people, I signed up a few days ago and starting playing around.

If you read the 140 character reviews of Pownce on Twitter, you’d be forgiven for thinking that Pownce is some kind of Twitter clone. Here, for example, is the collected wisdom of Paul Boag:

Just dont get pownce. Just feels like Twitter but i need to invite all my friends again

Not sure I can be bothered to update both twitter and pownce. Might have to make a decision soon.

It’s understandable, I suppose. Pownce lets you send little updates… just like Twitter. You can share links… just like Del.icio.us. You can share share events… just like Upcoming. So comparing Pownce to any of these services is understandable, I suppose. But I am reminded of the story of the blind men and the elephant. It seems that many of my own friends are displaying a disappointing lack of imagination by only comparing Pownce to what they already know.

The key feature of Pownce is the ability to share files. If you read the about page, the service is defined in a nutshell:

Pownce is a way to send stuff to your friends.

Stuff + friends. And like all the best apps, it was built to scratch an itch:

Pownce is brought to you by a bunch of geeks who were frustrated trying to send stuff from one cube to another.

If you want to compare it to anything, Dropsend feels like the closest competitor. Pownce is a pain-free way of sharing music, video and images amongst a discrete group of people.

And that’s the other key point: groups of people. It’s no coincidence that this app has support for groups built in from the start. The combination of file sharing with groups could potentially make it a killer app. It could be a social app like Twitter or whatever, but I think it could just as easily be a productivity app, more akin to something from 37 Signals.

Here’s an example: I’ve got everyone in the Clearleft office signed up. Each of us can have as many friends as we want but as long as we each have a Clearleft group, we can share files, links, events and notes with one another.

I’ve also created a Britpack group. If enough of my fellow Illuminati sign up, I can share stuff privately with them—something I can’t do on the mailing list because it quite rightly strips out attachments.

Another potential use would be for my band, Salter Cane. Emailing songs around is a royal pain. Being able to share MP3 files with an addressable but private URL could be really handy.

Far from being another Twitter or Jaiku, Pownce is a completely different part of the ecosystem of the social web.

I still plan to put public events on Upcoming and videos on YouTube, Viddler, Vimeo or wherever. But for that space between private and public, when I want to share something with a certain number of people, Pownce sure beats CCing a bunch of email addresses.

There’s another unspoken advantage that Pownce has over other social uploading sites like YouTube. If you’re sharing a file that might be slightly bending the law around license agreements or copyright, the ability to restrict the circulation could save everyone a lot of hassle. What the RIAA and MPAA don’t know won’t hurt ‘em.

The utility of Pownce isn’t the only reason I like it. It’s also really nicely designed. I don’t just mean the visual design—which is lovely, thanks to Daniel. The interaction design is well thought-out.

This is a surprisingly full-featured app considering that just four people put it together. There was just one full-time programmer for the website: Leah Culver. In spite of that, the site has launched (still in Alpha) with a whole bunch of features. The notifications and privacy settings, for example, are really nicely done. There’s also a nice “friends of friends” feature to help you track down people you might know.

Oh, and it’s got one of the best 404 pages ever.

Under the hood, everything has been put together with Django with storage handled by Amazon’s Simple Storage Service. If you peek into the markup, you’ll also find a bunch of nice microformats.

There’s also a desktop app for the service. It’s built using AIR née Apollo. It’s pretty slick and frankly, seeing an independent product like this is going to be far more likely to convince me of the benefits of the platform than any product demo from Adobe.

There are whole bunch of other little things that I like about Pownce that add to its personality—like the gender options in the profile form or the ability to choose themes—but I’ll stop going on about it. The key thing is that I can see this service filling a need through the combination of groups + file sharing.

If you’ve tried Pownce and come away feeling that it’s just like Twitter, you’re doing it wrong.