Link tags: chatgpt

12

A Coder Considers the Waning Days of the Craft | The New Yorker

GPT-4 is impressive, but a layperson can’t wield it the way a programmer can. I still feel secure in my profession. In fact, I feel somewhat more secure than before. As software gets easier to make, it’ll proliferate; programmers will be tasked with its design, its configuration, and its maintenance. And though I’ve always found the fiddly parts of programming the most calming, and the most essential, I’m not especially good at them. I’ve failed many classic coding interview tests of the kind you find at Big Tech companies. The thing I’m relatively good at is knowing what’s worth building, what users like, how to communicate both technically and humanely. A friend of mine has called this A.I. moment “the revenge of the so-so programmer.” As coding per se begins to matter less, maybe softer skills will shine.

Robots.txt - Jim Nielsen’s Blog

I realized why I hadn’t yet added any rules to my robots.txt: I have zero faith in it.

Documentation for GPTBot - OpenAI API

Now that the horse has bolted—and ransacked the web—you can shut the barn door:

To disallow GPTBot to access your site you can add the GPTBot to your site’s robots.txt:

User-agent: GPTBot
Disallow: /

In new AI hype frenzy, tech is applying the label to everything now

Today’s AI promoters are trying to have it both ways: They insist that AI is crossing a profound boundary into untrodden territory with unfathomable risks. But they also define AI so broadly as to include almost any large-scale, statistically-driven computer program.

Under this definition, everything from the Google search engine to the iPhone’s face-recognition unlocking tool to the Facebook newsfeed algorithm is already “AI-driven” — and has been for years.

Will GPT models choke on their own exhaust? | Light Blue Touchpaper

There’s a general consensus that large language models are going to get better and better. But what if this as good as it gets …before the snake eats its own tail?

The tails of the original content distribution disappear. Within a few generations, text becomes garbage, as Gaussian distributions converge and may even become delta functions. We call this effect model collapse.

Just as we’ve strewn the oceans with plastic trash and filled the atmosphere with carbon dioxide, so we’re about to fill the Internet with blah. This will make it harder to train newer models by scraping the web, giving an advantage to firms which already did that, or which control access to human interfaces at scale.

Today’s AI is unreasonable - Anil Dash

Today’s highly-hyped generative AI systems (most famously OpenAI) are designed to generate bullshit by design. To be clear, bullshit can sometimes be useful, and even accidentally correct, but that doesn’t keep it from being bullshit. Worse, these systems are not meant to generate consistent bullshit — you can get different bullshit answers from the same prompts. You can put garbage in and get… bullshit out, but the same quality bullshit that you get from non-garbage inputs! And enthusiasts are current mistaking the fact that the bullshit is consistently wrapped in the same envelope as meaning that the bullshit inside is consistent, laundering the unreasonable-ness into appearing reasonable.

“Artificial Intelligence & Humanity,” an article by Dan Mall

AI is great anything quantity-related and bad and anything quality-related.

Sensible thinking from Dan here, that mirrors what we’re thinking at Clearleft.

In other words, it leans heavily on averages; the closer the training data matches an average, the higher degree of confidence that the result is more “correct,” or at least desirable.

The problem is that this is the polar opposite of what we consider creativity to be. Creativity isn’t about averages. It’s about the outliers, sometimes the one thing that’s different than all the rest.

ChatGPT is not ‘artificial intelligence.’ It’s theft. | America Magazine

But in calling these programs “artificial intelligence” we grant them a claim to authorship that is simply untrue. Each of those tokens used by programs like ChatGPT—the “language” in their “large language model”—represents a tiny, tiny piece of material that someone else created. And those authors are not credited for it, paid for it or asked permission for its use. In a sense, these machine-learning bots are actually the most advanced form of a chop shop: They steal material from creators (that is, they use it without permission), cut that material into parts so small that no one can trace them and then repurpose them to form new products.

We need to tell people ChatGPT will lie to them, not debate linguistics

There’s a time for linguistics, and there’s a time for grabbing the general public by the shoulders and shouting “It lies! The computer lies to you! Don’t trust anything it says!”

Why ChatGPT Won’t Replace Coders Just Yet

I’ve been using Copilot for over a year now, and this is more or less how I use it: To help me quickly blast through boilerplate code so I can more quickly get to the tricky bits.

There’s a more subtle problem with ChatGPT’s code generation, which is that it suffers from ChatGPT’s general “bullshit” problem.

ongoing by Tim Bray · The LLM Problem

It doesn’t bother me much that bleeding-edge ML technology sometimes gets things wrong. It bothers me a lot when it gives no warnings, cites no sources, and provides no confidence interval.

Yes! Like I said:

Expose the wires. Show the workings-out.

ChatGPT Is a Blurry JPEG of the Web | The New Yorker

A very astute framing by Ted Chiang—large language models as a form of lossy compression for text.

When we’re dealing with sequences of words, lossy compression looks smarter than lossless compression.

A lot of uses have been proposed for large language models. Thinking about them as blurry JPEGs offers a way to evaluate what they might or might not be well suited for.