Blind prompting is not prompt engineering, StableLM, Bark, the anatomy of autonomy, HealthGPT, WebGPT & more

Changelog News

Developer news worth your attention

What up, nerds! šŸ‘‹

Jerod here with your weekly dose of news-y goodness. By the way, one of my favorite things about our new email format is we no longer proxy links through email.changelog.com! Thatā€™s awesome for two reasons:

  1. šŸ“ˆ Privacy: we have no idea which links youā€™re clicking on
  2. šŸ“ˆ UX: you can hover on a link to see where itā€™s gonna take you

If you appreciate direct links as much as I do, please tell your friends! Oh, and donā€™t forget you can listen to this issueā€™s companion audio right here. šŸŽ§

Ok, letā€™s get into the news.


The dataset wars are heating up

The NY Times reports that Reddit will begin charging for access to its API. They appear to be following Twitterā€™s playbook here, only with much better tactics. They wonā€™t be charging small-time researchers or indie bot/app developers. Itā€™s companies like Google and OpenAI who want the data to power their machine learning projects who will have to pony up.

(Stack Overflow is also getting in on that action.)

Bark is a transformer-based text-to-audio model

The team at Suno.ai is helping change the game in text-to-speech realism. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. It can also laugh, sigh, cry & make other non-word sounds that people make.

Bark input example

The examples are very impressive.

Kent Beck needs to recalibrate

In a tweet that blew up, extreme programming creator Kent Beck proclaimed:

Iā€™ve been reluctant to try ChatGPT. Today I got over that reluctance. Now I understand why I was reluctant.

The value of 90% of my skills just dropped to $0. The leverage for the remaining 10% went up 1000x. I need to recalibrate.

He expands on that statement in a full-on blog post by telling the story of his ā€œa-haā€ moment. If youā€™re hoping for a scientific explanation of the 90/10 split and which remaining skills get the 1000x boostā€¦ donā€™t:

ā€¦why did I conclude that 90% of my skills had become (economically) worthless? Iā€™m extrapolating wildly from a couple of experiences, which is what I do.

StableLM (from the team behind Stable Diffusion)

On April 19th, Stability AI released a new open source language model theyā€™re calling StableLM. Itā€™s currently available in 3 billion and 7 billion parameters, with 15 billion to 65 billion parameter models to follow.

This model is usable (and adaptable) for both commercial and research purposes. But thatā€™s not all:

We are also releasing a set of research models that are instruction fine-tuned. Initially, these fine-tuned models will use a combination of five recent open-source datasets for conversational agents: Alpaca, GPT4All, Dolly, ShareGPT, and HH. These fine-tuned models are intended for research use only and are released under a noncommercial CC BY-NC-SA 4.0 license, in-line with Stanfordā€™s Alpaca license.

I love how much the open(ish) AI advancements build and feed off one another. More like this, please!

Code coverage insights in your stack traces

Big thanks to Sentry for sponsoring this weekā€™s Changelog News! šŸ’°

For those of you who are new to the idea of code coverage, itā€™s a testing technique that informs what code IS or IS NOT tested. Itā€™s often represented as a percentage of the number of lines of code that are tested versus the entire codebase.

Instead of spending time writing tests with little to no visibility if the tests actually give you meaningful coverage in a given change, using Sentryā€™s integration with Codecov allows you to see the untested code causing errors directly in the Sentry Issue stack trace. Which means, no more time wasted trying to analyze your codebase to find out where you need test coverage.

Hereā€™s a quote from Alex Nathanail, Director of Technology at Vectare.

Improving our test coverage has meant dedicating a ā€œcoverage weekā€ every few months to improve tests in places I think are important. With the Sentry and Codecov integration, I no longer have to analyze our codebase and spend cycles thinking about where we need test coverage, instead, Sentry just tells me exactly where I need to focus - saving me several weeks out of my year and reducing my time spent on building test coverage by nearly 50%.

Learn more about this integration with Sentry and Codecov and how to set it up.

Blind prompting is not prompt engineering

Mitchell Hashimoto weighs in on prompt engineering:

A lot of people who claim to be doing prompt engineering today are actually just blind prompting. ā€œBlind Promptingā€ is a term I am using to describe the method of creating prompts with a crude trial-and-error approach paired with minimal or no testing and a very surface level knowedge of prompting. Blind prompting is not prompt engineering.

In this blog post, Mitchell makes the argument that prompt engineering is a real skill that can be developed based on real experimental methodologies. He uses a realistic example to walk through the process of prompt engineering a solution to a problem that provides practical value to an application.


šŸ’Ŗ Cool projects that have nothing to do with AI

Articles that almost made the audio edition

1ļøāƒ£ Anima Omnium thinks we should keep stuff linkable in our web writings, so much so that they wrote a Python script (aptly named: Linkoln) to help find links for posts written in Markdown.

To limit the scope of this prototype, Linkoln just uses a full-web search engine. In the future, I plan to run it against my browser history, a database of articles I have saved, blogs I trust, and so on, only resorting to the public web as a last resort.

2ļøāƒ£ swyx is at it again with a great post on the anatomy of autonomy.

The Anatomy of Autonomy

Are AI agents the next killer app after ChatGPT? It seems likely at this point. Thereā€™s a lot of money to be made by whoever brings these abilities to the masses. Itā€™s like giving people IFTTT, but not making think about the IF, the THIS, or the THEN. Just give your AI agent the THAT and let it worry about the rest.

3ļøāƒ£ Jim Nielsen doing what he does so well: consuming another developersā€™ content and writing up a summary with his own thoughts mingled in. This time, Jim watched Peter Van Hardenbergā€™s Local-first Software and named his summary post, Offline Is Just Online With Extreme Latency.

4ļøāƒ£ Justin Searls with some timely (ok maybe slightly tardy) hiring advice:

Missing in the coverage about the economic slowdown and ensuing tech layoffs is that this was all predictable years ago. It didnā€™t have to be this way.

Many of these layoffs never had to happen, because a huge number of the roles being eliminated never made sense as long-term, full-time positions to begin withā€¦ if you staff full-timers to meet the peak of your companyā€™s demand for engineering capacity, be prepared for when that demand dips. Because it always dips.


šŸ¦¾ Cool projects that have everything to do with AI

  • HealthGPT lets iOS users interact with their health data using natural language
  • Izzy Miller replaced his best friends with an LLM trained on 500k group chat messages
  • WebGPT is an implementation of GPT inference in less than 2k lines of vanilla jS using the new WebGPU API
  • Speaking of WebGPUā€¦ Ben Schmidt posted this excellent ELI5 thread on why itā€™s more important than you might think

šŸŽ§ ICYMI: Good pods that are ready for your ear holes

  • Yo! Zach Latta joined The Changelog to tell us all about Hack Club; the program he wished he had in high scool
  • Hugging Face is back on Practical AI. This time Rajiv Shah schools Daniel & Chris on LLM capabilities / options, in-context learning, reasoning, related tooling & the rapidly growing data science community on TikTok
  • Weā€™re gearing up for another Frontend Feud! In the meantime, listen to the last battle when the CSS Podcast defended their title against the guys from @keyframers
  • Did you know I was on Robby Russellā€™s Maintainable podcast awhile back? We discussed clarity over cleverness, the value of an automated test suite, what to consider when pulling in third-party code & more. Still relevant stuff!

šŸ¤© Devs just wanna have fun (sometimes)

  • The 13 ugliest phones in the Mobile Phone Museumā€™s collection of over 2600 models
  • Scroll up to ride this Space Elevator from Earthā€™s surface to the moon
  • A synthwave styled space shooter in JS, inspired by the 80s arcades Play | Source

pop-shoot screen cap

šŸŽ„ Videos worth watching


SHOUT OUT to our newest Changelog++ supporters: Jordan D, Brian H, Wolodja W, Liam J, Jack R, Richard H Aaron R, Enmanual R, David T, John S, Richard B, Matthew OD, Joe R, Max E, Carl J & Anthony RJ! šŸ’š

Oh, and heads up to all our ++ members: youā€™ll be receiving a special email from me later this week šŸ“„


Thatā€™s the news for now!

On this weekā€™s Changelog interview episode, Adam sits down with Andrew Klein from Backblaze to chat hard drive reliability and how they manage more than 250,000 hard drives.

Have a great week, forward this to a friend if you dig it, and Iā€™ll talk to you again next time. šŸ«”

ā€“Jerod