Search results for PALADINMINING.COM💯tesla token

blog.npmjs.org

npm token scanning extending to GitHub

The npm team is collaborating with GitHub on a new service that will automatically check for tokens that might have been accidentally pushed up to a repository and then automatically revoke them if they are valid. This will help to quickly mitigate attack vectors that might arise from the accidental oversharing of credentials for projects. From the post:

Whenever you commit or push a change to GitHub in a public repository and an npm token is found in the change, it is sent to npm for validation. If it’s valid, we will revoke it and notify the maintainer of this action via email.

appleinsider.com

How Apple could kill CAPTCHAs

AppleInsider explains Apple’s new Private Access Tokens (PAT) tech announced at WWDC:

Using a new HTTP authentication method called PrivateToken, a server uses cryptography to verify a client passed an iCloud attestation check.

When the client needs a token it contacts an attester — in this case, Apple — which performs the process using certificates stored in the device’s Secure Enclave.

I’ve been waiting for someone to kill CAPTCHAs for us, but this will be an Apple-only solution for now:

The company is working to help make Private Access Tokens a web standard, but there is no mention of tokens working on Android or Windows. People on those platforms may have to put up with CAPTCHAs, for now — or wait for Microsoft’s and Google’s work on the matter.

I believe this is the draft of the standard that they’re referring to. Cloudflare also has a nice article on their work in this space.

github.com

Chain-bench – a tool for auditing your software supply chain

Chain-bench is an open source tool for auditing your software supply chain stack for security compliance based on a new CIS Software Supply Chain benchmark.

You can run the tool from a CLI, assuming your code is hosted on GitHub (more SCM hosts coming soon):

chain-bench scan --repository-url <REPOSITORY_URL> --access-token <TOKEN> -o <OUTPUT_PATH>

I couldn’t find a comprehensive list of what checks are in the benchmark, but it appears they are referring to this guide. You can see what an example run’s results like like in the README.

Medium (via Scribe)

Something new is brewing

Max Howell, creator of Homebrew, has gone back to his notes on brew2 to apply web3 concepts to help “distribute value to open source.” He’s calling this new brew tea.

Tools like Homebrew lie beneath all development tools, assisting developers to actually get development done. We know the graph of all open source, which means we’re uniquely placed to innovate in interesting and exciting ways. This is exactly what tea will do. We’re taking our knowledge of how to make development more efficient and throwing innovations nobody has ever really considered before.

With plans to move the package registry on-chain, Max lays out the numerous benefits due to “inherent benefits of blockchain technology”:

Packages will be immutable (no more left-pad incidents)
Packages will always be available (we’ll use decentralized storage)
Releases will be signed by the maintainers themselves (rather than a middleman you are told you can trust)
Tools can be built to fundamentally verify the integrity of your app’s open source constitution
Token can flow through the graph

Max says “token flowing is where things get really interesting,” and goes on to say “with our system people who care about the health of the open source ecosystem buy some token and stake it.”

(Thanks to Omri Gabay for sharing this first in our community Slack)

Ship It! #20

Kaizen! Five incidents later

This is our second Kaizen episode, where Adam, Jerod & Gerhard talk about changelog.com improvements since episode 10. OK, so Gerhard deleted the DNS API token. Not only did he take the time to understand how that happened, so that he could actually learn from his mistake, but now we have a system in place so that we can share learnings from incidents. By the way, these are publicly available in our #incidents Slack channel.

A great & unexpected thing that happened since we recorded this episode, is Jerod fixing 99% of all the errors that were happening in prod. The top error was the broken Twitter auth - sorry Matt - which was a result of us upgrading to OTP 24 a few months back. Episode 3 show notes include a YouTube stream which captures it all.

We wrap up this episode by each of us sharing the improvements that we would like to do until our next Kaizen. You heard it from Adam first: Ship It Driven Development

Changelog Interviews #323

The road to Brave 1.0 and BAT

This week Adam and Jerod talk with Brian Bondy, Co-founder and CTO of Brave. They talked through the beginnings of Brave and how BAT (Basic Attention Token) could be driving the future of how we offer funding and tips to our favorite websites and content creators. Of course, they go deep into the historical and the technical details of the Brave browser and their march to Brave 1.0. The last segment of the show covers how BAT works, how it’s being used, and also their interesting spin on an ad model that respects the user’s privacy.

Bloomberg

A big crypto sell-off is happening for Bitcoin and Ether

Bloomberg is citing a sell-off of Bitcoin, Ether, and dozens of smaller digital tokens. The “crypto exodus” is happening due to a “sense of panic” hitting crypto investors. It’s been a brutal August for Bitcoin and Ether, with Bitcoin touching below $6,000.

“The big story in the market today is the huge weakness in Ethereum,” Timothy Tam, chief executive officer of CoinFi said in a phone interview — “Bitcoin has held up relatively well versus Ethereum. It’s still quite weak versus the U.S. dollar.”

While cryptocurrencies rallied in July on hopes that a Bitcoin-backed exchange-traded fund would attract new investors, U.S. regulators have yet to sign off on multiple proposals for such a product. The letdown has coincided with growing concern that entrepreneurs who raised crypto-denominated funds via initial coin offerings (ICO) are now cashing out of holdings such as Ether, the token for the Ethereum blockchain that is a popular platform for crypto projects.

What do you think? Are you selling, buying, or holding?

trivelop.de

Advanced techniques for architecting flow in Elixir

René Föhring shows how Plug.Conn and Ecto.Changeset are great examples of an advanced style of “flow” management where your program hands down a “token” (in practice, an Elixir struct) during execution.

changelog.com/posts

travis – distributed CI for the Ruby community using Rails, WebSockets, and Redis

Berlin based Rubyist Sven Fuchs asks if Java-based Jenkins is the best CI tool for open source Ruby projects.

Sven writes:

Instead, imagine a simple and slim build server tool that is maintained by the Ruby community itself (just like Gemcutter is, or many other infrastructure/tool-level projects are) in order to support all the open-source Ruby projects/gems we’re using every day.

Instead of just imagining, Sven and others have been working toward that vision with Travis, an extremely alpha Rails project. Travis is a single-page application built in Rails and uses Backbone.js as a client-side MVC frontend.

How it works

By configuring a post-receive URL in your GitHub project settings, GitHub will ping Travis when new git commits are received. Travis will then schedule a build in Resque, a Redis queue. Travis then uses Websockets courtesy of PusherApp to update registered browsers on build status as it runs in the background.

Take a look at some of the projects getting built over at travis-ci.org, the project’s new home page or checkout Sven’s quick tour of Travis in this screencast:

Usage

Currently, the hosted edition of Travis is open to anyone with a GitHub account. Just sign in with GitHub. Once you’re in, grab your Travis build token and configure a post-receive URL in your GitHub project’s Service Hooks page:

http://[YOUR_GITHUB_USERNAME]:[YOUR_TRAVIS_TOKEN]@travis-ci.org/builds

Host Travis yourself

If you want to run your own instance, you’ll need to set up configuration settings:

$ cp config/travis.example.yml config/travis.yml

If you want to run on Heroku, you’ll need to set some ENV variables

$ rake heroku:config

IF you’re running locally, you can start a worker with

$ RAILS_ENV=production VERBOSE=true QUEUE=builds rake resque:work

… or if you’re using God:

$ cp config/resque.god.example config/resque.god
  $ god -c config/resque.god

How you can help

Travis is in EARLY ALPHA. Sven and gang are looking for folks to help test, log issues, and submit patches. If you want to join the community, join the Google Group or hang out in #travis on IRC.

Special thanks

Sven and team would like to offer a special thanks to Pusher App for donating a Big Boy account for the project. If you’d like to pitch in with the compute side of the project, (we’re looking at you Heroku or Linode), please ping Sven.

[Source on GitHub] [Blog post] [Discuss on HN]

changelog.com/posts

GithubNotifier: Growl notifications for GitHub updates

In episode #35, Max wished for an app that would give him Growl alerts any time someone added a Homebrew formula to any Homebrew fork.

Well, Clint Shryock has created just such an app. Github Notifier is a simple menu bar application for OS X that listens for GitHub updates in any of your repos and then alerts you via Growl:

To install, you can download the latest release and drag to and launch from your /Applications folder. Add your GitHub login and API Token

… and set your refresh interval.

[Source on GitHub]

Changelog Interviews #644

We're all Builders now

We’re on location at Microsoft Build 2025 with Amanda Silver, Corporate Vice President of Microsoft’s Developer Division. Amanda leads product, design, user research, and engineering systems for some of the tools you use every day. We discuss the latest AI announcements from Microsoft at Build 2025, how AI is reshaping development tools, what’s next for VS Code, TypeScript, GitHub’s evolution, and even emerging editors like Windsurf that are forking the VS Code ecosystem.

Matched from the episode's transcript 👇

Amanda Silver: Nowadays we have benchmarks for these SWE-agent kinds of models that are coming out. And so part of what we – there’s different kinds of techniques that you go through in terms of kind of getting the performance of the benchmark to be better, because you can optimize for different things. You can optimize for token consumption. So like “What’s the cheapest way to accomplish it?” You could optimize for performance. “How can I complete the job more quickly?” You could optimize for accuracy. So in a lot of senses, what you’re saying is actually not wrong. And I do think that over time, when we think about different competing agents that could actually go and fulfill your job, you could have ones that are experts in different types of tasks.

Changelog Interviews #643

The Web Development Engine

We’re joined by Andreas Møller, Co-founder of Nordcraft — the team behind Nordcraft Engine, a powerful new platform designed to give web developers what gaming developers have had for years. Andreas shares what inspired them to build Nordcraft Engine, why they believe the web is overdue for a shift in how we approach designing and building for the web, ee explore how the platform works, how you can get started, and what’s next for Nordcraft.

Matched from the episode's transcript 👇

Andreas Møller: So because most of authentication is a backend issue, we don’t handle most of it, because we don’t have a backend, we just work with whatever your backend is… Whether you coded your own, or whether you use like Supabase directly as a backend, or whatever you’re going with. There’s also some – like, Xano is one of that more low code/no code style backend; it works really well with that as well. So from our point of view, again, we’re frontend, so we’re just doing HTTP requests to a server. So what we’ve done around authentication is say “What matters for us is how we store the token, in essence.” Whatever strategy you have for authentication is possible.

What we sort of nudge people towards is storing your token in an HTTP-only cookie. Because that’s generally the most secure way of handling it. And what we’ve done is we have an API proxy. So whenever you have an API from the frontend, if you run it through our proxy, we can actually authenticate that request. So let’s say you need to send an authorization header with your user token to an API - that’s the most common approach - well, you can actually set up “Well, that’s what I want to send”, and you don’t actually have the token on the client, because that’s stored in an HTTP-only cookie from when you authenticate it, but we actually put that token in before that request goes to the server. And that way, even if you have some script injection attacks, etc. they can’t actually steal your token. That’s not going to fit for everyone. Sometimes if you want to do – especially if you’re doing like real time WebSockets and stuff, that’s not always an option, and you do need to have the token on the client. So whatever your authentication strategy is, is possible. And it sort of speaks to the general approach…

I sometimes say “We didn’t really invent that much.” And by that, I mean almost everything works the way you would expect it if you stopped looking at it as a visual tool, and started looking at it as a JavaScript framework. So when people ask “How do you do authentication in React?”, it’s sort of almost a half abstract question, because it’s like “Well, however you like.” And similar, “How do you store things on the client?” Well, there’s local storage, there’s session storage, there’s IndexedDB, there’s all these options… Because it’s a web browser, right? And we are a framework, and we just build visual tools on top of that. But we didn’t invent whole new abstracts. The web’s pretty good. The HTML, CSS - it’s the best we’ve come up with yet, I think… So we’re not reinventing it.

Obviously, it’s not running JavaScript, but it’s very much stealing everything it can in terms of function names and making sure that everything is very familiar to someone coming from that world. And again, all the browser APIs, they’re the same.

Changelog Interviews #640

Building Zed's agentic editing

Nathan Sobo is back talking about the next big thing for Zed—agentic editing! You now have a full-blown AI-native editor to play with. Collaborate with agents at 120fps in a natively multiplayer IDE.

Matched from the episode's transcript 👇

Nathan Sobo: Yeah, that does seem true. What I will say about the networking thing is - yes, but it’s even harder… Because we have simulated random network latencies, right? As I was describing, you can actually in Rust build your own custom scheduler that you drive with a random number generator, and every single async part of your entire app can be scheduled by that, and you can randomize the order that things happen in. That does not help you with what is one token distinction difference going to do to the behavior of this crazy LLM system. Like, it’s fundamentally different. I don’t know… We can always sample. I could use a seeded random number generator to like sample off the logits on the back of the LLM… But is that even meaningful? You know what I mean? Like, the whole point is to be able to change the prompt, and that you can change the prompt and get such a diversity of different outputs is kind of like the point of it, I guess. Anyway…

Changelog & Friends #86

Of agents & agency

Long-time JS Party panelist Amal Hussein joins Jerod to catch up on her career path, to opine on the viability agentic coding, to feel all the feelings that AI brings out of us as developers, and to share something new in her life that changes everything.

Matched from the episode's transcript 👇

Amal Hussein: You got it. And what’s really cool about this is – we are a trust platform. We don’t look at your data, we actually – we use this tokenization workflow, which is we kind of bypass… We use our clients to connect directly to storage buckets. And so we generate a token, and we store a reference to that token in [unintelligible 00:34:45.29] But none of your data passes through our layer. And so hence the control plane/data plane that we set up.

Changelog Interviews #631

Antirez returns to Redis!

Antirez has returned to Redis! Yes, Salvatore Sanfilippo (aka Antirez), the creator of Redis has returned to Redis and he joined us to share the backstory on Redis, what’s going on with the tech and the company, the possible (likely) move back to open source via the AGPL license, the new possibilities of AI and vector embeddings in Redis, and some good ’ol LLM inference discussions.

Matched from the episode's transcript 👇

Jerod Santo: It’s only one token in English, you know? I don’t know about in Italian, but… It’s only one token in English. Okay, that’s cool. That’s very cool. I’m sure there’ll be countless other people with ideas on how they can leverage this to do cool stuff, whether it’s commercially, or healthily in their own time, to track their calories. I think it’s pretty sweet.

Break: [01:09:24.24]

Changelog Interviews #629

Programming with LLMs

For the past year, David Crawshaw has intentionally sought ways to use LLMs while programming, in order to learn about them. He now regularly use LLMs while working and considers their benefits a net-positive on his productivity. David wrote down his experience, which we found both practical and insightful. Hopefully you will too!

Matched from the episode's transcript 👇

David Crawshaw: If you step back to like running llama.cpp yourself or something like this, you can sort of oversimplify one of these models as every time you want to generate a token, you hand the entire history of the conversation you’ve had, or whatever the text is before it, to the GPU, to build the state of the model. And then it generates the next token. It actually generates a probability value for every token in its token set.

And then the CPU picks the next token, attaches it to the full set of tokens, and then does that whole process again of sending over the entire conversation, and then generating the next token. So if you think about that very long, big, giant for loop around the outside of every time there’s a new token, the token is chosen from the set of probabilities that comes back, it’s added to the set, and then a new set of probabilities is generated for the next token.

You can imagine in the middle of that for loop having some very traditional code in there that inserts a stack of tokens that wasn’t actually decided by the LLM, but then become part of the history that the LLM is generating the next token from. And so this is – that’s how those embeds work. You can effectively have the LLM communicate with the outside world in the middle there by it driving that. Or you don’t even have to have it drive it. You could have software outside the LLM that looks at the token set as it appeared and then insert more tokens for it. So this is all the fun stuff you can do by running these models yourself.

Changelog & Friends #73

Kaizen! Three wise men?

Gerhard is back for Kaizen 17! We discuss our CPU.fm changes in-depth, detail new Zulip / Neon integrations & put our Pipedream to the test. Oh, and a Gerhard surprise (of course)!

Matched from the episode's transcript 👇

Jerod Santo: Easy, I guess… I do remember once you have basically the Zulip API – like, their stuff is so simple. That’s one of the things I like about them. There’s no OAuth, there’s no craziness… It’s just like “Look, go ahead and generate a token, and then throw that token in a header. And all the requests that you have that token in the header, we’re going to let you do what you want to do.” Now, there are some finegrain controls beyond that, but they just start from the basic place. And so that made me getting Zulip abilities inside our app, like, 30 lines of code, for the module that changelog.zulip module, which invites people, and posts stuff… And once you can post stuff, then you’re just basically – you’re halfway there. Now, how does this work? Honestly, I don’t recall. Is this going through GitHub Actions?

Practical AI #298

Full-duplex, real-time dialogue with Kyutai

Kyutai, an open science research lab, made headlines over the summer when they released their real-time speech-to-speech AI assistant (beating OpenAI to market with their teased GPT-driven speech-to-speech functionality). Alex from Kyutai joins us in this episode to discuss the research lab, their recent Moshi models, and what might be coming next from the lab. Along the way we discuss small models and the AI ecosystem in France.

Matched from the episode's transcript 👇

Alexandre Défossez: I mean, there’s a few things that we’ve done that were really, really funny. For instance, just training on this old dataset from the ‘90s and like early 2000s of phone calls… And then it was not really like an assistant anymore. So it’s just like you end up on the phone with someone random, and they will tell you their name, they will tell you what they think about US politics at the time… And it’s really – it’s kind of a different thing that we tried to keep with the final Moshi, but obviously, with the phase of instruct tuning, we lost a bit of this… I mean, it still quickly falls back to the helpful AI assistant personality that’s maybe not as nice… But that was a funny thing. Basically, we can train it on anything, and then this is going to act like a kind of actor that would pretend to be a certain person in a very realistic way.

There’s a number of things that we’re exploring with this kind of approach. Anything that would be like speech to speech, or text to speech, or vice versa… Some of them we kind of mentioned in the paper, or with just this framework… Because we also have a text stream that basically we use only for the model to be able to output its own words. We don’t actually represent the word from the user, but the model outputs its own words. And this kind of aspect, by making the text late or early on the audio, we can turn the model from being like a text-to-speech engine, because if the text is early, then the audio is just going to follow it, but if the text is late and you kind of force the audio to some value, and you only sample the text token, that now becomes automatic speech recognition… So I think that kind of shows how versatile this multi-stream approach is. And all of those applications are really streaming. So we could – actually, something we did for the synthetic data was using this kind of approach to generate long scripts, and you could imagine generating maybe 15 minutes, or whatever. That’s our things that we’re working now more independently.

And yes, in terms of more as a general community, I’m not aware of anything in particular. I think one thing we want to do though is to release code to allow fine-tuning, maybe with LoRA, and also make it really easy. Obviously, the pipeline is a bit more complex, because you need audio, ideally you need transcripts, you need separation between the agent you want to train and the users… So we want to help with that regard, and try to make it easier to adapt it to a new use case.

Break: [00:34:06.14]

Go Time #337

Crawl, walk & run your way to usable CLIs in Go

With the number of libraries available to Go developers these days, you’d think building a CLI app was now a trivial matter. But like many things in software development, it depends. In this episode, we explore the challenges that arose during one team’s journey towards a production-ready CLI.

Matched from the episode's transcript 👇

Wesley Beary: All of the above, kind of. So there’s a couple of different things. One of the other things that I initially picked up at my time at Heroku and now I’ve really dived into is doing more like spec-first API development. So when we’re working on API endpoints, which - for us, a lot of what we’re doing is basically like we want to add a new capability to the CLI, which relates to CRUD operations on a resource, or something… Well, how are those going to be driven? Well, there’s going to be a REST API on the other side, right? So we have our Open API spec that defines what all of the API ought to be doing.

So usually, when I’m going to develop a new endpoint… Like, I was working on a new thing earlier this week. So I start, I go into the spec, I add in - in this case it was a new operation to create organizations. Previously, if you wanted to do that, you had to just go into the web interface. Now I’m trying to add it in the CLI. So I went in, I defined it in the spec, and then right now we’re using a tool called - hopefully, I get all of these right. There’s a bunch of tools. So I believe Prism is the right one for that… So Prism is a Node-based command line tool; it relates to Open API stuff. You spin that up and you can actually say “Here is a spec that has example values in it. When I make a POST request to the organizations thing to create an organization, just use the example data from this file and return something that doesn’t necessarily quite match up with what I said, but it is a valid representation of an organization.” Because for my initial stuff that’s fine.

[00:15:49.20] So usually, while I’m developing, I’ll start with that. So I’ll be developing the spec and developing the CLI in parallel, and then I can actually build out the CLI endpoint that works just against all of these mocked data. And then that I’ve found in terms of iterating and stuff is super-valuable, because - I don’t know about other people… Even though I’ve been doing API stuff for a while, I almost never get it right the first time. And it’s pretty costly if “getting it right” the first time means writing out all of the end points, and all of the backing stuff to that, and all of the tests to that, and all of the database interactions to that, and the tables… All of that stuff - there’s a lot of stuff to that. And so if you make a mistake, it can be a real pain to fix it.

So being able to just iterate quickly against the spec directly - it’s way easier, because I can be like “Alright, great. I’ll do this. I’m banging out. I’m working on the CLI endpoint. Oh, wait… There’s two or three more fields that need to be serialized onto this record that I just forgot about. Okay, let me go add those to the spec. Great. They’re there. Okay.”

Now the CLI thing does everything it needs to. Now I can go do the implementation, and I have a clear contract that I’m basically implementing against.

So we start with that. And then in the same way when we’re doing tests, we have a lot of tests that are marked to run just in that mocked mode. And so in the test context, we spin up a Prism mock server, and we run tests against that. Not everything works that way, because - the example I gave a second ago of like create an organization, but don’t really pay attention to the parameters I pass into you, or whatever, just give me back something that’s technically valid… Like, for some tests, that’s great. That’s all you really need. But for other tests, it matters that the database really gets touched, and the valid records are there. And especially in our context of X509 stuff - I wish it weren’t this way because it can be a real pain, but sometimes to create a valid record, you’re actually creating a constellation of records. It’s actually like it’s not just this one record that is valid by itself. You need to create these six different records that all have relationships to one another, or something. And so mocking that becomes a nightmare very quickly.

At the same way we have – Prism can also be run in a proxy mode, where instead of being a mock server, it instead passes requests through, but as the requests pass through in both directions, it checks to make sure that everything that’s passing through matches against the schema that you provided. So that helps to guard us then that now we’re running tests against real, live stuff, creating real records or whatever, but if there’s discrepancies from that contract that we’ve written, we’ll find out about it. So that helps, again, to keep us honest.

And then on the server side, we have some similar tooling. There’s a – so our server in this case is all in Ruby, that runs the API, and there’s a library that’s called Committee, that also has the schema loaded into it. It’s what’s called a rack middleware. So as the requests come in, it checks the requests against the schema, and if they don’t match, it will reject it and say “Hey, you’re including a field that’s not in the schema. I don’t know what to do with that. Please, don’t include this field.” Or “This is field is the wrong type” or all these kinds of things that it can find out just from the schema.

And on the same token, as the request comes back out, it checks it again against it and says “Hey, wait a second. Actually, you included three fields that aren’t in the schema. What’s up with that?”

All of that like provides nice guardrails and helps us iterate faster. And then we don’t always do this, because we’re such a small team, but it can also be really nice in terms of being able to parallelize some of that. Like, once you have a schema that you’ve agreed upon, I can continue working on the CLI and I can potentially hand off the implementation of the API side to one of my colleagues, and we don’t even have to talk to each other or whatever. We’re not stepping on each other’s feet. We’re both implementing against the same contract. As long as the contract stays the same, we can do that without even really having to talk. That forms the talking that needs to happen.

So all of those aspects have been really nice, and definitely have helped us iterate faster. I actually – this is a whole other aside, but in some ways I wish that I had something kind of like that schema set up for CLI stuff, where I could kind of define, I don’t know, somehow what the CLI ought to eventually do or look… Because then we could work backwards from that. We could have the contract… I haven’t seen anything like that, so if anybody knows of something like that, please let me know.
It seems harder, because the ultimate output of the CLI is much more freeform. In the case of an API you’re talking about JSON blobs in and out, so it seems a lot easier to define something that says like what the shape of those blobs should be, what the types should be, stuff like that. CLI is like a bunch of characters on a screen, so what do you even do?

[00:20:12.02] But yeah, not having that can make it harder, right? There’s a lot of just like guess and check. I don’t know. I mean, the closest we got is – this is iterated over time, but we do a lot of sketching basically, where a sketch is… In the early days, my sketches were basically like I would - again, former Rubyist; still Rubyist, whatever. I would write a Ruby program that was just like a bunch of print lines and stuff basically, that more or less did something in the shape of what we wanted the CLI to do, so you could just like see it happening dynamically… Because in a lot of cases, for me at least, that would give me a much better sense of “Does this feel right? Does this feel close to right? Does something seem off here? Does it seem too noisy? Does it seem like it’s not giving enough feedback? How does this feel?” I don’t know, for me at least it’s hard to do that without a little bit of poking and prodding, some just try, guess, check etc. So yeah, that’s been super-helpful too in terms of iterating quickly.

Break: [00:21:09.01]

Changelog Interviews #616

ANTHOLOGY — Packages, pledges & protocols

The hallway track at All Things Open 2024 — features Carl George, Principal Software Engineer at Red Hat for a discussion on the state of open source enterprise linux and RHEL (Red Hat Enterprise Linux), Max Howell, creator of Homebrew and tea.xyz which offers rewards and recognition to open source maintainers, and Chad Whitacre, Head of Open Source at Sentry about the launch of Open Source Pledge and their plans to helps businesses and orgs to do the right thing and support open source.

Matched from the episode's transcript 👇

Jerod Santo: So certainly it’s going to be on Coinbase, but he hasn’t said where you can buy this token.

Practical AI #291

Practical workflow orchestration

Workflow orchestration has always been a pain for data scientists, but this is exacerbated in these AI hype days by agentic workflows executing arbitrary (not pre-defined) workflows with a variety of failure modes. Adam from Prefect joins us to talk through their open source Python library for orchestration and visibility into python-based pipelines. Along the way, he introduces us to things like Marvin, their AI engineering framework, and ControlFlow, their agent workflow system.

Matched from the episode's transcript 👇

Adam Azzam: Yeah, it’s a great question, and it’s something I obviously try to think a lot about… I would say that right now there is way too much emphasis on single-machine, local LLM or agent workflows. Here’s what I mean by that. You pull up any framework in the world to run LLM things, or to build an agentic workflow, and it’s like happening in a local process, on your machine. It spins up an input field in your terminal, and then you have to type your answer to what your favorite color is, and then it goes and writes you a poem, or something like this. But businesses who are building LLM workflows or building on agents, at the model level we see that frontier models are accounting for – sorry, frontier model providers are accounting for a lot of the core sources of failure. So structured outputs from Open AI managed to wipe out just a whole host of your classic resiliency issues. And the resiliency issues I think are going to be around planning, and so the ability to add idempotency and transactions to LLM workflows is really what we would invest more into. You had a plan that executed ten things in a row, you found out at the 11th step the plan was bad. But you’ve already done ten things. How do you walk that all back? You would do that with transactions.

Where I am really interested in long term is - when Daniel asked earlier “Talk to me about how I as a human go from a locally running function to something that’s running elsewhere”, I think that the “me as a human” part of that is going to go away. And it’s often going to be an LLM in the course of solving a problem decides that it needs to create and massively parallelize a function to call out to.

And so now you need to be able to give an LLM, on demand, the ability to create and provision infrastructure, and submit a whole bunch of jobs to it. And so while we’ve built Prefect to be something so easy a human can understand, that’s going to play into a lot of strengths of “How do we expose an API that can really be taken advantage by an LLM?”, if that’s the intended audience of how to provision infrastructure.

The third piece is - at companies now you’ll have so many teams that are basically building LLM workflows in parallel to each other. So you’ll have like the conversation team, that’s trying to build in LLMs into how it talks to customers. And then you’ll have like the platform team, which is trying to use it to give feedback on internal pull requests. You’ll have teams that are trying to use a bit more human in the loop for commenting on like a design doc, something like this. But fundamentally, you have tens of thousands of parallelized executions or calls against LLM APIs, so how do you solve that coordination problem across the programs that are trying to invoke LLMs?

And so really trying to figure out, if I have 10,000 agents all trying to access the same resource at the same time, how do you govern that in a way that doesn’t cause them all to crash? If I have 10,000 engineers that are all trying to access the same LLM API, how do I make sure that they don’t all consume the same token budget all at the same time?

So that’s a bit more on the practical side of like – you know, future of agents, future of LLMs aside, there is still a very, very interesting, fundamental engineering question, which is “How do you get tens of thousands of concurrent API calls to Open AI or Anthropic all behave sanely, and not lean into a commons problem of everybody cannibalizing the same data resource?” And so that’s like the very boring thing that I like to think about. But orchestration is one of these very fun, but ultimately worst-case scenario disaster planning style things.

Practical AI #289

Understanding what's possible, doable & scalable

We are constantly hearing about disillusionment as it relates to AI. Some of that is probably valid, but Mike Lewis, an AI architect from Cincinnati, has proven that he can consistently get LLM and GenAI apps to the point of real enterprise value (even with the Big Cos of the world). In this episode, Mike joins us to share some stories from the AI trenches & highlight what it takes (practically) to show what is possible, doable & scalable with AI.

Matched from the episode's transcript 👇

Mike Lewis: Yeah. And so I think it’s that Elvis, like, peanut butter and banana sandwiches, or something. And so now we can just unpack the concept of context. Because by inserting some new ideas, we sort of shifted the probability of that index of potential tokens and the order that they would be. And we’ve shifted it enough so that a banana could potentially show up as worth picking. And someone did. And that just sort of demonstrates, “Well, most of you still said jelly, but someone said banana.” I’m just telling you, it has never not worked. Like, it always works.

And so then we start talking about, okay, so these are tokens. They’re parts of words, they’re numbers, they go together, it can predict the next one to mimic human communication or thought… And then there’s only a certain amount of these that can fit in its brain, or its memory, its context window. And we’ll show them on a board. We just put little hashes for like “This is a token, and this is a token…” Well, and what happens when we get to the end? Because Daniel, when I started working with these tools, we only had 2,000 to work with. And now it feels infinite, but not really. But because you learn early, because you learn early, you learn to work within the limitations of the tool and you learn elegant workflow then… Because we appreciate – even though it feels big, it’s still scarce, because they’ve got to pay for what we submit.

And so we show them, these are how the tokens stack up, and then we show them, like “When you start getting toward the end, if we fill up this 2,000, what’s it going to do?” It’s chopping off the ones in the beginning. And then we ask them, “Have you ever had an experience with ChatGPT where it feels like “I’ve already told you that”? Oh yeah, you see some – oh…! Oh, because it didn’t forget it. It’s selectively deleted it, you know?

And so we just run them through… That’s like the first 10 minutes, and we just build on that, and build on that, so that they build good AI usage habits, and they can interact with the tool in a way that leverages the strength of the tool, and with an awareness of what can go wrong.

Changelog & Friends #63

The wrong place to slap a person

Nick Nisi joins Adam and Jerod to talk about Karaoke, ARC and the business model of web browsers, this WordPress drama, and an epic bonus for Changelog ++ subscribers.

Matched from the episode's transcript 👇

Adam Stacoviak: Liberally, permissively… They’re probably interchangeable, to some degree. I’m sticking to my guns. Great job, Enchanted, and whoever’s behind this. AugustDev. Fantastic work. This is free. You just download it. What do you do to get access to the – you have to have like a token, is that right?

Practical AI #287

Pausing to think about scikit-learn & OpenAI o1

Recently the company stewarding the open source library scikit-learn announced their seed funding. Also, OpenAI released “o1” with new behavior in which it pauses to “think” about complex tasks. Chris and Daniel take some time to do their own thinking about o1 and the contrast to the scikit-learn ecosystem, which has the goal to promote “data science that you own.”

Matched from the episode's transcript 👇

Daniel Whitenack: Really it’s just, I’m assuming it’s generating text. And then at a certain point, similar to people asking “These models just generate texts. How do they know when to stop?” Well, they don’t know. They generate a special token, which is an end of statement token. And then the program, the computer program stops it from generating more text after that token is generated.

In the same way here, I’m assuming after a certain level of text is generated in the chain of thought, it generates a special token - that’s what I was referring to before - of like “Now generate the answer token.” And then that’s how they control the UI.

[00:33:30.25] So all of that is inference on my part. Again, I could totally be wrong. But that really is a similar process to what we’ve been doing now for a couple of years with these models. There’s not anything fundamentally different about that process… And so I’ve found it interesting that the big reveal here was sort of a different model created with RLHF, which is basically what everybody else is doing. Maybe they’re using this interesting methodology and the UI… So I don’t know what that reveals. It could mean this is a cool thing to hold us over until GPT-5, which will be this fundamentally world-changing different process, methodology, architecture model that’s going to be crazy. Or it could just mean there’s very much a diminishing returns here in terms of the methodologies that are available to improve this wave of models.

Changelog News #111

Is Linux collapsing under its own weight?

A Rust for Linux developer resigns amidst rising tension in the Linux community, Bret Victor shows off what he’s been working on for years, Rachel (by the bay) laments how useless “SRE” has become as a role, Doug Turnbull makes the case for hiring junior devs & Baldur Bjarnason says the LLM honeymoon phase is about to end.

Matched from the episode's transcript 👇

Jerod Santo: The LLM honeymoon phase is about to end

Baldur Bjarnason has been consistently bearish on the current crop of AI tools/products since I’ve been following him. I don’t agree with him in all aspects, but he does a good job of arguing his position, so I appreciate his writing on the subject.

In this latest post, Baldur explains how weaknesses in how LLMs work are making them great targets for manipulation.

We’ve also known for a while that prompts are effectively impossible to secure.

It should not come as a surprise that some researchers decided to see if prompt “security” could be bypassed with a malicious token stream that completely bypasses the whole “comprehensible language” part.

The process for discovering these malicious token streams – sorry, “Strategic Text Sequence” – is quite similar to what Profound, the company mentioned earlier, seems to be doing. You automate a process of shoving customised prompts into one end of the LLM black box and you map the output to discover token streams that have an unusually big impact on the output.

Given the opportunity for businesses to gain an unfair advantage… we all know what they’ll do with it. Baldur thinks this is going to go from bad to much, much worse as these techniques are uncovered:

This is going to get automated, weaponised, and industrialised. Tech companies have placed chatbots at the centre of our information ecosystems and butchered their products to push them front and centre. The incentives for bad actors to try to game them are enormous and they are capable of making incredibly sophisticated tools for their purposes.

Practical AI #283

Threat modeling LLM apps

If you have questions at the intersection of Cybersecurity and AI, you need to know Donato at WithSecure! Donato has been threat modeling AI applications and seriously applying those models in his day-to-day work. He joins us in this episode to discuss his LLM application security canvas, prompt injections, alignment, and more.

Matched from the episode's transcript 👇

Donato Capitella: Can I have like a second, different question? No, okay, so there are two things. I think you need to be comfortable having an opinion that could be proven wrong. Because there are two ways of answering that question, right? A way is “Well, this is something we don’t know, and we can’t know.” So whatever happens, I’m always going to be right. But what I think - the current LLM technology, because of the problem space, you are not going to be able to solve that alignment problem. The space of operation of an LLM - let’s take GPT-4, okay? So maybe you’ve got 50,000 tokens, so let’s say words; a dictionary like that. You’ve got a context of over 100,000 of these tokens. That gives me 50,000 to the power of 100,000 possible things that the LLM could possibly say. What’s that number? I don’t know, but like a Rubik’s cube is three by three, and it’s 53, 43 quintillion combinations. And that’s a drop in the sea.

So I don’t think we have a tool yet. I think the only way you would get reasonable alignment with the current LLM technology, the way I understand it - and again, I want somebody in the next episode to come and prove me wrong, because I would love this. I’m saying something that I don’t like. But the only way you’re going to be able to realistically align that is to find an alignment method that allows you to cover that huge 50,000 to the 100,000 token space almost completely. And I think the reinforcement learning, from human feedback at least, it covers a very small part of it. It’s actually really tough. We find instruction fine-tuned on LLM, but we didn’t do the reinforcement learning part. I mean, I don’t know if you have done that, but that’s not something that you take your LLM kit, you press a few buttons and you’re done. That seems very resource-intensive to me.

So again, the way I see it is that we need something else with the LLM technology to try and cover that space. It looks intractable to me, but maybe there is something else that we put on top of the LLM, or a completely different technology that can solve the alignment. But I’m not aware of one. Are you?

Practical AI #282

Only as good as the data

You might have heard that “AI is only as good as the data.” What does that mean and what data are we talking about? Chris and Daniel dig into that topic in the episode exploring the categories of data that you might encounter working in AI (for training, testing, fine-tuning, benchmarks, etc.). They also discuss the latest developments in AI regulation with the EU’s AI Act coming into force.

Matched from the episode's transcript 👇

Daniel Whitenack: [00:08:26.20] Yeah. So you mentioned a few things there I’d love to pick apart, which is this idea of - there’s some kind of provenance to a model that has to do with the data. So it may be good to remind people that a model, when we’re talking about an AI model, is really composed of two things. It’s composed of code that executes functions, and adds things together, and kind of essentially does a data transformation. So maybe it’s an image in and a label out, that’s a label, whether it has a cat in the image or not; or maybe it’s text in and a generated next token out. And these are data transformations, and that code that executes those data transformations is written in code, just like normal code, but it includes a bunch of parameters that need to be set. And by a bunch, maybe people are familiar, from seeing models now, that that might be 7 billion parameters, 70 billion parameters, 400 billion parameters. So in order to set those parameters to do that data transformation, there needs to be data that is used to fit those parameters, often called that training process.

Now, one element of this, Chris, is if you imagine like LLaMA 3.1 - which is a recent addition to our world - has whatever 400 billion parameters. You could imagine that maybe you’re not going to fit that many parameters with a small amount of data. And so there’s some relation between the complexity of the model and how much data is needed to fit it. And that may in itself be something that people aren’t quite grasping often, is that the bigger the model you want to use, the more data you need to have to train it, which is why these datasets have got larger and larger.

Changelog Interviews #602

Open is the way

Joseph Jacks (JJ) is back! We discuss the latest in COSS funding, his thesis for investing in commercial open source companies, the various rug pulls happening out there in open source licensing, and Zuck/Meta’s generosity releasing Llama 3.1 as “open source.”

Matched from the episode's transcript 👇

Joseph Jacks: Anthropic’s been beating Open AI now for two months on the benchmarks and performance. So Anthropic is like actually way ahead… But even Anthropic doesn’t have tens of billions in equivalent compute to scale. Like, Meta is literally doing this. They have hundreds of thousands of H100’s-equivalent from the last report that Soumith put out, who’s the PyTorch creator, and kind of leading a lot of this stuff. And they’re probably going to have half a million or a million more… And they’re literally just getting started with like scaling all this infrastructure.

So I take two sides to it. One is I think it’s justified and it makes a lot of sense, in the same way that releasing OpenCompute and React and PyTorch made a lot of sense, to deliver massive derivative benefits to their business, that were measurable. Really huge benefits. ROI - very easy justification.

On the side of LLaMA though, you have tens of billions of dollars now you’re spending. It’s not a few hundred million dollars or maybe a billion dollars. It’s like tens of billions. And so I still see some returns there, and you could see maybe a multiple of that if you look at the enterprise value of the business.

[01:22:09.15] But I don’t see the transformer approach they’re taking as the most efficient and effective way of actually continuing to innovate and scale the capabilities of the models. I think we have much better techniques. I mean, you have to really understand - what is the transformer doing? The transformer is taking in tens of trillions to pretty soon hundreds of trillions of what are called tokens; each token is a couple words, and split apart, and so a token is like a few bits of information, basically… Some tens of bits of information. And then you have hundreds of trillions of these things that you’re compressing and training into a system.

The human brain uses 10 to 30 watts of energy to compress what you could approximate as probably tens of exabytes of data by the time we’re wobbling around and able to like string together the first few coherent words. The amount of energy getting consumed by these transformer-based neural nets is gargantuan. We’re talking about tens of megawatts. It’s crazy.

The human brain is many, many, many, many orders of magnitude more energy-efficient than the state of the art neural nets that are getting produced today. And I do think the human brain resembles something probably roughly approximating a biological neural network of some kind, with electrical signals between our synapses, and all that kind of stuff. I think that’s kind of like the Red architecture. But to do that in silicon, at our current approaches is as if we were doing vacuum tubes in the ’20s and ’30s. That is our current state of the art sophistication in building neural nets. We are in vacuum tube days.

JS Party #331

Building LLM agents in JS

KBall and returning guest Tejas Kumar dive into the topic of building LLM agents using JavaScript. What they are, how they can be useful (including how Tejas used home-built agents to double his podcasting productivity) & how to get started building and running your own agents, even all on your own device with local models.

Matched from the episode's transcript 👇

Tejas Kumar: Yeah. Although, although, the thing that makes us human, that sets us apart from lesser animals is the prefrontal cortex. It’s the center of the brain that literally, literally just does predictions. And based on those predictions we’ll either quiet down other circuits, or raise their activity. It will inhibit or excite. But predictions are so crucial to the human experience. And so I think it’s important to not undervalue that, but also not overvalue it. And so next token prediction is still prediction, on some level.

Go Time #321

Dependencies are dangerous

Dependencies! We need them, but how do we use them effectively and safely? In this week’s episode Kris is joined by Ian and Johnny to discuss the polyfill.io supply chain attack, the history of dependency management and usage in Go, and the Go Proverb that “a little copying is better than a little dependency”. Of course, we wrap up the episode with some Unpopular Opinions!

Matched from the episode's transcript 👇

Kris Brandow: I think part of it also comes from the fact that your backend code is within your security domain, and your frontend code is not, and you kind of have to treat your frontend code as if it’s hostile, in all cases. So I think that kind of lowers the necessity in some cases of needing to be as security-conscious about what’s going on in your frontend. And most of the time, a lot of the frontend vulnerabilities, like cross-site scripting, can be blocked or prevented on the backend side of things as well. But I do think the kind of lack of security awareness for frontend people, especially lack of cryptographic awareness, does hinder in some ways. I’ve noticed that there’s not a huge amount of these, of the cryptography APIs in the browser, where that could be a much better way to do authentication and authorization. Because the browsers had APIs for years now where you can just create a public-private key pair that is locked into the browser, where another script, even on your subdomain, can’t extract that key. And that would allow you to kind of uniquely identify that browser in a cryptographically secure way, and you’re not passing around tokens that can be lifted, kind of no matter what, and someone can run some JavaScript on your website.

[00:46:07.26] So it’s like, yeah, there’s a little bit you can do through cross-site scripting. You can still make API requests, but you can’t lift that key and send it to a server and start making a ton of requests, or anything that. Whereas you can do that with a [unintelligible 00:46:18.05] token. And I think some of the reason that’s not done is because there isn’t as much of a familiarity with these tools and these ideas in the frontend community, so there’s not a lot of support for this type of stuff. I mean, we’re starting to get it with things like passkeys, but that is super-late compared to what we could have had if we tried to start doing this much earlier.