Search results for Staking lock-up period💯[aptminer.com]

Making DNSimple

Anthony Eden, Founder & CEO of DNSimple, joins the show to talk about the world of managed hosting for DNS and more.

Matched from the episode's transcript 👇

Adam Stacoviak: Yeah, I just wonder… You know, I know that you’re enjoying this chill biz you’ve got going on here. And not saying that a domain lookup would make it any less chill, but that’s the area where you have not wanted to explore, to keep things the way they have been. But I just wonder if there’s…

Changelog Interviews #641

NATS and the CNCF kerfuffle

Derek Collison — creator of NATS and Co-founder & CEO of Synadia — joins the show to dive into the origins, design, and evolution of NATS, a high-performance, open-source messaging system built for modern cloud-native systems and part of the CNCF. Derek shares the story behind NATS, what makes it unique, and unpacks the recent tensions between Synadia and the CNCF over the future of the project.

Matched from the episode's transcript 👇

Derek Collison: [19:38] Some need a lot of messages per second… NATS has always been pretty performant. Last time I checked, just raw speed, a single server could do about 20 million per second messages sustained. All of our clients can’t do that. They have more of the logic, but a raw just, you know, shove it in there… But when we say billions of things, it’s usually stored messages, and then subject cardinality. So in event sourcing and event streaming, every single message has a unique subject. And what we’ve found was is that people are starting to treat the stream of messages… And we have materialized views, like key-value and object store on top of them, but they’re starting to treat them like a subject-based addressing lookup thing, meaning that they use subjects all the time to get messages out, or filter messages, or do some other operation.

And so you could imagine trying to figure out an index that looks like a traditional data structure that anyone who’s done a comp side, 200, 300-level class would understand… But overlay subject semantics with wildcards and partial wildcards is a little bit tricky, if that makes sense.

Practical AI #274

The perplexities of information retrieval

Daniel & Chris sit down with Denis Yarats, Co-founder & CTO at Perplexity, to discuss Perplexity’s sophisticated AI-driven answer engine. Denis outlines some of the deficiencies in search engines, and how Perplexity’s approach to information retrieval improves on traditional search engine systems, with a focus on accuracy and validation of the information provided.

Matched from the episode's transcript 👇

Denis Yarats: Definitely looking into this, I think, and we consider multiple options. There are certain things I think we can do ourselves, for certain other things we probably have to work with some other partners. But I truly share your experience, and I think it’s – yeah, if you sit in front of a computer, I think by far the best interface is the keyboard. I don’t think you can do better than that. But yeah, if you’re occupied with something else, if you’re driving a car, maybe you’re walking, there has to be something else. Even the phone is – it’s okay, maybe even taking notes, maybe you say a command and something like that, and you get the voice back, that’s already something… But it misses visual information. So you’re gonna want to add that. So that means you probably have to have some sort of glasses on. I think it will definitely happen.

[00:38:20.25] And we will try, for sure – we spent a lot of time this year improving our mobile app to do voice-to-voice, and we invested a lot into voice generation. So for example, you can ask various questions; if you need something quickly, like you’re walking and you’re like “I want a quick lookup of information”, so we support that.

There is something for example for if you drive a car, we have this – you can read up the stories, or discover from Perplexity, that’s also AI-generated voice. So it’s like you listen to a podcast, or… So that’s super-important. Yeah, I think that the next step is vision, and so how do you get there.

Changelog Interviews #523

Just Postgres

This week we’re talking about by Postgres with Craig Kerstiens, Chief Product Officer at Crunchy Data, and a well known ambassador for Postgres. Just Postgres. That’s what this week’s show is about.

Matched from the episode's transcript 👇

Craig Kerstiens: I use the Redis one every so often. Like, “Hey, I just want to join data from Redis to Postgres”, and it works. There’s not an easy way to say, “Hey, here’s how one is safer”, or more risky. So if you look at it as a basic data type, or some lookup or some function, that’s definitely lighter-weight, simpler, easier, safer. If it’s doing things to the query planner, that’s kind of next-level, right? Citus does this. Timescale does this, right? “Hey, we’re gonna rewrite this query and take different paths.”

Citus and Timescale - I don’t know if they play well together. It’s been a common, common question for years. It’s like, we both take pretty invasive hooks. Back when I was at Citus, it’s like “Maybe fingers crossed… Probably not”, right? Because there’s some of the hooks - like, which one do you put first, and which one takes priority? And how does that step on toes? When you’re changing how code works, and say, “I’m gonna do this instead of that”, and then another thing does the same thing… It’s a little riskier, right? Not to say you can’t run with one of them, but you’re starting to change, and it’s a little less pure Postgres, right? And you’re relying then on that extension developer to say, “Yeah, our code is as tested, and is rigorous and as solid as Postgres code itself”, which some are, and some it’s wild west. And then when you get down to the storage engine, that’s just next-level.

So I think that’s a little bit of a framework of – and I love the small ones. We were about to add support for one, HypoPG, PG, which is hypothetical indexes. What would happen if I add this index to my database? You want to know the impact on queries ahead of time with this work? Is it a composite index, or…? You can add the index, but how do you simulate production on prod and staging? Like, all those hard things that we deal with every day in software, what is Postgres can just hypothetically say, “Yeah, this would improve things”?

Changelog Interviews #482

Securing the open source supply chain

This week we’re joined by the “mad scientist” himself, Feross Aboukhadijeh…and we’re talking about the launch of Socket — the next big thing in the fight to secure and protect the open source supply chain.

While working on the frontlines of open source, Feross and team have witnessed firsthand how supply chain attacks have swept across the software community and have damaged the trust in open source. Socket turns the problem of securing open source software on its head, and asks…“What if we assume all open source may be malicious?” So, they built a system that proactively detects indicators of compromised open source packages and brings awareness to teams in real-time. We cover the whys, the hows, and what’s next for this ambitious and very much needed project.

Matched from the episode's transcript 👇

Feross Aboukhadijeh: Yeah, so we have a pipeline that can do analysis tasks, and we feed in npm packages. We do this on every package published that happens on npm; so we have a program that’s tailing npm, following all the publishes in real time. And then whenever a new package is published, we download the tarball, we save the metadata about it, and we then kick off our analysis job to give us – to basically run all these tasks. We’ve written, like I said, about 70 of these analyses for a package, and so we kick that off. It runs, and it actually takes – we’re actually quite efficient. Each one is taking us five seconds or ten seconds for a package… And then packages might have transitive dependencies, so we might be doing this on – however many things they depend on, we’re gonna have to analyze those as well, because we don’t wanna just analyze the top-level package. That’s gonna miss anything nefarious that’s added in a dependency of a dependency. So we run this on all of them, and then we save the results, and that’s pretty much it.

We designed it in a way where we can also do this lazily. So we didn’t need to sort of sit down and just crunch through in a batch job, we didn’t need to crunch through the entire registry. So we actually have the capability to wait until a user visits the page for a package, or requests the score for a package, or does a look-up to actually then run that analysis, so that we can actually do it lazily if we need to.

So we’re doing that now in some cases for the really, really not popular stuff. We’re not gonna necessarily go and run an expensive analysis on all that stuff until someone looks it up, and then we’ll do that in real time for that package.

And we’ve done it in a really cool way, where we actually built it as – we did our own custom pipeline system. We didn’t wanna use something like Apache Spark, which requires you to use Java, and is kind of slow and clunky, and has a little bit of latency for running these jobs. We did it with our own JavaScript pipeline that we wrote, and it actually can cache the intermediate results of these analysis tasks, so that you can build a task that depends on other tasks.

Say one task is like “Download the code for this package.” You can cache that forever once you’ve done it. There’s no need to download that tarball again. It’s not gonna change. A version is immutable. And then you can have a task above that, takes in the tarball, and then untars it. And then you can cache that forever. And then you can have a task above that, that takes in the result of that and parses it into an AST for the JavaScript, and then you can cache that. So you can kind of construct these tasks can call into other tasks. But then when one sub-task is done, it may need to never be run again… Whereas like maybe the top-level analysis we’re doing - that actually might change more often, and so we can change that freely without having to worry about recomputing or redoing all that work below, if that makes sense. So you can think of it like a tree structure, basically. So it’s kind of nice. And then we can store these immutable blobs into a storage system that can store them forever. I don’t know if that was too much information, but I think it’s cool.

JS Party #183

JS on Wasm

KBall and Nick Nisi sit down with Nick Fitzgerald to learn about running JavaScript on WebAssembly. They talk about almost instantaneous startup, running interpreted languages at the edge, and take a deep dive into the weeds of how Wasm based modules will change the future of application development.

Matched from the episode's transcript 👇

Nick Fitzgerald: Yeah, so I don’t really like the term “replacing JavaScript.” I guess something I left out in my intro is I also was the lead of the Rust project’s WebAssembly Working Group, so I was kind of trying to make Rust and WebAssembly play nice with JavaScript. Our whole thing was that you shouldn’t replace your JavaScript; they should live together and be friendly. So you can reach for just those kernels of really hot code, and replace them with some Rust and WebAssembly, but then that should fit to your larger program.

Again, this is not about replacing JavaScript, it’s about “How can we bring JavaScript to places where you don’t really have access to JavaScript normally?” One example is serverless environments that just run WebAssembly. How do you run JavaScript in this environment? Or there’s different environments or OS’es that you don’t have a JIT compiler; all you have are the options of running an interpreter.

So we have kind of like a whacky setup that surprisingly works pretty well, where - yes, we’re taking SpiderMonkey, which is Firefox’s JavaScript engine, and we’re compiling that to WebAssembly. So we have a JavaScript engine running inside of WebAssembly. Then we run the JavaScript on top of that. And you might be thinking “Wow, that’s gotta be way slower than running JavaScript how you would normally run it”, and yeah, that’s true for throughput, but not for latency. So latency is like “How fast can we start up the engine and respond to something?” and throughput is like “How much work can do to extend it?” or “How long does it take to finish all of the work at a time?”

So we have this tool that I developed called Wizer, which takes snapshots of WebAssembly, and allows you to just basically initialize a program, take a snapshot at that point in time, and then the result of that snapshot is actually itself a WebAssembly module. And when you instantiate that WebAssembly module, everything is already initialized. So you don’t need to do any of that startup again.

[08:11] So what we can do is we can actually run all the JavaScript initialization, we can start up the JavaScript engine, we can parse the JavaScript, turn it into a AST, turn it into bytecode, evaluate the top-level so that all the event listeners are registered, and everything like that… If the JavaScript needs to build a global look-up table that is kind of in the top level, all that stuff happens, and then we take a snapshot. So that stuff’s already done. When we start up again, there’s basically nothing that needs to happen. We’re just immediately ready to start running JavaScript.

If you compare this to starting up a v8 isolate, I think it takes around 5 milliseconds to actually start the isolet, and you haven’t even started processing the actual JavaScript source code at that point yet. So you would still need to then parse the JavaScript, emit bytecode etc. With our snapshot all that stuff is already done. And I think the metaphor that Lin made in her blog post was if you have a contractor, you have to first negotiate with the contractor, hire them… That’s kind of like getting the JavaScript engine set up, and getting office space, and stuff. And then there’s making the Trello board, or whatever; getting all those items ready. That’s kind of like parsing the JavaScript. And then there’s the actual work that needs to be done.

So we’re kind of like making an office in a box here, where you just open the suitcase and the office is already in. Everything is ready to go, and you don’t have to do any of that initial setup time.

changelog.com/posts

ADsafe: JSLint powered safe JavaScript widget framework for ads and mashups

Douglas Crockford, author of JavaScript: The Good Parts and creator of JS Lint (featured in Episodes #26 and #46), wants to apply his Chuck Norris-style skills to protect the web from rogue widgets and ads.

ADSafe locks down guest script access to global variables and other page information and provides safe, indirect access to certain items via an ADSAFE object. ADSafe blocks or modifies access to:

Global variables - ADsafe’s object capability model prohibits the use of most global variables. Limited access to Array, Boolean, Number, String, and Math is allowed.
this - If a method is called as a function, this is bound to the global object. Since ADsafe needs to restrict access to the global object, it must prohibit the use of this in guest code.
arguments - Access to the arguments pseudo-array is not allowed.
eval - The eval function provides access to the global object.
with statement - The with statement modifies the scope chain, making static analysis impossible.
Dangerous methods and properties: arguments, callee, caller, constructor, eval, prototype, stack, unwatch, valueOf, and watch - Capability leakage can occur with these names in at least some browsers, so use of these names with . notation is prohibited.
Names starting or ending with _ - Some browsers have dangerous properties or methods that have a dangling _.
[ ] subscript operator except when the subscript is a positive numeric literal or string literal - Lookup of dynamic properties could provide access to the restricted members. Use the ADSAFE.get and ADSAFE.set methods instead.
Date and Math.random - Access to these sources of non-determinism is restricted in order to make it easier to determine how widgets behave.

An example ADSafe widget provides a <div> and enclosed <script> tag that uses the ADSAFE proxy object:

<div id="WIDGETNAME_">
   html markup required by the widget
<script>
ADSAFE.go("WIDGETNAME_", function (dom) {
    "use strict";

// This is where the code for the widget is placed. It can access
// the document through the dom parameter, allowing it indirect
// access to html elements, allowing it to change content, styling,
// and behavior.

});
</script>
</div>

ADSafe also allows loading approved external libraries:

<div id="WIDGETNAME_">
    html markup required by the widget
<script>
ADSAFE.id("WIDGETNAME_");
</script>
<script src="ADsafe approved url"></script>
<script>
ADSAFE.go("WIDGETNAME_", function (dom, lib) {
    "use strict";

// This is where the code for the widget is placed. It can access
// the document through the dom parameter, allowing it indirect
// access to html elements, allowing it to change content, styling,
// and behavior.

// Each library file can give itself a name. This script can access
// the library file as lib.name.

});
</script>
</div>

The source provides additional templates for creating ADSafe library modules and widgets. Be sure and check out the project web site for documentation on the ADSAFE object and other advanced usage.

[Comment on HackerNews]

[Source on GitHub] [Web site]

Practical AI #224

Data augmentation with LlamaIndex

Large Language Models (LLMs) continue to amaze us with their capabilities. However, the utilization of LLMs in production AI applications requires the integration of private data. Join us as we have a captivating conversation with Jerry Liu from LlamaIndex, where he provides valuable insights into the process of data ingestion, indexing, and query specifically tailored for LLM applications. Delving into the topic, we uncover different query patterns and venture beyond the realm of vector databases.

Matched from the episode's transcript 👇

Jerry Liu: Yeah, that’s a good question. And maybe just to kind of frame this with a bit of context - I think it’s useful to think about certain use cases for each index. So the thing about vector index, or being able to use a vector store, is that they’re typically well-suited for applications where you want to ask kind of fact-based questions. And so if you want to ask a question about specific facts in your knowledge corpus, using a vector store tends to be pretty effective.

[26:13] For instance, let’s say your knowledge corpus is about American history, or something, and your question is, “Hey, what happened in the year of 1780?” That type of question tends to lend well to using a vector store, because the way the overall system works is you would take this query, you would generate an embedding for the query, you would first do retrieval from the specter store in order to fetch back the most relevant chunks to the query, and then you would put this into the input prompt of the language model.

So the set of retrieved items that you would get would be those that are most semantically similar to your query through embedding distance. So again, going back to embeddings - the closer different embeddings are between your query and your context, the more relevant that context is, and the farther apart it is, then the less relevant. So you get back the most relevant context or query, feed it to a language model, get back an answer.

There are other settings where standard Top-K embedding base lookup - and I can dive into this in as much technical depth that you guys would want to, but there’s a setting that’s really standard, kind of like Top-K embedding-based retrieval doesn’t work well. And one example where it doesn’t typically work well - and this is a very basic example - is if you just want to get a summary of an entire document or an entire set of documents. Let’s say instead of asking a question about a specific fact, like “What happened and 1776?” maybe you just want to ask the language model “Can you just give me an entire summary of American history in the 1800s?” That type of question tends to not lend well to embedding-based lookup, because you typically fix a Top-K value when you do embedding-based lookup, and you would get back very specific context. But sometimes you really want the language model to go through all the different contexts within your data.

So a vector index, storing it with embeddings would create a query interface where you can only fetch the k most relevant nodes. If you store it, for instance, with like a list index, you could store the items in a way such that it’s just like a flat list. So when you query this list index, you actually get back all the relevant items within this list, and then you’d feed it to our synthesis module to synthesize the final answer. So the way you do retrieval over different indices actually depends on the nature of these indices.

Another very basic example is that we also have a keyword table index, where you can kind of look up specific items by keywords, instead of through embedding-based essence. Keywords, for instance, are typically good for stuff that requires high precision, and a little bit lower recall. So you really want to fetch specific items that match exactly to the keywords. This has the advantage of actually allowing you to retrieve a bit more precise context than something that factor-based embedding lookup doesn’t.

The way I think about this is a lot of what Llama Index wants to provide is this overall query interface over your data. Given any class of queries that you might want to ask, whether it’s like a fact-based question, whether it’s a summary question, or whether it’s some more interesting questions, we want to provide the tool sets so that you can answer those questions. And indices, defining the right structure of your data is just one step of this overall process, and helping us achieve this vision of a very journalizable query interface over your data.

Some examples of different types of queries that we support - there’s the fact-based question lookup, which is semantic search using vector embeddings, that you can ask summarization questions through using our list index. You could actually run a structured query, so if you have a SQL database, you could actually run structured analytics over your database, and do text-to-SQL. You can do compare and contrast type queries, where you can actually look at different documents within your collection, and then look at the differences between them. You could even look at temporal queries, where you can reason about time, and then go forwards and backwards, and basically kind of say “Hey, this event actually happened after this event. Here’s the right answer to this question that you’re asking about.”

And so a lot of what Llama Index does provide is a set of tools, the indices, the data ingesters, the query interface to solve any of these queries that you might want to answer.

Changelog Interviews #485

The story of Vitess

This week we’re joined by Deepthi Sigireddi, Vitess Maintainer and engineer at PlanetScale — of course we’re talking about all things Vitess. We talk about its origin inside YouTube, how Vitess handles sharding, Deepthi’s journey to Vitess maintainer, when you should begin using it, and how it fits into cloud native infra.

Matched from the episode's transcript 👇

Jerod Santo: Right. So a naive implementation of sharding would be at your application layer. So as you go to your users table, every time in your application that you’re gonna access some users, it would have to maybe do a look-up on the first letter of their last name and say “Okay, this one starts with S, and so I’m gonna go to this database that has the S’es in it.” That works, right? But at the application layer it’s very complicated to be doing that all the time. Sometimes you forget how it works, or a new engineer etc. So you can do that without Vitess; people do it all the time, build it into your application to shard. But what Vitess provides is this middle layer that hides that complexity underneath it, or tucks it away, so your application code can remain blissfully ignorant of that sharding strategy. I assume you could even change strategies or deploy multiple strategies, and your code that your developers are writing does not have to get spaghettied around. It doesn’t have to have all those concerns the whole time. Is that what you’re saying?

JS Party #196

Building GraphQL backends with NestJS

Doug Martin joins Nick to talk to us about building GraphQL backends in TypeScript with NestJS and his project, nestjs-query. We talk about what NestJS is and its built-in support for GraphQL and REST, and then dive into how NestJS-query extends it to generate code for you.

Matched from the episode's transcript 👇

Doug Martin: And I was a huge JavaScript developer, up until (I think) about 2015. Then I started doing Golang. Then I moved on to Scala, and doing a lot of that. Then I wanted to come back to my roots, I guess, and started doing more JavaScript, and uplifting some of the old projects that I had – like, FastCSV was written in JavaScript initially. So I wanted to look at what the latest technologies were, and TypeScript was really starting to take off, and I had been doing typed languages a lot recently… And I fell in love with how much safety it provides you… It’s kind of like that warm blanket at night. You feel a little bit safer.

So I went through and started upgrading FastCSV, and what I quickly found is starting JavaScript first and then porting over - how many things that you just ignore that could break your code, or whatever… And then TypeScript’s like, “Nope, you shouldn’t be doing that.”

So that’s when I really started to fall in love with it and see the power of it. I mean, it’s not giving you as many checks as a fully-typed language would, but it makes JavaScript just a little bit better. So it was really that – coming back from typed languages to JavaScript.

And the other thing that TypeScript helped me do, when coming back to an old JavaScript project, you have to build up this huge mental – the full context of the project, to know how everything ties together and make sure you’re thinking about all the edge cases. And if you haven’t looked at a piece of code in a couple of years, especially JavaScript code, that’s hard. So you’re like “How is this called again?” and then you right-click, and find usages, and nothing pops up. And you’re like, “Oh, that’s because I did a dynamic lookup or something on it, and it’s not actually referenced anywhere.”

So TypeScript gives me the ability to come back and support projects, even though I’m not actively working on them day to day. So once I really embraced the power of that, I’m not looking back.

JS Party #163

JS is an occasionally functional language

Eric Normand (long-time FP advocate and author of Grokking Simplicity) joins Jerod and KBall for a deep conversation about Functional Programming in JavaScript. Eric teaches us what FP is all about, details the functional side of JS, and reviews the good/bad/ugly of React.

Oh, and join us in the #jsparty channel of our community slack where we’re giving away three FREE e-book copies of Eric’s new book! 🎁

Matched from the episode's transcript 👇

Eric Normand: Yeah, yeah. So there’s that one… There’s one called groupBy, which is similar… Which instead of counting them, it actually puts them together, and the By means you’re passing in the function of how to group them. Let’s say you have an array of users objects, and you wanna group them by the first letter of their last name, something like that. You would pass in this array, and you would pass in the function that could calculate the first letter of their last rate, given one of them… And then it will make a map where all the e’s are in an array, and all the b’s are in an array… And there you go. Now you have them grouped in this way, and that’s great for making indices. If you need an index so that you can do a quick look-up by a certain value, like the first letter of their last name, it’s great for algorithms.

JS Party #146

Redux is definitely NOT dead

Redux maintainer Mark Erikson joins Jerod and Amal for an in-depth conversation around the React community’s fav state management solution. We learn how Mark came to be maintainer of Redux, why and how Redux Toolkit came about, when to go with Redux vs other options, and much more.

ALSO: prop drilling, the grep factor, & lasagna mode (oh my)

Matched from the episode's transcript 👇

Mark Erikson: Sure. So after writing the Redux FAQ in the spring of 2016, I followed that with a recipes section called Structuring Reducers, which gives some guidelines on things like “Why do we split up reducer logic into multiple functions? What are some ways that you can organize that reducer logic? And one of the patterns that I’d seen being used just in the first year of Redux’s existence was this idea of normalizing your state, which generally has two aspects of it. One is that you don’t wanna have duplicate copies of data being kept in the store.

[35:59] If we go back to that blogging example - so we’ve got users, posts and comments - every post probably has the user who created it. And if we fetch that data from the server, every post object might have a separate copy of the users object nested inside. We don’t want to store 50 copies of the user object in the Redux state, we just want to have one copy of the object per each user. And there’s a lot of cases where we want to be able to find a given user, a given post, a given comment by their ID.

So normalizing state generally implies that you’re going to store things as a look-up table, where the keys are the IDs, and the values are the items themselves, rather than storing them as an array. And so I wrote a docs page called “Normalizing state shape” that describes what this pattern is, and specifically suggest it as a good idea.

Despite that, we never included anything in the Redux library itself that ever helped you with the process of normalizing state in any way. There was a very popular library that’s been used with Redux called Normalizer, which I think Dan either started or helped maintain for a while. There’s also a library called Redux-ORM, which provides a class model-like facade over the plain data in your Redux store; and I did use that on one of my projects. But there was nothing built into the core library itself.

So after we’d built out the initial APIs for Redux Toolkit, earlier this year I was starting to think about that idea of normalization as a problem space, that we ought to supply something to help with. So I was looking over various packages and third-party libraries other people created that help with that in some way, and I ran across something that the NgRx store people had created. NgRx is basically a reimplementation of Redux for the Angular ecosystem, built around the RxJs package. And because of that, there’s a lot of overlap in the kinds of things that both Redux and NgRx do.

The NgRx maintainers had created an add-on called createEntityAdapter, which basically it provides a set of pre-built reducer functions for things like add one, add many, set all, upsert one, remove all etc. The typical CRUD-type operations you would do on a set of data. And I looked at it, I’m like “You know what - this package only has one or two references to NgRx at all. It’s almost library-agnostic. Is there any way we could make this reusable, so we could start using it with Redux Toolkit?” And they started looking at it, and I started playing around with it myself, and I ended up actually kind of porting it over and half-rewriting it. I added the use of Immer inside, the arguments for their functions were in the update,state order, instate of state,update… So I switched them around, so we could actually used them as reducer functions.

So ultimately, I ended up porting it, but none of that would have existed if the NgRx folks hadn’t created it in the first place… And it’s really cool to see that cross-pollination of ideas going back and forth, because NgRx was inspired by Redux, our createEntityAdapter, was a port of theirs… So it allows you to skip having to write reducer logic in a lot of cases for the most common kinds of update scenarios that you might be dealing with when dealing with a collection of some items. And you can either use them as the entire reducer function for a given action type, or you can use them as helpers within a larger reducer function as part of the logic that you’re writing.

Go Time #104

Building search tools in Go

Johnny is joined by Marty Schoch, creator of the full-text search and indexing engine Bleve, to talk about the art and science of building capable search tools in Go. You get a mix of deep technical considerations as well as some of the challenges around running a popular open source project.

Matched from the episode's transcript 👇

Marty Schoch: Sure. This was all the way back in 2014. Couchbase has this – obviously, storing data is the primary thing that databases do, right? But then people need to access their data, and you’re always looking for different ways to express the kinds of things that they’re looking for. It could be a key-value lookup, where you already know the key; it could be a SQL query, where you’re writing a query to describe the sets of records that you want returned, or it could be now this new thing, search, where you’re able to do full-text search capabilities across your document.

So Couchbase was in this position of looking to add that capability to their product. We were already adopting Go at that point, and had been successful using Go to – from our perspective, the value-add of Go was really faster development time. Maybe we could write a higher-performing thing in C, but there was also a chance that it crashes all the time, and the code quality is no good, and it takes maybe twice as long to get it to that same point. Go has always been a very – to me it’s like it’s an engineer’s mindset; it’s the right trade-offs for what you need right now.

So again, we set out to write what we needed in Go, but also we had this vision from the very beginning of making it open source. And I don’t mean open source in name only, which is what you see a lot of companies initiate, or they write something first and then they open source it later, but there’s not really that community working on something together approach. We really set out to build a true open source community around it.

[08:02] And again, you can debate how successful or not successful we are, but it’s a tough thing to set out to do, and I’m pretty proud of what we accomplished, as led by Couchbase.

Changelog Interviews #346

Off the grid social networking with Manyverse

We’re talking with Andre Staltz, creator of Manyverse — a social network off the grid. It’s open source and free in every sense of the word. We talked through the backstory, how a user’s network gets formed, how data is stored and shared, why off-grid is so important to Andre, and what type of user uses an “off-the-grid” social network.

Matched from the episode's transcript 👇

André Staltz: Scuttlebutt is a local database, where you increment your database with new content that you post… And then the question of how to share that data with others is, you know, you form networks; they could either be ephemeral networks - let’s say if you join some Wi-Fi, and that Wi-Fi has other devices connected to it, then you can exchange data with those other devices… But that is a connection that’s ephemeral, because you’re not always connected to that Wi-Fi, and those other devices are not always there. But that’s one legitimate way of sharing data in Scuttlebutt - with these ephemeral connections.

The less ephemeral ones are through the internet, and we use intermediate servers called pubs. A lot of people think that pubs are very central to Scuttlebutt; I don’t believe in that, because I’m also devising some other forms of servers… But it’s just enough to know that pubs are just one way how you can exchange data between Alice and Bob. The way that it works is sort of like a pub is mirroring your data, and it’s also mirroring Bob’s and Alice’s data… So it’s sort of like a hub that replicates what all these devices have.

And then once you connect to that pub, you will get the most recent data from all of those users. It doesn’t need to be a massive amount of users; it could just be a couple friends that share one pub, and they post constantly on there.

Then you could have a constellation of many pubs, and depending on who you connect to, you’re gonna get their updates… But the important thing is that if a pub goes down, then no data is lost, because it was just mirroring what everybody had. So you connect to another pub, that gives you a path to your other friends, and then you can get the data from that other server.

So it’s not a network that is made permanent. It’s a network that depends on who you want to connect to. The core idea in Scuttlebutt is that you define who are the people that you’re interested in, or that are your friends, and then you can get data from them through whatever means are possible, such as ephemeral connections, or pubs.

[39:54] There’s also new ways… I’m building one of these through a distributed hash table. A distributed hash table is what BitTorrent uses, and what Dat uses. It’s essentially like a big look-up that is spread around multiple computers, and they just bounce the request back and forth until it reaches the right destination. You could also get updates from your friend through a distributed hash table.

It’s really about choosing who you want to connect to, and then getting their updates through whatever means work, essentially.

Changelog Interviews #292

Elasticsearch and doubling down on "open"

Philipp Krenn joined the show to talk with us about Elasticsearch, the problem it solves, where it came from, and where it’s at today. We discussed the query language, what it can be compared to, whether or not it’s a database replacement or a database complement, Elasticsearch vs Elastic the company.

We also talked about the details behind Elastic’s plan of “doubling down on open” to open up X-Pack, which is open code paid add-on features to Elasticsearch. We discussed the implications of this on their business model, and what changes will take place at the code and license level on GitHub.

Matched from the episode's transcript 👇

Philipp Krenn: So if you store that in Elasticsearch, the sentence “These are not the droids you’re looking for”, after removing the stop words and stemming, what remains is “Droid you look”, because these are the three main concepts that might stick out, or that people might be searching for. So they’re all irrelevant, even the ‘not’. Full-text search doesn’t generally understand what you’re saying, like if this is positive or negative or what this is; it’s kind of just matching on these terms, and “Droid you look” are the three terms that would remain when you do the search.

Depending on the sentence, you will have more or fewer stop words, and we really kind of extract these base concepts. Then, since we’re just storing this stemmed version of the concepts that you have, the lookup afterwards is very fast, because whatever you’re searching for… If you search for “droid” or “droids” - it doesn’t really matter; this term you’re searching for runs through the same pipeline. So the stop words are removed, we’re doing the stemming, and then we can just go on the direct matches, and then you can see “Oh, we are searching for droid, and this sentence contains droid.”

[24:10] Then we’re doing the calculation of how relevant the specific text is. For example, if a text contains droid multiple times, that is probably more relevant for your droid search than if droid term was only appearing once in the sentence. Then we’re assuming, “Okay, droid is kind of like a relevant concept”, we give a specific weight to that, and then we will also take into consideration how long a specific element is.

For example, if your search term is appearing in a title - titles are normally very short - that is much more relevant than if it’s just appearing in text body, because that is much longer. The base concept that is being applied there in the background, which I’ve tried to describe here, is called tf-idf (term frequency-inverse document frequency) which is kind of calculating this relevancy.

The algorithm has been slightly refined by now, it’s called Best Match 25 (BM25); so it’s the 25th iteration of the Best Match algorithm, and this one is slightly better now. This is what is doing the heavy lifting behind the scenes for your search.

If you compare that to the classical search a lot of people are probably still doing in the relational database, you will have a hard time, because this doesn’t support anything like stemming. This also doesn’t support anything like fuzzy search; this doesn’t support synonyms, and lots of other concepts. And if you have the wild card in the beginning, so if you’re doing the like percentage (whatever term you have percentage), you cannot even use an index, so your search will always be very slow, because you’re basically going through all the entries.

Since you have the wild card in the beginning, you cannot use the index because you don’t even know where to start; you basically need to go through all the entries… Whereas full-text search just extracts the right terms and then you basically check where are these terms, in which documents do I have appearances of these terms that I’m trying to find.

Go Time #16

SOLID Go Design with Dave Cheney

Dave Cheney joined the show this week to discuss SOLID Go design, software design in Go, what it means to write “good Go code”, and error handling.

Matched from the episode's transcript 👇

Dave Cheney: Yeah, yeah. But there’s still probably an argument… You could compare a hashmap to just an array of items. For really big hashmaps and really big arrays of items, the lookup time is O(log N) versus O(n). But for really small ones, that N is really small, so it doesn’t matter. These are the subtleties… If you’re saying, “Well, we always have to use a hashmap, because hashmaps have faster lookups” is ignoring the fact of… Say HTTP had a map - is it ever gonna have five things in it? What’s the overhead of setting up that hashmap, hashing all the items, versus just doing a straight linear search. HTTP headers only have five or six items in them, usually.

So those are the kind of design decisions you can talk about with space and time, the big O notation. That was the thing that I wanted to get people in the Go community talking about - talk about design at a high level, rather than just posting on Reddit, “Hey, what’s the fastest way of framework? What’s the best HTTP?” I wanted to see people starting to talk at a higher level, start thinking about “What is the best way to design my application to make it maintainable in the future? Make it maintainable and reusable, composable.”

Changelog Interviews #616

ANTHOLOGY — Packages, pledges & protocols

The hallway track at All Things Open 2024 — features Carl George, Principal Software Engineer at Red Hat for a discussion on the state of open source enterprise linux and RHEL (Red Hat Enterprise Linux), Max Howell, creator of Homebrew and tea.xyz which offers rewards and recognition to open source maintainers, and Chad Whitacre, Head of Open Source at Sentry about the launch of Open Source Pledge and their plans to helps businesses and orgs to do the right thing and support open source.

Matched from the episode's transcript 👇

Max Howell: They also get a slightly increased yield, because they’re encouraging people to stake. Now, the Tea - we gain from people staking because it locks the token up; it prevents people from suddenly selling it. There’s an unstake period. This is common with crypto projects to prevent rapid fluctuations in token price.

Changelog & Friends #40

Rug pull, not cool!

If Changelog News had an extended edition, this might be it! Jerod & Adam discuss Hashicorp’s Cease and Desist letter, Redis getting forked, Boston Dymanics’ scary cool new robot, Justin Searls’ extensive use of the Apple Vision Pro, Thorston Ball moving from Vim to Zed, Firefox becoming hard to use, Beeper joining Automattic & more.

Matched from the episode's transcript 👇

Jerod Santo: Well, there you go - open source Beeper. I’m staking my claim. I think that’s gonna happen.

Changelog & Friends #29

You have how many open tabs?!

We’re taking you to the hallway track at THAT Conference in Austin TX, where we have 3 fun conversations: one with our old friend Nick Nisi from JS Party, one with our new(ish) friend Amy Dutton from CompressedFM (who has been a guest on JS Party of late) & one with our brand new friend / long-time listener Andres Pineda from the Dominican Republic.

Matched from the episode's transcript 👇

Jerod Santo: He’s already staking his claim.

Changelog Interviews #576

In the beginning (of generative AI)

This week on The Changelog we’re talking with Joe Reis about data engineering and the beginning of generative AI. We discuss phone hacking via frequency, the role of a data engineer, this AI hype cycle we’re in, build vs buy, the disconnect between data analysts and the business, ethical considerations around AI-generated content, and more. We also discuss the tension between AI and traditional engineering, as well as the inevitability of AI integration into pretty much everything.

Matched from the episode's transcript 👇

Adam Stacoviak: Well, I think if I can’t distinguish it, then I’m thinking what’s the motivation? What’s the end game? Is it because you want to create more? So I think where the lines get blurred there i this kind of side conversation we’ve had behind the scenes, which is “Should we have an artificial intelligence help us with our transcripts? Should we have artificial intelligence help us with a literal recreation of our podcast, in my voice, in a different language?” And we’ve been talking about it behind the scenes.

We’ve seen that recent politician in Davos happened - like, that was super-cool. I could somehow literally listen to somebody that spoke in a whole different language, for a speech that I would often just discard, not because I don’t care, but because I can’t care, because I don’t speak the language. I can’t understand. So now we have this – just to paint the picture for those who are catching up or listening, there was a thing in Davos, there was somebody who - I’m not sure of all the details; he spoke a foreign language in comparison to English for me. And typically, I would just not pay attention. But because somehow, someway, I’m not sure how it happened, that was ran through some sort of artificial intelligence thing, and it was his voice as a politician, just as a normal person, staking his claim and sharing his ideas, but spoke in his native language, translated to English, in his own native voice. So it’s as if I heard the same person speaking English, and now I can understand. And I was like “Okay, that’s cool. I can understand this person more.” That to me, I guess - when we talk about generated, what is the purpose of generation, I suppose? If it’s to pull the wool over my eyes, and entrap me, or ensnare me in some way in a scheme, then I’m not for it. But if it’s for my betterment, and it’s for me, then I think I’d probably be more aligned with it. But I think over time, even with the betterment side of it, and you can’t discern it, it’s like, now you question everything, because you’re just not even sure if anything is for you or against you. Gosh, how doom and gloom is that…?

Changelog & Friends #28

Gradually gradually typing Elixir

Our old friend José Valim & his team have been hard at work adding gradual typing to Elixir. They’re only 1-3% of the way there, but a lot of progress has been made. So, we invited him back on the show for a deep-dive on why, how & when Elixir will be gradually typed.

Matched from the episode's transcript 👇

Jerod Santo: Yeah, that makes sense to me. Absolutely. And you’re starting with the guard clauses and the pattern matching. Is that where you’re kind of staking the claim? Because there you’re already kind of doing – so in Elixir you can define multiple functions or methods, and depending on the actual parameters coming in, you’ve got multiple versions of the same function, and it’s checking all kinds of stuff. So you can say “Is it a struct?” You can say, “Does it have these particular fields?” You can do crazy amounts of pattern matching there, and then it will call the appropriate function that matches the pattern. So you’re kind of doing type analysis there, in a sense. Then you can have the actual guard clauses, which you guys provide a bunch, like is_map, or something like that; is_list. And you can write your own as well, can’t you? …to a certain extent.

Go Time #181

Building for Ethereum in Go

In this episode, we will talk about building for Blockchain in Go. We are joined by two of the co-founders of Prysmatic Labs (a company behind the upgrades to the Ethereum network). Raul Jordan and Preston Van Loon tell Angelica how they started the company, as well as what it’s like to build technical infrastructure for the Ethereum blockchain using Go.

Matched from the episode's transcript 👇

Raul Jordan: Yeah. I think it’ll be really hard for Bitcoin to switch over. Bitcoin prides itself on kind of being unchangeable, immovable. It’s unlikely that they will adopt something as radical as this. They have a very strong perspective that proof of work is the only way to go, for example. We don’t share that mindset.

The one mindset that I do share though is that I think it is risky to launch a proof of stake network from scratch. So proof of stake, by virtue of its name, you are staking something; you’re putting something up at risk, that has value, that you don’t wanna lose, in order to secure the blockchain. If you launch a project with proof of stake - where does the value come from? You’re staking something that doesn’t have any value, and you’re telling me that it’s gonna go securely?

So I’m a lot more skeptical of projects that launch pure proof of stake. I think it can be very dangerous. But like I said, Ethereum is already proof of work for a long time, it already has this massive security pool, and migrating to proof of stake makes sense for Ethereum. So I think there’s a trade-off. But yeah, we’re working on that…

Changelog Interviews #430

Darklang Diaries

This week Jerod is joined by Paul Biggar the creator of Dark, a new way to build serverless backends. Paul shares all the details about this all-in-one language, editor, and infrastructure, why he decided to make Dark in the first place, his view on programming language design, the advantages Dark has as an integrated solution, and also why it’s source available, but NOT open source.

Matched from the episode's transcript 👇

Jerod Santo: Yeah. Well, like I said, I applaud you for staking your claim. I’m with you. The only thing is that I like open source, I’m also okay with proprietary, and I just want the lines to be drawn and clear… And what I do not like is when you have people who are trying to ride the coattails of open source, and benefit from the great goodwill that the term and the community has… Because there is some toxicity, but there’s just a lot of value there. There’s tons of marketing value just to call yourself that, without actually being that so…

Practical AI #79

TensorFlow in the cloud

Craig Wiley, from Google Cloud, joins us to discuss various pieces of the TensorFlow ecosystem along with TensorFlow Enterprise. He sheds light on how enterprises are utilizing AI and supporting AI-driven applications in the Cloud. He also clarifies Google’s relationship to TensorFlow and explains how TensorFlow development is impacting Google Cloud Platform.

Matched from the episode's transcript 👇

Chris Benson: I love your perspective there, the business take on where you’re gonna choose to invest in adding value and how you create competitive advantage using these types of AI tools for your organization. As you were saying that, I’ve found myself wondering, what do you think some of the big challenges that you see people trying to create a competitive advantage for themselves, where it’s not the run-of-the-mill stuff, where they’re just taking their data and doing a little bit of transfer learning, creating their own version of the same model… But the things where people are saying “This is where our organization wants to make a mark.” Are there any examples that particularly surprised you or caught your attention in terms of big challenges that organizations are staking themselves on?

JS Party #112

Do you want JavaScript again or more JavaScript?

It’s a new year which means companies are hiring and developers are interviewing. So we thought it would be fun to host a fun game of technical Jeopardy.

Matched from the episode's transcript 👇

Emma Bostian: Alright, so here’s where we’re at - Jake, you’ve got 8,100 points. Kball, you have 7,400 points. There’s one final question, and I’m gonna need you to each write in the chat, the public JS Party chat right now, how many points you’re wagering on this answer. You can wager nothing, and just take it and run, or you could bet it all, and who knows - Kball, you might come out on top. So write in your chat, final Jeopardy, how many points you are staking on this answer.

Changelog Interviews #349

The state of CSS in 2019

We’re talking with Sacha Greif to discuss the State of CSS survey and results. CSS is evolving faster than ever. And, coming off the heels of their annual State of JavaScript survey, they’ve decided to take on the world of styles and selectors to help identify the latests patterns and trends in CSS.

We talk through the history and motivations of this survey, the methodology of their data collection, the tooling involved to build and run the survey, and of course we dig deep into the survey results and talk through the insights we found most interesting.

Matched from the episode's transcript 👇

Sacha Greif: Yeah, it’s one more data point which you can take into account or not. It’s up to you. Now, I think it’s fair to say that because I called it The State of JavaScript, I’m kind of staking a claim, in a way… “This is what it’s like. I have the ultimate truth.” But I think you have to take it more as a marketing thing, in a way. I thought it was a good name, I thought it really communicated what I was trying to do. I’m not trying to say I have the only truth about JavaScript, or CSS, or whatever. So you have to take it with a grain of salt.

lirantal.com

Advanced patterns for taking screenshots with Playwright

In this post, I will show you some advanced usage patterns for working with Playwright in order to take a screenshot of a specific element and modify the contents of the image, either before taking the screenshot or after, using image preprocessing tools.

squeaky.ai

Why we don't use a staging environment

The Squeaky team goes from dev straight to prod (same here), but many people advocate for (and use) staging or other “pre-live” environments instead.

While there are obvious benefits to deploying to different environments, at Squeaky we’ve decided to take a different approach. We only have two environments: our laptops, and production. Once we merge into the main branch, it will be immediately deployed to production.

Perhaps that sounds unusual, but so far it’s outweighed the benefits of pre-live environments, and we believe it’s helping us to ship faster, and lower the number of issues on production. So, I thought I’d write this post to share why we think it works, and why you should consider it too.

wiki.dendron.so

A local-first, markdown-based note taking tool for VS Code

whereas most tools (try to make it) easy to get notes in, they tend to make it hard to get them back out later, and it only gets worse as you add more notes. Dendron helps you get notes back out and works better the more notes you have.

There are a zillion and one note taking apps out there, but I like how Dendron positions itself here. I’ve never had a note-taking system that I stuck with, mostly because I rarely go back and find things in my notes that are useful. Most of that’s on me, but I wonder if some of it is on my tools not making retrieval a priority…

A local-first, markdown-based note taking tool for VS Code

Changelog Interviews #341

Wasmer is taking WebAssembly beyond the browser

We’re talking with Syrus Akbary about WebAssembly and Wasmer — a standalone just in time WebAssembly runtime aiming to be fully compatible with Emscripten, Rust, and Go. We talked about taking WebAssembly beyond the browser, universal binaries, what’s an ABI?, running WebAssembly from any language, and what a world might look like with platform independent universal binaries powered by WebAssembly.