Philipp Burckhardt, Athan Reines & the team behind stdlib.io believe in a future in which the web is a preferred environment for numerical computation. They’ve been working toward building that future for over a decade. Thanks to listener, Brian Zelip, Jerod sits down with Philipp to learn all about this excellent effort: where it’s been & where it’s headed.
Featuring
Sponsors
Speakeasy – Production-ready, Enterprise-resilient, best-in-class SDKs crafted in minutes. Speakeasy takes care of the entire SDK workflow to save you significant time, delivering SDKs to your customers in minutes with just a few clicks! Create your first SDK for free!
Changelog News – A podcast+newsletter combo that’s brief, entertaining & always on-point. Subscribe today.
Notes & Links
Chapters
Chapter Number | Chapter Start Time | Chapter Title | Chapter Duration |
1 | 00:00 | It's party time, y'all | 00:39 |
2 | 00:39 | Sponsor: Speakeasy | 02:59 |
3 | 03:38 | Welcoming Philipp 👀 | 01:01 |
4 | 04:39 | Data Science vs AI | 05:19 |
5 | 09:58 | Intro to stdlib.io | 08:31 |
6 | 18:29 | How did you build it? | 03:50 |
7 | 22:19 | Sponsor: Changelog News | 01:28 |
8 | 23:48 | Breadth of the library | 08:58 |
9 | 32:45 | How decomposability works | 05:23 |
10 | 38:08 | What people want from a stdlib | 00:44 |
11 | 38:52 | Prioritization | 02:36 |
12 | 41:28 | ML frameworks above? | 02:04 |
13 | 43:32 | Getting sponsors | 01:18 |
14 | 44:50 | Waypoints for app devs | 02:34 |
15 | 47:25 | Take the survey! | 01:25 |
16 | 48:49 | Closing time | 01:13 |
17 | 50:03 | Next up on the pod | 01:10 |
Transcript
Play the audio to listen along while you enjoy the transcript. 🎧
Hello, JS party animals. I am your internet friend, Jerod, and I am joined today by Philipp Burckhardt. Philipp, welcome to the show.
Yeah, thanks for having me. I’m very excited.
Happy to have you, man. Happy to have you. So shout-out to listener Brian Zelip for requesting to have Philipp on the show. Philipp is the author of stdlib.io, the standard library for JavaScript, and a real, live data scientist. Aren’t you a data scientist, Philipp?
Yeah, I mean, that’s my job title. I’m a bit of a jack of all trades. I’ve done a bunch of things. But I do have a PhD in statistics and data science, so that’s my background, kind of working with data and drawing insights from data.
Excellent. So perhaps a first here on JS Party. We just normally talk to pure JavaScript nerds… But now we have a hybrid, someone who likes JavaScript and likes statistics, analytics, data scientry… Data science as a thing - it seems like data scientists were super-hot like five years ago, and now they all had to rename themselves to either data engineer, or AI engineer… Have you felt the pressure to retitle, even though you’re doing the same thing?
To kind of rebrand and do something different… I mean, I think it kind of points to an issue that a lot of data science projects really didn’t deliver as much, kind of in industry, and the value that kind of was promised… And in that maybe there’s not as much demand as was anticipated, just for like pure kind of data sciency world. And that’s not what I’m doing, either. I think a lot of the actual work that happens in companies and stuff is often more like what people call it data engineering. ETL, extract, transform, load, that kind of is the acronym… And really, how do you get the data into the right format to make it amenable and to draw insights from it, to work with it? And that’s what’s kind of much more the daily works that people do, that work with data predominantly. And what people would say is data science, maybe like just coming up with a statistical model to do predictions, or something like that - that’s really just a tiny fraction of the work that people in such positions do. Because often the main challenge is get the right data, make sure the data is good, of good quality, doesn’t have any hidden issues and stuff like that… And that can easily bite you. It’s happened many times.
Yes. A lot of the work, it sounds, of data science. I know this by proxy, because I produce the Practical AI podcast, and Daniel Whitenack on that show, one of our hosts, is a data scientist. Now, of course, he is now a founder of an AI-related startup, as most data scientists are at this point…
Yeah. I mean, that’s another thing. [unintelligible 00:06:42.14] And there’s a lot of value there, but also probably at some point there will be an adjustment of expectations. It’s at least not the magic box yet that people expected it to be.
Yes, the magic box has some magic tricks, but they’re not quite as magical as we hope they are, at least at this particular phase in the cycle. One of the promises I’ve heard of LLMs is their ability to maybe clean up the data for us, or help us clean up the data. I don’t know about labeling, but even just like paste in a dirty CSV and tell it “Hey, can you make this better?” Is that a real trick it can do, or is that just people talking?
I mean, you can certainly ask LLMs, prompt them to do a clean-up… Whether it just introduces other errors, it really depends. For some things, it is really useful. I think if you have unstructured data, like text, where it’s not very easy to write a deterministic program to maybe put it into a structured format like JSON… You can easily do that. Just give an LLM a text and then ask it to put it into a structured format. Or give it a website, and to just scrape it easily like that.
[00:08:09.06] And I think for those, where it’s like a one-off thing, where you don’t need to have a complex – I don’t know, a complex AST parser thing that tries to kind of extract all the relevant information deterministically… That’s probably more robust. Well, it is more robust. But often, that might be just too much work, and you can just use an LLM as a crutch to get to something. And if it works, it works. I think there’s some footguns there, too.
Yeah, I mean, there are – and I’ve done some scrubbing and normalizing and all that kind of stuff… I’ve been around long enough to have datasets that come to me, and I’m like “Oh, this is a mess.” And I just write the code that does it. But it takes a while, and it’s toil… And the thing about it is it’s almost always throwaway code… Unless you’re gonna continually get bad data from the same source, and you’re gonna rerun that over and over again. A lot of times, the work is like “Well, I’ve got to clean up this dataset, and then I’m just gonna throw away all my tools that I made, because they were ad hoc.” And so if you can get 80%, 90% of the way there by spending 30 seconds versus four hours, maybe you can spend 30 minutes on the resulting dataset. And since it’s all probably throw-away at the end of the day anyways… Like you said, if it gets the job done, why not use it, right?
Yeah, usually I try to be pragmatic and think not to be too ideological about things like this… There are uses for things, and some tools are good at certain jobs, whereas others… And for me, LLMs right now at least are like a super-exciting technological advancement, that does a lot of things. But in the end, it’s not – I don’t know, we will see what happens… Whether the AI apocalypse is –
Right. It is not the silver bullet that some people claim it to be. Okay, so moving on from there… Of course, this is the world of science and math and data that you live in, and so it makes sense to talk about it… But we’re here today specifically to talk about your standard library project, which lives, like I said, at stdlib.io. The mission of this, it says, is a future in which the web is a preferred environment for numerical computation. So that’s the reason why you built it. Can you tell us a little bit of the backstory? How it started, why it started, when it started, all that?
Yeah, for sure. I’m gonna do that. And just to correct the record a little, it’s not my project, necessarily. I started it together with Athan Reines, a data scientist from San Francisco. We started this – I mean, we have been working together now for 10-11 years, so a long time, really, and we started originally with a prior project, and then it officially came into being in 2016.
Okay.
But I can give you a bit of a backstory, maybe.
Please do, yeah.
Like how I really got into JavaScript, and got very excited about the web as a platform… And then I had basically this challenge for myself, okay, how can I combine these two different things that I really like? So I really wanted to do my academic work at the time, and at the same time I really liked – I kind of got really into web development, and just building web applications.
But I think the genesis – I mean, it is tied in some way to Node.js. I wasn’t there at the super-early stages, when Ryan Dahl did Node… But then there was this community, that evolved once [unintelligible 00:11:33.23] and the package manager, and the people started writing all kinds of packages for the JavaScript ecosystem. And I think that wasn’t there before. And it also unlocked thinking about JavaScript not just as this toil language of the web browser. It’s not this language that just is used for interactivity on websites.
Right.
[00:11:56.12] You could think suddenly it can power a web server, but it can also be in a terminal, if you need to do something; you can just use it. And so that’s kind of in the background there… And after my bachelor’s I didn’t really know what to do. I did a bachelor’s in economics, I didn’t really like it, and so I pivoted, so to speak, to statistics, and I did a lot of [unintelligible 00:12:16.28] programming. That’s kind of the preferred language for like statisticians. And it has some nice benefits, but also some downsides, as all things do. But I ended my master’s, and while doing that, we were working on a project to kind of model the political – the sentiment around different political topics on the platform, I guess now X, formerly Twitter. We did a scraper to scrape newspaper articles from all kinds of sources. And then we extracted that information, and I used MongoDB as a document store. And that has a JavaScript REPL environment, which you can interact with it…
So that was what I was doing, was trying to kind of do topic modeling, which is kind of probabilistically modeling these topics, and we added a time dimension such that you can see how do they emerge over time, and then maybe also collapse again if they – if the new cycle is very short-lived, where there’s a few topics that blow up, and then they suddenly disappear again.
And what I was doing though – so I needed to do some calculations. And I was already using Mongo, and I had to use JavaScript kind of to interact with the Mongo app, and I also had to do the scraping. And for the scraping – at the time; now it has like [unintelligible 00:13:38.21] that whole ecosystem wasn’t really there at the time. It was very in the early stages as well. So now it’s very grown, and there’s a lot there, but at the time it was just like working a DOM tree, and extracting content from a website, you couldn’t do in R. So again, I used I think jsdom [unintelligible 00:14:01.12] And in my job now at Socket [unintelligible 00:14:06.23] is my coworker, so it’s kind of interesting… The world is small and you always end up crossing paths again.
That’s pretty cool, right? Meeting people… I love it. I love it when you just use something for a long time, and you just kind of like remotely admire its creator, but maybe never interact with them. And then all sudden you’re like “Oh, wow.”
Yeah, and you never even meet, maybe.
Yeah.
That’s kind of how open source works. People just put their things out there, and then they take a life of their own at some point… Hopefully.
Yeah, hopefully. Or they die on the vine. But you know, not everything lives. It’s part of the circle of life. So did you say Athan? Was that your partner on this?
Yeah, Athan Reines. He’s my collaborator on there. And now the project has grown. I think we’re almost at 100 contributors now in total, and this summer we got accepted into Google Summer of Code for the first time…
Oh, cool.
…with four super-talented students working on different projects… I can touch base on that later, maybe. But yeah, so it’s just really like a collaborative effort, I just wanted to highlight that. And that’s what we want. I mean, we really hope to build a community of people, and people who are really engaged and want to push this project forward… Because it cannot really be done by a few people. It’s too ambitious in some ways.
[00:15:31.10] But yeah, that’s kind of maybe some learning. This was 2012 when we started this. So when I started getting into Node… And even that – I met Athan through GitHub. So I think he came across on my repos, [unintelligible 00:15:43.22] I did some numericals things… And then he wrote me an email, and we took – there’s actually one other funny story there… Because he hired me for the summer. I wanted to do Google Summer of Code back in the day, because as a [unintelligible 00:16:02.01] doing a PhD, usually you get a nine-month salary, and during that time you do your research, and maybe you have to teach, or like do some other work for it… But then maybe there are grants to pay you for the summer, or you can also teach through the summer, you can do your own – but at the time I wanted to spend the summers in Germany, so I was trying to figure something out… And I was actually – now, again, another co-worker of mine now, Mikola Lysenko, who did a lot of very popular packages in kind of numerical computing, in JavaScript, [unintelligible 00:16:34.10] kind of multi-dimensional array one, which is used in tons of places, [unintelligible 00:16:38.18] and other packages… So he would have been my mentor for this Google Summer of Code project, which was about compiling – using Emscripten back in the day, to compile C libraries for linear algebra to JavaScript. And that didn’t happen. So it didn’t happen, and then this fell through, because we were trying to do it through the jQuery Foundation. Do you remember jQuery…?
Oh, yeah.
[laughs]
I’ve got some gray hairs over here. I know jQuery Foundation.
Yeah, yeah. So that [unintelligible 00:17:08.13] I mean, probably still; I don’t know how many websites still use it.
More than we think.
More than we think, right?
Yeah…
But yeah, so then Athan ended up just hiring me for the summer to work together with him. And that kind of sparked really like a long-time collaboration. And we have – yeah, it is a long road, and you see there’s always more to do, and it’s… You know, Rome wasn’t built in a day, right?
Right. Well, JavaScript was in 10 days, but… One of the things that happens when you have a language primarily in a browser is you have a hard time expanding the breadth of the language. And so one of JavaScript’s longtime downfalls has been a lack of any sort of standard library. I mean, most languages have one; some are better than others, some get old and crufty, other people maintain theirs… And JavaScript, prior to Node, there just wasn’t much there. And like you said, Node busted it out of browsers, gave people a reason to kind of take it more seriously as a language… And one of the things that serious languages have is standard libraries. A set of tools that aren’t part of the core language, but that are attached to it, and allow you to do all kinds of stuff that are common.
Now, your project obviously is a third-party thing. You’re not attached to the JavaScript language or the ECMAScript language. You are a standard library that is a package that’s installable. But how did you build it out? What did you decide to build? I mean, there’s so much stuff in there at this point. It seems like it was focused on maybe math at first, or maybe now it’s focused on math and numerical computing… But there’s so many utility functions and stuff that it’s beyond that. But that seems to be the focus.
Yeah, that’s like the – I get question quite a bit. I mean, as you said, JavaScript historically didn’t have like a standard library, compared to other language ecosystems. And that kind of limited significantly what was possible. And you also mentioned the challenge that basically the language being in the web browser and having that unique position, that kind of means that it also evolves at a different pace than other languages. So it has to go like through TC39 committees, which [unintelligible 00:19:21.19] changes… And so it’s very much like – like, maybe things are changing now, because there are all of these JavaScript runtimes now… There’s Bun, which is just forging ahead and just doing stuff, and forcing the hand of many other players… So there’s this proliferation of JavaScript runtimes. And Deno as well, and like TypeScript getting full support…
[00:19:48.08] So I think maybe things have changed a little, but back in the day when we started, we were just thinking “Alright, so there’s definitely this lack of a standard library in JavaScript.” And we really wanted to provide a really rigorous and high-quality set of tools and functions… And also, have everything in a way that’s fully documented, where we can have – you know, we strive to have 100% code coverage, everything’s fully documented, and everything’s coherent… So we really wanted to do something where we do not have to depend on external kind of tooling or dependencies.
Now I’m working at Socket, so we do this supply chain security for companies… So I kind of – there’s always when you depend on third party code, that’s always just the worst, because things can break, things can change. And of course, it’s also like the biggest unlock for like productivity, because you can just draw on all the – you can stand on the shoulders of giants, and the entire work that people have done.
But first, it was really like – we needed certain functionality, and it was just not there. And so we were like
Okay, we will kind of build this out on our own”, and to really ensure “Okay, this is high-quality code, that we can depend on for the future.”
And I would say, this focus certainly I think, in my mind at least, has shifted to “Okay, we really want to provide that numerical and statistical computing infrastructure.” Because there is – at least I think the sentiment has shifted in some sense, because now with all the language features starting with ES6, all the additions to JavaScript, people are just less complaining about the lack of tooling that is baked into the standard library, and there has been more standardization about kind of what the web APIs and what’s available I think as well. So I don’t think that’s our main selling point… But we do offer more than just numerical and statistical computing. That is certainly true. And if you want to do text processing, or like you have some other tasks with natural language processing, or just general utilities, working with arrays, working with multi-dimensional arrays specifically, which is like a fundamental building block you need for any kind of numerical or statistical work, that’s all kind of part of the library. And part of these things are never gonna come to JavaScript in all likelihood either. Given the kind of environment the language lives in, there’ll likely not be support for those things in the future.
Break: [00:22:20.00]
So as of the GitHub readme, 150+ special math functions, a bunch of probability distributions, there’s a lot in there for data viz, and then just 200+ general utilities, if you’re doing data transformation, functional programming… That’s kind of the stuff where as the platform has gotten better, some of that stuff has just gotten less useful, because it’s built right in… And that’s great. There’s a lot of stuff in there that’s probably not still built in, but there’s a lot of functional abilities inside JavaScript, the language…
But maybe even sometimes you want like a functional API, for example; a lot of things in JavaScript are hanging off of the prototypes. So there’s still value. Even if the utilities I would say that are now part of the standard library, there’s still good reasons to have those utilities. Plus for documentation purposes, because for example we do have like a REPL environment, and you can look at Help pages, you can look at examples… And we really want to have a coherent kind of experience there.
Let’s say you do some data transformation and you plug elements from an array, or something like that. You can write – we just wanted to make sure that you can just call the help function on that and you will see the documentation for that specific function. And there’s no discrepancy; all functions have that, basically. So it’s a very good form, and design, and it puts an onus on us to make sure that that’s all there. But we hope that in the end it is for the end user experience itself.
Do you know how many functions are in the standard library, roughly?
So the goals of the product is to be fully decomposable. So one thing we decided early on is what we want to do differently than other projects, like say NumPy, SciPy, Python… It’s really like, okay, every package can be independently consumed; it has defined all the resources that are located in the package, or specifically depend on other packages… And we have right now 3,000-something… I mean, you can look – we actually push a generated GitHub repository for each individual package in stdlib, and right now if you go to the stdlibjs organization on GitHub, it has roughly 3,800 repositories, or something like that. So it has grown quite a bit.
Okay. Do you do that so that they can be installed individually? Or why are you doing that?
Yeah, so you can actually go to Npm and find any of the packages individually. [unintelligible 00:26:17.09] install the whole library, or like a different namespace, organize the namespaces… So if you want all the statistics functions, or you want all math special functions that are part of the library, those can be installed together. And if you use a bunch of them, that’s preferable, but if you just want to have a few, you can just install those individual packages. And we generate the GitHub to both surface the code, so it’s easy to parse the code, there’s a [unintelligible 00:26:43.16] like what has changed for that specific package… It has information about – we also generate bundles now for Deno, or like ESM, so like ES Modules… The libary is still ES5, so very old school. But people can generate bundles for ESM [unintelligible 00:27:02.19] So it’s just to allow for different ways of consuming it.
Yeah, that’s great. That’s great. That legacy, that old stuff just makes me think that you’ve gone through so many transitions probably, from 2012 till now, roughly, as you’ve built this thing out. Surely, there’s been some pain in that process, right?
I mean, as I mentioned, we have been very conservative in kind of not adopting kind of new language features. And again, that is a double-edged sword. It is actually more performant. We try to be most idiomatic to C code. If you write code like if it was C, with loops instead of like the functional map, or filter, and stuff like that, you usually write better-performing code, because the compilers can optimize it better, and you avoid those multiple iterations etc.
[00:27:57.27] So we really try to write performant code, and also not have varied interfaces and functions, because that prevents compiler optimization. So if you always know the arguments passed to this function have the same types, then that will be good for the JavaScript runtime to kind of compile the code efficiently. And that’s kind of – so that was a conscious decision, and we didn’t really like – I mean, we’re not opposed… I mean, I think it’s great how JavaScript has evolved, and all the new capabilities… But in the end, it doesn’t matter right now, for our end user. If you use us, it doesn’t really matter if you write a React application, if you write modern ESM, if you wite TypeScript… We have TypeScript declarations that we ship for all packages as well, so they all are typed… So you will not really see it.
And also, the one main advantage probably we have with ESM, if you just input syntax, that you get tree shaking. But as I mentioned, deployment is already fully decomposable, and modular, so you can just install and require what you need, and you basically don’t need to rely on anything.
But yeah, we will at some point [unintelligible 00:29:10.17] but since there’s so much – it is, as you said, a big pain. So we have still like the var instead of const and let declarations in the readmes, and we don’t want it necessarily, because people have moved on in some ways… And so it would be nice to change that. But yeah, it’s a bit of a challenge. It’s a big project.
Right. Yeah, totally.
Maybe we can do some transformations, write some scripts to – I don’t think asking an LLM… That would probably introduce subtle bugs, and we don’t want to do that.
No. There are certain things that are more important. And then you’d also have to decide, should we use const or let everywhere? And you and your team can probably argue about which ones you should use given specific nuanced contexts of 3,800 functions.
Yeah. There’s a lot, so it would need to be decided. I mean, we actually need to look at the code, where does it – but you would probably do something like [unintelligible 00:30:06.11] Or something like that, I don’t know.
Well, that’s way too reasonable, Philipp. You have to argue about it more than that. [laughter]
Yeah, I don’t know why that is, but it seems like at least specifically in the JavaScript ecosystem there’s always a lot of heated debates. I think ecosystems seem calmer.
Yes. But I think the proper term is bike-shedding, isn’t it? I mean, heated debates about small things that don’t matter all that much.
Yeah, you don’t want to get bogged down and lose the big picture.
Yeah. Well, you’ve got a lot of stuff to do. Are you guys still building it out, or does it feel pretty much like “We’ve covered the breadth that we want to cover, and we’re just maintaining”?
Yeah, so as I said, I’m super-excited that we got into Google Summer of Code [unintelligible 00:30:55.26] we have some great projects currently underway… We have a contributor, Sunil Shah, who’s working on basically supercharging our REPL. There’s a lot of new features coming along there. And we also have another project building out fully our linear algebra kind of functionality, with [unintelligible 00:31:15.17] So that’s coming along. And also, we didn’t have all of it now, but after this period we will probably have C implementations for all the math functions in the library also, so you can use that with Node add-ons and get really good performance. I mean, JavaScript itself is decently performant, but you can get maybe a 2x of C, and stuff… But if you really need performance, why not use the capabilities of just running it in a native add-on.
[00:31:49.05] So that will all come along. But there’s still – I mean, I would say it was a bigger undertaking than initially envisioned, and just because of the constraints we’ve put on ourselves to really ensure things are fully documented, fully tested, that we can stand by our claim “Okay, this is high-quality code you can depend on”, that put a lot of the onus on us… And of course, we could have moved faster probably otherwise, but it’s always a trade-off; you need to decide what you want to do. But we are coming along well, and we’re very excited to have [unintelligible 00:32:18.07] decent contributor base of people who work on the project. We just hope that continues, and we’re definitely in it for the long term.
Yeah. Well, you’ve proven that so far. You’ve been in a long time already, and why stop now? I guess maybe burnout would be the reason, but it sounds like you have a decent team. So it’s not just yourself, it’s not just your partner in crime; there’s other people working on it. I would like to take a chance at this potentially being too nerdy and too technical of a question… But can you explain how the decomposability works? Like, how it’s architected so that everything can be decomposed, and reflected… I’m sure it’s very aware of itself in order to do all this stuff that you’re doing, like breaking out separate packages, and all this stuff that you’re doing… How does that work? How is that built?
So first of all there’s the organization of how the library is authored, right? Basically, we use the Node.js module resolution algorithm. Actually, in the Node modules folder – every package has its own kind of Node Package, basically, internally. And then you need to make sure, okay, if I’m writing my files, I only require – I don’t do relative requires of files that are collocated in a package. So that’s what we do.
And we have an explicit dependency that gets – where we import a require function, or whatever it is we need, from a different package. But then that’s explicitly noted in the package. And then we have a whole tooling pipe chain. There’s a lot of GitHub Actions workflows that analyze the code, and make sure that “Okay, which packages are used in there?” So then those get autopopulated; the pkg.json files to have those in there. And as I mentioned, we push to different repositories on GitHub, so we generate basically all these individual repositories, and publish the packages from there.
So it’s basically like we have to publish like 3,000 packages, but we intentionally – there’s been history there… When we started this, we started just writing different packages for like doing stuff… Back in the day there was a company called Compute.io, so we had some different packages there… And we manually had to do all that. Like okay, we want to publish a patch to that package. Okay, we run the npm package command. But then the problem becomes, again, if you need to make an edit across all packages or something like that, suddenly you have a bit of a problem, because we have 300 repositories where there’s a lot of manual labor involved just to make one change. So that caused us to move to this monorepo, and just say “Okay, we’re going to have everything in the stdlib kind of development repo, do all things there, and then have just a lot of tooling and workflows that handle the independent publishing.”
Yeah, I don’t know how many of the prolific maintainers – I don’t know, like Sindre Sorhus or someone who has to maintain so many hundreds of packages, how do they do it? I bet they also come up with their own automation and solutions to this… Because it’s just a big challenge actually if you have many packages to maintain, and make sure that you can push a fix out if needed etc.
[00:35:52.06] Yeah. I’m sure it’s a huge challenge. So if somebody wanted to dig into that architecture, and maybe use it for their own stuff, is the place to look probably the GitHub Actions of the monorepo? Is that where most of that –
Yeah, that’s where most of our tooling is. We also have make, certain make recipes, that are in a tools folder that contains a lot of tooling and libraries. But you can check out the static JS GitHub organization and the main repo at stdlib, and that has – yeah, and the lib Node modules where all the packages are organized. And that’s [unintelligible 00:36:28.15] we need to do the code transformations to actually transform that, too. But right now at least we still only publish common JS to NPM. There’s the whole debate of dual exports, and ship multiple bundles, and all that kind of stuff… And it’s actually quite complicated. I think the whole transition from common JS to ESM and Node proved a bit more complicated than people had hoped. It is where we are.
Yeah. Is your stance just to wait it out? Is what you’re doing?
That’s kind of where we are right now. Wait it out until it’s like fully there. Because right now, as I said, you can use this just fine in a project that uses ESM, and you wouldn’t gain much if it was ES Modules. But yeah. And we offer those other – we have ESM bundles now we generate as well. And we will try to keep up with things, and as things progress, we might ship more, or like it might change how we publish our individual composed modules… But we will see. It’s certainly not something where we feel like we have to follow the latest trend immediately.
I think already JavaScript suffers a lot from – like, there’s so many. Every year there’s a new framework… So things that you have to deal with, and learn his lesson. So it’s really like trying to be backward compatible mostly, and try to make sure that if you wanna depend on us, you can.
Yeah. I mean, that’s what people want from a standard library project. They want dependable, maybe even a little bit boring, well thought out, well documented, high-quality code… But bleeding edge out of your standard library - those two things are kind of at odds with each other, so I think you’re taking the right stance there.
I’m on the GitHub page, 360 open issues, 88 open pull requests… So much has gone into this. Almost 50,000 commits. I don’t even know if you know that. You’re making it close to 50,000. Do you guys track these things?
Yeah, that will be a new milestone, once we’ve reached that. And most of those – I mean, we have a few things that are automated, but the majority is actually worked by hand.
So huge project, tons of effort has gone into it… And then clearly, a lot of effort is going to continue to go into it, especially you’re getting things like Google Summer of Code, which should be an excellent boost to what you all are working on. How do you prioritize, how do you pick what to work on? Who triages issues? I mean, this seems like a full-time job for some folks, and I know you have a full-time job, so…
Yeah, so that’s certainly a challenge. Luckily, Athan has support from his employer, Quansight, to kind of work on stdlib, so there’s buy-in there… And I’m using a lot of my free time to invest in this… But I’m also luckily working in a company who is a big believer in open source, a lot of people there… Our CEO, Feross Aboukhadijeh is a big name in open source [unintelligible 00:39:37.18] and all kinds of other things, and now founded Socket… So I’m really grateful to be working in an environment where open source is a thing people support.
Right.
[00:39:50.20] But of course, any open source project can only really work with a community, and through maintainers, and we’re always looking for new folks who want to join and help us out. There’s tons of things to deal with on writing implementations, doing documentation stuff, doing more – we need to get better at kind of putting the word out there… Because, yeah, we’ve come a long way, but we still haven’t really been [unintelligible 00:40:16.11] and have that conversation with you… But those are all things that should be done… And so yeah, always exciting if new people kind of come on board.
We do hope, of course, that we’ll also find more like corporate backing, and this has become a thing that is depended on… I mean, it’s heavily used, but we obviously want to be depended on by major corporations etc. in the libraries, and then kind of get buy-in through that.
But in the grand scheme of things, JavaScript is still a nascent kind of environment for like numerical and statistical computing things, so most people just by default would opt for, let’s say, Python. And crossing that chasm, and just making sure that everything is there that people would expect out of the box, right? I think that that’s a very huge unlock, because then there’s no reason why you cannot use JavaScript to do most of the things that people might now use Python for, for example, in the kind of data science, statistics, ML.
Yeah. It seems like – are there frameworks and toolkits that would be slightly above you and using you? I’m thinking like – and I have never even looked at TensorFlow.js, but I know that might be a thing, where it’s like, okay, it’s TensorFlow in JavaScript. I assume they have some numerical computing going on in there… Are there projects that people are using, or could be using to do this stuff, that are similar to like PyTorch, and that kind of thing?
Yeah, I mean, that’s certainly something we will also explore. So I don’t think, at least in terms of libraries that would use us under the hood for that thing… But we have – for example, we have been in talks with [unintelligible 00:42:08.04] and we’re working with him to potentially use our multi-dimensional way implementations to power kind of [unintelligible 00:42:17.27] going forward. So that’s something we’re actively exploring. And there’s also – because JavaScript is very fragmented, so there is opportunity to think about, okay, we have some… There are some libraries which are not really maintained anymore either; for example jstat is one of the earlier statistical libraries for JavaScript that got pretty popular… And [unintelligible 00:42:39.09] is maintaining this, and I think would be very happy to hand off some of that work, so we have been talking with him, “Can we make sure that it gets further improvements and updates?” and we kind of can provide some of the underlying foundational implementations for that.
So I think that there’s opportunity there to kind of standardize the ecosystem hopefully a little, and make sure that there’s not – and of course, we would need to prove that we provide the right building blocks, and that you gain something from depending on us, versus just going out on your own… But I think we can make that case, because we really made sure – like, everything is numerically accurate, and you can really depend on the implementations. You don’t get wrong results… So it’s hopefully a good foundation for numerical computing in JavaScript. But that’s also something we need to do more in the future.
Well, there’s always more to do. Yeah, shout-out to Quansight Labs for sponsoring and supporting… Of course, shout-out to Feross and Socket… Listeners of JS Party know Socket and Feross very well, because they’re longtime friends. And of course, Socket also sponsors some of our work here at Changelog, so we thank them for that as well. In fact, Brian Zelip, when he wrote in asking us to have you on the pod, he even said that you’re extended JS Party family just by being a Socket employee…
[00:44:03.04] So lots of love, lots of love going around. We’d love to get more love onto this project. Do you have a GitHub Sponsors? Is it easy? Or is there an Open Collective? How do people actually –
Yeah, there’s an Open Collective, so if people want to sponsor us – and we certainly welcome financial sponsorship as well. But also, if anyone is interested just to contribute to the project, or just using it… We’re very excited for, and always looking forward to that, and talking to people.
We did also in our GitHub channel, which is linked from the repository, we do right now have office hours once a week…
Oh, that’s cool.
So if they’d like to stop by, and if they just have questions for how to start contributing, for example, they can do that. And yeah, we’d just be very excited to see.
Very cool. What are some waypoints for folks who maybe aren’t doing numerical computing, but they’re app developers, maybe they’re using React, maybe they’re building Express server-side apps… What are some places where like “I need this in my life, and stdlib has it”? Are there specific tools that are more popular than others? Just for Jane and Joe developer.
Yeah, now things seem to be coalescing around full stack development for JavaScript writing. Maybe you have an Express server, or you have a MongoDB data store or something, and use React for the frontend, and maybe use TypeScript or pure JavaScript… But I think we see more and more coming on the language stack, which was one of the big draws for me for JavaScript. And once you have that, and now you do want to share code across the client and the server, and maybe now you need to do some data manipulation, so then you could use some of the [unintelligible 00:45:43.05] we have in the library. If you need to do text transformations, we have a lot of things to work with strings… And in a way more advanced than if you just use things out of the box.
And maybe you do want to display some statistics on your website, or something like that. Maybe you don’t have that work happen on the server. There’s always a trade-off… If you need to have server-side rendering for SEO purposes, then okay, do all the work in the server to ship the final HTML to the client. But in many cases, you don’t want to do that. Maybe you have a web app that has like a dashboard with some graphs in it, and it displays some statistics, or has some numerical features in there, you can use some of the libraries or packages that are part of stdlib to do a cheap deck of statistical tests, to calculate the mean, median, or other numerical measures in a very efficient way. And you can do these calculations on the client side. Then you don’t need to either have a beefy server, or if you have a serverless setup, and you don’t risk of getting charged a bunch because you do all these function calls in the cloud.
[00:46:58.07] So that’s kind of like where I think really like the main advantage also there, for just being able to do all these computations on the client. So if people want to have any of those needs, and would like to do some of the things that you would find in Python, or in R, we hopefully have you covered. And if not, we try to prioritize things that our users want.
Sure. Very cool. Well, stdlib.io. Of course, that links out to the GitHub. We’ll have the links to all those things in our show notes for everybody. What else is going on coming down the pipeline? Of course, Google Summer of Code will be huge. But do you have anything else burgeoning? You mentioned pre-show there’s a survey of some kind…
Yeah, so if you go to stdlib.io/survey - and that’s not live right now, but it will be shortly - we will have a little survey for people who are interested in statistical numerical computing in JavaScript… And kind of just to learn a bit about what kind of applications people are looking at, kind of what would they expect, just to get a general sense… And yeah, if anyone is interested and would fill that out, we’d be very grateful.
And what else is coming down the pipe - Google Summer of Code is in full swing, and recently we’ve focused more on kind of building fully out that linear algebra functionality, and finishing our automation story there as well, so that you can do it… Because you need to have all these functions then available to directly operate on multi-dimensional arrays, so that you really have all the building blocks you need. If you want to do more advanced machine learning, or other algorithms… Once we have all those things in place, it kind of unlocks a lot of other use cases down the road.
Awesome stuff. Well, Phillipp, thanks for taking the time out of your day to hang out with me and tell me all about this really cool, long-standing, and very feature-rich project, standard library for JavaScript and Node.js. And thank you again to Brian Zelip for requesting this episode. By the way, you - yes, you - can also request episodes to JS Party. Go to jsarty.fm/request. There you’ll find a form where you can give us a tip, you can tell us a guest you want to have on, a topic… You can even select your JS Party panelists and hosts. Do that. We read them all. We make a lot of episodes based on your requests, because we want to serve you, our audience. We don’t always turn every suggestion into an episode, but we read them all, and so you’ll have our eyes and our ears… And perhaps you’ll have a episode - thank you to you - happening soon as well.
So that’s all from me. On behalf of Philipp Burckhardt and the standard library project. I’m Jerod and this is JS Party, and we’ll talk to you all on the next one.
Our transcripts are open source on GitHub. Improvements are welcome. 💚