Search results for [BCCMINING]💯difference between mining hardware and cloud

CI/CDagger

Gerhard Lazu joins the show to discuss how Ship It! started and why you might want a general purpose language for your CI/CD.

Matched from the episode's transcript 👇

Gerhard Lazu: That’s what I’m saying. So it’s almost an iteration of like the cloud, and hardware, and on-prem. And we’re in this weird place that I think because everybody is trying to save money and figure out how we do things differently, that like people are almost scrambling, and it’s going to be interesting to see what actually saves people money and time and is good for their use case, or if they end up down a whole other rabbit hole… But I think it’s a disservice to pretend like they are all the same thing, and that you are not essentially using a different type of a cloud. You’re just using your private cloud.

Go Time #42

Race detection, firmware, production-grade Go

Kavya Joshi joined the show to talk about shipping production-grade Go, writing firmware with Go, making complex technical concepts accessible, and other interesting Go projects and news.

Matched from the episode's transcript 👇

Erik St. Martin: I think we kind of sidelined there on some cloud and hardware.

github.com

Building a cloud native storage engine for Postgres

One of the things we discussed with Paul Copplestone from Supabase was what, exactly, might a cloud native Postgres look like? Well, perhaps it will look like OrioleDB:

A new storage engine for PostgreSQL, bringing a modern approach to database capacity, capabilities and performance to the world’s most-loved database platform.

OrioleDB consists of an extension, building on the innovative table access method framework and other standard Postgres extension interfaces. By extending and enhancing the current table access methods, OrioleDB opens the door to a future of more powerful storage models that are optimized for cloud and modern hardware architectures.

Founders Talk #77

From open source to commercially viable

This week Adam is joined by Asim Aslam, the founder of Micro - a new cloud platform entirely focused on the developer experience of consuming and publishing public APIs. Asim’s journey spans many years of open source work on Micro. His sole focus right now, is evolving that work into a commercially viable business. This episode is jam-packed with stories of great timing, grit, resilence, success and failure, and, of course, lessons learned.

Matched from the episode's transcript 👇

Asim Aslam: Honestly, I think the advantages Google has, and one engineer at Google has, is the thousands of services and that shared platform that they all get to build on, that scale. And once you leave those companies, you don’t have any of it. Once you quit a job, you have nothing that you had at that job. It’s all gone. Now you have to rebuild something. So a lot of engineers just work in two-year cycles, from company to company, rebuilding the same things over and over, and it’s maddening. I think the key thought is like “What if it all existed out in the open and you could just use it?”

The other key thing to think about is it’s not that just like “Hey, this API is available, that API is available.” We have lots of APIs. We have API fragmentation. I have to sign up for every single provider, I have to learn their APIs, I have to understand how to use them, they’re all different, it’s a unique snowflake… What I need is a shared platform, a shared model of development, and a way that I just reuse each one with this kind of no one understanding of how they work. That’s why things work inside companies, because you have the shared experience… And you have to bring that to the world. And the only way that really works is when we end up having new platforms, new operating systems, new development models, so desktop, web, mobile… I think cloud is the next one, but we’ve only been focused on hardware and infrastructure in cloud, we haven’t been focused on operating systems, development models… So I think we’re slowly kind of getting our way there.

blog.openai.com

Scaling Kubernetes to 2,500 Nodes

Are you really pushing Kubernetes? No? OpenAI is…

We’ve been running Kubernetes for deep learning research for over two years. While our largest-scale workloads manage bare cloud VMs directly, Kubernetes provides a fast iteration cycle, reasonable scalability, and a lack of boilerplate which makes it ideal for most of our experiments. We now operate several Kubernetes clusters (some in the cloud and some on physical hardware), the largest of which we’ve pushed to over 2,500 nodes. This cluster runs in Azure on a combination of D15v2 and NC24 VMs.

Ship It! #75

How vex.dev runs on AWS, Fly.io & GCP

Few genuinely need a multi-cloud setup. There is plenty of advice out there which mostly boils down to don’t do it, you will be worse off. Vex.dev is a startup that provides APIs for video and audio streaming. The hard part is real-time combined with massive scale - think hundreds of thousands of concurrent connections. They achieve this by using a combination of Fly.io, AWS and GCP. Jason Carter, founder of Vex Communications, is joining us today to talk about the multi-cloud setup that vex.dev runs.

Matched from the episode's transcript 👇

Jason Carter: Yeah, I think so. I think being able to have much cleaner control over the network is super-important. I used Kubernetes for a long time, and we found that we ended up having to go straight to VMs, so that we could really understand and not throw a lot of extra-complicated networking layer in between… I’ve seen some very interesting systems where they do kind of sort of point-to-point VPNs for all their traffic, and so that way they’re able to have a combination of on-premise hardware and a combination of cloud… But I think the key advantage is obviously cost, and in some cases being able to put servers in very specific areas, and kind of keep traffic localized there.

Another thing that we’re kind of excited to explore is - let’s say you have a company that requires video and audio infrastructure. They’re generally going to need it for a lot of different reasons. Maybe it’s for internal meetings, maybe it’s for broadcasting out to events… And so if you can provide a system where for all the internal meetings they can just run it kind of in their own network… Mostly, people are going to be in that area, connecting to that office; maybe you want to keep your recordings there, that sort of a thing. I think it’d be very cool to provide that hybrid approach to these sorts of companies, where “Hey, for the most part, you can use the super-cheap, really, really fast server, it’s really close to you, but then if you do need to scale out because you’re doing an event, you can kind of branch out to the cloud, where it gets a little bit more expensive.” But having that flexibility is super-important to me, because I think it opens up just kind of a lot of interesting opportunities for us and for folks that might use our services.

JS Party #42

Decentralizing the web with Beaker

Feross talks with Mathias Buus and Paul Frazee about the decentralized web, why the average person should care about decentralization of the web, the Beaker browser, Dat and the differences and similarities to BitTorrent, and how Paul and Mathias first got involved in this work.

Matched from the episode's transcript 👇

Paul Frazee: [28:09] Yeah, I was talking to one of the really good friends of the project, Peter Wang, about this, and he was saying “Is it still in the cloud?” and I was like “Yeah, but it’s the non-proprietary cloud.” The cloud is about commoditizing hardware, and that’s still part of what we’re doing here, but it’s almost like making the web kind of a part of that commoditization of hardware.

Changelog Interviews #282

The impact and future of Kubernetes

From KubeCon + CloudNativeCon 2017 — Brendan Burns (Kubernetes co-founder) and Gabe Monroy (creator of Deis) joined us on The Changelog to talk about the origin, impact, and future of Kubernetes and cloud infrastructure.

If Kubernetes let’s you not think about your machines, I think in many cases people don’t even want to have machines, this move towards serverless containers and the orchestration of serverless containers is the next really important part of what we’re doing. — Brendan Burns, Kubernetes co-founder

Matched from the episode's transcript 👇

Gabe Monroy: [29:37] Yeah, one of the big things that we’re seeing from those traditional IT folks is the desire to lift and shift workloads into containers. I’ve been present for some pretty shocking – the idea that you could go take a bunch of existing legacy Windows applications and in a few days get those things wrapped up in containers, moved over to an orchestrator and get all the benefits of, you know, the self-healing system, node failures, the workloads are gonna move around, the applications are much more resilient, you don’t need to go back and decommission a bunch of old servers and hardware, maybe you throw a cloud move into the mix… I mean, that stuff is really enticing to traditional IT orgs, and these are the things that container orchestration makes possible… So I definitely agree with Brendan, there’s a lot to like here.

I think we have to be conscious that container orchestration is sufficiently generalized that sort of modern cloud-native microservice development that we associate that with Kubernetes very closely, there’s actually a lot of other uses for this stuff, IoT being another one. Lots of different things; machine learning… There’s lots of different things that you can use this for. I think we’re just scratching the surface of that.

Ship It! #102

Managing Meta's millions of machines

Anita Zhang is here to tell us how Meta manages millions of bare metal Linux hosts and containers. We also discuss the Twine white paper and how AI is changing their requirements.

Matched from the episode's transcript 👇

Autumn Nash: I think your techs and your engineers are different though, right? Like, your engineers and your corporate people versus the techs and the people that are actually doing your hardware are two different components. Because a lot of companies are technically remote, but – well, not just that, but I think it’s also interesting, a lot of people seem to be owning hardware in data centers, but they’re not actually managing their data centers. Not Dropbox specifically, but just talking to like people in industry lately, it seems like the new thing is… It’s somewhere in between owning your own data center, and cloud, but instead, they own their hardware, but they just have it in another data center maintained by other – like, by techs overall, which is like interesting; it’s almost a new offering.

Practical AI #176

MLOps is NOT Real

We all hear a lot about MLOps these days, but where does MLOps end and DevOps begin? Our friend Luis from OctoML joins us in this episode to discuss treating AI/ML models as regular software components (once they are trained and ready for deployment). We get into topics including optimization on various kinds of hardware and deployment of models at the edge.

Matched from the episode's transcript 👇

Luis Ceze: …and 700 people actually attending live, and then more folks that consumed the content after that. And it was really nice to see contributors to TVM, but also folks from the general machine learning acceleration community, and a hundred vendors participate, and cloud providers participate, and so on.

[03:55] Also related to Apache TVM, we announced the TVM Unity effort, which is essentially an effort on bringing together all the key threads in performance automation, extensibility, and so on; an integration with the rest of the ecosystem front and center on TVM. So our view there is really what we call not too opinionated on how you actually get a model to run well on the harder targets, about how do you actually enable people to do what they want productively, including using other pieces of the ecosystem. So yeah, TVM is moving along really, really well, so it’s great to see.

And now on the OctoML side - so since may last year, we more than doubled. The team is about 130 people now. We made significant changes to our platform, to the SaaS platform that uses TVM as one of its key components to automate the process of deploying machine learning models. And we recently released also a private accelerated model hub, which is a set of models that are pre-accelerated to a bunch of different hardware, so folks can see the power of the platform in automating the process of getting models from the hands of data scientists into deployable artifacts. We also formed a lot of former partnerships with key hardware vendors like AMD, ARM, Qualcomm, and cloud providers like Microsoft Azure.

Practical AI #125

Deep learning technology for drug discovery

Our Slack community wanted to hear about AI-driven drug discovery, and we listened. Abraham Heifets from Atomwise joins us for a fascinating deep dive into the intersection of deep learning models and molecule binding. He describes how these methods work and how they are beginning to help create drugs for “undruggable” diseases!

Matched from the episode's transcript 👇

Abe Heifets: Okay. I also wanna take the opportunity to give credit to the chemists. Real transformation happens at the intersection of multiple different pieces. If you think about AI, there’s data and there’s algorithms that come together, but those wouldn’t be able to be run without a huge success by DevOps, and cloud computing, and GPUs, breakthroughs in the hardware, which are driving much of the –

I was just looking at this - because I’m a nerd, if we didn’t already establish this… The ASCI White, the most powerful super-computer in the world, and therefore in the history of our species, up to 2001, ASCI White, clocked in at over a hundred million dollars and a hundred tons. That machine peak FLOPS is an Xbox today. That’s an incredible transformation. So if you think about why are we able to do these things with AI today - it’s because of the massive success by hardware engineers; that’s driving a huge part of it.

So I wanna give a huge shout-out to the chemists. There’s been this equal exponential change on the side of the chemists, which is why we should care about AI. Here’s what they did. 15-20 years ago big pharma – like, if you and I wanted to order compounds out of a catalog, these commercially available compounds, there may be a million molecules that you and I could buy. And big pharma like Pfizer, Novartis and [unintelligible 00:51:19.28] Bristol Myers - they had maybe 3-5 million molecules in their warehouses. So in that world, it was better to be Pfizer and Novartis, because you had a better shot about finding something in your catalog, and then with an army of chemists you could iterate your way to a drug from that initial something. Remember, in Canavan, that’s what we lacked, was that initial something, and that’s what we were able to do at Atomwise.

Practical AI #75

Insights from the AI Index 2019 Annual Report

Daniel and Chris do a deep dive into The AI Index 2019 Annual Report, which provides unbiased rigorously-vetted data that one can use “to develop intuitions about the complex field of AI”. Analyzing everything from R&D and technical advancements to education, the economy, and societal considerations, Chris and Daniel lay out this comprehensive report’s key insights about artificial intelligence.

Matched from the episode's transcript 👇

Chris Benson: Yeah, I know that one of the topics you and I like to talk about a lot are transformers, for instance… And I’ve noticed – just a few weeks ago I was talking to some folks, and you’ll see these large transformer models come out, and then these follow-ups that are huge performance enhancements… And they may reduce the size of the model, but you end up getting a dramatically faster training based on these optimizations… And I think when you counter that with the fact that we’re seeing GPU, TPU and other hardware architectures really accelerating… You’ve got cloud options, you have options for being able to maybe have a GPU right on your desktop that you’re working, or whatever… And I think the combination of that has made a huge difference in accessibility for people to be able to actually do these.

Practical AI #24

So you have an AI model, now what?

Fully Connected – a series where Chris and Daniel keep you up to date with everything that’s happening in the AI community.

This week we discuss all things inference, which involves utilizing an already trained AI model and integrating it into the software stack. First, we focus on some new hardware from Amazon for inference and NVIDIA’s open sourcing of TensorRT for GPU-optimized inference. Then we talk about performing inference at the edge and in the browser with things like the recently announced ONNX JS.

Matched from the episode's transcript 👇

Daniel Whitenack: Yeah, choice is good in the sense of cost, too. Like you’ve already mentioned, if there’s more choices out there for this type of specialized hardware… I know this has been a big win for Intel’s chips that are in drones, and you can plug in via USB stick, and stuff… It just allows people to do fun things really quickly with deep learning, and also functional things that are really crucial to certain products. I think that you ultimately win as a consumer, right?

I’ve kind of stopped – well, part of me still wants to buy a big GPU workstation, which I probably will never do because I don’t have all the money, but the other side of me says “Well, at this point it doesn’t matter”, because I can get any sort of specialized hardware for doing this stuff in the cloud, and moreover, I can go and buy one of these chips that I can integrate into my Raspberry Pi or another fun device, and just build some fun projects… And when I need more compute power, then I just spin up more on the cloud, so… Yeah, I’m glad that I don’t have to keep that saving going for a huge GPU machine that’ll sit in my office… Although it’d probably be good for heating.

Ship It! #116

The Zookeeper of jujutsu

Tim Banks joins Justin and Autumn — there’s nothing quite like being punched in the face by Zookeeper or being taken down by a “hot” shard.

Matched from the episode's transcript 👇

Autumn Nash: What about the fact that you can scale in the cloud, and you don’t have to wait for the hardware, you don’t have to…

Changelog Interviews #527

What it takes to scale engineering

This week we’re talking to Rachel Potvin, former VP of Engineering at GitHub about what it takes to scale engineering. Rachel says it’s a game-changer when engineering scales beyond 100 people. So we asked to her to share everything she has learned in her career of leading and scaling engineering.

Matched from the episode's transcript 👇

Rachel Potvin: Yeah… I’ve seen a lot of ways that things didn’t work. Sometimes when you see a counter-example that’s just as good as seeing a good example, and even sometimes more effective; and I’ve tried things that didn’t work. But I’ve seen several common patterns in scaling my own teams over many years… I’ve brought multiple teams to over 100 people throughout my career. At Google I worked in developer infrastructure for a long time, and I brought those teams to over 100 people working in an organization of 2,000 people, with the amazing Melody Meckfessel, who is now CEO of a company called Observable, that I worked for her for many, many years, and learned a lot of great lessons from her, for example.

[22:01] Then at Google I also lead the cloud platform and recommendations platform in Google Cloud, and scaled that team from something like 30 people to well over 100 people. And then within GitHub as well, I’ve scaled multiple sub-teams within my organization to over 100 people when my team itself, when I went to leave, was over 500 people. And so hopefully you learn from experience, right? I mean, I certainly think I did. And like I said, that being thrown in the fire at the beginning of my GitHub experience, where – you know, there were a lot of things that were really surprising to me, in terms of how siloed GitHub was. There were a lot of things in terms of how decision-making was happening that I could tell didn’t work…

I can give you a quick story, which is when I first joined GitHub, Fantastic’s team came to me… And you know, I joined two months before GitHub Universe, which is the big developer-facing conference every year. And this great team came to me, and they were working on a language feature, and they said to me, “Rachel, we have this great new language feature, and we want to announce it and release it at GitHub Universe. As our new VP, can you tell us - should we launch it for JavaScript, or should we launch it for TypeScript, Java, Python?” …you know, the four next popular languages. And I was like “Okay, hang on; this seems like a great feature. Do we need more research? Are we not confident? Why are we just like targeting one population versus another?” And this great team said to me, “Well, okay, here’s the thing… When we first started this project over a year ago, it was easier for us to get CapEx budget approval (that’s like hardware) instead of OpEx budget approval (that’s cloud capacity). And so we ordered a bunch of machines, and we got them racked in our data center, and we were running a MySQL backend… And we have space for the index for JavaScript, or the next four popular languages, but not both, and it takes 12 weeks to order new machines. And GitHub Universe is less than 12 weeks away, and so we’ve got to pick.” And for me, coming from Google, my brain was melting a little bit, because “On-prem what? Like, isn’t it all cloud?!”

Changelog Interviews #475

Making the ZFS file system

This week Matt Ahrens joins Adam to talk about ZFS. Matt co-founded the ZFS project at Sun Microsystems in 2001. And 20 years later Adam picked up ZFS for use in his home lab and loved it. So, he reached out to Matt and invited him on the show. They cover the origins of the file system, its journey from proprietary to open source, architecture choices like copy-on-write, the ins and outs of creating and managing ZFS, RAID-Z and RAID-Z expansion, and Matt even shares plans for ZFS in the cloud with ZFS object store.

Matched from the episode's transcript 👇

Matthew Ahrens: To be honest, ZFS it’s getting to be older, right? I mean, 20 years is a long time, even within enterprise software, and I think that it can be a challenge to remain relevant as things change within the industry, with things like – you know, first we had the challenges of SSDs with very different formats characteristics, then with virtualization changing where the storage hardware fits into the stack, and now with the cloud, even more so the separation between the storage hardware and the actual use of it. So I think it could be a little discouraging, but to me, the project that we’re working on now with ZFS on object storage has just been incredibly fun, and I feel like we’re taking ZFS to the next– we’re giving it some more legs that’ll keep it relevant for another decade. And it isn’t something that’s going to be used by every ZFS user today, but it’s going to enable a lot more ZFS users in the future by making ZFS integrate even better into the cloud and bring those capabilities of snapshots, compression, all that stuff, to object storage and good performance to object storage.

[01:23:56.10] I’ve really been having a blast the past year with the team, developing that and designing it. A lot of the code is actually in userland, writing in Rust, so we all learned Rust, which is really exciting… It makes me never want to touch C again, even though it is my job to do so, so I’m going to do it. But Rust - it feels so comforting now that I’ve learned it. The safety of it feels very comforting, and it makes dealing with raw pointers in C everywhere feel scary, as it should be. I would say, it should feel scary. It is hard. You’ve got to get everything just right with C in order to not have bad bugs. It’s more work, but that’s fun work, too. I see ZFS continuing to be relevant, because we’re adding these new use cases to it, and I find that really exciting.

Changelog Interviews #624

The world of embedded systems

Elecia White, host of Embedded.fm and author of Making Embedded Systems, joins us to discuss all things embedded systems. We discuss programming non-computers, open source resources for embedded, self-driving cars, embedded system like the GoPro, Traeger smokers, and even birthday cards. According to Elecia, embedded is going everywhere.

Matched from the episode's transcript 👇

Elecia White: So in college, I took applied CS courses, because I couldn’t see the point of theoretical ones… And theoretical engineering, because I didn’t really like any of the actual engineering. And what that gave me was Fourier and signal processing, as well as the ability to develop products. I went to HP, and Netservers, big then cloud computers, and kept getting lower and lower into the hardware, monitoring temperature… The things that we monitor now in cloud servers. We were just starting to put in temperature sensors and alerts when things were happening, instead of just the bad old days of having computers that ran, or they didn’t, or they caught on fire, and you found that out by going in.

And as I kept going lower and lower in the stack, eventually I wandered over to HP Labs, and that was when we made DNA scanners. And I didn’t know what I was getting into. My boss was just like – or my perspective boss was like “You’ll really enjoy this. I promise.” And I was just like HP Labs. That is the best location ever.”

And the first time I moved a motor – because I told it to, not just because I powered on it something. But the first time it was under my control, it was magical. It was like, software can touch the world. Software can change physical things. It’s not just zeros and ones. It’s “Move this here. Read what’s actually happening.” That physicality of embedded systems is what got me into it. And that’s what’s kept me there. I really like – not blinking lights, but a lot of people get into it for blinking lights. A lot of the hobby things are blinking lights. And I appreciate it. I hate blinking lights, but I appreciate their enthusiasm. But for me, it’s touching buttons and having things happen.

So that’s how I got into embedded, is I found out you can actually write software that changes things, and not just other software.

Practical AI #281

Gaudi processors & Intel's AI portfolio

There is an increasing desire for and effort towards GPU alternatives for AI workloads and an ability to run GenAI models on CPUs. Ben and Greg from Intel join us in this episode to help us understand Intel’s strategy as it related to AI along with related projects, hardware, and developer communities. We dig into Intel’s Gaudi processors, open source collaborations with Hugging Face, and AI on CPU/Xeon processors.

Matched from the episode's transcript 👇

Benjamin Consolvo: Yeah, I can start, and then Greg, you can fill in anything, Greg, that I’m missing. But I think the best way is to get onto the Intel Tiber Developer Cloud, which is kind of our Intel cloud where developers can come and try out both our hardware and our software that’s set up right there for them. We’ll offer kind of our latest, whatever we have on that platform. So we have our Xeon product, we have our Gaudi product, we even are going to be having a dev kit kind of for the AI PC, a simulated environment; even though it’s not a local machine, it’s still a cloud. So that’s, I think, the best way to get started, is the Intel Tiber Developer Cloud. And Greg, did you have anything to add there?

Ship It! #104

FROM guests SELECT Andrew

Andrew Atkinson joins Autumn & Justin to tell them why folks should (and are) picking PostgreSQL as their database in 2024 and how to scale it.

Matched from the episode's transcript 👇

Andrew Atkinson: Yeah, I mean, more or less. I mean, Postgres architecturally is still a single instance database. So there’s not a concept of nodes, or different instances in the core community Postgres distribution. But like Autumn was saying, there’s lots of companies building on Postgres. So for example, Microsoft supports Citus, which is a multi-node Postgres distribution, so you can add more nodes if you want, and you can distribute your writes and your reads to different nodes… So that is one way to scale up. That would be an alternative to just vertically scaling up a single instance. You would scale out more by adding more nodes.

[40:06] Typically though, the web applications I work on – and with modern cloud providers you can get so much hardware, so many resources on a single instance, that I actually tend to encourage people… Like, it’s fine; single-instance scalability is fine. I mean, unless you’re Google, maybe… But you can get servers with – I saw at one point Amazon was gonna launch… And I’m sure Google Cloud Platform, Microsoft, these other big cloud providers have similar things… Servers with 24 terabytes of RAM. So if you can get more RAM than the size of your database, all of your queries are going to fit into the memory of your instance, and you’re going to have super-high performance queries, even with relatively minimal effort on indexing, and things like that.

Changelog & Friends #36

Retirement is for suckers

THE Cameron Seay joins us once again! This time we learn more about his life/history, hear all about the boot camps he runs, discuss recent advancements in AI / quantum computing and how they might affect the tech labor market & more!

Matched from the episode's transcript 👇

Jerod Santo: Yeah, exactly. And they’re doing that almost purely for cost savings. I would say that’s their primary motivation. And the cost savings are significant if you are in that group of people who have a stable business, with known growth trends… Because the cloud provides a lot of things, but the main thing it gives, especially for startups, is that dynamic scaling. You know, scale it up, scale it out, scale it back down again… That’s really powerful for companies who don’t know what their server demand is going to be. So that’s one.

And then there’s also this movement of taking cloud APIs to either your own hardware, or to collocated and cheaper VPS-style infrastructure. So now you’re still – everything we learned about the dynamic scalability of cloud APIs and AWS and others have given us, what if we could use those same APIs, but build them on our own hardware, for instance? That’s another movement. And that one’s about – I think it’s also about probably privacy plus cost, but without losing a lot of what the cloud provides. It’s trying to have the best of both worlds, I guess.

Changelog & Friends #8

Bringing the cloud on prem

Adam was out when Bryan made his podcast debut here on The Changelog, so we had to get him back on the show along with his co-founder and CEO Steve Tuck to discuss Silicon Valley (the TV show), all things Oxide, homelab possibilities, bringing the power of the cloud on prem, and more.

Matched from the episode's transcript 👇

Steve Tuck: And if you’re actually thinking about a pool of resources that are all – again, back to cloud computing, you’re not trying to design specific hardware components and software components; you’re trying to give developers instant access to arbitrary amounts of compute storage and networking via an API. And you give equality of service to that, and you can’t do that when you have that kind of brain stem, that switch that is unaware of what’s happening on compute sleds, and unaware of what’s happening up in the software stack. It’s the classic – anytime there’s a bump in the night, everyone blames what? The network.

Practical AI #37

The landscape of AI infrastructure

Being that this is “practical” AI, we decided that it would be good to take time to discuss various aspects of AI infrastructure. In this full-connected episode, we discuss our personal/local infrastructure along with trends in AI, including infra for training, serving, and data management.

Matched from the episode's transcript 👇

Daniel Whitenack: I think you’re right Chris; I think that variability also falls onto a spectrum. We talked about the spectrum of the hardware that you might use, that all this is running on, in terms of specialized hardware, versus things available in the cloud. I think there’s a spectrum here, too.

[31:57] On one side of the spectrum there’s a lot of open source, free tooling that will allow you to do interactive model development, and run it on pretty much any hardware - in the cloud, on Kubernetes, in Docker, on-prem… Things like JupyterLab - it’s like Jupyter, but multi-user Jupyter, so you can have multiple Jupyter kernels, and all of this stuff, and run a lot of different notebooks… But there’s also other free options. There’s Google’s Colaboratory (Colab), which has a bunch of free GPU resources and other things in notebooks that you can manage… There’s things like Binder that will spin up Jupyter notebooks from a GitHub repo… So that’s one side of the spectrum where you’re using a lot of these free environments.

On the other side of the spectrum there’s data science platforms, like you were talking about, Chris, which are things like Domino, and DataRobot, and DataBricks, H2O… Some of these are not free; in fact, some of them are not very cheap… Some of them are a little bit more moderately priced, depending on how many users you have and what workloads you’re running. But a lot of these give you, like you were saying, a really nice interface, maybe to track your data, to track different experiments that you’re doing…

My experience is that a lot of them are centered around the idea of experiments and running experiments, and iterating on those experiments… They’re not necessarily meant for running production AI services, but very much for model development and experimentation.

Cloudflare

Humanity wastes about 500 years per day on CAPTCHAs. It’s time to end this madness

Thibault Meunir writing on Cloudflare’s blog:

Based on our data, it takes a user on average 32 seconds to complete a CAPTCHA challenge. There are 4.6 billion global Internet users. We assume a typical Internet user sees approximately one CAPTCHA every 10 days.

This very simple back of the envelope math equates to somewhere in the order of 500 human years wasted every single day — just for us to prove our humanity.

They aren’t just doing napkin math, they’re also trying to fix things:

We want to get rid of CAPTCHAs completely. The idea is rather simple: a real human should be able to touch or look at their device to prove they are human, without revealing their identity. We want you to be able to prove that you are human without revealing which human you are! You may ask if this is even possible? And the answer is: Yes!

I held off on having a CAPTCHA on our site for as long as I could, but the spammers are relentless (did you know they’ll even click on email confirmations now?!) so I finally gave in.

I’d do darn near anything to be rid of ‘em again (any ideas?), but it seems the alternative that Cloudflare is pursuing requires hardware security keys. Interesting stuff, and definitely worth a read, but it’s all experimental for now and I don’t know if/when we’ll be able to put it in practice.

OpenAI

Microsoft is investing $1 billion in OpenAI

Straight from the horse’s mouth:

We’re partnering to develop a hardware and software platform within Microsoft Azure which will scale to AGI. We’ll jointly develop new Azure AI supercomputing technologies, and Microsoft will become our exclusive cloud provider—so we’ll be working hard together to further extend Microsoft Azure’s capabilities in large-scale AI systems.

Sometimes it’s hard to see the value traded in large scale investments like these. What do both sides get? With this particular investment, however, it’s pretty obvious what Microsoft is getting (Azure++) and what OpenAI is getting (an expanded R&D budget). It’s also worth noting that this is specifically focused on Artificial General Intelligence, not merely advancing the current state of the art in Machine Learning.

Changelog Interviews #621

Building the developer cloud

Kurt Mackey is back for a deep dive into what it takes to build the developer cloud. Kurt joins Adam to discuss the alliance between companies and cloud, something Kurt refers to as the “Rebel Alliance,” cloud complexity vs usability, Fly’s future with Postgres and why they’ve waited, thoughts on Neon and Supabase (Kurt shares a hot take), and our CDN saga and plan to build a simple CDN on Fly called Pipely (still a Pipedream).

Matched from the episode's transcript 👇

Kurt Mackey: And it actually feels better sometimes. It’s nice to have everything in your power. It’s nice to own all the things, because you know where to go to fix stuff. It’s really hard. That’s what DHH’s tweet was about. It’s really hard to wait for someone you don’t have a good relationship with to go fix things… And so what we’re actually doing – what they’re taking advantage of us for right now is not only… So we run our own hardware and networking for basically economic purposes. We need to have a good business, we need to make good margins. This is a good way to control costs, and also make sure things are kind of optimal for what we need… So we can buy the best mix of CPUs and memory and disks for a given piece of hardware, because we kind of know what we need the most.

[00:38:15.23] And we can avoid – when there’s supply constraints on things, we can work around that, for example. And this is what we did with GPUs. We couldn’t get A100s, so we got these L40S’es, and that worked just fine for us, because we knew it would. We weren’t kind of at the mercy of paying four times as much because we didn’t have this level of control.

So what Tigris is getting from us is they’re getting all of our run global compute, including a load balancer bit, at scale, that they don’t have to build. And even if you go to someplace like AWS, you end up just building this same thing on top of another cloud.

And then they also get the economic advantage. We make a little money when they buy hardware, but in general, they’re paying close to what it would cost them to buy similar hardware for themselves… And then we make a little money when they – no, we don’t make any money. We don’t make money when they push bandwidth anymore. We’re kind of like giving that to them at cost, because a big part of their pitch is like free egress, because this is a big deal to everyone but AWS. Only AWS can get away with charging insane amounts of money to move a gig of data out of S3.

And so they’re kind of benefiting economically. They can do these things, they can sort of punch above their weight, because we’ve already done a lot of this work… And then they can technically punch above their weight because we’ve already got a lot of this infrastructure in place. But it still has to stay pretty good for them. It has to stay pretty close to better than if they were just doing things themselves. And so that’s the – over time, I’m actually like… I’m in some control of this, but I’m actually really curious to see how this plays out with them and us, and which things that we can continue doing better for them, and which things as they scale, it makes sense for them to take on themselves.

A really good example of this is they probably need cheap, slow disks at some point, and a lot of them. And we have absolutely no reason to ever buy cheap, slow disks for anything we’re doing. It’s just not an important part of the product, and it’s not a thing we… And so I would expect that when it comes time for them to do cheap, slow disks, that’s obviously gonna be a thing where there’s a pretty good chance they won’t use us for that, and that’s fine. It makes total sense to make those decisions for things like that.

Practical AI #292

Big data is dead, analytics is alive

We are on the other side of “big data” hype, but what is the future of analytics and how does AI fit in? Till and Adithya from MotherDuck join us to discuss why DuckDB is taking the analytics and AI world by storm. We dive into what makes DuckDB, a free, in-process SQL OLAP database management system, unique including its ability to execute lighting fast analytics queries against a variety of data sources, even on your laptop! Along the way we dig into the intersections with AI, such as text-to-sql, vector search, and AI-driven SQL query correction.

Matched from the episode's transcript 👇

Till Döhmen: I like to describe MotherDuck as giving your DuckDB a cloud companion. So it’s easy to associate “Okay, we bring MotherDuck to the cloud”, which is one way how we describe ourselves as well, to associate that with “We provide infinite scale-up in the cloud. You give us a workload, and we start how many hundred DuckDBs in the background, that in a task-like fashion, let’s say process your data concurrently.” But actually, one of the hypotheses that MotherDuck is based on, or that the company was founded on is that actually single-node compute, which means one DuckDB database, with nowadays hardware, cloud hardware, actually gets you very, very, very far. So when your local compute resources reach a limit, you have single-cloud instances with up to - how much is it? 24 terabyte of memory? That’s relatively big data.

So that’s one aspect… So scaling up with one cloud company in DuckDB. Another aspect is collaboration. So once you’re connected to a cloud instance, you can have shared context with other users in your organization. You can create shared datasets, you can have shared notebooks, and so on and so forth. And with that, of course, comes all the enterprise SOC2 kind of things that some of the enterprise customers require to adopt towards DuckDB.

Changelog Interviews #606

Reinventing Kafka on object storage

Ryan Worl, Co-founder and CTO at WarpStream, joins us to talk about the world of Kafka and data streaming and how WarpStream redesigned the idea of Kafka to run in modern cloud environments directly on top of object storage. Last year they posted a blog titled, “Kafka is dead, long live Kafka” that hit the top of Hacker News to put WarpStream on the map. We get the backstory on Kafka and why it’s so widely used, who created it and for what purpose, and the behind the scenes on all things WarpStream.

Matched from the episode's transcript 👇

Adam Stacoviak: Is anyone building or using Kafka, open source Kafka, as you said, in a scenario where they’re not on public cloud, where they’re building out their own infrastructure, where it’s probably maybe even more harder because you’re literally managing the disks, you’re not ordering the disks, or SRA-ing the disks? …you’re literally managing the disks. Is that a scenario that happens, or is it less likely? So that’s definitely a thing that happens. I know of companies that that do that.

But just as the migration to public cloud over the last 10 years has only increased in velocity, essentially, that is becoming less and less popular, because it is indeed hard. And it’s even harder when it’s in your own data center, as opposed to the cloud, where you can just ask for more disks, and you get them right away.

The cost situation is a little different there too, because typically the way that you’re provisioning network in your own data center would not end up with a per-gigabyte cost. I mean, obviously, you amortize everything over how much data you’re transferring inside your data center, but you’re buying it in terms of hardware, and your per-gigabyte rate if your traffic goes up doesn’t correlate the same way linearly as it does with Amazon. But that’s definitely still a thing people do, but it’s less and less popular every day.

Practical AI #278

The first real-time voice assistant

In the midst of the demos & discussion about OpenAI’s GPT-4o voice assistant, Kyutai swooped in to release the first real-time AI voice assistant model and a pretty slick demo (Moshi). Chris & Daniel discuss what this more open approach to a voice assistant might catalyze. They also discuss recent changes to Gartner’s ranking of GenAI on their hype cycle.

Matched from the episode's transcript 👇

Chris Benson: [00:23:50.05] I think that’s a fantastic insight. I think in a perfect world if we can help people along kind of get through their own trough of disillusionment very quickly, to climb back up onto the slope of enlightenment by following that guidance is essentially what I’m getting at… Before diving into it, I know that over time, as I’ve talked to people, it reminded me of – amplified beyond what I’ve heard before, but of previous technologies that were supposed to solve everything. You know, blockchain was going to solve the world, if you recall. Blockchain was amazing, we were gonna have it everywhere, it was gonna be everything… And having since reached that plateau of productivity at the end of the lifecycle, blockchain has a fantastic place in the technology world, and a vibrant community. But of course, it doesn’t solve all things, and I think people need to realize that the same with these kinds of models is that they can do that.

So I know that one of the things that I’m trying to get people to do is to get through their own trough of illusion quickly, and start recognizing in a really productive sense how to fit it in with larger systems. We’ve always talked about it’s really the software system around these models that makes it all work, that makes the value for the user… And even extending that, if you’re not in the cloud, it’s the hardware. If you’re out on the edge, it’s all about what you have on the hardware, and how does it integrate, and how does it integrate with the systems you already have in place, and what special value are you expecting generative AI to bring to bear that you haven’t already been trying to design and solve for?

And so I think as people really stopped and they kind of got out of their New Year’s Eve party moment, and they said “Okay, I’m an engineer, I need to start being an engineer again”, and thinking about it, and they thought “Well, maybe it doesn’t solve everything, like I thought, but I can identify some pretty cool things that it would help [unintelligible 00:25:42.02] And I’m hoping that people will start focusing on that, and bring engineering, to your point, back to bear on this, and solving it, but solving it in that larger ecosystem, that includes the overall stack that you’re in, the software… And, since we’re moving evermore out onto the edge, into all the devices that we use out there, beyond just our cell phones that are always everpresent, that we can find some good uses.

Changelog & Friends #51

A different kind of rug pull

Adam & Jerod discuss the news! But first, we discuss how you can keep up with the software world (good question, Tyler Boyd!) On the docket: Developer job postings trend, the Ladybird Browser Initiative, the Polyfill.js supply chain attack & is the future self-hosted?

Matched from the episode's transcript 👇

Adam Stacoviak: But like you had said, Next Cloud, I believe, for the most part is being built for mostly nerds; not quite fully nerds, but mostly nerds. I agree with [unintelligible 01:23:52.08] share, which was - I’ll paraphrase, because I’m not gonna read it exactly… You know, reducing hardware costs, long-term costs etc. I think having privacy is kind of to some degree there, because you can still have like non-VPN traffic, SSL traffic happening… So even then you’ve got to become a bit of an expert on that stuff, too.

[01:24:18.08] You may choose a networking system that aids you in that, but doesn’t remove it completely. But then it also puts all the ownership on you, and then you multiply that by everybody who self-hosts, and you’ve got a lot of people. He does say “There are billions of people in the world. Tens of millions of them turn 18 every year, and they all need software.” And I don’t have an 18-year-old, Jerod; I have a 20-year-old, and she is not at all interested in purchasing hardware and self-hosting anything whatsoever.

Practical AI #272

Rise of the AI PC & local LLMs

We’ve seen a rise in interest recently and a number of major announcements related to local LLMs and AI PCs. NVIDIA, Apple, and Intel are getting into this along with models like the Phi family from Microsoft. In this episode, we dig into local AI tooling, frameworks, and optimizations to help you navigate this AI niche, and we talk about how this might impact AI adoption in the longer term.

Matched from the episode's transcript 👇

Daniel Whitenack: Yeah. I was asked – I was at a conference last week, and I was asked which direction things would be going, either local AI models or hosted in the cloud… And I think the answer is definitely both, in the same way that there is a place for – if you just think about databases, for example, as a technology, there’s a place for embedded local databases, that operate where an application operates. There’s a place for databases that run kind of at the edge, but on a heavier compute node that serves maybe some environment, and there’s a use case for databases in the cloud. And sometimes those even coexisting, for various reasons.

In this case, we’re talking about AI models. So I have a bunch of files on my laptop; I may not want those files to leave my laptop, so it might be privacy reasons that I want to search those files or ask questions of those files with an AI model. So privacy security type of thing, or in a healthcare environment they may have to be airgapped, or offline sort of thing, or public utilities sort of scenario, where you can’t be connected to the public internet… But then it might just be also because of latency or performance, inconsistent networks, or flaky networks, where you have to operate sort of online/offline… There’s a whole variety of reasons to do this. But yeah, there’s also a lot of ways that, as you said, this is rapidly developing, and people are finding all of these various ways of running models at the edge. And we can highlight – if you’re just into this now, and getting into AI models, maybe you’ve used open AI’s endpoint, or you’ve used an LLM API… If you wanted to run a large language model, or an AI model on your laptop, there’s a variety of easy ways to do that. I know a lot of people that are using something like LLM Studio - this is just an application that you can run and test out different models…

There’s a project called Ollama, which I think is really nice and really easy to use. You kind of just spin it up; you can either spin it up as a Python library, or as a kind of server that’s running on your local machine, and interact with Ollama as you would kind of an LLM API. And then there’s things like llama.cpp, and a bunch of other things. These I would kind of categorize as local model applications or systems where there’s either a UI, or a server, or a Python client that’s kind of geared specifically towards running these models locally.

And then there’s a sort of whole set of technologies that are kind of Python libraries or optimization or compilation libraries that might take a model that’s maybe bigger, or not suited to run in a local or lower-power environment, and run that locally.

[00:10:03.11] So if you’re using the Transformers library from Hugging Face, you might use something like Bits and Bytes as a library to quantize models, shrink them down… There’s optimization libraries like Optimum, and MLC, OpenVINO… These all have – some exist for some period of time. Actually, I think in the past, we’ve had the Apache TVM project on the show, and we talked about OctoML… So this is not a new concept, because we’ve been sort of optimizing models for various hardwares for some time. But these optimization or compilation libraries are also usually kind of hardware-specific, so you optimize for a specific hardware. Whereas other of these local model systems are maybe more general-purpose, less optimized for hardware specifically. I don’t know if you’ve got a chance to try out any of these systems, Chris, running some models on your laptop…