A lot of effort is put into the training of AI models, but, for those of us that actually want to run AI models in production, performance and scaling quickly become blockers. Nikita from MemSQL joins us to talk about how people are integrating ML/AI inference at scale into existing SQL-based workflows. He also touches on how model features and raw files can be managed and integrated with distributed databases.
Databases are tricky, especially at scale. In this episode Mat, Jaana, and Jon discuss different types of databases, the pros and cons of each, along with the many ways developers can have issues with databases. They also explore questions like, “Why are serial IDs problematic?” and “What alternatives are there if we aren’t using serial IDs?” while at it.
Mat, Johnny, and Jaana are joined by Francesc Campoy to talk about Graph databases. We ask all the important questions — What are graph databases (and why do we need them)? What advantages do they have over relational databases? Are graph databases better at answering questions you didn’t anticipate? How is data structured? How do queries work? What problems are they good at solving? What problems are they not suitable for? And…since we had Francesc on the hot seat, we asked him about Just for Func and when it’s coming back.
Computer Scientist Yaw Anokwa joins the show to tell us how Open Data Kit is enabling data collection efforts around the world. From monitoring rainforests to observing elections to tracking outbreaks, ODK has done it all. We hear its origin story, ruminate on why it’s been so successful, learn how the software works, and even answer the question, “are people really using it in space?!” All that and more…
This week we talk with Manish Jain about Dgraph, graph databases, and licensing and re-licensing woes. Manish is the creator and founder Dgraph and we talked through all the details. We covered what a graph database is, the uses of a graph database, and how and when to choose a graph database over a relational database. We also talked through the hard subject of licensing/re-licensing. In this case, Dgraph has had to change their license a few times to maintain their focus on adoption while respecting the core ideas around what open source really means to developers.
Adam is on location at ZEIT Day talking with Jessica Rose about burnout, Henry Zhu about his passions and pursuit of open source, and Simon Willison about data and his passion for interesting datasets in the world.
Matt Jaffey joined the show and talked with us about Pilosa, building distributed index with Go, and other interesting projects and news.
Philipp Krenn joined the show to talk with us about Elasticsearch, the problem it solves, where it came from, and where it’s at today. We discussed the query language, what it can be compared to, whether or not it’s a database replacement or a database complement, Elasticsearch vs Elastic the company.
We also talked about the details behind Elastic’s plan of “doubling down on open” to open up X-Pack, which is open code paid add-on features to Elasticsearch. We discussed the implications of this on their business model, and what changes will take place at the code and license level on GitHub.
We went back into the archives to conversations we had around blockchains and databases at OSCON 2017. We talked with Monty Widenius (creator of MariaDB), Brian Behlendorf (Executive Director of Hyperledger), and Tague Griffith (Head of Developer Advocacy at Redis Labs).
Business Source License (BSL) is there to be a bridge between closed source and open source. - Monty Widenius
Mike Glukhovsky joined the show to talk about the future of RethinkDB. Mike was a co-founder of RethinkDB along-side Slava Akhmechet. RethinkDB shutdown a year ago officially on October 5, 2016 — and today we’re talking through all the details with Mike. The shutdown, getting purchased by the CNCF, relicensing, buying back their IP and source code, community and governance, and some specific features that Mike and the rest of the community are excited about.
Mark Nadal joined the show to talk about his hacker story and his venture backed open source datastore project called GunDB — a realtime, decentralized, offline-first, graph database engine. We talked about the details behind this database, how Mark secured funding, why yet another datastore, who’s using the database, how Mark plans to sustain this project through products and services, his thoughts on the RethinkDB postmortem and more.
This episode is part of our remastered greatest hits collection and features Richard Hipp, the creator of SQLite, talking with us about its history, where it came from, why it has succeeded as a database, how it’s development is sustainably funded, and the how and why of it being the most widely deployed database engine in the world.
MacLane Wilkison and Michael Egorov, the creators of ZeroDB, joined the show to talk about ZeroDB — an end-to-end encrypted database (protocol), why it’s open source, how it’s different than other encryption techniques, performance for running encrypted queries, and an interesting topic called Proxy re-encryption.
Sameer Al-Sakran and Tom Robinson from Metabase joined the show to discuss Metabase - their open source tool that’s laying the foundation of their goals for open source business intelligence.
Slava Akhmechet joined the show again to catch us up on RethinkDB and the awesome progress they’ve made to power the realtime web. We talked about innovation in databases, compared and contrasted to pub/sub, Pusher, NoSQL, and even The Next Big Thing™ in databases.
Ilya Grigorik joined the show to talk about GitHub Archive, logging and archiving GitHub’s public event data, and how he uses Google BigQuery to make querying that data accessible to everyone.
Wynn sat down with Andy Gross and Mark Phillips of Basho and John Nunemaker of Ordered List to talk about Riak, Riak Search, and moving an open source community to GitHub.
John Nunemaker joined the show to talk about open source, improving your craft, building a business, and how MongoDB has changed his life.