This week weâre joined by Jason Bosco, co-founder and CEO of Typesense â the open source Algolia alternative and the easier to use ElasticSearch alternative. For years weâve used Algolia as our search engine, so we come to this conversation with skin in the game and the scars to prove it. Jason shared how he and his co-founder got started on Typesense, why and how they are âall inâ on open source, the options and the paths developers can take to add search to their project, how Typesense compares to ElasticSearch and Algolia, he walks us through getting started, the story of Typesense Cloud, and why they have resisted Venture Capital.
Jason Bosco: Yeah, so we came up with it mainly to mirror the cost of running the service with how much we charge users. So thatâs one core principle that we held on to, because - sure, from a business perspective, that probably doesnât make the best idea, because youâre very closely tied to your costs⌠But thatâs what we chose in service of trying to make sure that we offer something thatâs as affordable as possible.
So if you were to run, for example, Typesense on your own cloud accounts, we wanted the cost to be somewhat similar. And where we get savings is from economies of scale, essentially, like running thousands of clusters ourselves, so both the management effort involved, and the savings that you get with high spend. So thatâs what we capitalize on. And then we pass on â you know, some of the savings we get, you know if you want to call it that, all savings back, instead of trying to do value-based pricing, which is what Iâve seen some other SaaS companies do. Now, that does make the pricing a little bit more complicated, because people have to know how to calculate RAM, how to calculate how much CPU they need⌠And thatâs why we added a little calculator which says âJust plug in the number of records you have, or the size of every recordâ, and then weâll calculate and roughly give you an estimate of how much RAM you might need. That works out well for most use cases.
[01:00:08.08] If people choose x as the size of their dataset, Typesense typically takes 2x to 3x RAM, and thatâs given out as the recommendation in that calculator. And then for CPU, we just tell people, pick the lowest CPU available for that RAM capacity, and then as you start adding traffic, youâll see how much CPU is being used and we can scale you up from there. Or we say run benchmarks â if you already have high traffic in production, run benchmarks with similar kinds of traffic, staging environments, see how much CPU use, and then pick good CPU. So that does make it a little bit more complicated to calculate CPU.
And then the other configuration parameters, like you can turn on high availability, meaning that weâll spin up three nodes in three different data centers, and automatically replicate the data between those, and then load-balance the search traffic thatâs coming, search and write traffic between all the three nodes. So flick of a button, you have a HA service.
And then we have this thing called Search Delivery Network which we built in Typesense Cloud, which we essentially replicate the dataset to different geography regions. So you could have one node running in Oregon, one node running in Virginia, one node running in Frankfurt, another one running in Sydney etc. And anytime a request originates, we will automatically route it to the node thatâs closest to the user.
Itâs similar to a CDN, except that in a CDN they only cache the most frequently used data, whereas here, we replicate the entire search index to each of those nodes sitting at different locations. So itâs as good as itâs going to get in terms of reducing latency for users. In fact, this search delivery network is what prompted some users to use Typesense as a distributed caching JSON store. So instead of having to replicate your primary database, which is probably sitting in one location, out to different regions, which is a hard thing to do, they instead send a copy of the data into Typesense, and have Typesense replicate the data to different regions, and then hit Typesense directly as a distributed cache. So thatâs an interesting use case that people have used Typesense for.
So yeah, these are the different pricing angles. And I think when people realize that, âOh, if I were to host this on AWS, or GCP, this is how much the incremental spend I have to spend with Typesense Cloudâ - when that delta is tiny, when people realize that, thatâs when hopefully thatâs a convincing case for people to let us deal with the infrastructure stuff, rather than having to spend time on it yourself, and spend that engineering time and bandwidth. However tiny that might be, we still take care of that on an ongoing basis.