Daniel Stenberg joined the show to talk about 20 years of curl, what’s new with http2, and the backstory of QUIC - a new transport designed by Jim Roskind at Google which offers reduced latency compared to that of TCP+TLS+HTTP/2.
DigitalOcean – DigitalOcean is simplicity at scale. Whether your business is running one virtual machine or ten thousand, DigitalOcean gets out of your way so your team can build, deploy, and scale faster and more efficiently. New accounts get $100 in credit to use in your first 60 days.
OSCON – O'Reilly's Open Source Convention combines the experience of the open source community with ideas and strategies for using open source tools and technologies. There's no event quite like OSCON! When registration opens — save 20% on most passes by using the code
CHANGELOG when you register.
Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com.
- Daniel Stenberg on Twitter: "Twenty years of maintaining open source, and all I ever got... "
- Everything curl - Book
- Copy as curl
- Web transport, today and tomorrow
- QUIC Homepage
- What is the difference between TPC and UDP?
- IETF - Internet Engineering Task Force
- The annual curl user survey is up. Please donate a few minutes and answer some questions!
- curl survey 2017 – analysis | daniel.haxx.se
Daniel, we last had you on the Changelog when curl was 17 years old. Now curl has turned 20, and a lot has changed in those three years... But I think we should start with this quote from a tweet that you put out recently, which I loved and we retweeted, which said "20 years of maintaining open source, and all I ever got is an awesome career, friends all over the world, and a gold medal from the Swedish king." You've gotta start with the gold medal, right? Get to the important stuff first. Tell us this story.
So I was awarded an engineering price in Sweden. It's named after a Swedish engineer called the Polhem Prize. It's an old, distinguished prize that they have been handing out for I think 120 years or so. Really a prestigious prize, given out to engineers and inventors of different things over the years.
In 2017 I was awarded and given this prize, and it comes in the form of a gold medal and a cash part.
At the award ceremony - in October I believe it was, in 2017 - I was awarded this gold medal from the Swedish king, who was there and gave it to me, so I got to shake his hand and say thanks.
That's awesome. And in the tweet, which is linked in the notes, there is a picture of you shaking - I assume that's him - the Swedish king's hand there...
Now, you just tweeted at us a few days back - May 18th; we're recording this on May 22nd, 2018, so on a time delay... Did something bring it to your mind, or did you finally get a copy of the picture that you could share? Why the delay on the tweet if this happened late last year?
[00:03:57.08] So I brought it up from a completely different reason, actually. Previous to that tweet, I tweeted another image that was one of these funny things, about one of these fake O'Reilly covers from a book that says "Thanklessly maintaining open source" and a sad lama on it... You know, more of the constant mantra that is maintaining open source is a bit of a thankless job many times, and we do a lot of things... And then someone replied to me and said "Well, you got a gold medal." [laughter] So I had to sort of show the other side of the coin really, because I think I have gotten a lot of good things from open source, and I enjoy it a lot. It's not an ordeal or a struggle for me, it's a pleasure and I do it for fun... So I definitely wanted to bring out some of the goodies and goodness that I experienced from working with open source.
Well, this is only your second time on this show, but it's probably the umpteenth time that your name has been mentioned since we had you on three years back, because you impressed us so much with the 17 years of dedication to curl, and just this relentless pursuit of what is such a popular, widely-used tool, and so relied upon. This is definitely the web's infrastructure type of a thing. And so many people burn out, fizzle out, projects change...
So many things go what we might consider wrong - wrong in terms of sustainability, but with you it's like, you're 20 and you're still rollin'. Do you have a retirement date in mind, or what are you thinking for this?
Sometimes I think about what I would do if I wouldn't do this, but no... I'm still enjoying this so much, and I don't see anything else that I wanna do as much as this. This is really my baby still so very much, so I keep on doing it for the fun of it.
I think what is kind of interesting about the 20 years aspect is not so much the length of time, but the -- the amount of time I guess is somewhat the same, but a slightly a different side of the coin is that it's been involved in your life. It's been a part of your life since 27; I'm assuming since it's 20 years you're now 47, doing some basic math here... Doing some basic math here. That's a lot of time - that's your 20's, your 30's, and your 40's.
That's a lot of time.
It is totally a part of my life, and I've been doing it -- the first code I wrote was even before curl. [unintelligible 00:06:39.01] it's like 23 years. Yeah, it's older than my kids, it's older than my house, I've switched jobs like 3-4 times since then... So it's one of the most constant factors in my life, really. It's been with me since forever. So yes, it's really something that I don't really consider giving up ever, because it's me, really.
Do you own the full copyright to curl, or is it a community? What's the structure, maybe the legal implications of the ownership of it?
I own most copyrights, but not everything. I haven't really been very strict about it either, so if people contribute a chunk that they want to have their copyright on, that's fine; so we have a bunch of different other copyright holders on various parts, but I would say that maybe 70%-80% of everything has my copyrights on it.
I ask that mainly because of the question Jerod asked, which is what would you do otherwise, essentially? At some point you'll have to pass it on.
Of course, yes.
You know, by force or by desire.
I'm not being morbid here, or anything.
But it is open source, and its licensed extremely liberal, so anyone is free to continue wherever they feel like at that point, or at any point, really.
[00:08:08.01] It kind of reminds me of this conversation we had off-air at Build with the Python theme there, Adam, about really the passing down of the torch from Guido van Rossum to whoever is next with regards to the Python project, and when you have a BDFL, if that BDFL is really good at doing BDFL things, everything goes well... But eventually, there needs to be a passing of the torch. Have you put serious thoughts into that, or are you far enough away -- of course, with that we always bring up the somewhat morbid conversation of the bus factor, like "What if something bad happens to Guido or to yourself, Daniel?", but more likely, an eventual retirement from software or from open source... Is that something that is actively in your mind, or does it just feel like it's really far away at this point?
Both yes and no. I would say that it is active in my mind, in the regards that I've been thinking about it and I've sort of given it thoughts about how to do it at some point in time, but it's not something that I consider doing any time soon, sort of hand it over to someone.
My ideal case or my ideal situation would be that within the project there would be one or two or three persons that would be sort of the natural other people that would take over if I would just get bored one day, and they would just more or less transparently just shoulder the tasks that I've been doing, and just continue in whatever means they think they should do it.
But at the same time, the way I do the project, I also know that I have a pretty strong presence myself, and I think that I sometimes also don't let others reach that level sometimes before -- because I think I sometimes do a little bit too much myself... "You know, why wait for someone else to do it when I can do it myself?", sort of... And I think that sometimes isn't constructive in that regard, and it doesn't really encourage others to step forward and show their abilities.
But it's also in one sense very much so your life's work... So, talk about difficult to pass on or to let go, even if you know it's constructive in the long-term to let more people into the fold, or the ones who you trust, to give them more responsibilities, or allow them to come into that, when it's like... You know - curl. Daniel is curl. It's your project. It's hard to let go of that, right?
Right, yeah. But of course, I would like the project to be more distributed to more people than we are right now, and I'm trying to make that happen, but it's not -- I think I've sort of laid the groundwork for one way to work, and it has sort of developed into this, so it's not that easy to just say that "No, no, I just wanna do a little part in my corner here. You go ahead and do everything else", because there aren't that many others who are prepared to jump in and do the other stuff.
I can recall several years ago when we talk to you before, you mentioned how some of the income you've been able to make obviously has been because of contract jobs that you've done for various companies to add features, or specific things... You know, I'm just imagining that it's very difficult to piecemeal and break off some of that whenever it's so kind of you focused in the minutiae of it... And it's not exactly - I don't wanna say not the [unintelligible 00:11:44.03] I've never done it, obviously, but it doesn't have this lure, like some other popular projects may have, like "Hey, come and be a contributor, and you'll have this glorious open source lifestyle." [laughter] I'm not sure there's much draw; how do you draw people into this project with you?
I mean, it's the pipelines of the internet, right? It's internet plumbing.
Yeah, and I think that might be what attracts people then, because it's sort of a fundamental thing that is just everywhere.
Yeah, so if you contribute to curl, you can get your real piece of code into a couple billion devices over time... That is, of course, an interesting feeling or challenge.
Wow... I didn't consider that. Okay, I take it back.
It's one of the things I've been thinking about... I've been putting off a blog post about developer and leveraging software; I feel like software developers live at what is perhaps right now the height of a human's ability to leverage things... And the fact that you can write one line of code, Daniel, and then do a release, and then that eventually has to trickle down and go through the release process, but that's going to affect billions of devices, millions of people - that is an incredible amount of leverage, and I do thing that that's attractive from a software developer's stance, because how can you live the most meaningful life...? It's to have the most positive impact on the most people, and software really lets us do that.
Oh, absolutely. Sort of, just do something little in my corner, and it can seriously influence the entire world, in some ways, at least.
Yeah. So let's zoom out and talk about your community a little bit, because as I've been watching curl and your blog more closely since you were on the show, one thing I did notice is you do keep it fun, you do celebrate victories... Like, your 20-year celebration post was awesome with like the Titanic reference... I can tell that you're still light-hearted and having fun with it, even though you've been doing this for 20 years. You have a curl conference now, you've got stickers... Tell us about some of the stuff that you're doing in the community and who all is part of it with you?
Yeah, I think all these other things around the project that isn't code also makes it fun... I mean, some of the oldest other contributors or maintainers in the project - they've been around for... I think the oldest guy has been a little over 15 years now. So some of those are really my old friends by now, so setting up a little conference to me over a weekend and just talk curl for a weekend - I can't think of much other things that are more fun to do in a weekend... So that's just awesome.
And of course, making -- one thing about becoming more known, and things like getting awards and prizes makes people get your eyes open and see us in a slightly different light or angle... Suddenly, people approach us with money, or ideas, and they can print stickers for us, and hand them over, or they can borrow us their conference rooms for a weekend, or stuff like that. So stuff also gets easier when you become known, or people realize the impact and people get friendly... We get friends all over, so that is fun.
So of course, I like curl and I like working with it, and of course I then try to sort of bring up those fun moments, like celebrating 20 years of curl, or now we have 32,000 questions on Stack Overflow, or now we have 1,700 contributors in the Thanks file, and stuff like that. I wanna help out the other contributors and everyone, to make sure that they feel appreciated, and that we all appreciate what they do... I think it's fun.
It also goes back to this constant question... I say, "Yeah, I've been working with curl for 20 years", and then they're like "Well, I used it 10 years ago, and it worked exactly the same. What have you been doing?" [laughter]
I think we asked you that on the last podcast... [laughs]
[00:16:02.24] That was actually one of our questions, was "What's new with curl? What have you been doing these last few years?"
What have I been doing - that's a completely natural question, and it's not a bad question, it's just that, you know, when you're working with something and the facade or the front is the same, and the whole point with the tool and the library is that it should work the same way... We work really hard to make sure that it keeps working the same way, but of course, we added a little stuff, and we fixed bugs, and stuff. But the point is that you shouldn't realize that a lot of stuff underneath actually changed and we sort of replaced half of the engine, and added a lot of other things, or documented everything again in another way; you don't have to think about that.
I wanna sometimes help people in the project and people around me to realize that we are actually doing a lot of things that even if you may not think of all these changes and you used curl the same way ten years ago, we have actually also added a whole bus load of things just the last few years, and here are some of those things that you can now do that you couldn't do before, and blah-blah-blah; why that is good, and how this helps your application or your usage of this in the future, and so on. And also - we're having [unintelligible 00:17:26.06] but we're adding a lot of features. We have 215 command line options...
Sometimes I feel a need to highlights parts of that to help people actually find out about things that curl can do.
Almost discovering hidden features or hidden gems, so to speak, because you're not paying attention to the changelog, or whatever...
Exactly. So even if the things might not be new, I can sometimes just write about it - "Well, imagine if you wanna do this, you can actually do it like this with curl." You've been able to do it for a long time, but maybe you didn't think about it.
Has anybody ever written a curl cookbook or some sort of thing where it's not necessarily -- like it's just a pamphlet, or maybe it's even only digital... But it's like "Here are 25 things that you can do with curl", and then specific examples of those commands... Because that would be so useful.
Yes, there are pages like that. I try to do that sometimes, but I'm not the right person to do it. I'm so entrenched in the details, so I just get lost in... [laughs] I've actually written a book about curl that I'm posting online. It's called Everything Curl, and it's really everything curl.
So is it gonna be like a big curl bible kind of thing, or is it going to be -- how long is it? I guess that's what I'm trying to get at.
It's long... [laughs]
That's right. That's how I thought you might say it. You said "It's EVERYTHING curl." I'm like, "Well... It might be too much curl."
Yeah, yeah. But it is. You can just google it, and if you print it, it's like 250 pages or so.
Is this an ongoing thing?
It is an ongoing thing...
It started in 2015(ish), something like that; late 2015.
Yeah, exactly. I think it was after our last podcast... But yes, it's been going on for several years, and it's never gonna end either, because it's just so much -- I know curl changes all the time too, so if I wanna keep up, I need to keep up with the book, too.
But it's an effort to describe curl and how to use curl in a way that isn't really just man pages, and reference documentation, but actually sort of help people to read up about it in a different way.
It's kind of like a "Did you know...?" kind of thing. I think that would be so useful. I was thinking - you know, kind of a callback to a recent show - maybe there was a Devhints out there for curl, and of course, there is... So devhints.io/curl. This is kind of what I'm thinking, but it's light on examples. There's three examples, and I think you could probably come up with some complex use cases where "This would be super handy for this particular case, and then here's your curl command."
[00:20:06.21] One thing I have seen a lot of, which is really neat, is different HTTP tooling - specifically some desktop apps for Mac - will actually have like an "Export to curl" button once you've crafted a specific request, right? And then you can just get the curl export and put that in your terminal, and that's really cool.
Yeah, the "Copy as curl" has really become a popular feature, and I like that, too.
Firefox, Chrome and Safari now all have this "Copy as curl." If you're using their dev tools, you can copy from their specific -- you know, if you watch the network traffic from your browser, you can select a particular request and do "Copy as curl" from that.
Spectacular if you're trying to replay a very specific thing in the terminal and capture the output, or whatever you wanna do from there.
Oh yeah, it's really handy, and it's a great way to learn how to use -- if you wanna do something with curl and get "It's roughly this that my browser just did", and just get a copy and edit that command line. I mean, the command line is usually quite long, and...
215... That's a lot of features.
Right. They're often really repetitive, because the browsers set a lot of headers, so you wanna have the exact headers like the browsers do; they set a lot of them... Very long command lines. But still, you can look at that command line and see "This is how you could do it.
So speaking of headers and speaking of features, I actually found on your blog recently a feature that I'm very much looking forward to, which is a small change, but you said the core feature set has stayed the same, and people say curl works exactly like it used to... You're doing some UI brush-ups, specifically with the -I command, which is probably -- like, if you go through my history with curl, in my command prompt, you're gonna find curl -I (capital I, of course) almost every time, because I use it in headers... And you're adding bold key values on the header; so the header names are bold, and then the value (the text) is not bold. So that's like a very small thing, but you're not ignoring the facade or the paint; you're still making small improvements to the output, as well.
Yes. And sometimes I have a hard time to decide what to focus on... But I think it's fun to do that, too. I try to sort of move around a little bit. I can work a little bit on how things appear on the command line... I changed one of the progress bar outputs a while ago too, just because of -- it is actually somewhat important to some people, and why not...? And it's fun to work on that sometimes, and then go back to debugging HTTP/2 streams for another day.
So I mix it up, and that's what makes me, of course, enjoy this, since I can do various things. I can play with a UI one day, and then go back and work with protocol stuff another day, and then work on documentation a third day, and then write a blog post another day.
I've actually just landed it in Git, so it'll be for the next curl release, the code that outputs the headers as bold. The name part is bold, and the value -- well, the value part is not bold. It's actually a very long time coming feature request.
Well, I'm sure a long time coming, but you also mentioned that this was not an insignificant amount of code change. Maybe you weren't set up to do this kind of output, or -- why was it a bigger feature than maybe people would think it is?
[00:23:50.17] I think it's mostly a lot of internal decisions on how to do HTTP, and show headers... You know, we have this concept of headers, and curl supports a lot of different protocols, and some of them have the internal concept of headers, but I only wanted to do the bold for HTTP headers. So it was mostly because of how I had done this with curl until now, or not done it.
And also, I had to change -- I don't know how to explain it, but headers come from [unintelligible 00:24:31.26] at the end of the line, so you wanna make sure that you actually do this on a complete header, and not on a partial header. So if it would be an extremely long header, it would still need code to handle -- that would only do the left part and not the right part of it, so it was a lot of finicky internal things.
Good old-fashioned yak shave.
Yes, and I've sort of done a lot of decisions a long time ago that were convenient because I didn't do this, and now when I had to go back and make sure that I could split up the headers like this, then I had to just remodel a couple of things and shape it up. But I think it was all good. I think I improved some other tiny things in the process, and I know that a lot of people will appreciate getting the headers bold; however small it may sound, it's one of those details that makes it look better.
So I guess we came to this conversation through an embarrassing moment for me... It was early in the morning on a Sunday, and somebody in our Slack - Daniel - had said "Hey, what's the state of HTTP/2 and where is it going?" and I'm like "Great question, we should ask Ilya. We've had him on the show a while back, and it'd be great to catch up. I send him an e-mail with the subject line "Current state of HTTPS?", not /2, and I had to quickly check that, because that was obviously not right... But I was reaching out to essentially get an update on TLS 1.3, QUIC, and some other stuff... So maybe help us understand -- he said that you're working on this; you've got a lot of stuff going on. What's going on?
There's a lot of stuff going on. Well, HTTP/2 - that shipped three years ago, right? Our last episode... And the RFC was published in May 2015, so yeah. And now, three years later, the work is of course no longer going on standard-wise on HTTP/2 very much. There are still things happening in HTTP/2, but the fundamentals are there, and it's good, and it's working, and it's being used.
[00:28:00.16] I could just add perhaps that if we look at traffic done by Firefox, we can see that Firefox is using HTTP/2 in about 75% of all HTTPS traffic, so I would say that is pretty good; a significant amount of the traffic is HTTP/2 now... Counted by volume, of course. If you look at the other ways - how a large percentage of all the web servers in the world that are providing HTTP/2, it's not as nice numbers. I think we're approaching 40% of the top 1,000, and in the top 10 million it's like 25% or so.
But it's still moving, and I think the numbers are still rising pretty quickly. I think they doubled roughly the last 12 months or so. They've been doing that for a while. So it's growing, and it's being used, and it's being understood, and I think there are areas that have been more successful, and some that have been less successful in the protocol.
I think already when HTTP/2 shipped, there was this notion that the next protocol revision wouldn't at all take 16 years to happen, it would happen much sooner...
[laughs] That would be nice, wouldn't it?
Yes, and a lot of the HTTP/2 work was also laying the foundation to make sure that we could iterate protocol versions much faster and easier and more effortless in the future. HTTP/2 brought a lot of that infrastructure.
At the same time when HTTP/2 shipped, Google had already been running their QUIC experiments in their Chrome browser and in their server side since (I believe) they went public - 2013, or so - with their QUIC efforts.
Anyway, Google took their efforts to the IETF and said "We should make a standard version of the QUIC protocol." They did that in late 2016. And QUIC being an experimental protocol that Google invented then, which is HTTP/2-like, but it's done over UDP. UDP is not reliable, it doesn't do retransmissions or anything, and there's no security in there or anything, so you basically implement a transport stack -- basically, a TCP-like stack that also features security then, because wanna have the not HTTPS, but HTTPS-like...
So you have UDP and TCP - don't those operate kind of at the same level of the stack? Why would you take UDP and then make it TCP-like?
Well, I can take one step back first - why wouldn't you invent a new protocol? If you wanna make TCP better, why not make a TCP 2, in parallel to TCP? That has basically been ruled out because of all the middleboxes and NATs and firewalls and everything in the world; that makes it really hard to introduce any new transport protocols nowadays.
Hm... So you're stuck.
So we are pretty much stuck. TCP or UDP - those are the ones we have to choose between.
[00:31:56.13] So now the answer is "Well, we can't change TCP enough to make it faster or better or more secure, but we can take UDP, which is very lightweight and doesn't have any of these things a TCP has, and make it TCP-like", but not with some of the trappings, I guess?
Exactly. By choosing UDP and basically "Do it all yourself", then you can basically decide how to do it... You just do whatever you want. And in Google's case, they have a fairly large client-side implementation and a fairly large server-side implementation. They were in an excellent position to experiment with doing their own protocol over UDP, and implement all this and check it out, see how it works... And it worked really good, and they figured out that "This is a protocol we should make a standard for the web and the internet."
Can either of you give a 10-second/60-second version of the difference between TCP and UDP?
TCP is like setting up a string between two computers - a physical string - and you pass on data in one end, and it will arrive in the other end. Or it might get connected, but the data will arrive or not arrive at all. But it will arrive, and it will unaltered, and it will arrive in the same order that it was sent from the sender.
So it's basically a way to transport data and make sure that it's a reliable transport in both directions... But UDP on the other hand is basically sending notes in the air, writing pieces of paper and throwing them [unintelligible 00:33:36.13]
Message in a bottle.
Yeah, it might arrive, it might not... And it might arrive in another order, too. So it's much more lightweight, and it's been used traditionally for DNS, MTP, and traditionally also for RTP, for video.
But it was never on a really wide scale, high-speed internet scale like this... So that's always been one of the biggest concerns, "Will UDP break stuff now? Because we haven't designed things for UDP at this level." But over time it has proven that most of the things actually work pretty well anyway, and over time people have also adjusted things and improved infrastructure, and routers, and things... So things are going better.
And looking at Google's numbers, they claim that they -- my number is old now, but they said already like a year or two ago that 7% of the internet is quick already, and that's quite a big share of data running [unintelligible 00:34:49.13]
So QUIC is the new version of what HTTP/2 has been, right? The evolution of HTTP/1 to HTTP/2 is now coming to QUIC, and...
And QUIC is a lot of things... Because first, it was the QUIC that Google made.
Right, and now it's evolved to something else.
Because it's a long time ago when they started this, like 2013, 2012 or something like that when QUIC was begun...?
Yes, exactly. So I think they went public with it in 2013, but then they had already been working on it in private before that... But then they produced what I call Google QUIC, and that is basically sending HTTP/2 over UDP, with custom encryption code. So you could almost use your HTTP/2 implementation and just provide that QUICK stack, and it would work. But when they took that protocol -- well, they kept up with documenting how the protocol worked, and they had a website for everything, and they made it all in the public, and they took their latest update of the drafts to the IETF and said "We should document this protocol. This is QUIC from Google", blah-blah-blah, and when they brought it into the IETF and they started to look at it and decide on how to move forward on this, they came to the conclusion that this bundled solution that is one transport protocol that is only sending HTTP/2 wasn't ideal for a protocol or a transport like this. So they came up with the conclusion that QUIC should be split in a transport part and an application part.
[00:36:39.06] So it should be able to also transport other things than HTTP, and DNS was one of the first things that were discussed and it has been one of the second protocols that have been in discussion all the time. So then QUIC became "QUIC the transport", and "HTTP over QUIC" is the new HTTP...
Is that the final version of QUIC, or is that a transitionary version as well?
Well, it's not final, because it's not done yet. So they took it to the IETF, and they created a QUIC working group in the IETF, and within that group there have been a lot of activities since then. They're now doing draft 12 of the specs, and they have four different specs I think. The plan is to be done by November this year, 2018...
With the spec.
With the specs, although there are several... I think there are four or five specs. But yeah, I don't think they will stick to this plan, because there are still too many loose parts and moving parts.
I guess to zoom out, the question might be -- this is all in an effort to obviously make progress, but to make it easier to iterate on something that has been traditionally harder to iterate on.
Yeah, but also sort of -- when HTTP/2 shipped, we all were aware of a lot of shortcomings and things that we could improve further in the transport protocols. So when we went to HTTP/2, we improved a lot of things from HTTP/1.1, but there are still a lot of other things that HTTP/2 can't really do, and where it has bottlenecks, or problems that we can solve... And we couldn't really solve them with TCP in the HTTP/2 context, but going to QUIC we can solve some of those problems that are still present in HTTP/2. I mean, apart from just fixing things in TCP... Fixing things in TCP is really, really difficult in general, because not only -- there are many reasons why TCP is difficult to change, but two of them are that again we have a lot of middle boxes over the internet. You're talking through NATs and routers and everything, and they "know" how TCP works. So if you change how TCP works slightly, you add a little thing here or there, you break X percent of those boxes, and they will refuse to send it because they know that's not TCP anymore.
Even if you're just tuning parameters, or if you fundamentally change the protocol? Because tuning parameters - they shouldn't break... That would just be really bad programming on those boxes.
Well, yes, but that's the reality... Just as sort of a little story into this - one of the features they added in TCP... I think like seven years ago they added a TCP Fast Open, which is a way to send data already in the first SYN packet in TCP. You know, when you do a TCP handshake, you do a SYN, and SYN - there is a three-way handshake... So in order to gain roundtrip, they invented this method where you could add data already in the first SYN packet, so you would save a roundtrip; you would get data earlier. And you know, a lot of this struggle is to get data earlier; reduce roundtrips, get data earlier. So sending data already in the first packet of TCP - that's potentially saving, if not tens, but sometimes hundreds of milliseconds if you're far away; it's a huge benefit.
[00:40:27.29] But implementing and using this TFO over the internet today - it turns out to be a struggle and a pain to make sure that it works, because there are so many machines out there that block that little new bit that comes saying "Here's the TFO", saying "Nuh-uh... That's not TCP the way we want it. Deny!"
Yeah, that's a tough problem. You just have so much infrastructure out there... It's not feasible to change the boxes in the middle, because there's just too many owners, too many places, too many situations that you're never gonna be able to replace those.
Exactly. That's what they call ossification nowadays. And the grand solution to that is encrypt everything, so that none of these middleboxes can actually peek into those little bits and bytes.
Exactly, they can't figure out you wanted this, because they don't know it, they just have to pass it on... So then you can add things, over time. That is one reason why QUIC is now really encrypted, as much as possible, really.
But that shows how it's hard to change even TCP over the wire, but then also just changing the implementations of TCP. They're kernel-based stacks; it takes forever. This TFO - the spec came seven years ago, and it was only a year or so ago that Windows finally implemented it widely... So it takes forever for this to be implemented widely. So if you wanna iterate fast, you can do it like that.
And then there is another technical problem, for example, that TCP has and HTTP/2 - it's the problem with packet loss. When HTTP/2 was introduced, the new method of doing transfers was a lot of streams over a single physical connection... So you would typically do 100 streams of the same TCP connection, just a lot of logical streams over it, which is a good way to do a lot of parallel transfers, but only using one connection.
This is really good, as long as your network is decent, and it turns out that if your network turns out to be very lossy and you start losing packets, then having just a single TCP connection is really not ideal... Because then losing one packet [unintelligible 00:43:07.12] that means that you're waiting for one packet to get resent to get those 100 streams continued, while previously you would do typically perhaps six connections per host, and you would do sharding, you would maybe have 20 connections or 30 connections with HTTP/1.1 to sites.
So it's almost like faster networks get faster, but slow networks get slower. Slow as is in -- unreliable maybe is the better word. Not slow, but unreliable.
Exactly. Slow as in bad radio...
Exactly, because HTTP/2 is really good if you're far away. So for people really far away from their servers, it's excellent. It's possibly those who actually gain the most by HTTP/2.
Because they need to make less TCP connections?
Yeah, and much less roundtrips. You can fire off 100 requests at once, basically, and get the responses, instead of this ping-pong - request/response, request/response, sending/waiting, sending/waiting.
[00:44:10.14] This TCP limitation is not there in QUIC. In QUIC you create connections, but they're not connections in the same way as TCP has them. When you're sending a stream, the streams themselves are reliable within the stream. So we can send things, and you know that the picture or imagine or whatever you send will arrive in the other end unmodified and exactly as it was sent from the source. But the streams - they are independent from each other. So if you drop a packet somewhere in the middle, that belongs to stream one. Stream two can still continue, because it still has all its little packets. It's only the one that actually has lost packets that has to wait. So this makes the lossy network situation completely different, because then if you lose a few packets somewhere - yeah, sure, those streams that belong to those that actually have lost packets, they will have to wait and resend packets and everything, but the others can continue.
It sounds like "Thank goodness for UDP", because it's provided us a loophole around the ossification, right? We would have been stuck if this UDP hack wasn't available to us.
Exactly. It is exactly like that. That's why it has to be UDP, and that's why we're doing all this work, implementing TCP-like stacks in user space in both ends. QUIC, as a protocol, is (I would say) far more advanced than HTTP/2, because now you also have to implement the transport part, and then the HTTP part on top of that.
Daniel, if you were to describe QUIC's mission, what would it be?
It would be to reduce roundtrips and work pretty much transparently the same way as HTTP/2, but better, and secure by default, and always. There's no clear text QUIC. And of course, that is the HTTP over QUIC, how that will appear. There will be more QUIC after this QUIC.
[laughs] "There's more QUIC coming after this QUIC." Wasn't HTTP/2 supposed to be all encrypted too, and maybe they backed off on that at the last minute?
Yeah, exactly. Well, HTTP/2 in reality, I would say, on the internet, over the web, is encrypted always, but the standard allows for both...
Exactly, the spec is sort of you can do it either way. But for QUIC there's no unencrypted version. You need TLS 1.3. You can't avoid it.
[00:48:03.15] You said there's more QUIC coming... This QUIC hasn't even arrived yet; how can we look that far down the pipeline?
I think that is maybe -- we don't have to care about it right now, but...
But when Google took this into IETF, they decided we should split it into transport and application, and the application is HTTP. And we should prepare for another application, maybe DNS. Then they also said "Well, we also want QUIC to be able to handle Multipath...", which is -- I don't know if you know about Multipath TCP, but that's setting up multiple paths between two endpoints over the internet.
But then they decided that "Maybe we don't have time to get into Multipath in QUIC v.1, so we'll postpone the Multipath part." So there's already this talk about "Oh, but then QUIC v.2 will be making sure that we can actually do DNS, and do Multipath, and stuff like that..." Basically, postpone because it hasn't been enough time to cram it in into version one.
So you mentioned earlier that 7% roughly - and that might be an older number - of the internet is using QUIC; specifically if you're using Google Chrome and you're speaking to Google services, you're most definitely using QUIC and you just don't know it. What about the rest of us? What's the roadmap look like in terms of adoption, or production use, and when we should start thinking about it? Many of us are still trying to get on HTTP/2, so maybe this is a little overwhelming... But maybe we can skip HTTP/2 and go straight to QUIC, I don't know.
I get that there's some notion of that. Maybe if you haven't gone to HTTP/2 by the end of this year, maybe you should consider just going to QUIC at once. But I don't know. Well, the Google QUIC was not implemented by many others than Google alone. The Caddy server has an implementation, and there are a few other standalone implementations, but they have never been widely deployed or adopted... So the Google QUIC version is primarily used by Chrome and the Google servers; that is basically what 7% of the internet traffic is.
But the IETF version of QUIC, which is quite different over the wires... It's sort of this divide, and they changed the crypto layer, and they changed pretty much everything in the protocol. So the IETF version of QUIC is being implemented by a lot of different players... All the ones that you can expect - the browsers, the big server vendors, and the big service vendors (Facebook is on it), and the CDNs, too. Going into the future, we will see this getting deployed and used by all the big players that were involved with HTTP/2 deployment.
So are you working on this on Mozilla's behalf, on curl's behalf? Both, perhaps? How does it fit into your life now?
Well, I'm actually not that involved in QUIC. I'm reading the traffic, I'm getting the news, and sort of following that steady stream of GitHub issues, and stuff like that. So yeah, I'm participating a bit for both interests - from a Mozilla perspective and a curl perspective, but because of course I wanna make sure I learn and know how it works and understand everything, and then as soon as it becomes possible and I get the time and energy, implement it and support it in curl.
What kind of timeline would you expect for that? Would you wait for the -- the draft needs to be formalized, right? So that November 2018 that they're shooting for - you wouldn't start any sooner than that, would you?
Yeah, I would, depending on things...
Tell us more. What things?
[00:52:01.25] Well, it's like building a tower, or building a house - when can you move in? When I implemented HTTP/2 for curl, I went in pretty early, and I started implementing support already in one of the drafts I think a year before it finalized. That turned out to be really useful, both as feedback back into the standard process, but also a lot of just trying out things and getting everything working, and interoping with all the other implementers. I think it's really useful to get in as early as possible... But not too early from my point of view, because in the QUIC world there's so much transport here, and I wanna have the transport part fairly done by the time I start adding the HTTP parts on top of that transport stuff.
Then I need to cooperate with others to do a library; there are already many libraries that implement this, but I am having a particular one in mind, and when I work with those guys, to make sure that we get an HTTP over QUIC library that works fine with curl, and that I can make sure that curl uses.
I'm expecting us - or me - to start doing that soon. I already started by now, but I think the spec hasn't really moved on as fast as I anticipated it, and the libraries are also not really there, and I haven't really had the time... So maybe in a month or so, I would say, hopefully during the summer, I could get to start on it.
Well, we talked about the ossification of our infrastructure, at least in curl's case, and on the software side. And on the client side, we appreciate that you are so eager to jump in and to help beta test the implementation of these things, and maybe even write one of the early client-side implementations of supporting these things, so that we can continue moving forward... Because when curl adopts something, a lot of devices around the world now can speak that language, right? So that's pretty cool.
You mentioned DNS as the other potential application of QUIC underneath DNS. I'm assuming there you would gain any speed, because UDP alone has gotta be faster than QUIC, right? Because QUIC has additional things... But there you're gaining that encrypted connection. Am I on point there?
Yeah, and I think there's an even bigger goal here, too. This term that has been used within the IETF several times, that I can drop here - they talk about the post-TCP world... If you wanna go to things completely TCP-less, then you need to do the other protocols over quick, basically.
I'm not sure why they picked DNS as the other protocol to use here, because... I mean, DNS has its own road forward in other ways, so I'm not sure exactly how this is going to turn out; I can't really speak much about why they picked DNS or what they want from that over QUIC, because nowadays we see a lot of DNS going over TLS, and DNS over HTTPS coming... So we're already sort of fixing up the security parts and the privacy parts for DNS like this... So I don't know.
So a "post-TCP world" - I've never had this consideration. This is my first time thinking about what are the implications. Dan, you probably thought about it a little bit more - what does that imply? What does that change? It seems like a simplification, but maybe not, because you've gotta put so much stuff in QUIC...
One of the interesting things without TCP is what is an HTTPS URL, really? Or an HTTP URL, for that matter... But HTTPS URLs - they are basically implying TCP, right? Or HTTPS is... Since they're not saying "Connect to me on UDP port 443", because you probably don't have that. So that's one of the greater challenges - how to move away from that.
[00:56:26.02] I didn't mention that, but the way you bootstrap into a QUIC world from HTTP (or HTTP/2) is that the server is replying with an alt Alt-Svc header saying "You can connect to this origin over on this server, using this protocol", blah-blah-blah, and then you continue from there and you cache that information.
I was actually gonna ask about that - is that then a UDP request? Like, the client sends one of those first? It can't be a TCP request...
Well, the initial one will be an HTTP... Or you'll rather upgrade to HTTP/2 probably first, and that response will say "The next one you can continue on over here, using QUIC, this version", blah-blah-blah.
So you still have to require hand-shaking, and you still have the setup time on that very first request, because you don't know if it's gonna be a QUIC server basically, until you do, and then from then on you can assume that and you can also cache that in the client.
Yeah, and it has a lifetime. So if you know you're gonna provide that for a year, you can set up a really long lifetime, so everyone will cache that for a long time. But going back to the ossification, UDP is also not as successful to use over the internet as TCP is. There's still this single-digit percentage of connections that will fail over UDP. That [unintelligible 00:57:49.22] handshake QUIC at all. So you still have to have that fallback mechanism to go back to HTTP/2 if the QUIC connection doesn't work... At least that is what we're doing now, and for the forseeable future.
Now, you're telling me that there's no such thing as a post-TCP world then, because [unintelligible 00:58:08.28] forever. [laughter]
Possibly... You know how everything gets done. There's always something left of the old technology somewhere. We'll never get rid of everything.
Yeah... And I'm really just wondering what the implications are. Why is IETF using this term now internally in their conversations? And it's like, I don't understand why you would want a post-TCP world, unless maybe because it's just old, and QUIC's better in every single way, eventually.
Yeah, I guess... I guess it's because it then solves the ossification problem. It allows you to keep on developing the protocols freely, much more freely. So if you wanna implement Multipath next year or in 2020, you can do that, because you have encrypted everything from the beginning, so there won't be any middleboxes that prevents you from implementing new, cool features that you come up with in the future. So I think there's a lot of that.
Except for that first request. [laughs] You've still gotta get it through there.
Yeah, that's the current approach, but I guess there will be those who will do the happy eyeballs approach, when you try both at the same time and you go with the one that responds, and stuff like that. That is also a solvable solution. You can probably invent something in the future that will do it differently.
So where should developers out there, in open source land, where should they be putting QUIC on their radar, and thinking about it more or less important in terms of maybe somebody's running a website, like Changelog.com, or maybe they're running a network service, like Twitch or something...? Is this something that we should all just be patiently waiting for, should we be getting involved? Maybe that depends on who you are and what you're up to, but what would be your advice with regards to QUIC?
[01:00:04.01] It's of course a technology that if you're into low latency serving things from either end over the internet, this is a technology that is coming, so of course, getting familiar with it and how it works and what it means for you - that's a good start. But it will take a while until they are reliable and solid implementations of this. So if you wanna work on code now, you're pretty early on, and you get to get a lot of funny things and rough edges if you try it out... But of course, it's a chance too to work on this bleeding edge protocol stuff.
So Daniel, you have the ear of the open source community, you're an elder statesman now, if you will, being awarded a medal by the Swedish king... I mean, that's something that doesn't happen every day, so you've got that going for you... If you could give some closing advice on this time around to our listeners and to us with regards to open source, software development, life - whatever it is, as parting words, what would you share with the audience?
My general advice when it comes to open source and software development like this in general is first to make sure that you try to find what is fun for you and work on that, because of you don't do that, you end up not doing it at all. So finding your project or your ideas or whatever scratches -- I mean, you scratch your itch; that makes you actually do something, and that's fun, and then you can possibly become productive.
Then I think you also need patience. Whatever you do in this area of work, you need to be sure that it's -- some things just take a lot of time. Not only time to get things done, but also time to make sure that others find your project and that you find your users or whatever, that you get your stuff completed. Things take a lot of time.
Speaking of patience, going back to the beginning of the conversation, the 20 years post... You mentioned Titanic, you mentioned that Google wasn't even formed yet, and here we just talked about Google leading QUIC, or at least beginning QUIC, and where it's at now... It's pretty interesting to see the patience it must have taken on your part to deliver curl and then evolve it over years and be patient with all the change.
Yeah, and just looking back over time and seeing what a different world and a different society we had back then... It's only 20 years, but most of everything we know today - it wasn't like that 20 years ago.
Cool. Daniel, thank you so much for spending time with us, thanks for coming back again, thanks for your super awesome service to the community in ways I'm sure that the future generations or the entire world will truly appreciate... Maybe lesser than we need them to, but having something that's so widely adopted and so widely used, I'm sure it will be around forever, for as long as the internet needs it, right?
Yeah, exactly. As long as it's needed, it's going to be there.
That's 20 years so far, so... 20 years more at least, for sure.
Our transcripts are open source on GitHub. Improvements are welcome. 💚