Amal sits down for a one-on-one with Alex Russell, Microsoft Partner on the Edge team, and former Web Standards Tech Lead for Chrome, whose recent post, The Market for Lemons, stirred up a BIG conversation in the web development community.
Have we really lost a decade in potential progress? What happened? Where do we go from here?
Alex Russell: I think the way youāve framed that in terms of trying to understand which architecture is going to make the most sense is really a pressing problem for our community. So Iāve been talking with colleagues and friends about this sort of little tiny model - I need to write it up in a blog post, but sort of like, every system, every interaction that you do⦠I wrote something about it last year, a kind of generalized theory of web performance, unified theory of web performance⦠And it kind of includes something like this. Basically, the idea is you start with a system being responsive, that is to say itās not doing anything, right? It can take input at low latency; so you provide it input, it starts to do the work, and it acknowledges to you that itās doing the work. That can be like, I donāt know, starting a loading spinner, or something. You do updates about that work, you do part of the work, you learn more about how itās going, and then you continue to update the user about its progression. Eventually, you retire the work, and then you present the results of the operation, and then youāre back to square one. You are interactive again for the very next thing. And that loop, that core interaction loop, where you start quiescent and you end quiescent⦠Like, you start being able to take input, and then you return to being able to take the next input - that describes clicking on a link, or typing in a URL, just as much as it describes some update through a local data store or through a templating system to generate some diffed DOM. And so you can model the same interaction as either a full HTTP round trip, or as a fully local interaction, depending on where the data model lives.
[38:00] You remember the old Mapquest, where you had to like click north, south, east and west, and maybe you could get northeast, northwest? ā¦this very painful thing. And then Google Maps, of course, the slippy maps made the world a lot better for folks who are doing that kind of interaction, because each incremental interaction involved much less variance and lower latency as you took each individual turn through that loop. Because grabbing it and moving it was a whole turn through that loop. And you could ā again, at an intellectual level, you could model all those interactions as being full round trips. And we all came to agree that that was less good. But we didnāt say under what conditions is that less good. And so Iāve been trying to enunciate this with folks at work, and the closest thing that I can come up with is session depth weighting. So that is to say ā think about a distribution of sessions, right? Because weāre never talking about just one user doing just one thing. Weāre talking about users doing a series of things, in a session. And they have a distribution of actions they take. And thereās a distribution of different kinds of users through a site. But if youāre thinking in terms of, say, an eCommerce site, you have a couple of sort of prototypical session types that you think most about. So youāll think most about a user who comes into the site door through a product page, from a search result; they add to their cart and they checkout immediately. Okay, thatās one flow.
Thereās another flow, which is they go to your homepage, they go to the search box, they go to a search results page, they sort and filter a bunch, and then they go to a product detail page, and then they do some configuring and some futzing around there, and then they add something into the basket, and they do a little bit more shopping, and then they come back to the basket, and eventually they do a checkout. Those two sessions are extremely different in maybe their composition, and the question that I think we can try to tease out is, āIf we know about the kinds of sessions that a site or product has, we can learn a lot more about what trade-offs are going to be better.ā
I like to bring up the example of Gmail. I can click on a link to an email from Gmail and see just one email. Or I can load up Gmail as like my daily driver email client, or Outlook, or whatever; I can load up the thing to do 1,000 interactions through an extremely long session. Now, I think youāre right to be cheesed every time you see a loading bar to get to a single email, right? That 10 kilobytes of text shouldnāt require 10 megabytes of JavaScript, or something; or 3 megabytes, as the case may be. We can do better. So thatās related to session depth. And so if we think about SPAs being appropriate for long and deep sessions, especially where thereās like high frequency and low latency interaction, editors - editors are a perfect example. The Figmas, and the Photoshops of the world - you need a local data model, because otherwise network variance and network latency in the critical flow are not great. Whereas for lots of other kinds of applications, which are either not editors, or have much shorter sessions, even if you could imagine the most engaged user, fetishize the user who will spend all day on your thing, the reality probably isnāt that thatās who most of your users are⦠And we can know something about most applications, because honestly, most software projects today are not greenfield. Like, weāve built these on the web before. We were building email on PIM clients, again, with tools that I used to help build back in 2006. Full business intelligence suites with Ajax in 2006-2007. The stuff isnāt new under the sun. We can say that we know something about these classes of applications now. And so as a result, we can start to characterize them by those session lengths, and then we can start to decide whether or not we need a local data model, and all the tools that are premised on operating over a local data model.
[42:01] When we talk about this SPA technology, what I think weāre really talking about is, āDo I have a local copy of some subset of the data that Iām going to be manipulating frequently, applying optimistic commits to, and then updating my UI about as quickly as I possibly can in order to make it feel better?ā And thatās extremely subject-dependent as to whether or not the app itself is going to feature long sessions on average.
I love WordPress as an example here⦠So WordPress is two applications, right? WordPress is an editor; so you sit down and you want to write a blog post, you go into the editing UI, and you have a very long session, probably. Itās a drilled-in editor, that editor does lots of stuff, loads of stuff⦠You can sit there and preview, and refresh, and all that kind of stuff. Whereas if youāre a reader, that experience is mostly going to be thin consumption; like, one to two clicks.
And so in those sessions, the user who has a very long session can divide the total session costs in terms of latency, or payload, or whatever you want to call it - because those things kind of blend into each other - over the number of interactions that you take. So if you can drive down the resulting fractional number, if you can get that to some low number because you front-loaded stuff, and therefore each subsequent cost was much lower, versus the sort of full-page refresh model, then youāre winning. But if youāre a reader of a blog, you canāt afford any of that, because the denominator is one, maybe two. Scrolling doesnāt count, because the browser does scrolling for you, and itās magic, butā