If you find yourself chasing shiny objects and squirrels all time, you should đŻ listen to this episode featuring Ozan Onay (President of Bradfield School of Computer Science) where we discuss his recent blog post entitled You Are Not Google which was the #1 link in Changelog Weekly - Issue #159. This show is full of wisdom and advice for every developer out there.
Featuring
Sponsors
GoCD â GoCD is an on-premise open source continuous delivery server created by ThoughtWorks that lets you automate and streamline your build-test-release cycle for reliable, continuous delivery of your product.
Datadog â Cloud-Scale Monitoring â Monitoring that tracks your dynamic infrastructure and applications. Plus next-generation APM. Monitor, troubleshoot, and optimize end-to-end application performance. Start your free trial, install the agent, and get a free t-shirt!
Bugsnag â Mission control for software quality! Monitor website or mobile app errors that impact your customers. Our listeners can try all the features free for 60 days ($118 value).
Notes & Links
Before we get to the show note links, here are some notable quotes from this episode and Ozâs post.
Software engineers go crazy for the most ridiculous things. We like to think that weâre hyper-rational, but when we have to choose a technology, we end up in a kind of frenzy â âbouncing from one personâs Hacker News comment to anotherâs blog post until, in a stupor, we float helplessly toward the brightest light and lay prone in front of it, oblivious to what we were looking for in the first place. This is not how rational people make decisionsâŚ
Writing code isnât really about writing. Thinking is the thing that we do. Eventually that gets translated into running code.
This is a counter-force against the marketing machine of high opinion technologies.
As of 2016, Stack Exchange served 200 million requests per day, backed by just four SQL servers: a primary for Stack Overflow, a primary for everything else, and two replicas.
- You Are Not Google
- Changelog Weekly - Issue #159
- Bradfield School of Computer Science
- Cutting through to what matters
- How to teach yourself computer science
- Teach Yourself Computer Science
- Learn how computers work
- Learn every language
- Google Search for âpostgres paperâ
- The Implementation of Postgres (paper)
- The Design of Postgres (paper)
- @loveapaper on Twitter
- All Things Open
Transcript
Play the audio to listen along while you enjoy the transcript. đ§
Oz, you wrote an article recently called âYou Are Not Googleâ, and it resonated and reverberated across the software developer linksphere, or whatever that is. We all read it, and we all talked about it. Why did that resonate so much with so many people?
Look, I was actually a little surprised. I thought thereâd be an undercurrent of old people, letâs say, who are nodding along and theyâre like âYes, finally somebodyâs speaking truth to the young onesâ, but actually⌠Yeah, it did surprisingly well across the board. A lot of people reached out afterwards being like âIâve had this conversation with my managerâ or âI inherited this codebase, or whatever⌠Weâre really struggling with exactly the thing that youâre talking about, and Iâm glad you put it out there.â
I think weâre just getting to the point now where thereâs been so much excitement, so much promise with some of these technologies⌠People have actually had time to implement them and see them in practice, and theyâre starting to get that voice in their head that says âMaybe this is too much. Maybe this is not the right thing.â
A lot of people have actually spoken and written about this before, but just the timing I think is such that people are ready to hear this message.
Yeah. Well, we have short memories as software engineers, so itâs nice to hear things, even if youâve had that thought or read that before, bring them back up either to a new generation of people who havenât thought about these things, or to those of us who have and forgotten that principle and moved on.
I feel itâs a key thing to keep being reminded of things like this⌠To even go back to reread books that were pivotal to you, or go reread or be reminded of things like this in your career, to sort of jolt you back into reality, like âOh, stop chasing shiny objects!â
I think itâs gotta be a counterforce against the marketing machine of hype in new technologies, right? Thereâs a lot of money behind things like Mongo, so youâve got this constant force online⌠You donât even realize; theyâre inventing acronyms, theyâre sponsoring conferences, and itâs just kind of subliminal that thereâs this active paid force to get you to buy into the new technologies. But thereâs nobody whoâs like putting money behind Postgres and sponsoring conferences and got a marketing team inventing flashy acronyms⌠So we need that reminder just as a counterforce to capitalism, in a way.
[04:05] I was at the OSCON London and there was actually a Postgres company in the vendors area. So there are people out there doing different things, but yeah, thereâs voices that are louder than others.
Or more well-funded.
Sure. Before we get too far into the weeds here, why donât you just give us the gist of this article? I think a lot of it is right there in the title, âYou Are Not Googleâ, but give us the high-level breakdown and then weâll go into the details of your ways of fighting against this, and then weâll go from there. Give us the gist.
Yeah, so the gist is that there are some amazing technologies out there that are great for only a tiny, tiny fraction of companies. We see that theyâre great, but we forget to consider âAre they great for us?â This is a theme across computer science and software engineering; itâs true for newer data stores, distributed data stores like Dynamo and its legacy, Cassandra, Riak and so on. Itâs true for large scale dataflow engines like MapReduce and its legacy, Hadoop and Spark and so on⌠But also true of other things, like software engineering practices, service-oriented architecture, where itâs been amazingly successful in a very specific context. That very specific context is not your context, and so we just ended up as a community pretending like we do index every website in the world, or we do have tens of thousands of engineers whose teams need to be split up and interfaced.
The post just pulls together some of these trends that Iâve been seeing a lot of, this over-excitement about these technologies and these practices, and just threads them together like that.
This is a question that I posed - I think it was with James Pearce of Facebook⌠Coming from their side, especially as we cover the open source ecosystem and seeing Facebook open source its tools to lots of coverage, to lots of interest, and to lots of people starting to use them, and the question that I posed to him at the time was âDo you guys feel a responsibility for putting out there tools that may not solve other peopleâs problems, but because of the massive interest and because itâs Facebook or because itâs Google (or whichever company), they received an outsized portion of developer mindshare?â and he said thatâs something that they think about for sure, and they try to address it by way of documentation and giving talks and explaining where theyâre coming at this problem from.
But the article that you wrote, this principle of âYou are not Facebookâ or âYou are not large company Xâ really puts the shift back onto the individual developer, not on the company thatâs open sourcing Cassandra, or putting out there Mongo, or whatever, but to us, to actually from the other side not get swept away by the hype.
Yeah, totally. Youâve gotta expect that companies like that act in their own self-interest, right? Theyâre not gonna open source something unless they think theyâre getting value from that, whether thatâs being a magnet for engineers, getting just good PR generally, inviting contributors and offloading some of the work of maintenance, flushing out bugs and that kind of thing. They donât do it out of the goodness of their hearts. Sometimes itâs just for retention. The engineers want to have their work seen publically, and they wanna get the recognition for that. Facebook or a company like that will support that, to keep their engineers happy.
[07:57] But the objective is not to solve your problem. Youâve gotta solve your problem. Itâs great that they provide these tools for you, but the onus is on you to read the paper, read the docs, read whatever it is thatâs available. Obviously, itâs open source, so read the source code if you can, and really stop and think about your problem, consider the other technologies that might do as good a job, really be honest with yourself about what you need and why youâre making this decision, and then pull the trigger. You canât just trust that because itâs Facebook itâs gonna be good for you. Probably if itâs good for Facebook itâs not good for you.
Weâve just recently shipped a show with Gerhard Lazu where we talked about the real world situation of deploying Changelog.com, and that episode has received a lot of praise, Adam, because it was focused around a real-world problem. It wasnât just tools, it was about the Why and the What and the How, and Gerhard really brings a level and a logical method to the way he does things.
One thing that Iâve been cognizant of, Adam, and Iâm curious if you are, with the Changelog itself, because as we cover open source software and the people that make it, do we sometimes contribute to the hype? How do we not just be â because we like to cheerlead, we like to root people on, we love to see success, and sometimes I wonder if thatâs something that we have done, which is to just make more noise and less signal.
Yeah, thatâs a great thing to think about, because thereâs times whenever I feel we cover things because itâs our duty and our position and our responsibility to share the news, right? And the news is whatâs happening. Itâs similar to Twitter and a retweet - did my retweet of that mean that I agree, disagree? Does it represent me or my feelings or did I retweet it because I wanted you to also see it? And I kind of feel like itâs on us to sort of share, and itâs on the audience to, as Oz is saying here in this article, itâs like âHey, youâve got to read the docs, youâve got to read the fine print.â Itâs us sifting through and curating whatâs happening out there, and to a degree disseminating it, but not to every finite point, and I feel like thatâs sort of our game, itâs that focus.
Yeah, I think we need the news. We need to know whatâs out there and whatâs just arrived that we donât know about, but if the newsbringers can give us a bit of the historical context, I think that goes a long way. Some of the people Iâve spoken to, they donât realize that Cassandra is based off of Dynamo, so just for the sake of your listeners, this guy Avinash Lakshman, who worked on Dynamo at Amazon, moved to Facebook and really reimplemented Dynamo at Facebook, and thatâs what became Cassandra. Some changes - he got to work on the system twice, so you might say there are some improvements, but itâs very similar to Dynamo, which also means that itâs very well-documented and thereâs a great paper.
When somebody delivers news about Cassandra - which I guess is not news anymore, but thereâs gonna be Cassandraâs legacy as well - giving the context of âHey, this derived from Dynamo. Dynamo was created specifically so that Amazon could have a high write availability shopping cart, because they lose money if you canât write to the shopping cart, but itâs not such a big deal if the shopping cart is inconsistent, if you see your item twice in the cartâŚâ
[12:03] As soon as you hear that, that that was the reason for Dynamoâs creation, that they published a paper on it and you can read it and learn a lot about the rationale and the context, and that became Cassandra, well then the news about Cassandra is less newsy and more like âOkay, this is now an open source version of this thing that worked really well for Amazon, so if I have a problem like Amazonâs problem, I can use Cassandra.â
Obviously, that takes a lot more work and maybe news is gonna stop at like a one or two-sentence historical background, but just being like âHey, shiny new technology! Cassandra is great because it scales!â Thatâs less helpful than we as providers, reverberators of news could be.
Yeah.
Well, thereâs two sides to us, really - weâve got our quick, poignant Twitter/weekly email that we ship out, and then we have deeper, more personal, more human, I guessâŚ
Contextual, yeah.
âŚcontextual, with the podcast, and I think thereâs times when we hit â letâs say we news-jack something or we hype train something just because itâs on everybody elseâs minds, and it makes sense for us to cover it. Weâre not exactly advocating âHey, because this shiny new tool is shiny and new doesnât mean itâs the thing you should choose.â Weâre doing our best to come as interested, curious technologists/developers and hopefully providing context to other peopleâs choices and what fits them for their problems.
Yeah, the real danger is the cargo cult mentality, as you point out, Oz, in your article. Weâre gonna go through â you have an acronym of your own, talk about marketing hype, Oz⌠[laughter]
Come on!
UNPHAT! Weâll go through the finer points in there, because I think thereâs some great advice in there, even though Iâve got some qualms with your acronym itself, because I canât help but bikeshedâŚ
I think what your overall call here - besides just to point out that this is a phenomenon inside our community which is problematic, as people end up with the wrong tools for the job, and donât find out until much later, and itâs much more expensive to reverse that problem⌠But your overall call is for critical thinking. Or really, you just say â I guess thatâs the âtâ in your UNPHAT, thereâs âthinkâ⌠So I guess my question to you is what is it about us as software engineering community that makes us so susceptible to the new shiny, and we donât put critical reasoning â and of course, Iâm speaking generally here, so if youâre a critical thinker and you always make the right decisions, Iâm not talking about you⌠But maybe itâs just me - I easily just take the shiny, new choice without putting it through that rigorous thought process. What is it about developers that makes us this way?
You know, I donât think weâre actually that bad. I think we do strive to make good choices and to be thoughtful, but we miss the mark a lot of the time, particularly the more junior engineers, the people who are â well, I say that, but I still have this problem and I still need to be vigilant about it, so donât take that as me saying âHey, you youngsters, you are the ones who are struggling to do this.â I do this as well, but thereâs one thing Iâve observed about the software engineering community, which is that we love fast feedback loops; we love hacking and getting the feedback, and thatâs in some context excellent, thatâs something that we can use as a process to do really good work - just a little bit of input, get some output, have a little repl going, some quick feedback, Hot Reloading, whatever.
[15:56] This kind of thing is great in some contexts, but in other contexts you just need to sit back, turn off the computer, get a piece of paper, be thoughtful and really reason about the problem. That switch from fast feedback, fast input, see your results quickly, get that adrenaline rush of building something directly, to âHey, letâs slow down, letâs take notes, letâs question what weâre actually doing⌠We have something thatâs working, but is it the best way that we could do this?â Itâs kind of counter to that fast feedback thing that works well for us in other contexts.
I wonder if itâs similar to - and this may be very provocative⌠If itâs similar to an addiction? Because I wonder if you can connect something like this to maybe something like where people are addicted to Instagram, or addicted to the feed, the next thing coming, where the reason why you do it is less about wise choices as a developer and more about our actual minds as human beings. We get this rush - you used the words âadrenaline rushâ⌠I wonder if itâs connected to something thatâs far above simply being a developer, if itâs just a human flaw.
Thatâs above my pay grade right there.
I think weâre really lucky to have a craft where we can be totally engrossed in it, where we can get that feedback, where we have joy from it. It kind of stops feeling like work at those times, which is amazing. Iâm sure youâve had this experience where youâve got this big problem to solve, you sit down, you start writing, and then you look up, your tea or your coffee is cold, itâs night time, youâre hungry⌠You totally donât know what happened for the last ten hours. Thatâs an amazing, amazing feeling, but sometimes thatâs not what weâre supposed to be doing. Sometimes itâs not just sitting down and hacking on something and getting that physiological experience of doing this, like rock climbing, or tennis, or something⌠Sometimes we actually need to be more aware and less in the flow; we need to slow down, we need to counter our first instincts, we need to question ourselves.
The best engineers can do that, and they can switch between those. The rest of us are working up to that standard.
While weâre speculating, Iâll throw another thing into the ring⌠This is something that Iâve thought of with regards to this particular problem - perhaps itâs part of the revolt against waterfall methodology, the agile movement, where itâs like âHead West, young man. Letâs just get going, and weâll figure out as we goâ, and we found out that thatâs a better way of building software than thinking through every possible thing, wait upfront for six months, and then being done with the design phase and moving on to the build phase, because six months ago we didnât know what we needed and we realized that as you build software, things change, itâs in motion as youâre building it. So that perhaps leads to âWell, letâs just get going. Iâm just gonna pick a tool and Iâm gonna build it, and then weâll figure that out when we get there.â
As with most things, the true best choices are in the grey areas, where you wanna move fast but you have to slow down to think as well. You can save yourself a lot of effort by putting some thought and some preparation, and still doing agile software development. So you donât have to just fly in everything.
I do feel like there is this push to always be moving. âMotion creates emotion, motion shows progress, fix it along the way⌠Youâve got a race to do, so why not just get in the car, even if it has no wheels? Weâll put them on the raceâ kind of feeling, and itâs like âWell, we needed four wheels, not twoâ and youâre halfway through the race and everybody else has already finished because they slowed down enough to think âHow many wheels do we need?â Youâre right, motion is sort of the anti, where youâre always forced to go forward, and forward is progress.
Yes.
I have not personally watched it, even though I know itâs one of his greatest hitsâŚ
Thatâs true.
Itâs documented by Rich Hickeyâs greatest hits on Changelog.com.
Itâs totally worth it, it should definitely be⌠You know, maybe number two or three. Itâs got some great ones, but Hammock Driven Development is up there, even just for the name. Youâve got Hammock Driven Development and youâre like âHm, what else is there driven development? What is this an alternative to?â You have this amazing visual as well, of this senior software engineer, someone really respected at the company, whoâs got his sprint planning points or whatever, heâs about to stop work for the week and try and get his points, and the first thing he does is string up a hammock, just like sits thereâŚ
Nice.
Maybe a couple hours later comes down, makes a coffee, goes back⌠This is how a lot of good software gets written, through first thinking about it. Agile doesnât leave that much room for that, or at least it doesnât encourage that. You need to fight to make room for yourself to think before you psyche yourself up for the week and get your sprint points.
Waterfall by default encourages that, it encourages the pre-planning, and it obviously has a lot of other downsides, and thatâs why as a community weâve swung the pendulum away from that. But now the flipside is that itâs up to us to really stop and think.
Coming up after the break we talk about UNPHAT. This is, in Ozâs words, I promise, âItâs a dorky acronym for you to follow the next time you find yourself googling some new technology to build or rebuild your architecture around.â We break down each letter of the acronym, we talk about his clear intent for humor, but more importantly, Oz shares some serious wisdom to consider when evaluating your technologies. Stick around.
So the acronym you came up with was UNPHAT. Your articles says âThe next time you find yourself googling some cool new technology to build or rebuild your architecture around, I urge you to stop and follow UNPHAT instead.â Iâll just lay out the words here, the brief synopsis; weâll go into the details. Iâll tell you, eNumerate is the one that I struggle with here⌠Anyways, Understand (understand the problem), the N is âeNumerate multiple candidate solutionsâ (thatâs a stretch, but I get it). The P is âRead the Paperâ if you find the candidate solution. The H is âDetermine the Historical contextâ, which youâve referenced in this conversation as well, and then the A is âWeigh Advantages against disadvantagesâ, and then finally, the T, as weâve mentioned previously, is âThink!â
How did you come up with this list and acronym youâve got?
Iâll be honest, this is pretty tongue-in-cheek, and the way that I came up with the acronym was to first think of the dorkiest acronym that I could⌠[laughter] And then fit everything else to it. So I was really pushing for UNCOOL. UNCOOL was tough, because of the two Oâs.
That would have been good, though.
UNPHAT â I just needed to massage it a little bit, and âeNumerateâ was one of those things that I massaged. [laughter]
Well, itâs funny because PHAT â is that still in the Zeitgeist? I donât know⌠I mean, I know that [unintelligible 00:25:37.18]
I actually had to ask a millennial about that. I was like âIf I say âPHATâ, do you know what that means?â and he was like âYeah, Iâve heard that word. I think you can use itâ, so I went ahead with it, and here we are.
Thatâs so funny, âI asked a millennial.â [laughter]
Well, you know, I had a couple people help me edit this, and one was like really doing copy editing, and the other one was helping me empathize a little but with my younger readers, so that was his input, the âThatâs okay.â
So you started out with the dorkiest way to do it, which means youâre (as you said, tongue-in-cheek) not trying to be overly serious. Youâre trying to make a point, butâŚ
Memorable.
Yeah, in a memorable way, thatâs like âHey, come onâŚâ - I donât really know, how do you mean by that?
Like âItâs okay to be UNPHAT, you donât have to be PHATâŚâ, P-H-A-T.
P-H-A-T.
I feel like I have to say that every time, since weâre audio-onlyâŚ
I kind of have to go back in my own mind, because Iâm 38, I have to think like âIs PHAT that PHAT, that was like saying âThatâs cool?â So saying UNPHAT is saying like âuncool.â Just to give everyone context who may not be â theyâre probably scouring to Urban Dictionary to get a context of that. Itâs like saying âThis is uncool.â
Right. I think Chris Tucker said it best on either Money Talks, or what was the movie he did with Jackie Chan? It wasnât Friday⌠Help me out here, guys.
Yeah, I know the movie. It was a coupleâŚ
Yeah, there was a sequel. Oh, itâs gonna be one of those shows where people are emailing usâŚ
[unintelligible 00:27:24.04]
Oz, you canât help us out on this one?
No, it was with Jackie ChanâŚ
Rush Hour.
Rush Hour, thank you. Oh, that would have killed me. And he said PHAT - Pretty Hot And Tempting. That was the way that Chris Tucker described PHAT. Remember that, Adam?
So weâve reacronymed it. I didnât realize it was originally an acronym.
I think it was just a statement of like whatâs cool. It was just a word, kind of an inappropriation of the word âfatâ, and I think he then backronymed that during the movie. Now, I donât have the full history in front of me, but I do have that much. So he said it meant âPretty Hot And Temptingâ, but now you have UNPHAT.
âUn-Pretty Hot And Tempting.â [laughter]
YeahâŚ
[28:16] Thatâs good context, though. It goes back to that what Oz said before, which was âGive something some history, give something some contextâ, and I think thatâs just an interesting way to unravel this and bring some humor to it as well.
Yeah. What we are calling for here first of all is to understand the problem, and one thing that you state there is that people tend to think in solution domains and not in problem domains. Can you unpack that for us?
Yeah, so when I say âproblem domainâ, I really mean âyour problemâ, so your business, or your project, or your customers, really thinking about what it is that makes this your problem and not somebody elseâs problem. Thatâs really the problem domain, the facts and context around that.
The solution domain is the set of tools that you could use, or the architectures that you could use⌠Really what it might look like, all the candidates of how you might solve this. Now, people think that spending time in the solution domain and considering âHey, should I use language A or language Bâ, thatâs what the main decision is about, and youâre gonna end up there, sure, but most of your time should really be spent asking questions, understanding your own problem, probing that as much as you can, and then out of that youâll be surprised how frequently the solution will just fall out, where youâre like âOh, so we expect to have this happen? We expect to have 10,000 customers use it this much every day, over this period of time, and every write needs to persist? Well, you know, then thereâs one solution for that, and itâs this.â
I very rarely see people spending too much time thinking about the problem, and I very frequently see people spending too much time thinking about the end solution, what the technology may be at the end. If you start to go down that path as well, itâs just a trap. You start googling, you start reading articles, thereâs a debate on Reddit or something, and you get drawn into that whole thing. And they donât know about your problem, only you do.
So really, if thereâs one thing that you take from this conversation or this article is âUnderstand the problem.â Spend the time, ask the questions, dig into it. Everything else should flow much more naturally after that.
Yeah. So in the next step you say âeNumerate multiple candidate solutionsâ, so not just your favorite tool of choice⌠Now, I start to get conviction on this one, because when it comes to data stores, I just tend to reach for Postgres, and because itâs general purpose, Iâm generally okay⌠But perhaps Iâm being lazy in that regard. How many candidate solutions is sufficient, and why is it such an important aspect of being UNPHAT? [whispering] Oh, goshâŚ
Youâre struggling, huh? [laughter]
Iâm struggling⌠But I like it though.
So how many are sufficient? I donât know. I would challenge you to at least think of one, and at least honestly give it a bit of a whirl because the temptation is always âI know this first thing, this default thing, and therefore itâs always the best.â Iâm not saying you need to think of five necessarily, but at least yank yourself out of that confirmation bias, that prejudice that you have for the thing that you first thought of, and just look at it from one other perspective. Maybe after that youâre gonna think about another one or another one. But if all you do is temporarily yank yourself out of this deer in the headlights kind of fixation with your language or your operating system or your data store, thatâs great.
[32:00] So maybe with Postgres youâre not thinking âHey, do I use MySQL instead?â, it wouldnât make sense, but maybe youâre thinking, âHey, do I really need to actually persist this data? Is that really a part of the problem? Could I store it in a file? Could I store it in memory?â So maybe itâs that kind of thinking instead that really gives you the different perspective. âDo I really need to solve this problem? Can I just call somebody up and talk to them about it instead?â But maybe at the end you use Postgres, I donât know.
Right.
Itâs just something to pull you out of the default path.
Do you have any examples where you enumerated the multiple candidate solutions as you mentioned here and you were very thankful for doing that task, that discipline?
Yes, I have a story in there⌠I will not name the company, but this actually happened with them. They were using Kafka. The first design of their system was not very good and they responded to that by really over-engineering their second system. It was Kafka and Samza and all these really excellent technologies that operate at way, way larger scales than them, and really through a conversation with one of the engineers at the company we ended up with a design that would have more of a traditional relational data store, but which could have honestly been somebody writing into a book. Thatâs actually the design that I push for. I actually push for the data store being somebody receives an email and physically writes it down, maybe in a couple of books.
RedundancyâŚ
Yeah, redundancy. [laughter] Or maybe, you know, you have it in a spreadsheet AND a physical book⌠And I donât think they went for this design ultimately, so maybe Iâm not exactly answering your question of when I was personally thankful for it, butâŚ
Well, to advocate for something means that you must have had some sort of reward from doing the discipline, so Iâm just wondering what was something you personally experienced from it.
Yeah, I mean⌠You know, just the satisfaction of ultimately being right. Maybe thatâs petty, but I live for that stuff. [laughter]
Nothing wrong with wanting to be right.
Yeah, must be right!
I think a lot of the time it comes down to just talking yourself down from the ledge; you get excited about something â âOh, Iâm really excited about functional programming! Clojure is a really well-designed language, Iâm gonna use Clojure for this project.â One path is to go down that and really feed off your enthusiasm and write your system in Clojure, and the other one is to say, âHey, but Iâve actually been writing Python for five years, and itâll be fine if I write in Python. I will, as a company, do better; people will be able to understand me, weâll know how to deploy it if I write in Python.â Thatâs the kind of situation where you look back and you think âOh god, Iâm glad that I didnât go with my first instincts on that.â
Given the next point, I have a question for you, Jerod.
Yes.
And since youâre using Postgres as your example here.
Yes.
If we follow what Oz says, he says to at least give one additional candidate solution to look at and enumerate over, and then once youâve chosen that â letâs assume you chose Postgres, so point three is âConsider that candidate solution and then read the paper if there is one.â And I just googled a Postgres paper and found a 36-page document from the University of California, BerkeleyâŚ
I know it very well.
You know it very well, Iâm sure⌠[laughter] So this is my question - have you read this paper, the implementation of Postgres?
[36:00] Thatâs funny you ask that, because I was just thinking as we were coming to this third point, you know, I was doing so well up until this one⌠[laughs] And maybe Iâm coming out here as not a great developer, because when it comes to reading the paper - Iâm trying to think of any papers that Iâve read with this particular goal in mind of like vetting a tool, where Iâve gone and said âIâm gonna read the paperâ, and no, Iâve never read that Postgres document, to answer your question.
Iâm just wondering how many people â maybe Iâm the only one here, but how many people will actually do this one? I donât know, Oz⌠Is this aspirational, number three, read the paper? Should we all be doing this every time?
Look, I think if itâs a brand new technology or if itâs like 5-10 years old, yeah, you should read the paper. If thereâs one key paper⌠You know, if you can go and read the Dynamo paper, or something, read the paper. Because youâre gonna get so much useful context out of that.
I teach databases, I have a lot of students who use these technologies, and then they read the paper for the first time with me because I assign it as reading, so I see that. I see their eyes light up and they say âOh, really? Amazon uses it for that? They want high write availability? Well, weâre using it for high read availability. This explains why weâre getting this inconsistency.â They read the paper, they get the context, and immediately they switch on and understand it at a deeper level than other people at their company who have been using that technology for longer but who just donât have the context.
So I think for a newer technology, particularly if thereâs one key paper, the MapReduce paper, or something, thatâs fairly straightforward to read.
For something like Postgres, because itâs older and itâs got a really long legacy, thatâs harder and you need a little bit of a guide, like a trail guide. Googling and finding a paper, maybe itâs good, maybe itâs not⌠But if youâve got someone to point you in the right direction and say âOkay, Postgres has a long legacy. Really itâs based on system R. In the â70s there were some key ideasâŚâ â well, I shouldnât say âbased on.â A lot of people will be angry with me if they heard that. But a lot of the key ideas in these traditional relational database management systems are from system R in the â70s. Those are well-documented and there are some famous papers there that really â if youâre trying to understand the optimizer, thereâs one paper there thatâs still required reading in most databases courses. Itâs about the Zellweger optimization model. If you wanna understand how Postgres really optimizes your query, you wanna start by reading that paper from the â70s.
There are a couple of key papers like that, but they are hard to find. You can google âZellweger optimization modelâ or you google âPostgres paper.â So having a little bit of guidance there helps a lot.
Hopefully youâve got senior folk at your company who can point you in the right direction. If you say âHey, I really wanna understand this at a foundational levelâ, hopefully theyâll point you to those kinds of resources and not Stack Overflow questions. But yeah, it is tough for sure. You need discipline, you need some spare time at work, ideally. You need to be at the kind of workplace where you can say âHey, Iâm gonna read this paper for a couple of hours to get a better sense of whatâs going on hereâ and for people to be cool with that.
I would argue too that you should have to say âHey, Iâm gonnaâŚâ Maybe that is too helpful but you shouldnât be treated like a child, where you have to ask for permission, to say âLet me read a paper to get more knowledge on what weâre gonna do.â I mean, I donât know⌠[laughter]
Well, you donât get any sprint points for reading a paper.
Thatâs true. Well, I think weâre back to the whole waterfall/agile; weâre forced to produce, and producing is code⌠Or commits, at least.
[40:04] This might be a good time to mention the Papers We Love repo on GitHub and that community, because what I was thinking there, Oz, as you were talking, is itâd be great to have a centralized curated place where you could just come and say âOkay, when it comes to Bigtable or when it comes to MapReduce, this is the paper, and itâs right here.â I was thinking âYeah, Iâve heard of Papers We Loveâ, so I was looking it up, and actually thatâs close to what theyâre doing. They have a data store section, and this is Bigtable databases, Dynamo⌠So you can find all those papers in one centralized place, curated by a community. That takes out some of the legwork that may otherwise prohibit you from finding the best or the canonical paper for this particular subject or tool.
Iâm glad you brought that up, because they deserve a shoutout. Great community overall, particularly in New York and San Francisco. Incredible organizers. I really would encourage folks to go to those meetups if thereâs one in their city. Great community, great turnout. Thereâs always a thoughtful speaker focusing on one paper⌠But then the people in that audience are gonna be the folk where you canât ask a question like that, âHey, Iâm trying to understand this thing. What would you read if you were me, or what would you be thinking about if you were me?â Itâs a totally different community to the standard meetups, so big ups to them.
For the listeners, if thereâs an easy thing you wanna do right here and right now while youâre listening, you can actually tweet @loveapaper. That has their repo link in it, so if you just wanna say âHey, I heard this on the Changelogâ, tweet that, itâll at least point any friends you have in the right direction. Weâll obviously include a link in the show notes, too.
Absolutely. Oz, moving on in your checklist here⌠The fourth letter, the H is âDetermine the Historical contextâ, and I feel like that keys into number three, the paper, because really when you read the paper youâre probably gonna tease out the historical context. But the idea there is what youâve said previously, where if you found out why Cassandra was such a key tool that was abstracted out because of their necessity for high availability writes. Well, now you know why they built it the way they did, and now you know whether or not it fits your needs or not.
Yeah, absolutely. If you canât find it from the paper, find it somewhere else, or just dig and ask those questions for yourself⌠Like âWhy did Google build MapReduce? What problem was it solving? What hardware were they using? How much were they paying for it and how much was that an issue, and how much of that fits?â That totally changes the equation.
Whereas if you just google âIs MapReduce a good tool for this job?â, youâre probably gonna find the answer âYesâ somewhere; somebody said that, and youâre ready to read it.
Well, maybe itâs a good moment to add a little self-plug - another place thatâs great for historical context is podcasts, so find the podcast about the topic. Thatâs one of the things we do. Weâve just recently had a show about Kubernetes, and we ask those kinds of questions: why was this born inside of Google? Why was it open sourced? What are the reasoning behind it? Thatâs a great way to get historical context if you canât find it elsewhere. Heck, if we ever ship that transcripts feature, Adam, you could even just go read the historical context. [laughter]
Right here on the show youâre gonnaâŚ? Ugh! Iâm just kidding. We do, we need to get that out there.
I just got all depressed here on the showâŚ
Oh man, donât do it⌠Donât do it. Listeners, we have transcripts since episode 200 of The Changelog⌠So thatâs a lot. Thatâs like 54+ shows at this point; maybe more, since this show is probably more like episode 260, or something like that. So long story short, we have full transcripts, we just havenât shipped the feature yet. Itâs because we have other problems weâre trying to solve, and thatâs okay.
Weâll just keep telling ourselves that until we ship it.
Thatâs the band-aid, thatâs what makes it okay.
Next two points here, just to finish up your acronym before we take another quick break is âWeigh the Advantages against the disadvantages.â Do you wanna expand upon that, or do you think thatâs self-explanatory, Oz?
[44:11] Look, it was mostly to get the âAâ in there, but⌠[laughter] But you know, there are gonna be tradeoffs. The main thing that we do as engineers is decide which tradeoffs to accept, and sometimes weâre honest about that and sometimes weâre not. To keep going with the Dynamo/Cassandra example, you trade off consistency. Maybe you want consistency. Probably you want consistency, so just being aware of that quid pro quo is what Iâm saying there. You get the advantages, but that comes with disadvantages.
I liked the second half of that though, which was âDetermine what was de-prioritized to achieve what was prioritized.â So it makes sense advantages/disadvantages, but in that context itâs about determining what was more important versus what wasnât as important, and does that align with your goals or your problem?
Well, that probably goes back to â who was itâŚ? Was it the Fred Brooks book, âThereâs no silver bulletâ? Weâre always looking for the panacea, the perfect solution, the silver bullet to solve all our problems, and what we find out is there arenât any. Everything is a tradeoff, and thatâs what engineering is - picking the correct set of tradeoffs for your problem domain as you laid out, Oz⌠So absolutely, if youâre going to prioritize something, you have to de-prioritize something else. Knowing those things before you go into the tool is just â think, man⌠Thatâs number six, Think.
I almost think this is like a par for the course, though. Obviously, you wanna think, right? Did you feel like you had to drive that one home, Oz?
âŚwith an exclamation mark.
Yes.
You know, really, I want the whole article to just be Think with an exclamation mark and thatâs it. Just that one word. But you know, I ran that by somebody and he was like, âYeah, thatâs not gonna do well, just an article with one wordâ, so I fleshed it out a bit. But thatâs the core thing here, really.
Do you think the title âYou Are Not Googleâ is much more sticky than just âThinkâ?
Iâm glad I went with this then, but⌠You know, this is the main thing, and many of us have said this. We do need to say it, because we see thoughtlessness a lot⌠Even if weâre reminding ourselves to be thoughtful as well when we say this.
Writing code is not really about writing, or writing a book is not really about writing. Thinking is the thing that we do mostly. Mostly, weâre paid to think, and eventually that gets translated to running code. But the fact is, when you look, most people are jumping to the implementation, most people are jumping to a technology choice, most people are jumping to their way of writing code. Weâre not thinking.
[47:06] The thing that we know works in software engineering is thinking; everything else is contextual, everything else there are tradeoffs. Thinking is the one thing where you can reliably get better results by doing.
Thatâs awesome. Oz, thanks so much for taking the time to write this post, to come on this show and share so much of what you know about software development and all the wisdom youâve shared. I really appreciate your time, man.
Yeah, thank you.
Our transcripts are open source on GitHub. Improvements are welcome. đ