Nishant Roy, Engineering Manager at Pinterest Ads, joins Johnny & Jon to detail how theyâve managed to continue shipping quality software from startup through hypergrowth all the way to IPO. Prepare to learn a lot about Pinterestâs integration and deployment pipeline, observability stack, Go-based services and more.
Nishant Roy: Yeah, I think youâre right. I mean, itâs an impossible decision to make, right? Because on one hand, if youâre spending too much time on optimizing early on, youâre not going to make it as a startup, especially given this environment weâre heading into now⌠Speed of everything is of most importance. On the other hand, like you said, you end up with these legacy systems that donât then support hypergrowth, or supporting new products.
I think one big thing that weâve seen over the last probably five years to a decade or so is widespread adoption of video. So folks or companies who didnât really optimize early on for, letâs say, content delivery for instance, suddenly youâre delivering video, which is obviously a lot more expensive, and people are having to reconsider how they built those systems, or in some cases even rebuilding those systems to operate at this new scale that customers now want.
For us, for my team in particular, the biggest scale challenge we were going through was the team growing. So like I said, my team specifically was ten people, now the entire ads infrastructure org is probably getting close to about 100. The ads team is probably three to four times that number. So the question we were asking ourselves is now that weâre going through this phase of growth, building a lot more products for a lot more users, for a lot more partners, how can we continue to keep that same level of velocity that our pinners are used to, our partners are used to, and us as engineers are used to, with this growing org, without causing more incidents or without having people stepping on each otherâs toes? And I think that is, to some extent, the eternal quest; weâre never going to really hit a perfect happy â I donât know what the right word is, but middleground between all three of those factors. But that is something weâre constantly evaluating and tuning for, is how do we build our systems in a way â either can we build better modularity, can we have more config-driven systems where folks can make changes without needing to understand how the rest of the system works?
For instance, for rolling out new experimental models, do I need to actually go and read exactly how those models are going to be chosen? Do I need to understand what features have been fed into my model? Do I need to understand how those scores have been used? Or can I just go in and make a simple JSON change or something, and say âFor this subset of traffic, which may be country, or surface, or device, for iPhone users coming from Canada on the search page, I want you to use this model, with this percentage of traffic. And I donât care how the rest of the system works.â
So those are the things that have been really on our minds constantly, and something weâre looking to continuously improving. In fact, the pre-submit test framework was one way that we did that, is anyone can go and make a config change to add a new metric and add a new slice, without affecting how any of the rest of the system works.
Our actual ads delivery system isnât fully there yet. Weâre continuing to improve this for developers, making it easier for them to make changes, a) without bringing down the system, b) without blocking other folks or needing to understand how your change interacts with someone elseâs. But like I said, itâs the eternal quest, as the product offerings from the sales side and the product teams get more and more diverse, we realize that some parts of our system just werenât built to support those sort of products in mind.
Something as simple as if we need to serve - these are some of the problems that weâve solved perhaps, but if we need to serve both video and image ads for the same request, do we have a good way of doing that? Not a particularly hard thing if youâre building a system from scratch, obviously, but once youâve built it with a certain assumption, just these basic things sort of start to fall apart, and then you need to go and look into âShould I just be redesigning this whole thing, or is there a quick and easy way for me to get this off the ground, and then go in and redo it to unblock the next big thing?â
[31:48] And again, what ends up typically happening - I think this is probably true for most big companies - is you obviously need to get that MVP out, so you do something a little hacky to start out with, and then you retroactively go in and ensure that a) itâs not gonna break anything, and b) how do we make this a more pleasant experience for developers and product managers alike? So thatâs the wheel thatâs always spinning, and weâre trying to stay ahead of it, but weâre usually playing catch-up.