Practical AI – Episode #102

Hidden Door and so much more

featuring Hilary Mason

All Episodes

Hilary Mason is building a new way for kids and families to create stories with AI. It’s called Hidden Door, and in her first interview since founding it, Hilary reveals to Chris and Daniel what the experience will be like for kids. It’s the first Practical AI episode in which some of the questions came from Chris’s 8yo daughter Athena.

Hilary also shares her insights into various topics, like how to build data science communities during the COVID-19 Pandemic, reasons why data science goes wrong, and how to build great data-based products. Don’t miss this episode packed with hard-won wisdom!



DigitalOcean – DigitalOcean’s developer cloud makes it simple to launch in the cloud and scale up as you grow. They have an intuitive control panel, predictable pricing, team accounts, worldwide availability with a 99.99% uptime SLA, and 24/7/365 world-class support to back that up. Get your $100 credit at

Changelog++ – You love our content and you want to take it to the next level by showing your support. We’ll take you closer to the metal with no ads, extended episodes, outtakes, bonus content, a deep discount in our merch store (soon), and more to come. Let’s do this!

FastlyOur bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at

RollbarWe move fast and fix things because of Rollbar. Resolve errors in minutes. Deploy with confidence. Learn more at

Notes & Links

📝 Edit Notes


📝 Edit Transcript


Play the audio to listen along while you enjoy the transcript. 🎧

Welcome to another episode of the Practical AI podcast. We’re gonna explore today a lot of interesting things in AI in a little bit of a different way. My name is Chris Benson, and with me as always is Daniel Whitenack, my co-host. How’s it going today, Daniel?

Oh, it’s going great, other than I woke up this morning and my computer – we had a power outage last night, so my training run abruptly ended at some point in the night, I’m not sure when… So getting that restarted; I guess I’m learning those sort of things about having an AI workstation locally, whereas most of the time before I just ran stuff in the cloud, so… You know, upsides and downsides, I guess… But yeah.

Sounds good. I guess for listeners who have been tuning in for a while, Daniel built his own workstation for doing deep learning, and he’s been going through the trials and tribulations… So I’m waiting till he has it all figured out before I try it myself…

On some Fully Connected episode we’ll have the chat about this. Fun has been had through that process, so…

There you go. Everyone can learn from your pain on that one…

[laughs] Yes, please do.

Well, today we have a bit of a different episode from the usual, in that we have someone who’s joined us who has done lots of interesting things in the past, which we’ll hear about, as well as some things that she is currently engaged in. With us today is Hilary Mason, who is currently the co-founder at Hidden Door. Welcome to the show, Hilary.

Thank you. I’m pretty excited to be joining you today.

Well, thank you very much. Daniel and I have been familiar with your work for many years, because you’ve done a lot of stuff that within the context of the data science world has been very much in the public eye… And so we are really interested in finding out some of what you’ve done in the past, some of which our listeners may possibly – those that are familiar with you may already be familiar with, but also some of the cool stuff that you’re doing now.

[04:05] I guess, to start off with, would you just kind of give us a little bit of background about yourself, how you got into the field, highlight some of the stuff that you’ve done up until now?

Sure. First, that’s really kind of you to say, and hopefully we’ll talk about some things today that you aren’t familiar with…

Okay, looking forward to it.

So I’ve been doing this for quite a long time, as you say, and I’ve tried to be very prolific, because if you do enough things, at least some of them are likely to be interesting. I started in computer science and machine learning over 20 years ago. It’s been a long-term interest of mine. As a kid, I loved science fiction, I still do love writing stories, I love thinking about machines that could really be partners to us… And obviously that wasn’t entirely possible in the ‘90s, and I’d say that it’s maybe just now becoming possible… But I started in Academia, and realized that I have several personality traits that could make me a little bit of a mediocre academic, but I think a pretty good entrepreneur.

That’s pretty interesting… Could you just give a bit of detail about that? I’m kind of interested… Because there are a lot of people that are maybe sitting in Academia a bit disillusioned, or maybe they’re like in industry, they don’t know if they should go do research, or all of those things.

Yeah, and it’s a hugely emotional process, whatever side of it you’re on, to think about where you might go. And there’s definitely – I have a lot of conversations with folks, even now, who are considering whether there are opportunities to use their academic skills in industry, if they’ll find something that’s nearly as intellectually fulfilling… So yes, this is something I have a ton of empathy for, especially because I went through it.

I was in a faculty role, and the things that I was interested in - first, in computer science we still tend to give a little bit more status to theory work than we do to engineering work… But I really like to build things. And I can force myself to go through and do some proofs, and math is something that I can get along with and I do enjoy, but if you give me the choice between spending two weeks at a chalkboard, thinking about math, and spending two weeks at a whiteboard and at my keyboard, actually trying to build it, I will always choose the latter. So that’s one personality flaw of mine.

Hear, hear.

And then another one is that I have a relatively short attention span, in the sense that as an academic researcher, I find that some of the best work requires persistence over not just months, but over 4, 5, 10 years… And I tend to – after one year, something has to change for me to continue to keep up that persistence and interest in it.

So I have a relatively short attention span, I like to pay attention to a lot of different things at the same time and try to figure out where they connect; that’s also something that is a real asset in data science, where you’re often facing some sort of problem, you have certain technical tools that you understand, certain data assets, maybe a product platform to build on top of, and you need to figure out how to pull all those pieces together in something that’ll work… And if you’re doing it in a startup context, you need to do that quickly. You don’t really have the luxury of a year to take the best approach… So that’s another personality trait that I have. It’s also something I look for when I hire folks out of Academia.

[07:48] One of my favorite questions to ask is to lay out one of the technical challenges that we’re working on, and to say “How would you approach this?” and usually they give a wonderful answer and I say “Great. Now, what if I told you you had two weeks to build something, what would you do?” and then the sweat starts to pour down… And then I say “Okay, now you’ve got two days. What are you gonna do? What is the stupid simple thing you can pull together?” and really looking for that kind of agility of thinking and being able to make decisions about where you’re gonna prioritize that simple thing, versus the right thing. And I think part of that is that I am more of a hacker than I am a perfectionist… And again, these are just things that I’ve realized that I need to work with people who are perfectionists, and who have that other perspective on things, because that way we end up building really great things… But those are my personality flaws. [laughs]

Yeah, I appreciate you going into detail there, because I think those are a lot of thing that people think internally, but they don’t voice them a lot when they’re going through that, or ask people “Hey, I feel this way. Where should I go? What should I do?” A lot of times people just kind of struggle with that inwardly, and don’t really voice it.

Yeah. And I’ve spent so many years feeling like there was something wrong with me, because I preferred to write code to write math… Even though it is ultimately the same thing, and it’s easy enough to go from code to math, I felt there was something wrong with me because I had these particular traits and preferences about the environments I’m working in… But I’ve realized now that I’ve been doing this for 20 years; it was a long time.

So as you were saying all that – you and I share a bunch of the same characteristics. Over my career I’ve also had the same thing, where I’ve been like – I’m feeling like I was doing it maybe not the best way in every case; but hearing you doing it, and with the successes that you’ve had over the years, it kind of validated like “Oh, maybe things aren’t so bad after all.” So I think it’s a really good message to get out there, because I suspect that if you are and I am - and probably quite a few other people out there are experiencing the same.

I’m sure.

That’s great. I’m kind of curious – I know one of the, of course, aspects that we’ve been talking about is kind of jumping into industry, and building products, and those sorts of things… But how did you get to the point where you really started getting this interest in data science, and data science products? How did that develop?

Yeah, it’s a good question. Like I said, I started in Academic machine learning; I have always been interested in using that as a tool to build useful things… But I actually learned the product lesson the hard way. There’s a bit of a longer story here, but the short version is that I left my academic position and came to work for a startup… And the startup at the time - we were building statistical models of career progressions off of millions of resumes we crawled off the web. This was in 2007-2008, so it was something that was novel. You couldn’t go on LinkedIn and see that kind of analysis done for you. It was obviously something I had a personal need for, because I was going through a career crisis myself at the time… And that company ended up failing in nine months, and it failed because of two things. One is that we built some beautiful data science models, put a UI on them, and nobody actually wanted to use them.

The second thing was that everybody thought a website about data about careers was meant for college students. But the whole idea here was that you could say “Okay, I’m a lawyer, and I don’t wanna be a lawyer anymore. What do people like me go on to do?” Or you could say “I’m a software engineer and I wanna be a CEO. How do I get there? What are the career paths that other people who have made this transition have taken to get there? What are the stages they go through? What are the fields did they explore?”

[11:58] And college students, of course, have no data, so it actually didn’t work for them at all. So there were some flawed business assumptions, some flawed product assumptions, and that was a wonderful lesson, and I learned it thankfully on somebody else’s company and somebody else’s money… You really have to build a product that is useful to people, which of course sounds so obvious when you say it out loud, but at least it was not obvious to me at that time.

So I went from being primarily interested in the modeling, to being primarily interested in building useful things, and have spent quite a bit of time studying and practicing and trying to figure out how to do that in the intervening years.

I’m curious, to that point - how do you look at that now? Because that’s always a hard thing for companies to do. Regardless of the industry, people trying to figure out what their customers want, and their needs, and what do people actually wanna use and are willing to commit to; how have you addressed that as you’ve learned that over the years?

That could be a whole course…

It sure could.

There’s a lot to talk about there. But I think the principles of it are really to clearly identify the problem you’re trying to solve, and not brush to – like, try to get to the question; don’t rush to the answer. And then figure out the people who you’re answering this question for, and figure out what’s actually useful to them. And this, honestly, is a mix of both quantitative analysis and qualitative analysis, and really trying to have empathy for the people who you’re trying to build for, and understanding where this fits into their lives, which are always interesting and chaotic… And then it comes back to what we were actually talking about before, where you start to build the simple things that can start to potentially address the problem to understand if it’s even worth investing in the best things.

So it really is a process that sort of merges the practice of product management and product design with the practice - if you’re building what I would call a data product, which is something that really depends on some sort of data science or machine learning capability to build the core feature of the product, so something like a weather prediction… Google Maps is always my favorite data product; the navigation stuff. These products could not exist without the underlying representation of the data… So you’re really trying to figure out what decisions are people making, in what context, how do I get that information to them in that context, so they make a better decision; how accurate, how good does it have to be for them to first get benefit from it, but then second, actually use it? If your weather predictor is 1% more accurate than another one, that’s not a compelling reason for me to use it, if the other one is more convenient… So really thinking about all of that.

One of the gaps I see broadly in our practice right now is that we often have product designers and product managers who are in the position to make the best decisions about the use of data science machine learning in their products, but they don’t have necessarily the background, the knowledge, the access to that talent or tools. They’re certainly not gonna build their own deep learning rig in their house… And then you have on the other hand your data scientists who often are not connected to the customers, or to the ultimate people who will benefit from the work they’re doing. So I think in our field of practice there’s actually a lot for us to figure out, not just from a “I’m a startup and I wanna build something new” point of view, but really from your day-to-day practice as somebody in this field… Like, how do you think about doing this work in the organizational structure you’re in? I think it’s kind of a mess right now, and that’s a big opportunity for us.

[16:05] I know you and I have met previously and have talked about business, and you’ve really become a powerhouse in developing entrepreneurial opportunities… But before we get too far past that, I am curious - I know I alluded to in the beginning, we talked about the fact that our first awareness was when you were producing content, and learning from your students… I’m kind of curious, how did you integrate teaching others your expertise integrate that into your entrepreneurship? Does it still have a role? Has that role evolved or changed? Was that just a step along the way? I’m just kind of curious about the development there.

Yeah, it’s a good question, because I think people take a variety of approaches to this, and the honest truth is that I love teaching, and I love talking to people. It’s why I’m here this morning. And I also really like sharing opinions that are useful. The company I founded about six years ago was called Fast Forward Labs, and we met at the Fast Forward Labs office a while back… That company was built on the idea of doing independent applied research and sharing as much of it as possible… But the sharing piece is not exclusive to my work at Fast Forward Labs. It’s been a thread through all of my different data science jobs or management jobs I’ve had, or things I’ve gotten involved in.

I think it’s really important in this field to talk about what we do and what works, and more importantly, what doesn’t work… Because the field is so young. People have only been able to get a degree in data science for maybe 4-5 years, and that still astounds me… The fact that you could have a job with that title has been a thing for about a decade. There’s a lot for us still to figure out, and that’s why I love it, by the way. It is not a solved problem, and our technology is also not a solved problem. It is really weird.

If you look at the change in capabilities of machine learning technology and how you have to manage using them and investing in them - it is completely different than most technologies that folks are familiar with… And yet, we tend to shoe-horn it into existing structures and processes that come from software engineering.

When I talk about the stuff I like to work on, for one thing, I’d say it’s a two-directional exchange. One is that I and the teams I have been fortunate enough to work with - we try to have our unique point of view, and we try to be… And this is really something I can’t help, honestly, but deeply pragmatic about what something is, how it works, where we think it’s useful… This is really important in AI and machine learning, because there is so much hype, there is so much salesmanship, there is so much marketing that it’s designed to get people who don’t themselves have a deep expertise to believe something that’s not quite true… And because the tech is weird and changing so quickly, it’s very easy to believe that stuff… So it’s really important, if I can say this as a technologist, to have those pragmatic points of view and then share them where you can, because we need to build a consensus in the community around what is possible and what isn’t, the best ways to approach certain kinds of problems… And we only do that by sharing.

The other thing I’ll say is that I love working in the data science community because the more that you succeed, the more I succeed. We’re not directly competing with each other, and for the most part if we’re data scientists or we’re machine learning engineers at different companies, I can help you out. I can hear about what you’re working on, I can share what I’m working on, we can give each other feedback… And within a certain ecosystem, the more one company succeeds, the more another is likely to as well.

[20:11] This is really different than if you ever have had the – well, I don’t wanna be unkind, but if you’ve ever hung out with a bunch of hedge fund quants… I live in New York City, so occasionally I end up – well, not anymore, but I used to end up at those events… These folks talk about the weather and about sports because they do work in a community where there is a significant competitive dynamic.

So I think that one of the things I really appreciate is the ability to share, and having that actually be supportive for all of us. And then I’ll also say it’s worth sharing, because you will get feedback, you will meet people who are interested in the same things you’re interested in… And I’m an introvert; it is very hard for me to talk to a lot of people, which may seem a little bit counter-intuitive, because here I am, talking to you…

That surprises me, yeah…

But by talking to you once, and now this discussion will be out in the world, folks who are interested in the things that we’re talking about today will reach out to me. I don’t have to go out and talk to a thousand people to find the two or three who are gonna share these interests… So it’s a good hack also to find the people who you really can brainstorm with and share with.

So you were talking quite a bit – you used the word “community” a bunch of times as we were talking through that last section, and I’m really interested in a couple of things. First of all, your insight into how to do data science well… Because as we know, there are so many ways to go off the rails in a variety of ways. But I also wanna throw in the fact that as we are talking today, we are still in this world that is dominated by the Covid-19 epidemic, and that has completely changed what we have been doing for years as professionals… And we’ve had to adjust workflow, we’ve had to adjust the way we communicate with people… What the word community means to us in terms of implementation has adjusted… Could you give us some insight into how you’re adjusting and how you help people think about doing data science well in this new environment.

[23:48] Yeah, I actually think this is an unsolved problem, and I’m glad we can talk about it, because hopefully it’ll get someone who is creative and excited about it thinking… We’re recording this podcast on Zoom – this is not great, this is not the end game. This is in no way as rewarding or as helpful as having a personal or in-person connection… And our data science community of practice has – there are a bunch of events people go to, and I always find that if you wanna know what somebody did, you read their white papers or their publications.

If you wanna know what they’re doing now, you take them to coffee; if you wanna know what ideas they’re thinking about, that they haven’t quite decided if they’re good or bad yet, you have to really talk to them in a way where they’re comfortable, usually at some sort of event… And now we’re missing all of that. So we’ve lost that layer of connection, and I feel like we’re also burning down a lot of the social capital we’ve built up before the Covid crisis… So I think there is a wide open space for people who can figure out how we do data science together, how we continue to have this open space to share and learn when we’re dealing with the fact that we can’t travel, and mostly we have to stay apart from each other… And I wouldn’t claim to have an answer, but I think it’s an area where I would love to see more attention.

I agree. Last year, as we were recording and talking about things, I know Daniel and I were always off to conferences… I live in Atlanta, but I was in New York often, occasionally meeting with you, and moving around and having great conversations… So it has definitely been a challenge trying to bring the same level of quality into the conversation, and the same level of sharing insights. Because when you’re around the conference table, it’s so much easier to just hop up and hit the whiteboard and have those ideas sharing. If you take that then to the next level and you’re actually talking about producing, creating data science products, and informing other products with your data science, and contributing to that whole development effort, it’s definitely gotten a little bit harder to get to those points quicker since then… And I was wondering, as someone who is certainly actively doing that now, how have you adjusted to that? How have you tried to ensure that you’re able to get there successfully, in the same way you have in the years before we got to this point?

Yeah, I wish I could say I’d solved it, but the things I’ve been thinking about are trying to observe what’s missing. One thing that has come up for me is that now that I have all of my meetings in this form through a screen, I’ve started to forget who I talk to about what… And it’s because we’re missing - at least for me - the physical cues that were tied to the storage of memory. So I have a couple of folks who we’ve had a monthly meeting in a diner, in [unintelligible 00:27:09.00] and now we’re doing it on Zoom, and I couldn’t remember what we talked about last, because we didn’t have the physical cue of “We were at this table, and I had this diner coffee cup, and the eggs were overdone…” And I love diners, so I say that with the biggest amount of affection.

So it’s trying to say - okay, I’ve noticed this is missing. How do I create that context? Can I use different Zoom backgrounds for different groups of people I’m meeting with? Can I physically alter the space I’m in, which is always hard in a New York City apartment, but at least turn around, or try and find another corner to sit in…

[27:47] When it comes to the work of data science, one thing I’ve noticed is missing is the casual brainstorming and relationship-building. I think these things are tied together… But it’s really easy to talk about the work that’s clear, but it’s almost too easy to get caught up in the details of what’s obvious, and not to spend the time on what isn’t obvious, or those ideas you have that are just a little weird, and you’re not sure if they’re the best idea or the worst idea (I have a lot of those), and you need to share with somebody to get their impression to really know if it’s worth thinking about or exploring further…

So really trying to create the space for that kind of discussion, which is generally less structured, may involve talking about “What do you see out your window in Atlanta?”

Rain today.

“I can show you what I see – it’s a beautiful, sunny day in Brooklyn.” Yeah, so I’m trying to be very thoughtful about that, and then thinking about, as we move into playtesting some of our new stuff, how we do that in a way that gives us the same kind of information that in prior years you could just get by sitting down next to somebody and watching them try to use the product. I don’t have an answer, but I think the first step is to be really thoughtful about what’s missing, and then try to create the space for it.

Yeah, I really resonate with what you’re saying, especially as a person in an organization that doesn’t have a big data science team, or something like that. I oftentimes feel like I have all of these what to me seem like really great ideas, but I really have no gauge on “Are they good or are they not?” And now I’m not really going to meetups or anything like that in like a physical format. There’s a guy that works down in Indianapolis that I’ve known for a while, and we just have like a standing meeting on our calendars every couple of weeks… It’s labeled “ML Water cooler”, and we just chat about random things, like you were talking about… So I really resonate with that. I think that’s even… Like, pre-Covid, for people that were in that sort of organization where they’re hired in as maybe the first data scientist, or establishing some sort of data science initiative - that can be really tough. So yeah, I resonate with that a lot.

You’re right, in that same area. And it is definitely something that was an issue before Covid, too. I think this has just exacerbated the pain of it, because it was a lot easier if you were the only data scientist at a company to go out and have data scientist friends at other companies, and now you have to be very deliberate about it.

And that raises some questions – you know, you still have customers, right now that you’re developing in your new stuff that we’ll talk about, and you have done that repeatedly over the years… How do you adjust to that right now? Your ability to – once upon a time, you could go places where your customers would be, and you could engage them in various ways that were familiar, that are all gone at this point. As you’re trying to build things now, and you’ve already talked about how critical that customer and user feedback is, how do you achieve that today?

You get into survey science? I don’t know… [laughs]

Yeah, we’re really trying to hack together the closes approximation. I don’t have an answer, but figuring out “Can I give you something to run in your own environment, and then be on video and trying to observe you that way? Can I ask you a bunch of questions that seem adjacent to the thing I’m actually interested in, to try and understand the context around how you think about what you’re doing?” For example, for Hidden Door we’ve been asking a lot of families “How do you tell stories together? How do you feel about it as a parent?” For kids, it’s like “Why is this fun for you? How many characters do you have? Do you always make up new characters?” So it’s really just trying to develop that empathy through the means we have access to right now… But you know, honestly, it is a challenge.

I guess at this point you started to allude to characters, so it might be a good moment here to talk about your current work and the current project you’re doing. I know that you announced on LinkedIn when I reached back out to you about Hidden Door… Could you tell us a little bit about Hidden Door at this point?

[32:21] Yeah, I’m happy to. Hidden Door is a product for using machine learning and AI as a tool for creative assistants, primarily focused on kids and the people who like to tell stories and write with them… And what I mean by that is that I think the tools of NLP and natural language generation are tremendously powerful, but they’re primarily useful to provide the things that are repeated in narrative and to accelerate other human creative efforts… And by that, I mean - kids are deeply creative, but they are not necessarily familiar yet with things like the standard narrative arc, where you introduce tension, how you manage conflict in narrative…

In genre writing they may not understand the things you get for free in a genre, versus the things you have to explain… They may have the spark of an idea, but find that it’s actually a lot of work to create the story around it… And we believe that the technical tools around NLP and language generation are just now starting to become powerful enough to be supportive tools for that sort of storytelling. And by the way, they’re really fun, and I think anyone who spends time with kids or has kids of their own knows that kids are endlessly creative, and they often demand endless creativity of their parents, too; so many parents I’ve talked to - and it’s my own experience as well - it’s like “Can we tell yet another story about the wombat who really wants to play basketball, or the grapes that rolled down the stairs onto the subway and went on an adventure?”

So we’re building essentially tools that can be part of this creative process, but there are a couple of things I wanna be very clear about, one of which is that I don’t expect the technology itself to be creative. The creativity comes from people. And the second is that this does not do all of the work for you; it’s more of a partner, in that as you explore and create and play, it starts to fill in the structures and the gaps and the descriptions, and does it in a way that is really fun and is actually more like a game than like a homework assignment.

And then the last thing I’ll say, because I know this audience is probably deeply familiar with things like GPT-3 and a lot of the issues with natural language generation tech, is that there is a real need to build structure, safety, coherence and memory into these systems before they can be deployed for any human-facing application, much less one that you’re willing to put in front of young people. So there is a huge amount of engineering that we’re thinking through right now around how you build those systems, such that we’re confident letting people use them in a way that is unsupervised… And I mean that in both senses of the word. And the last thing I’ll say is that I grew up playing Dungeons and Dragons and a bunch of other role-playing games…

So did I.

…so if you need a metaphor for this, it really is the dungeon master… Like, finally, can we have our computer system play the role of a dungeon master in structuring and guiding a story, without being deterministic about where it starts or where it ends.

You mentioned a couple of things that I would really love to dig into. One of those is you mentioned how you thought that we were kind of at a point where NLP technology could augment some of these things, so I wanna dig into that a little bit… And also, you mentioned the safety aspect, which is of course a big topic… But maybe we can dig into that first one first.

From your perspective, as you’ve seen a lot of different subfields within data science and AI growing, what really catches your attention about the growth in NLP right now? And maybe why it’s crossing into some areas that can augment more sophisticated workflows like this.

Yeah, so my last company was Fast Forward Labs, which I’ve mentioned earlier. We did a lot of independent applied research. Our very first research at Fast Forward Labs in 2014 was around natural language generation. It’s an area I’ve been interested in following since then. And at the time, the state of practice was essentially that you pre-generated templated sentences; it was more like a pregenerated [unintelligible 00:37:35.23] style thing, and then you had a process that would dynamically assemble those sentences… And it meant that at the time the tech was really good for taking quantitative data, so something like a weather report, for example, or a company’s quarterly earnings report, or even scores in a sports game, and writing an article off of that.

If you showed it something that it had not seen before, it just did not have the language and the sentences just did not exist to be generated… And at the time we built at Fast Forward Labs a system that would take structured data about real estate, so apartments in New York City - we crawled 60,000 apartment listings, and then we built a little system where you could say “Okay, this is a two-bedroom/two-bathroom right by Central Park. It has a doorman and a washer-dryer”, and it would write the ad for you. It worked well enough for things that were common, so things like that one I just described.

You could also put in things like an [unintelligible 00:38:45.15] one-bedroom apartment, and it would try… It would put a few sentences together, but it wouldn’t sound very good. What has changed since then is the use of transformers and the ability to build these incredibly large-scale pretrained models that excel at the token prediction tasks. That means that you essentially take at an intuitive level all of the internet that’s mostly English, a bunch of books and whatever other commentary datasets we can throw in there, train a multi-billion-parameter model against that, and then you use that thing now to, given a prompt or a series of tokens, predict what the next token is going to be.

Now, there are a couple of things that are really interesting about this. One is that these models are huge, and this is both a good thing and a bad thing. We’ll come back to this in your second question. The second part of it is that what I actually think is really transformative here is not that it’s solving a problem that couldn’t be solved before, but rather that before you would have to train – let’s say you wanted to do a translation from English to French, and a classification of something as, you know… What are we classifying these days…?

[40:06] Everything.

Everything, right. Happy or sad. Maybe we have a hope of actually solving sentiment analysis for real now… And you wanna generate some language – you would have to build a system that was custom-built for each one of these applications, and now you have a general model that can be used for all of them. That’s pretty mind-blowing.

The second thing is that the ability to describe a task with [unintelligible 00:40:29.26] so to give a couple of examples of what you want, and then have the predictor be able to actually follow that - that’s really amazing. It actually says to me that we will likely change our expectations of how we interact with NLP systems in the future, where rather than building these custom purpose pipelines for one task, we’ll be able to create these general systems that we can tune for a task locally, at the backend; there are a bunch of things implied by that about infrastructure you need… I don’t think everyone’s gonna have a deep learning box in their house; I do have one, and it’s more of a pain probably than it’s worth…

You’re speaking to my current pain… [laughter]

Yeah, it looks good on the spreadsheet cost-wise, and then you think about the hours you spend trying to get some driver installed for something, and it’s just not fun anymore. That’s my opinion. I’m glad you’re having fun with it.

There’s a step function improvement in the capabilities of these NLP systems broadly, which means that actually, again, this isn’t something that’s completely new, but the speed of development and the ability to play and use them flexibly in different parts of a product has changed. The cost function has changed. And that’s really exciting to me.

And then when you think about what these systems are good for, they aspire to create the most mediocre drivel that humanity would create and put on the internet… And that is not what we need for writing brilliant stories, and it’s not what kids need for learning to write brilliant stories, so I’m also really excited about the opportunity that building on this foundation opens up to actually create something that is able to encourage that brilliance.

Yeah. And I guess to tie in that last part of where we were going with the safety side of things - you mentioned the large size of these models, and that we’re now scraping much of the internet… Of course, there’s a lot of things on the internet that at least I wouldn’t want to show to a child, I think… So there’s that aspect, I guess, which is around safety and context and all of those things… There’s also – you know, you were mentioning much of the internet, which is mostly English, and there’s of course a lot of kids out there that don’t speak English… So yeah, I definitely see there’s some potential issues, and of course, you can’t tackle everything at once, but what are some of the main challenges that you’re thinking about as you’re trying to leverage these models for the particular audience that you have in mind?

Largely, I’m thinking about it as allowing the maximum flexibility and creativity within a constrained problem space… And just to be very clear about what it means to be safe - it took me two tries to get something deeply misogynist out of GPT-3, and I was not trying… And that’s not good.

It’s disturbing, yeah.

[43:47] As of today, I don’t think you can put the raw output in front of people at all, unless you’re constraining the domain in which it’s able to produce text. So I do think you could use it today for things like translating from language to code; I built a thing that writes really shitty SQL queries, and that seems pretty safe, maybe… But for things that are generating, deliberating inciting the model to hallucinate fictional worlds - it’s not safe now.

So constraining the problem space such that we’re able to manage, say, descriptions of characters and descriptions of items in a way that we can then run another layer of machine learning classification, and even in many cases human review and human feedback to ensure that what is coming out of the system beats some notion of content standards.

We also have other issues around getting what you expect. GPT-3 and other models let you dial down or up the randomness of the output, but when you dial it down, it actually ends up being quite boring, and when you dial it up, it’s completely random and you can’t direct it where you want it to go, because it doesn’t have taste. We have taste, but as far as the math knows, you’re just sort of randomly exploring a space, and any particular set of tokens is just as likely as the next one… So we need to build systems that learn and reflect that taste. I believe, at least from our approach right now, those are systems built on top of fine-tuned GPT-2 systems, in this case.

I also wanna be very clear, because we’re recording this here in August… We’re at the beginning of this work, so I can’t speak to it as if I have solved this problem. I don’t want anyone to point to this discussion and say “Oh, they did it, so it’s great.” It should be easy. It’s not easy. And I think there’s almost as much work involved in what we have ahead as there is in getting to this point in the first place.

I actually wanna pull us back a little bit from the technical conversation, because if we finish up the conversation in a few minutes without me asking what I need to on behalf of my daughter, I am in deep, deep trouble. So I’m curious – I’m totally recognizing that you’re still fairly early in the process of building what you’re building with Hidden Door… The first thing that I know when I go downstairs in a few minutes is she’s gonna be asking “What is it? What’s it about?” and she has these other games and virtual worlds, and she’s gonna be trying to compare it with that… So from that kids’ perspective - we’ve talked about the ability to use NLP in this, and you’ve mentioned it being the AI-driven Dungeon Master… But could you tell us for a moment what your vision is for what – when you get to a point where you’re ready, what that’s like for the kid? What you think they’re going to experience; maybe give us a little quick example.

I’m so glad you asked; I’m hoping she’ll be one of our play-testers too, if she’s up for that.

Oh, she’s very up for that.

So from the kid’s point of view there are really two different things going on. One is to give them essentially a buddy that is co-writing with them, but where they’re in control. So the kid can say “Today I wanna be a wizard who’s getting on a rocket, going to the Moon” and the system will say “Yes… And on the Moon we’re gonna find a pizza shack…”, and obviously, I need this help too, because I’m just making this up off the top of my head.

It’s fine. It’s good. Keep going.

Then the kid can say “Oh no, it’s not pizza. I don’t like pizza. It’s spaghetti” or “It’s porridge”, whatever fits in their model, and the system will start to generate things that adapt. So it’s really this partner buddy where they’re able to hit a button [unintelligible 00:47:57.27] or have something that can support their creativity, and it reduces the work of creative play by providing that support, and they can say “Oh yeah, this is great. I opened the box and I did find a shard of rainbow inside.” Or they can say “No, I hate rainbows. It was a horseshoe.” And then the system will adapt to that, and then the next thing will be that as well.

[48:26] At the same time, this game is encouraging and rewarding creativity and bravery and certain behaviors, so you also are playing the role of a writing coach, helping them think through “What do you think should happen next? Where should this character go?” Sort of encouraging them to branch out a bit, if they get stuck, giving them that guidance on where they need to go, but always leaving them in the driver’s seat.

Also, as I said, we’re still prototyping and play-testing, but I think that from the kids’ point of view, this is something that helps them explore these worlds that they have in their own minds, and they already want to explore through text… And it’s something where their characters can represent their individual experience. So every kid has things in their lives that they don’t see reflected in media… And this is, of course, particularly true for some people over others; or I have a friend who’s an immigrant, who said “My kid is not gonna see any stories about the clashes of cultures that you see in our family, because our family is so unique”, but she wants to create these stories for herself… So I think one of the real exciting things is letting kids show their own experiences through these stories in a way that maybe they’re not seeing in traditional media.

And then also, because this is a dynamic system, these characters can grow with them. I know that, especially young kids, like an eight-year-old - they’re changing all the time and they’re learning so much. They’re having all these new and amazing experiences, and the character that they wanted to play with six months ago can grow up with them and can have experiences along with them, which I think is something pretty – you see it in video games, but it’s not something we typically see in books. So I’m hopeful they’ll be able to create those sorts of experiences.

It’s interesting - it sounds like she has single player games that she does currently, and she experiences in games that can be educational… And then there are multiplayers with other kids… To some degree, it feels like it’s a bit of a hybrid, where you have this AI-enabled buddy that creates a multiplayer experience in what would otherwise be a single-player engagement. Is that a fair way of representing it?

It is, though I’ll also put a little asterisk on this, since we’re thinking deeply about what this looks like to play with other kids and with your parents as well… So I think yes, it is having – as we’ve conceived it right now, the vision is having that AI as a second player, but also not an equal one to you. I don’t know if you have an Amazon Echo or Google Home…

We do, both.

Yes… So the way that kids are able to use those devices to play music, to ask questions - they don’t think those are people, they don’t think those are peers. They’re very much the one who’s driving where you go with that interaction. I see a very similar interaction here, where they have the ownership of the story, they’re the one creating… But this is a tool that they’re having a dialogue with, that is helping them think through and explore those stories.

Okay. I guess the final question is just any thought toward the future about where – I’m really interested, especially in this application of NLP… We hit it a lot, but just kind of where this may take us going forward, what are some of the future visions that you have… Maybe not strictly things that you’re planning to do, but things that you could see happening in the industry that might affect kids in this positive way, to kind of finish out.

[52:26] Yeah, I love thinking about these questions broadly… I think that machines that can understand language well and can respond to us in language are incredibly powerful, because language is the interface we use to talk to each other… And the ability to take information and represent it in language, whether it’s for children or for anybody, is something that is really powerful. We haven’t yet seen what this set of products and interactions… We haven’t seen the end of that yet. I don’t think chatbots are the height of innovation around interacting with systems through natural language. So that’s something where I don’t know what exactly it looks like, but I do think that a decade from now it’s quite feasible to think that it’ll be a commodity to interact with a lot of systems… And not just in a “Alexa, play me this song” sort of way, but in a “Can you take a look at this dataset and actually tell me something meaningful out of it, so that I can make a better decision?” sort of way. So that’s something I’m pretty excited about.

I also think we’re seeing this proliferation of ways for people to create with this technology… And again, this is actually starting at the professional level, so with AutoMLtools, but we’re seeing much greater democratization of access to the tech… And honestly, as much as I love data scientists and I consider myself to be one, I think the most creative applications come when you take that capability and you give it to people who have some other expertise or some other world they’re living in… And so I’m pretty excited to see what people are able to do with it when it doesn’t take a huge investment or a huge amount of technical skill to actually start to play with this stuff.

Awesome. I think that’s a great place to wrap up; a really good perspective. I know that there’s so much to talk about in these areas, and there’s only so much time to cover… But we’re gonna for sure put the link to Hidden Door and some of these other things we’ve discussed in our show notes… So if you’re curious and you’re listening in on these later maybe, and curious about Hidden Door, check out those links. We really appreciate you joining us, Hillary. It’s been a great conversation.

Thank you both. This has been a lot of fun.


Our transcripts are open source on GitHub. Improvements are welcome. 💚

Player art
  0:00 / 0:00