David Yakobovitch joins the show to talk about the evolution of data science tools and techniques, the work he’s doing to teach these things at Galvanize, what his HumAIn Podcast is all about, and more.
DigitalOcean – The simplest cloud platform for developers and teams Whether you’re running one virtual machine or ten thousand, makes managing your infrastructure too easy. Get started for free with a $50 credit. Learn more at do.co/changelog.
Brain Science – For the curious! Brain Science is our new podcast exploring the inner-workings of the human brain to understand behavior change, habit formation, mental health, and being human. It’s Brain Science applied — not just how does the brain work, but how do we apply what we know about the brain to transform our lives.
Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com.
Play the audio to listen along while you enjoy the transcript. 🎧
Welcome to Practical AI. This is Daniel Whitenack, and I’m joined by my co-host, Chris Benson, who is an AI strategist at Lockheed Martin. How are you doing, Chris?
Doing great. How’s it going today, Daniel?
It’s going pretty good. I’m packing up and getting ready to do a very non-AI/data science thing, which is going backpacking for a bit.
Oh, that sounds great.
So I’m gonna be out of touch next week, and hopefully away from any sort of cell phone signal, and that sort of thing. Actually, I’m pretty excited.
Fantastic. What area are you gonna be in?
I’m going up to Minnesota, to the Superior Hiking Trail, which goes along Lake Superior. It should be a good time. It’ll be a new one for me, and I have no doubt that being in isolation a little bit will give my mind some time to think about all of those AI problems that I am trying to solve as well, so… Looking forward to that.
Today we’re very privileged to be joined by David Yakobovitch, who is a Principal Data Scientist at Galvanize, and also a fellow podcast host. He’s the host of the HumAIn Podcast. We’re really happy to have you here, David.
Thanks so much, guys. It’s a pleasure.
First off, why don’t you kick things off my giving us a little bit of your background and how you got interested in AI/data science things, and ended up where you’re at now?
Sure. Ever since I was young, I loved math competition. I competed both in the state and national level in the U.S. I went to college actually for applied mathematics, and physics, and doing theoretical proofs, and quickly realized the industry was changing from research to applied, so I started moving in the direction of code and applied research, which led me down the path of actuarial science back in 2010.
Gotcha. This is maybe just at the start of a lot of the data science hype maybe…?
That’s right. Big data wasn’t even a word until 2012. The AI revival was only kicking off around then, so… I think in 2012 I first learned C#. I was also playing with Fortran and COBOL, because that’s what the company I worked for had. So I was picking up some of those old languages, but… COBOL is never going away, I don’t think.
Fortran neither, as far as I can tell.
Yeah, especially with financial services… I’ve seen that reemerge over the years, and… Although that’s a tangent, but I think that’s interesting, because as everyone’s moving to cloud, it’s still “How do we maintain these systems with these languages?” But I love it, because – I’ll go back into the background, but when I teach a lot today, I tell the students “Hey, if you wanna work in Jupyter, if you wanna work in an IDE, guess what - it supports Fortran and COBOL. So you can always pick up those old languages.”
Yeah, it would definitely be a fun exercise… And I’ve kind of done this a little bit, with not those languages, but kind of trying to implement things side by side, in different notebooks, and see how they look, and experimenting that way; it’s a fun thing to do.
Didn’t you do bindings for Go, Daniel? …if I recall.
Yeah. I worked originally on one of the first Go kernels for Jupyter, which is now maintained by other people who are doing great things with it… But yeah, there’s a lot of fun times to be had with Jupyter and languages other than Python, I would say.
That’s super-cool. And it’s amazing how the bindings have evolved the technology. When I was getting involved in actuarial science, not much of that existed; even APIs were just emerging in this certain aspect.
Back in 2010 - it was around the watercooler, literally at the office, in person, before remote work was even happening. Teams were saying “Hey, we’re thinking about getting on the cloud. Hey, we’re thinking about getting these servers”, and people were talking about Python, this language. And yeah, Python has been around since the 1990, but it was just getting into the financial services back then. So I said “I’m gonna pick it up”, and started learning it… And before you know it, the last eight years I have been involved with different financial services companies, implementing data solutions with Python, and helping them build everything from analytics and dashboards, to predictive models, and setting up data strategy, as well as building out centers of excellence.
That led me to not only learn how to teach and how to code, but then how to help others take over processes. I think having worked with a lot of companies over the years, one of the biggest flaws we always see is not enough things are documented, and it’s really challenging for those not coming from tech to pick up tech skills. So I’ve always been that go-to person around the watercooler to teach you how to use Excel and SQL and Python, and it just became a natural fit in the past few years, where I got into learning and development pedagogy and training.
That is a perfect segue into the first question I have for you - tell us about Galvanize. What do you do, and how did that come into your life?
Sure. Galvanize was founded in 2012. We’re one of the bootcamp providers for software engineering and data science in the United States. We have three segments of the business - a consumer, a remote, and an enterprise. I’m on the enterprise corporate arm, and that plays a lot to my previous skillset of helping other individuals learn tech in corporate.
Prior to being at Galvanize, I was at General Assembly, doing the same thing on the enterprise side - working with financial clients, scaling hundreds of individuals in organizations to reskill and upskill in the Python programming language, in Jupyter, in working with return on investment projects for their divisions. At Galvanize we have all those divisions as well. We are both consumer and enterprise-facing, and we’re all over the U.S.
I think what’s most exciting is there’s been so much growth happening in 2019, and we’re seeing that even into the next three years, predominantly because everyone is wanting to reskill and upskill, and code is now the first thing that people are picking up.
As you got into that training side of things – it sounds like you got into data science training pretty early in terms of when these programs were coming out… What really motivated you to see that need for better data - data science training, or was it kind of a personal thing on your side, where you really developed some passion for teaching, or found out you were good at it? What led you down that path?
So me it’s very mission-driven, even since middle school and the math competitions, because we would have math competitions where you not only competed individually, but you had team assessments. That’s where you’d have to solve four questions between 30 to 60 seconds, and come up with a group answer. It’s incredible how fast-paced it was, both state-wide, nationally and internationally. So if you had the weakest link on your team, you had to get them up to speed, so that you can successfully perform.
I’ve always been interested in helping everyone rise to the occasion. But beyond that, I’ve noticed how technology has transformed so quickly… My father actually was an entrepreneur, and owned a business that worked at the schematic level to repair TVs, VCRs, DVDs, and all sorts of electronic gadgets, in South Florida, all throughout the ’80s and ‘90s. At one point, this company grew to over 20 people, they had three locations, they were doing millions of dollars of business a year, and then before you know it, the whole industry changed. All of these new smart TV’s appeared, products disappeared, and it was so challenging to keep up with the times in technology. And before you know it, the whole servicing industry and warranty industry started to evaporate.
Fortunately for our family, my dad was already in his sixties when that started, so he went into an early retirement… But then I started thinking, “Hey, how could someone like my dad learn to code?” And he really wanted to, and he had that capacity because he had that technical mind, working with fixing electronics, with capacitors and all these gadgets. It was interesting, because I in essence mentored my dad as he was picking up Python through some of these platforms, and coaching him.
At the end of the day, what I realized is he didn’t wanna learn Python for data analysis. He knew at 63 years old that he wasn’t gonna become a data analyst at a Fortune 500 company… But he knew if he could take the work that he did in RPA and robotics, and apply Python there, it would make a lot more sense. So what did my dad naturally do? He bought Arduino boards and Raspberry Pi boards and connected sensors to refract light off the walls in the living room, and soundwaves when the dogs moved, and sensed actually position locations for where movement was occurring… Because he was always sensing movement with audio and for visual on those TVs and those soundboards.
So it was so interesting for me as a takeaway to realize - to make code and to make languages stick, you have to make it relatable for the learners, and you have to provide capstones that they can take back for their portfolio, and they’re having a fun time learning, as well.
I’ve gotta say, your dad sounds super-cool there, getting into that… I certainly think there’s a lesson to be learned there about always looking for the new thing, no matter what age you’re at, and staying engaged and diving in.
To broaden it a little bit, as we look at data science training, both in industry and in Academia - it’s evolved so quickly over the last few years. Where is it lacking? What is industry doing well and not so well? Where can they improve? And the same for Academia. And are we really doing a good job preparing data scientists for getting out there in the world at this point?
It’s super-interesting, because at our organization I do work a lot with the New York City government on their different programs with the small business administration and training programs. So I sit down with politicians and local leaders and talk about how are we serving constituents who are making $18,000 a year and get them up to $85,000 a year? And the truth is most programs are really rushing into industry without full preparation, so we haven’t seen the best results all throughout.
Many programs say “Hey, our average graduate makes $78,000, and they get a job within six months of graduation”, but that’s not always true. For us at Galvanize, we are on both Course Report and SwitchUp, and we have everything that’s peer-reviewed and checked through the industry to make sure that we’re giving you the real facts on how our students do and perform. But even then, for us, we’re constantly having to innovate on the curriculum.
Now all the universities are launching data science programs, and a lot of them are getting into AI programs as well. Whether you’re looking at the first ones, like Berkeley and Columbia, or other ones popping up all around the country… I wouldn’t say any of them have won the game per se, because the technology is changing so fast. I think when someone’s thinking about going into learning through a data science training program, whether it’s a university or a bootcamp, it depends on the goal you’re looking to achieve. If you’re going directly into an undergrad or a masters program, it makes sense to tack it on, so you have that extra skillset that’s gonna help future-proof yourself in whatever role that you move into in your career.
But if you’re going straight into a bootcamp without any other prior experience, it’s often a struggle… Because those bootcamps, if you’re doing the full-time - which is 60 to 100 hours a week - for three months and then you’re expecting to get a job afterwards, there’s a reality I have to share with most students… I tell them that you need to have a basis there. The biggest students who have great success going to bootcamps are those who are already software engineers or have a PhD. And that’s a very limited pool.
If you’re coming from a liberal arts movement, you can be successful in a bootcamp. However, you’re gonna have to put a lot of time and work to see those results. And the classic example I share with students is if you’re someone who already is a software engineer and you only study two hours a week, and look to get that job, but you’re someone who’s a liberal arts, but you spend ten hours a week, you’re gonna ramp up a lot quicker than the software engineer, just not in the beginning.
So it is all about time output and thinking smarter. Is there a program that’s better or worse? There’s so many out there, and I like to say that we have some of the best programs in the industry, but they’re constantly evolving. I think when you choose a program you wanna be involved in, you wanna make sure that that institution or that bootcamp has full-time curriculum people who are constantly innovating and improving… And to be willing to ask them “What’s the tech stack? What are we gonna learn? Are we just learning Python? What packages? What databases? What projects?” Feel free to ask those big, tough questions, because that’s gonna serve you best down the road.
I hear a little bit of what you’re saying in terms of the helping people understand where they really wanna get to, where they’re coming from… Do you feel like as an industry we’ve crystallized it all in terms of what a data scientist is? It seems like for so long - and maybe still in many ways - defining what data science is is just so varied that it almost loses meaning, in some sense, because it could be like you’re doing TensorFlow and deep learning, all the way to analytics things, to big data, distributed processing things… Do you think that we’ve kind of crystallized around that term at all?
I’ve noticed recently a lot more effort in terms of specialized job role titles, like machine learning engineer, or even things like data science engineer… Of course, data engineer has been around for a while now. AI engineer… It seems like a lot of people are kind of shifting to the side of like “Oh, we need to add ‘engineer’ in the name, because these data science people coming through don’t really know how to build anything.”
So what is your sense of that, as you kind of survey people coming through these sorts of programs, the types of positions that they’re looking for, the types of things the industry is looking for? What is your perspective on that?
Right. If I look at an ML engineer as someone who has software experience in building applications, and a data engineer as someone who could already work with cloud systems or distributed systems… And often the bootcamps and the masters programs just don’t give enough there. That’s why you wanna look at capstones for that.
But I think the challenge is there’s so much information to cover and pack into it that data science has just become this term that encompasses the industry. And how I look at it is, simply put, what used to be big data became predictive analytics, became data science, which is now the AI industry. It’s constantly evolving… But the truth is when you look at data science roles, 60% to 80% of the work is still in the data. It’s cleaning data, it’s labeling data, it’s getting it all set up…
I featured actually just at the beginning of August on the HumAIn Podcast Mark Sears, who runs CloudFactory, which is one of the big data labeling companies between and Africa. They have 10,000 just labeling data. I think the reality that a lot of data scientists don’t know until they join the company is you’re not playing with algorithms all day. Maybe you might, but even an ML engineer - you’re gonna be working with APIs, and setting up pipelines and systems before you get to start training and testing and working with other teams to see those results.
So I definitely see a specialization occurring in the field. In fact, I’m calling now a new subfield emerging in data science, which we’re starting to see in some trend reports, called Data Science as a Service. Similar to how we saw Infrastructure as a Service, with things from like HashiCorp, Ansible and Terraform, and a lot of deployment options for the cloud, we’re gonna start seeing that (and we already are) in Data Science as a Service. We’re seeing companies like Neptune and Spell, and [unintelligible 00:19:00.20] and other ones, which have all just recently raised their series A’s, that they’re helping deploy systems.
You even see the founders of Anaconda - a couple of them branched off and launched Saturn Cloud, which is launch these systems in Dockers containers, and now do your data science. Paperspace and Brooklyn got notorious for that, and has been doing a phenomenal job partnering with companies like FastAI, and Insight Data Science Fellowship as well.
As a follow-up to that, it kind of feels like our industry is starting to grow up. I’m older than the two of you guys, and I was around when the internet first exploded in mainstream, in the early ‘90s. You went from very few job descriptions, that were then kind of fragmented as it exploded out, and you had many dozens of job descriptions very soon. It feels to me very much like the data science world and the AI world are kind of starting to do that now, where instead of every role being tied to a data scientists, you’re seeing lots of specialization, and therefore even data scientist is becoming a little bit specialized in terms of what activities it does in that ecosystem. Do you have any thoughts on that?
That makes complete sense. A lot of companies now in consulting are actually hiring staff data scientists. These are data scientists who are supporting many teams. But then some teams hire a data scientist to help with just that division. So I think you’re gonna see that, where there are those who are both cross-functional, and those focused only on a product.
For those who are looking for data science and AI jobs now, next time you’re checking out a company and you’re looking at their job boards, whether that’s on a website, or Indeed or LinkedIn, I encourage you to see in the job description what are they talking about? Is it the product, is it the team, is it the whole company?
In fact, when you’re looking at that for a role, it’s important to look at the requirements and the recommendations. Often people choose to apply or not apply for jobs based on if they check every single box on there. But having interviewed and worked on hiring teams to bring in a lot of data scientists at Galvanize and at other companies, it’s not every box that has to be checked.
Yeah, I totally agree.
The requirements - generally, most of those boxes should be checked, so you do wanna make sure you’re covering 75% or more, but think of the requirements as must-haves. You generally should have that, but not necessarily everything. Sometimes that tech stack is specific to a company. If you’re looking at ten data scientist positions, between startups and Fortune 500’s, every requirement could be different. Someone uses Python, someone uses R, someone uses TensorFlow, someone uses Mxnet. My goodness - should you learn them all, just until you get a job? I don’t think so. Go with what you’ve got, and then start applying that.
In the interview, companies are very flexible with that. If you’re like “Oh, I’m amazing at R, but I’m just picking up Python” - sure, I’m gonna do the code interview in R. Let’s see what you can do. If you do it great, you know what - phenomenal! There’s so many integrations happening there.
I also think – this is such a tangent, but I think R is playing catch-up. R fell in the rankings to only the ninth most desirable language this past year. It used to be top five for a few years. Now they’re playing catch-up. There’s some really cool new packages coming out, so I wouldn’t be surprised if R climbs back up the leaderboard the next couple of years.
David, we already mentioned that you have your own AI podcast called The HumAIn Podcast. It’s great to have another podcaster on the show with us; it’s the first time we’ve done this, and we’re really excited to help bridge the gap between some of these different people creating content… It’s really great to have that opportunity.
I was wondering if you could just share the premise behind the HumAIn Podcast and why you decided to start creating it.
Sure. Thanks, guys. I think technology is moving at such a blistering pace in what we’re now coining the fourth industrial revolution… And as you mentioned, the gap is continuing to grow, especially between humans and machines. All these new products are coming out, all these new companies are coming out, which are supposed to improve our lives… But a lot of our jobs are at risk.
I created the HumAIn Podcast to bridge the gap between humans and machines in the fourth industrial revolution. I feature in an interview format conversations with chief data scientists, AI advisors and leaders who advance AI for all, to help everyone learn more about humans and their processes in AI, which is called human-centered AI; empathetic design, which is how we can build better processes for humans, and other topics like AI for social good, AI governance and AI research.
I think for me coming up with the HumAIn Podcast was so natural with all the training and deliveries I’ve been doing, as well as seeing my dad and his own journey from going in robotics to code, and then saying “Alright, David, you’re the one who’s gonna be the next generation.”
So I wanted to make sure these conversations could be heard, and for a broader audience… Because so many people, whether they’re working today, or they’ll be working tomorrow, are concerned about these trends and how their jobs will be impacted. So that’s a little bit about HumAIn.
Another thing about HumAIn, just a fun fact not many people know - HumAIn in its spelling in French actually means to be human. So it’s a little play on words, I throw the AI in there… But it’s a great podcast I’ve had a lot of fun with. It’s been going on for about ten months. Thanks so much for letting me talk a little bit about that.
Sure. I love the focus on how you’re addressing some of the hard questions. I know that Daniel and I are always out there doing talks, and meeting people in different events, and stuff, and those same questions come up all the time in conversations… So I think it’s really great to just address them head on and sort through the problems.
I’m really curious, could you share with us over the last ten months maybe some of the highlights of the various episodes or interviews that you’ve done? I’m really curious, what are some of the peaks of content that you’ve had over that time?
Yeah, it’s super-interesting because I always take a different theme to the podcast. I reach out to people who I’d love to invite to my dinner table, and talk about the industry. One who I had on was about synthetic data, I had Jeremy Kaufmann, from Scale Venture Partners [unintelligible 00:26:28.29] how startups like KeepTruckin and Convoy have scaled into billion-dollar ventures by using synthetic data.
I had the opportunity to talk with Kristen Kehrer, who is a female founder who works in data science training as well, and is based in Boston. We talked about how the industry has changed from research to applied.
I’ve also spoken with one of my good friends, Noelle LaCharite, who used to be an early employee at the Alexa team and VOICE, and now just got named the number one VOICE advocate of 2019. We’ve recently sat on a panel at the VOICE conference in Newark, talking about how Microsoft and Amazon are working together to create a universal audiobot.
That’s just three examples of conversations that we’ve had on the podcast, but I love to talk about different themes, talk about trends, and speak about how it’s relatable to each and every one… Whether someone is a data scientist today, they’re a business executive, or they’re someone who just wants to get into the industry. It’s a little bit of entertainment and education for all.
Thanks for sharing those. I actually can’t wait to listen to some of those that you mentioned. I wanna try to play devil’s advocate a little bit here and kind of give you a chance to say – like, if I am out there and I’m thinking “Oh, the gap between humans and machines is widening, basically why should I care? If Gmail is able to complete my sentences and it’s convenient for me, why should I care that I don’t really understand that, or I don’t understand my data that’s being used for that, or how it’s being used, or… It’s just a convenience for me.” Why is the gap between humans and machines such a concern? Why should we care?
I like to place it as something that’s personal for each and every person. One of our presidential candidates right now, Andrew Yang, from the state of New York, is talking about being the “humanity-first” presidential candidate. And the reason he’s taking that stance is because there are things that we see every day in our life that are being automated. For example, the self-checkout lines at the grocery store used to be run by staff, and people who had jobs, whether they were improving their life or giving us the paycheck.
We’ve now seen several presidential candidates just in the past few weeks actually talk about “Do they support self-checkout lines or not?” It seems like this should not be something controversial, but the truth is there’s at least a couple million people in the United States who work in grocery stores. And whether that’s assistant managers, or helping with transactions, or better customer experience, the truth is jobs like that can go away.
We look at places like gas stations. If we’re in the North-East, in New Jersey, that’s one of the few states that still has human-serviced gas stations. But everywhere else it’s self-service. That’s been around for many years, there’s no AI there, but that shows how the jobs were eliminated. I think we’re gonna continue to see that happening. We’re looking today at customer service experiences even with when you call to a company to have a check-in on what you’re doing with your current flight, or for paying a bill - it’s being completely automated away. The services got even better in the last three years. And we don’t see it often, because we don’t see these people day-to-day; they’re not eating at our dinner table, we’re not talking to them about their life, but the truth is those jobs are going away… Which means it’s not just impacting them, but it’s also impacting you, and it soon could impact your career as well.
I don’t like to play pessimist - I know we’re playing devil’s advocate here - but there is the opportunity where AI’s intention… Like, what is the big goal in AI, why should we even use AI and all this automation? It’s for efficiency. Efficient markets. The big cities are notorious for being efficient systems. And if you can make something more efficient, you will make it more efficient, because you can drive down costs or increase revenue. And the challenge is every industry is gonna be impacted there.
Earlier this year in April Deloitte actually had their human capital management report and said “We’re moving into a world of superjobs.” Superjobs are jobs 3.0, which means the analyst at Goldman Sachs who used to work on their Excel reports, and created PowerPoints, and presented the investment banking manager and determined that this is the next investment that’s going to help them make a lot of money - it no longer is that same job. That was what they did in 2003. But today in 2019 they have NLP systems that help auto-generate reports, automatically create dashboards, and then the analysts are offering some oversight, as well as some customization with the higher cognitive tasks… And then they’ll also maybe help with that presentation. But that itself has eliminated the need for a lot of investment banking analysts. Where they used to have 200 traders or analysts on the floor at Goldman Sachs, today in 2019 you can do that with 5-10 people.
So there has been a constant evolution, and it’s gonna keep happening. I tell people, it’s not that you need to care about it, it’s just that if you don’t care about it - well, then you’re putting yourself at risk. And ultimately, when you are optimizing for anything in life, you should be minimizing your risk. Maybe not for investments, so that could be a little play on words there, but in general - you should be creating a better process if you come from that mental model for your moral hazard.
Yeah, I just saw an article - I’m looking at the date, it was August 7th - where J.P. Morgan now apparently is experimenting with an AI copywriter, that apparently (at least in some cases) can write better ads than humans can. I think this is a similar trend to what you’re talking about. Do you think that the main piece of this that really comes into play is automation, that’s the main player there? Or is it also kind of a logical gap, where people are less sure about what their technology is doing or how it’s operating for them to produce convenience, and that sort of thing?
Well, I think the three big industries that we’re gonna see rapid onset of automation in the next ten years are data and AI, connected devices and internet of things, and robotics. All three of those industries are rapidly advancing in automation… Particularly there what’s happening is the products we’re using today are no longer being developed by humans.
The example that you’ve just mentioned, with J.P. Morgan with their copywriting AI - sure a little bit of it is public relations, and talking about what’s out here, so they get first to market, but in fact they’re not first to market. Bloomberg and other financial companies have been creating articles with automated systems for years. If you’re someone who invests in the stock market, next time you go onto any website, like SeekingAlpha, or Bloomberg, or Reuters, or any of these, and you look at the general news of the day, and the article says “This stock has gone up 10%, and the EPS is XYZ, and the dollars are Y” - that could all be written by a machine, and in fact it probably is.
That’s why earlier in 2019 there were over 100,000 media and copywriting jobs eliminated in New York City from companies like BuzzFeed, Vice and others… Because all that is starting to be automated, and teams are realizing “Well, do we need 100 copywriters, if instead we can have so many generated stories from a system, and then we have a copywriter supervisor who checks through them to see which is more plausible and does some refinements there?”
The challenging thought is this - in a capitalist society like the United States, everything runs by money; and if money is not being made, automation is the first thought to come to mind. When we look at media companies like The New York Times and Washington Post, who now run their businesses with digital subscriptions, they have more in 2019 than they had print subscriptions in 2001. But when you look at other publications, like The Los Angeles Times in California, The Sun-Sentinel in South Florida, The Houston Chronicle - all of these ones are struggling. They don’t have as many subscriptions, so they don’t have as much revenue.
And the truth is revenue is driven by how much business you can bring in. If the business is declining, the first thought that comes to mind is how automation can help solve it. I really think that’s why companies like J.P. Morgan are looking at AI copywriters, and Bloomberg and Reuters and Vice and BuzzFeed have already started getting in on those trends.
I don’t think ever all humans will be replaced; I think there’s something to say with the sentiment of how we each think uniquely, with our mental models and our perspective, that a lot of people like to read and learn about. I think that’s one of the reasons why Substack recently raised – I think it was a 100 million dollar round for human-based newsletters.
I’d like to tie a few of the threads that we’ve discussed together… We’ve been talking about the relationship between human and machine, and you pointed out that you have these three points that are coming together between AI and data on one, connected devices on another, and robotics on the third. And you’ve talked about some use cases here. It feels like we’re almost dancing around a particular term that we’re all talking about these days anyway, which is human-centered AI, where it is augmenting humans, it’s allowing for the collaboration between humans and machine, and in many cases where the human is sort of orchestrating a symphony of AI collaborators that might be working together to get something done for a company.
I know you like to talk a lot about human-centered AI… Could you tell us a little bit more about what it is and why it’s growing so fast, and what you think the implications are going forward?
Sure. So human-centered AI as a term just became big in the market in the last year. It became big because Stanford said “We’re gonna launch the HAI Institute”, their own human-centered AI institute with Fei-Fei Li, who’s been a professor at Stanford for many years, and ran Google Labs for a few years as well… And the intention is really thinking about the future, because Stanford, and even other major institutions like MIT with CSAIL are thinking about “What is gonna happen when technology is everywhere?” The elimination is already happening with jobs, and it’s not just in the U.S.
I recently had a colleague traveling in Shenzhen, China, and they stayed at the J.W. Marriott, a premium, ultra-luxury hotel, part of the Marriott Bonvoy brand. And when at the hotel, that individual wanted to get room service. What did the J.W. Marriott do there? Now they have robotic butlers that drop off the latest diet Coke that you like, or your meal. They no longer have humans going from rooms. Those robotic butlers have computer vision, they can press the elevator button, go in, ring your doorbell, drop the food off, and they don’t have to wait… So it provides greater access for on-demand service, 24/7.
I mention that because that’s why I think Stanford and MIT and other institutions have moved in on this human-centered AI movement, where – look, we’re moving to AI; we all get it, that is the future. And sure, there’s some hype; it’s gonna be slower and faster in certain segments than we think or expect. But if we don’t start placing diverse opinions into these processes early on, thinking about bias and how we can make sure the systems work for all people, then we’re gonna slip behind.
So by thinking about that, you can say “When I design the process, does it work for someone who’s 75 years old AND someone who’s 7 year old? Am I designing a process that can move in different terrains? Is the product going to be one that works across multiple languages?” Anything that is non-accessible needs to be accessible with AI. The reason is because otherwise you’re excluding different cultures.
Today we serve all cultures, primarily by hiring people who speak different languages. If you travel to Disney World or Disneyland, you get that. They have hundreds of fantastic park service individuals who speak different languages and support you as tour guides throughout your journey. But in the future, that could just be a soundpiece with different languages.
The challenge is we have to make sure we’re being accessible for all, and starting to design technology that’s enabling humans at the onset. One example that’s been a failure, that has been quite prominent in the news, is how Apple with Siri - their audio-enabled assistant - never had an Icelandic language for Iceland. So when you were a kid - now as alpha generation, or generation Z, growing up, you were using Siri, the human language of English, to communicate with Siri. So you’re speaking in English, but not Icelandic, because Icelandic didn’t exist for Siri. And what that meant is they’ve shown now that the Icelandic language is becoming extinct in Iceland… Because kids do not wanna learn it, and therefore parents are not gonna teach it, and before you know it, we’re having the diaspora of culture appearing again as a result of technology.
I like to think back to one of my favorite authors, Jared Diamond. He’s written “Guns, Germs, and Steel”, “Collapse”…
Yeah, I’ve read it.
…now “Upheaval”… All his really interesting books on how culture and society change. I think we’re now entering this new wave, and whether we call it the third industrial revolution, the fourth industrial revolution… Whatever name you wanna give it, I think the next 30-40 years is going to be a generation defined by connecting systems, with internet everywhere, data everywhere, listening everywhere. Once that’s complete, if we’re not thinking about the human at the onset, a lot of jobs are no longer gonna be here. That might mean we need something like conditional basic income, or we need to have stronger governance to protect our societies. Of course, that depends on your mental model, but we can just look at facial recognition and see how cities all across the U.S. are banning facial recognition, both in schools and in cities, because of the concern of jobs going away, and the concerns of privacy.
David, I really appreciate how you brought up this idea of how technology makes an impact on culture. I think that’s actually really important. I think of things like – you know, everybody thinks of Google Translate as being so great, which it is, in many ways, and is an amazing accomplishment, but it supports (I forget how many languages now) around 100 languages… And some of them better than others. But there’s over 7,000 languages being spoken in the world right now, so it’s kind of a drop in the bucket. Over 6,000 of those languages are spoken by 25% of the world’s population, which means that those are basically marginalized communities.
So when technology has been made available, like you were talking about, in certain languages, all it tends to do is kind of further marginalize communities… And what I’m wondering is what do you think is a way that we could incentivize creating technology for these marginalized - or using financial terms - emerging markets, or whatever you wanna call them… Do you think that there’s a role there to be played by regulation? Is that needed? Or is there a way that we as AI practitioners can adjust our practices and our workflows to better orient ourselves in terms of the technology that we’re building.
I think it could be a combination of both. I hope it’s more that we as developers reorient ourselves and our technology. Think about if you are a developer building a product - you don’t want your product to break; so whatever the user input is, you’re designing it so that if that’s a number, the number gets fed into the system. But if it’s text, the text gets converted to a number, so it still gets fed into the system.
I think just like data scientists and engineers today who are developing their APIs and their systems not to break based on inputs, the same thing should be that “Are we thinking about humans first in these systems?”
We’re starting to see AI guidelines. I know the European Union recently launched their own AI ethics standards; that came out a few months ago. There’s similar initiatives going on in the U.S. as well on ethics, and about integrating these systems for all. But it’s really about having the conversation and then making sure you take action.
Just like you mentioned about the languages, I had the opportunity to sit down on the panel, an advisory session with one of the leading candidates for the New York City mayoral election in 2021, and that exact topic you brought up, we brought up as well.
You mentioned 6,000-7,000+ languages, and in fact in New York City over 20% of the people do not speak English. 10% of that is about 800,000 of the eight million people in New York. So if you take some of those languages, just the top ten other languages that are not English or Spanish, it is about 25% of the population. That’s enough to have enough votes to win an election. Not that the goal should be just to win an election, but it should be serving all constituents. And when you’re in an accessible city like New York City, it should not just be translating services for eight languages, but how about at least 100 of them or more?
I think when we start thinking about processes, we need to do a lot more competitive intelligence, a lot of research on who our constituents are, and then to best serve them. And if that means rolling out a feature in tranches, that’s totally fine. But as long as you have the goals there and you’re thinking about accessibility on the onset, it’s paving you in the right direction.
One of the things that I’ve heard talked about a lot lately - and kind of almost as an extension maybe of human-centered AI - is I’ve heard about empathetic technology and empathetic AI, and where we need to go with that. How do you connect those two, from a relationship standpoint? Where does empathetic meet human-centered, and are there other components there that I’m missing?
I think bridging the gap on human-centered AI and empathetic design there’s a great story that came out in California… I think it was with Kaiser, just a few months ago. There was this grandparent, 70 years old, on his deathbed at the hospital, and they send a robot in to say “There’s no other treatment we can do for you. You’re going to die. Go home.” Something like that. I’m paraphrasing, of course. But that robot - not empathetic design.
Now, what could be empathetic design? A robot that’s serving as a nurse to actually clean a wound when you can’t always have a nurse on call, because they’re working 16-hour shifts. Or a robot that could have infrared computer vision to detect where your vein is, so to better help inject your medication, or the insulin, or whichever treatments you need. That could be very empathetic. There are ways to do that, and it starts with design thinking.
In fact, on the HumAIn Podcast I talked with Chris Butler from IPsoft just about that, design thinking, and all these questions that are critical, because you need to think about what is the customer experience or the consumer experience. That’s this whole new field that’s being coined in 2019, CX - so now it’s the CS industry - about making sure it is about humans, it’s about customers, and it’s about empathy. I think we as a society have the moral obligation to provide the best possible customer service. By doing that, you build loyalty and you keep customers and create more revenue. But if you don’t do that, you create the risk of not only alienating your customers, but also losing those who are most valuable.
One classic case of this is a lot of the cell phone companies now have been using data to understand what is your threshold of getting angry to switch from a service provider, and they’ve been using that data to see how many times they could push back on you before giving you a discount, or giving you a change in your service.
I knew it!
Right? It’s real! Come on, we all like to think it’s not happening, but… I call it Hacking AI, which people have done for a while… When you’re talking to one of these companies - Verizon, Sprint, T-mobile, AT&T on the phone, you better believe that in real-time they’re taking your audio, it’s converting to text, they’re getting the sentiment, and based on that, they might be making some different decisions, if you sound more angry, or you sound more calm.
Here on the Practical AI Podcast of course we like to keep things practical, and as you’re talking through all these things, of course, it’s probably easier for me to see practically some of those hacks, like you’re talking about now, like hacking to optimize or to make things more efficient, or to automate, or whatever those things are, just because of my technical mind… But as you’ve of course advised a lot of different companies, you’ve worked with a lot of different students, and teams as they’re gearing up… What are some practical ways that me as an AI practitioner – are there some practical steps I could take to start modifying my workflow such that I am becoming maybe a little bit more empathetic or human-centered in the way that I design the systems that I’m building? Have you found anything that consistently helps or is practical in that sense?
If you’re building systems today and you’re thinking “I wanna integrate AI. I wanna bring in automation, but I wanna make sure I best serve my customers”, it’s important to first decide “Are you gonna go with a system you’re building from the ground up, with code, and engineers, where you can control each and every step in the process? Or are you gonna take a pre-baked solution that one of these data science as a service companies, or one of the cloud providers like Google, Amazon, Microsoft or IBM or others have available for you?”
The reason I say that is because if you take a pre-baked solution, that bias and that inherent (potentially) inaccuracy that exists in these systems will be present in your products. It could be easier to implement it, but you may not be able to customize it as much as you want. So you definitely wanna consider first whether your organization or your team has the technical chops and availability and bandwidth to do some coding or use a pre-baked solution.
I think secondly then it’s always thinking about your customer. I think design guidelines are only starting to emerge right now. There’s more human guidelines; there’s been ones recently on cybersecurity that have come out in the last few weeks, but right now I think it’s still a very nascent industry on determining the exact guidelines. I think just starting to ask questions such as, you know, thinking of like Business Model Canvas, or Lean Model Canvas - who are my customers, how am I serving them? I think that’s a great starting point. Because in fact, most of the times these questions that we ask as engineers - where do all those answers lie? In our minds. We never write them down. Once you start writing it down, it looks completely different. And that’s why I think it’s important that if you’re building a product or changing a product that’s putting humans first, think about partnering with someone who is a strategist in business, or someone who might come from a liberal arts background, because they’re gonna add that unique vantage that could help you think about each and every person.
I love how you came full circle right there, and kind of got back… We talked a little bit about those different backgrounds early in the call. So if somebody – they’ve gotten a little taste of what human-centered AI and design thinking and empathetic technologies are in this conversation, but if they wanna dig into that, do you have any resources that you think are particularly good, or you can point them to, that listeners can go and see more about it, understand more and learn more in this effort?
Yeah, one of my favorite trend reports that’s been talking a lot about these sub-industries is from one of my mentors in New York, who I went through her teaching/training fellowship, Amy Webb. Amy Webb has this new book, The Big Nine. She talks a lot about technology, and trends, and teaches at NYU… But she has a tech trend report that is almost 400 pages this year, so it’s definitely a long read…
…but they’re all broken out by all these sub-trends. Some are AI, some is human-based, some is cyber… And you can start to see what companies are working there, and what those products look like.
I think also Matt Turck, from FirstMark Capital in New York - I really like him. Every year he creates a big dashboard of all the AI/data science companies and what they’re doing in the industry. That’s really where the data science as a service has been emerging. He’s also looking at companies that are starting to think about ethics. I think some of those were shown this year, but I think starting next year we’re gonna see a lot of those companies emerging into the trends.
Awesome. Well, thank you so much for sharing those things. I know I’m gonna take a look right afterwards. It’s really been great to have you on the podcast, David. It’s been awesome to hear about your perspective on human-centered AI, to hear a little bit about Galvanize, and data science learning, and all sorts of things. I really appreciate you being on this show, and I really recommend to our listeners to go check out The HumAIn Podcast and take a listen, see some of that great content that was mentioned.
Thank you so much for joining us, David.
Thanks so much, Daniel, thanks so much, Chris.
Our transcripts are open source on GitHub. Improvements are welcome. 💚