In the second of the “AI in Africa” spotlight episodes, we welcome guests from Radiant Earth to talk about machine learning for earth observation. They give us a glimpse into their amazing data and tooling for working with satellite imagery, and they talk about use cases including crop identification and tropical storm wind speed estimation.
Me, Myself, and AI – A podcast on artificial intelligence and business produced by MIT Sloan Management Review and Boston Consulting Group. Each episode, Sam Ransbotham and Sheervin Khodabandeh talk to AI leaders from organizations like Nasdaq, Spotify, Starbucks, and IKEA. Me, Myself, and AI is available wherever you get your podcasts. Just search Me, Myself, and AI.
Fastly – Compute@Edge free for 3 months — plus up to $100k a month in credit for an additional 6 months. Fastly’s Edge cloud network and modern approach to serverless computing allows you to deploy and run complex logic at the edge with unparalleled security and blazing fast computational speed. Head to fastly.com/podcast to take advantage of this limited time promotion!
LaunchDarkly – Fundamentally change how you deliver software. Innovate faster, deploy fearlessly, and make each release a masterpiece.
Play the audio to listen along while you enjoy the transcript. 🎧
Welcome to another episode of Practical AI. This is Daniel Whitenack. I’m a data scientist with SIL International, and I’m joined as always by my co-host, Chris Benson, who is a tech strategist at Lockheed Martin. How are you doing, Chris?
Very well, Daniel. How are you today?
I am doing wonderful. I’m blessed, and pretty excited, because if you remember, Chris, we had a wonderful conversation not too long ago in a new podcast series that we were launching. This podcast series is sort of a collaboration with the Open For Good Alliance, which is a sort of multi-stakeholder group that is working to create localized training data, which is one of the major obstacles for local AI innovation in Africa and Asia… And we’re kind of having this podcast series with that Open For Good Alliance to spotlight some of the things that are going on with AI in Africa.
Last time we talked with Joyce Nabende from the Makerere Lab, and Joyce is back with us today. Welcome, Joyce.
Thank you, Daniel. Nice to be here. I am very excited for another podcast talking about AI in Africa and featuring another organization that’s doing a lot of work in making sure that we have AI data available for ML communities in Africa.
[03:59] I’m so excited that you’ve joined us again. It was a wonderful conversation before, and now we get to welcome you back as sort of a co-host with us in this podcast, which is wonderful. So yeah, if you could maybe introduce our guests and the topic for the episode, and then we’ll jump into some of the great things that they’re doing.
Yeah, thank you, Daniel. So today we are very excited to host two people from the Radiant Earth Foundation. We have Hamed Alemohammad and Abba Barde, who are going to talk to us about the work that they’re doing around AI data collection challenges, especially looking at machine learning for Earth observation. They’re going to give us the perspective that they have, what they do, how they work with the community, and especially around the capacity-building initiatives that they are involved in. And just to say that I have worked with Hamed before in the past, so this is very refreshing to have another conversation with him today on this podcast. So Hamed and Abba, you’re very welcome, and we are very excited to listen from you and to learn about the work that Radiant Earth is doing in the AI space, around the Open For Good initiative. Hamed, you’re very welcome.
Hello, everyone. I’m excited to be here. Thank you, Joyce, Daniel and Chris for having us in this episode.
Yeah, we’re excited to hear more about what you’re doing. Could you give us maybe a little bit of the wider context of why Earth observation, why machine learning as related to the sustainable development goals?
Yeah, so let me start with sustainable development goals, or as we always abbreviate it, SDGs. These are a set of goals that were set by the United Nations back in 2015, with a set of targets for 2030. These goals talk about hunger, poverty, access to clean water and sanitation, climate action, [unintelligible 00:05:53.10] sustainable cities, and basically there are 17 of them that tackle different aspects of a sustainable society. Each country is basically progressing toward those goals, depending on where they start with, and they have to regularly monitor and report where they are with their target.
Where Earth Observation comes in is we have this massive number of satellites orbiting the Earth and capturing measurements on a regular basis. This data can be translated into those targets and variables that the countries need to report, and also helping those countries to better monitor their progress and see where they’ve put their efforts.
Nationally, because you have a lot of data, that’s where the AI and ML aspect comes in, because you’re dealing with basically a continuous flow of data from these satellites. It is a real big data problem; it is not just a [unintelligible 00:06:42.20] data problem, because we are dealing with petabytes of data on a daily scale, when you think about global scale… And that’s where AI can help you derive insights, get those target numbers out, and provide support for decision-makers to be more effective, and affect the society down the road.
So maybe Chris knows this, because of his work in aerospace and other fields… But if I’m thinking, okay, I understand the concept that there are many satellites taking pictures of the Earth… Like, if I was a data scientist - like, I’m not running the satellites. I don’t really know – like, I know how to access Google Maps or Google Earth or something like that, but could you explain a little bit who runs these satellites, where are the images, how are they aggregated and accessed, and that sort of thing?
Good question. So there are two big players in this, operation-side. There are governments that operate many, many large satellites. On the U.S. side we have NASA, on the European side we have the European Space Agency, we have the Japanese Space Agency (JAXA), we have the Indian Aerospace Agency… Many of national countries have their own space agency that they operate satellites.
[07:56] Historically, the U.S. and Europe have the bigger ones, in terms of the government sector. Then there’s a growing commercial sector that is operating their own fleet of satellites, and they have been kind of booming in the last 10-15 years particularly… And they provide a different suite of satellites and their datasets.
Almost everybody is moving to the cloud, let’s put it that way, where the data sits now. Commercials are all providing the data through the cloud. Governments have their own kind of data stores and portals that we can access the data from, but they are also now working very closely with cloud providers, all of the major ones… And you as a data user can go to those cloud buckets, start exploring the data; there are APIs that you can start to query and search for imagery.
The government data is usually open, in the sense that you don’t pay for the imagery to access it, but at the end you should have your own capacity and resources and machine to receive the data. Commercials have paid data on a regular basis, but they also have - at least I would say major commercials have that… They have a disaster kind of open data program; when there’s a problem, an immediate situation, they release open data for disaster response. They also have research programs that academics and researchers can get kind of a quota to access some level of free imagery. So yeah, data is now moving to the cloud. That’s really the paradigm shift.
Given that you have all these different providers, and when they put the satellite up, obviously, they have their own objectives on what they care about, what the satellite is designed to collect, and that it’s not necessarily standardized across all of those different programs… How do you go about thinking about getting a usable dataset from your standpoint, from all those different sources, and knowing where to go, and how to put datasets together? It seems like it would be a bit of a logistics challenge.
It is a logistics challenge, and actually, this is one of the things that Radiant started to act in that sector and provide support to the community. Think about you as a user are looking for what we call optical imagery. Optical imagery is like an image that you get with your cell phone, but in this case it’s captured by a satellite from a couple of hundred kilometers into the Earth Orbit. This is typically what you see in a Google-based map, Apple-based map, or any other kind of mapping ecosystem that you work with.
If you are looking for a specific image over an area, you will go to an API and query for that image. The problem a couple of years ago was that each provider had their own API definition and how they would record what we call the metadata of that image - what is the cloud cover in there, what is the spatial bound, what is the time tag. We as a neutral agency in the community said “Hey, this is a problem. Let’s get together and come up with a standard way of cataloging our data.” And this is what we have now as a spatio-temporal asset catalog or stack, which is a standard of specification, open source and fully community-driven, that defines how you’re gonna build your data catalog and expose it to the end user. So now everybody who contributed to that, which is around 25 organizations, are adopting that standard, because they tested out, they provided feedback, and it’s now becoming the universal way of searching geospatial datasets. All the major data providers are now adopting it, and we at Radiant use it for our own data store.
Yeah. Hamed, just a quick question… For example, if I wanted to have access to the data and you’re saying you came in as a middle player to try and provide a standardization for access of this data - so where is this data stored? How can someone have access to it? Now we don’t have to contact all the different asset providers, but go through Radiant Earth to have access to that kind of data?
Yeah. So on our end, what we provide is specifically AI and ML-ready data, and not necessarily any satellite imagery. So Radiant basically is working in this sector, but providing those benchmark training datasets that a user needs to build a machine learning model. So we don’t necessarily provide the raw satellite imagery for anywhere on the Earth, but we have a data repository with a stack API and a catalog that anybody can access - so it’s open access for everyone - that you can come and search for “Oh, I’m looking for a labeled dataset of, for example, land cover classes in Kenya”, and you can query our database and find the corresponding labels, and the source imagery, which is usually from a satellite. Sometimes we also have a drone, but the majority of them are satellites.
[12:17] Then you can use that dataset to feed it into a machine learning training pipeline and build a model for, for example, land cover classification. But yeah, we have those types of datasets. Our repository is called “Radiant MLHub”, and as the name says, it’s a hub basically for these kind of resources. So far we have been pretty much focused on training datasets, but we have recently launched our model repository as well, which I can talk later about.
So I’m curious as a practitioner - maybe we can go to Abba - going from this sort of stack that’s very specific and specialized to the sort of Earth observation, satellite imagery world, and kind of taking that and then mapping it into the formats that AI and ML people like, what are some of the challenges with that in going from just sort of that raw satellite imagery down into actual training datasets that can be used with models?
Okay, so some of the challenges which are faced while doing that is - so you have vast amounts of data, and making it ML-ready might be a bit tricky, because you need to work with data loaders, and you need to find a way which… You might have time-series data now, time-series observations, and it gets a little bit tricky working with those as well. On MLHub we actually have tutorials as to how you could use these ML-ready data. You just pass them onto your model side of things, and you train from there.
So I’m curious, after thinking about this sort of data, what it can mean for training models, I was wondering, maybe Joyce and the others - could you help us connect this specifically to problems that are being solved in Africa, and the types of datasets with this imagery that are relevant to some of those things?
Yeah, Daniel, I think that’s a very important question… Because if I just look at one typical example - for example, I want to understand a major problem that we have around Africa, which is the problem of deforestation, right? So you know that forest cover is a very difficult thing; and if you’re looking at maybe the authorities in the different countries, they want to understand what are the major drivers of deforestation, how is deforestation occurring in the different countries…
[16:09] I think one of the things that I would refer to is trying to look at this – for example, the Earth observation data, looking at forest cover… Maybe, Hamed, is that a specific use case that you can talk about, and how Radiant Earth can provide us with the kind of dataset that governments can be able to use to understand what’s going on with the forests in the different countries in Africa?
Yeah, that’s a very good example, actually, Joyce. We know in the climate change world, when we talk about mitigation strategies, one of them is being able to reduce the concentration of CO2 that we have emitted into the atmosphere, and that’s what forests are about. Forests are absorbing many of those, and sequestering all the CO2 that have emitted into the atmosphere. So it’s essential for all the governments, at the national level and international level, to make sure we can monitor forested areas, and stop any illegal deforestation.
So satellite imagery is providing that regular observation over a region that is forested, and how AI can help with that, and how the things that we do can help us. Oh, a government wants to build a monitoring system that would basically run an algorithm every time there’s a new observation available from the satellite, and detect the spatial boundary of where are the forested areas, and provide an alert or a kind of anomaly detection in the ML world when you think about it, that “Hey, there was a change here with respect to the kind of previous observation.”
This can be at any spatio-temporal scale that you think about, because we have observations regularly available, and they are available globally; that’s the nice thing with the satellite imagery. When you have a satellite orbiting, particularly those that we’re calling Sun-synchronous orbit, which are synced with the orbit of the Sun, you get regular measurements over a region like every five days, 10 AM in the morning, you get a regular observation. And you get the same thing anywhere on the Earth. So building those models is easier with that kind of constant type of observation.
But at the end, the governments can use that and have kind of a strategy for how they wanna basically stop that illegal deforestation, or have a monitoring system for protected areas; areas that have a specific boundary, nothing should happen there, no construction, no built up kind of things happening - they can have a monitoring system to do that. This is a very impactful and tangible use case for many people who think about climate change and climate impact.
Yeah, so I guess then we can segue into the work that you do with the communities. I think that’s a very important community problem, that you want to understand what’s going on ground, but you can also use it as observation data to be able to understand for example forest cover change, or even just trying to understand the yields, if you’re looking at for example farmlands in Uganda, or farmlands in Kenya; how does Radiant Earth work with the community to be able to solve, especially the [unintelligible 00:18:49.16] that you listed at the beginning… So how are these all connected around now working with the communities.
Particularly on the community side, I think the role that Radiant plays is we don’t wanna be the problem-solver, because we are just one organization, and problems are so diverse on the ground… But as we say in our mission statement, we wanna empower those organizations and individuals in their local communities to be able to use these resources, particularly the benchmark datasets and guidelines and tutorials that we put out, to solve those problems.
So we work on use cases, we work hand-in-hand with some stakeholders and governments on the ground, but at the end of the day, the goal is really empowering them to be able to do that themselves. That’s how we kind of model our partnership and our collaboration with local agencies.
But the crop example that you mentioned is another impactful one… About two years ago, when we had the locust swarm hitting the East Africa region, after all the cyclones and the wet season that there was there, there were a couple of governments in the East Africa region that were looking “Okay, what are we growing in terms of crops this season, and where are they?” Because they need to have an immediate response in terms of what will be the impact of those swarms and the food security; how much they need to import, what is the impact on farmers, should they provide any subsidy there?
[20:05] And governments, some of them didn’t have any basically updated map of cropland areas in their region, because that is a very intensive process if you wanna go on the ground and do a census every year, and provide basically a baseline map of that. But satellite imagery can do that for you. If you have good reference data on the ground, not necessarily a full census, you can build a machine learning model that looks at the time-series of Sentinel 2, and then gives you basically a crop type of a region. Then you can have a map at a national scale that the government can use for decision-making, and basically having a better insight into what are farmers growing in this region.
This is a growing field in the AI and [unintelligible 00:20:44.21] I wanna pass it to Abba, because he’s working on a problem around crop type classification, and how we deal with actually regions that we don’t have good reference data… But it would be good if you can talk about the synthetic data problem.
Because we have limited data in a lot of regions, limited label data, what we were able to work on is using generative adversarial networks (GANs) to generate an image for each of the bands of the satellite images. Assuming we have just 2000 labeled images of different regions, we can now generate much more than 2000 based on the data which we have. That has proved to improve our crop type classifiers which we build. It’s still an ongoing work, but it’s provided good results for now.
Yeah, so if I’m understanding right, Abba, you’re sort of using the GANs for data augmentation in the case of data scarcity… So do you take the actual observed imagery and use that with your discriminator in this framework to create these augmented images? And could you describe a little bit – was that the initial solution that made sense to you, or is that something you stumbled on later? Could you describe maybe a little bit more of that process and how you came to that solution?
It was a research which came off an existing paper which an MSG-GAN was used. We decided to make a few modifications on that, to be able to take in all the possible bands which we have, and generate them.
The initial people just had generating images without the labels, so we were able to modify it and also generate, including with the labels as well. So that’s basically the setup which we used.
I’m curious, as you’re talking about GANs and the use of that, and trying to augment data and stuff, have you – as kind of a little bit of a random question thrown in… Any thoughts on the use of simulation? We’re seeing a lot of simulation starting to be used with that going forward in this space… Have you all gotten into that area, have you put any thought into what you might do in terms of data augmentation with simulation and GANs and such, or not?
Something on the GAN, just to say - we have a website, if you’re interested to see some of those synthetic imagery; you can go to isthisplacereal.com. We have a game there, you can see how good you can detect real from synthetic satellite imagery or fake imagery. We got interested in this, similar to the website that [unintelligible 00:23:28.25] thispersondoesntexist.com.
But anyway, back to your question about simulations… So that is true, that is a growing kind of application in the generally air science/geospatial sector. We haven’t at Radiant touched on that space yet; we are not doing any simulation ourselves. But particularly for those who are working to embed AI/ML modeling into the general climate modeling, in terms of projecting what will happen with a warming world within like 10 years, 20 years, 100-year windows, they are basically relying on many of these simulations to fit it into the machine learning model part and train those models.
[24:06] That is a very growing field; there’s a lot of research, a lot of interdisciplinary – actually, departments being established at universities to just work on that type of problem; how we can learn from the physical and the simulation world to teach the machine learning models to simulate [unintelligible 00:24:20.12] and be more scalable, and hopefully more interpretable in the world. So it’s a growing field, but we at Radiant haven’t done anything in there yet.
I’m interested – as I was sort of exploring the crop spotting or classification use case that you’re highlighting, I noticed that there’s a leader board on the Zindi site related to a Spot the Crop challenge. Could you talk a little bit about that, and how you’ve decided to utilize some of this competition leader board type of approach to look at some of these problems?
Yeah, so this is actually one of those use cases that we work with a local partner. In this case it was the Western Cape Department of Agriculture in South Africa. So they do this agricultural census every decade or so, because as I mentioned, it is a very extensive and expensive process… And they had done the recent one in 2017-2018. So they had a high-quality map of their state in terms of what crops are grown, specific field boundaries of each farmland, the crop type, and other metadata, like irrigation type, and so on. And they were interested to see “Can this process be automated, or at least semi-automized, using satellite images?”, so they can get the updated map every year. And what is the arc of possible with that.
So what we worked with them was receiving that data as a partner, and then curating a high-quality and diverse training data out of that. So we matched those kind of labels that they have collected on the ground; those are practically what we call labels - you are on the ground, you collect the crop type, and because we matched this with satellite imagery, that becomes the label. And then we basically match with corresponding time-series, as Abba mentioned; in this type of problem we use usually time-series of imagery, it’s not just one image and you classify a label for it… Because crops have a seasonality, they have a phenology, and you wanna find that signature, and then the model will be better able to decide if this is wheat, or maize, or sorghum, or so on.
So we basically curated that dataset, and then to kind of crowdsource models, we ran a competition. Getting some support from the GIZ FAIR Forward program, we ran this competition on the Zindi and exposed the problem to a pool of talents that Zindi has, to see who can build the best model for this crop type classification problem.
Similar to many other competitions, we had a training set, and a test set, and the test basically predictions that were hidden; we used that for scoring, and kind of defining that leader board that you saw… And people basically build their models – the incentive for them is getting exposed to a new problem, it’s also a capacity development effort for us, because many of the people in the AI community across Africa are eager for new problems… And I think geospatial and satellite imagery is one of those domains. So it’s also a capacity development effort for us, while it is a real-world problem-solving. It is a problem that a government agency is interested in, and there is a good potential for it…
So the winners basically are those three on the leader board that you see; they have built the best models in terms of accuracy score of detecting crop types in Western Cape South Africa. The models are all open source; we haven’t put it on GitHub yet, but soon they will be. Yeah, so that is the scope of that competition…
I’m wondering, maybe Joyce as a member of the wider research community in Africa, you could give your perspective on how a research group like yours might think about using some of these tools that Radiant Earth is creating, and what it might enable for you… And maybe then if you have follow-up questions for Radiant Earth in those regards, feel free to take us wherever.
[28:09] Yeah. So I think this is interesting… I think the examples that I gave [unintelligible 00:28:13.20] deforestation is one of the practical problems that we are working on in the lab… And it was interesting to hear Hamed’s thoughts on that. But also, what is important is - from the lab, we’ve been doing a lot of work around collecting data about the crops, some sort of crop mapping, and that there is the problem, trying to understand the ground truth and collect as much ground truth data as possible… Because if you want to use [unintelligible 00:28:36.25] then the ground truth can act as a reference.
So I feel like the problem also that we can solve is now that we have the ground truth, can we be able to map that to the satellite imagery data and be able to build models that can easily be useful, crop type mapping around different farms in the country… So there are several potential areas that really we can benefit from what Radiant Earth is providing and the kind of data that they are providing… But especially, since Makerere AI Lab is located within a university, it really provides an opportunity for capacity building, because that’s how the lab grows. We have a lot of students who come in, do internships, and then get introduced to several concepts in machine learning and AI, and ML as observation is one of those concepts that we are starting to work on.
know that earlier this year we had the ML bootcamp that we organize together with Radiant Earth and FAIR Forward as well… So I think that’s what I wanna hear from Hamed. I just saw recently that [unintelligible 00:29:34.13] so I think it’s important for us to know, if I were a student now there, and I wanted to learn more [unintelligible 00:29:43.00] Radiant Earth, yes, there are tutorials, but I want to get a whole idea of [unintelligible 00:29:46.26] how do I get the data… But not just the data, how do I actually start to build my own model; for example for a prediction of deforestation, how do I get started? I know that Radiant Earth has been able to provide that opportunity. So Hamed, if you could speak more about this capacity development, that would be really great.
Sure thing. When you think about the whole ecosystem of sustainable development goals, all the data being available, and as I mentioned, working with local partners, one of the missing pieces that we’ve found is we need to train more individuals to be able to solve these problems… Because it is a new field, it is a growing field, it is an impactful field, but we need people to get trained how to use the data, and generally starting with what is satellite imagery, as Joyce mentioned. For that reason, one of the pillars of our work is really training and capacity development.
Earlier this year, back in May of 2021 actually - so we joined efforts with Joyce’s team at Makerere University and got support from the GIZ FAIR Forward program to run a training bootcamp focused on machine learning for Earth observation. And the target for us was there’s a growing community of AI practitioners across Africa; we at Radiant have worked with Data Science Africa, with [unintelligible 00:31:01.02] across different countries, and they are all AI people, but they’re not exposed to the remote sensing work. So the training was outlined in the sense that let’s start exposing them to what is satellite imagery, how you deal with this type of data, how do you access them? I mentioned the APIs and the repositories, but practically, how do you write the Python code to get that type of imagery? And then how do you curate a training dataset when you have the reference data on the ground?
So the lectures were kind of designed in that sense, and we had around 40 participants in that course… And then in the second week of that we were working on practical use cases - how do you build a crop type classification model? How do you build a model for, for example, wind estimation from a cyclone? We’re using, again, satellite imagery that we get regularly. And particularly, we were asking participants to also work on some of those exercises between lectures. So it was hands-on and practical, it wasn’t just lectures.
[31:56] And I think it was a very successful training based on the feedback and the survey we collected from the participants afterwards… And then we packaged that training program into an online course which is now available on the [unintelligible 00:32:07.17] So anybody can go through that course now, on their own pace, and basically start learning about Earth observations, the topics that I mentioned now to you, dealing with machine learning in that sector, and then building your own model. If you finish the course [unintelligible 00:32:23.28] you can also get a certificate, and there’s also a user community on our end… So we have an open Slack workspace; the link is in that course, and it’s basically [unintelligible 00:32:33.26] that anybody can join, ask questions, connect with other peers, share their experience or problem, or look for collaborators in others. So those who participate in that course can also connect to others and basically get feedback from others.
So as you all have been talking about this for the last few minutes, it’s a cool process, and I find my mind wondering back to earlier in the conversation, and you’ve hit a couple of different use cases… But I’m starting to wonder how this could be applied to so many areas, now that I understand how you’re approaching, and the tools that you’re developing… So if you go back to that notion of sustainable development goals (SDGs) and those kind of big-problem areas that you’re addressing some of those, I’m rather curious, now that I understand, which of those types of things you’re hitting, or planning to hit in the future, and which might be aspirational? So even if it’s something that you have an interest in, your brain has been chugging on it for a while, and you know that might not be something that you’re going to be able to address right away… I’d love to understand what you’re thinking about what’s possible here; what types of things, what types of problems are achievable maybe in the short, medium and long-term, long-term maybe just being aspirational?
Yeah. I will share my feedback; Abba, feel free to jump in afterwards. So think of geospatial as a horizontal sector. So geospatial can feed into many, many vertical domains, from culture, food security, land cover, surface water monitoring, draught monitoring, deforestation, ocean monitoring, sea life… There’s many aspects that geospatial as a horizontal sector can feed into. That’s how we have established MLHub; so it is agnostic to the application. But as a Radiant team, we have limited capacity, so we work ourselves on the specific problems which are particularly in the agriculture and food security sector. That has been really the prime application area for us, because of its impact across the development sector. It employs a significant portion of the labor across these developing regions, and it’s a significant portion of the GDP, and then food security and human life is definitely another angle of that. So for those reasons, we have been kind of doubling down our efforts into how we can better solve stakeholders on the ground to solve the problems related to that.
The ambition and the end goal, if I wanna think “Okay, what is the ideal world, the utopia that we are thinking about?”, is the kind of stakeholders, particularly in this case governments, and to some extent commercials who are providing services, be able to say “Oh, I’m in this region. I wanna know what the farmers are doing, I wanna provide the right recommendation to them in terms of fertilizer application, the best crop type to grow, how to basically plant that, how much water you need, and maximize their production toward the end of the season.” That addresses their economic well-being, that addresses their food security, and the human well-being, because they have more nutrition and food to feed the society.
And then we are utilizing our national resources the best way, because that’s another angle of this thing. If you’re over-planting in regions that are not suitable for that specific crop, we are killing the nutrition of the soil, and then down the road we’re not gonna have a sustainable agriculture.
So really, the end goal is supporting the governments and the stakeholders to be able to do that. It needs a holistic approach; it is not just us doing that. But what we are trying to do is showing the art of possible, providing those benchmarks and access points, and providing the know-how and the skillsets to as many people as we can train, and then asking them to train others, so be sustainable in the training ecosystem to be able to solve those types of problems.
[36:14] I would be curious just to sort of follow up on that. That was a very interesting take on where geospatial sits in the stack, horizontally, and what it can impact. I was wondering, just for our listeners our there who – you know, with it being Practical AI, many of them are hopefully thinking about practical things… I was wondering if you could describe a little bit in more detail, if people are excited by this podcast and they want to dig a little bit into the Radiant MLHub, could you just describe a little bit what the API looks like for the Radiant MLHub, and the sorts of functions and things that you can do with the hub? I see some sort of model extensions, and that sort of thing…
The Radiant MLHub, just like Hamed mentioned - so now there’s a model registry on it, and it contains just brief tutorials on how to use the models, and also it has the datasets. So you could browse it for different applications - so crops, wildfire, land cover… Just different applications which you might want to use, and you get to see all the datasets.
Now, the API calls made - it’s a bit direct as well, because it was made to be simple. In Python, you just from Radiant’s MLHub import dataset and then you specify the dataset which you want, for each of the datasets. It’s a bit direct as well how to do that. You just proceed from there to build whatever model you want to build, or analyze data… So it’s a bit straightforward. It was made to be very simple for anyone to use. Everyone can view that as well on MLHub.earth. So you could register and download whatever datasets there, and work with it as well.
I’m wondering, as we come near to a close here - we’ve heard some, of course, very exciting things about Radiant Earth and MLHub… I wonder, Joyce, if you had a final closing question for the team as you’re thinking about Radiant Earth, how it’s impacting AI in Africa, and going into this new year of work… Do you wanna maybe close us out with that?
So I think also looking back at the training that we had, I think that’s a very important thing for people to reflect about, that the resources are available… And also note that it’s not only for people in Academia, but also people in industry, in policy… I remember during that training we had people who had private startups, that were looking around Earth observation, who also benefitted from the training that’s available.
So when I think about where AI is going in Africa, the things that we think about - one is that this capacity, and especially in a growing field like ML for Earth observation, so how do we train people, how do we build capacity… But then another thing that I also think about is the data, and through the Radiant MLHub the data – because Hamed has explained that they provide ML or AI-ready datasets that people can be able to use for various problems that they have to encounter using that observation data.
So I feel like that helps to close this gap that we’ve been struggling with, of “Okay, there’s no capacity” or “I’m not knowledgeable in that area. Okay, if I get that knowledge, then how do I gain access to the data?” So the data is also provided there. There are tutorials, and it’s a bit easier for people to follow through and practically build their ML models.
So I kind of feel like as a ramp-up this is something that’s important for us to remember… But maybe, Hamed, there’s something that I have missed out that you think could be important for us, for the AI community in Africa, as we move into the new year. It would be exciting to hear your thoughts on this.
[40:01] Yeah, thank you, Joyce. I mean, you hit it pretty well. This is about the community and the capacity. We are acting as what we call a collaborative agency, providing those resources and supporting the community to be helpful to their end users, which are decision-makers, farmers on the ground, some people who are working in the sustainable urban development…
For us, the success of us is basically the success of those end users who are building those applications. So if they are more efficient, more productive in deploying solutions into their community, we are successful, because we have been able to empower them. That’s really how we look at this ecosystem. And we look forward to engaging with more partners, with more users… As Abba mentioned, everything on our end is open access, so feel free to start using the data.
If you have data you wanna contribute - because MLHub is an open repository, so you can access the data openly, but you can also contribute data. It is not just us publishing data; many of the datasets are contributed by other providers and users.
So if you have a benchmark data and you wanna expose it to a broader community, please get in touch; you can publish your data on MLHub, and we definitely are interested to expand the coverage of the data, in terms of both geospatial coverage, as well as the application areas.
And we will have more training and capacity developments in 2022 and the years beyond. If you have specific needs and you wanna get support from us, get in touch, the Slack workspace that I mentioned is definitely a good way to communicate with us. We also have a support channel on our website, you can see… And we look forward to engaging more of those users.
Well, thank you all so much. Thank you, Joyce, for joining us again, and for the Radiant Earth team for taking time out of their amazing work to have this conversation. It really is wonderful, and we’ll include links in our show notes for those that want to jump off to the MLHub, to jump off to the open datasets, and the competitions, and the course… So please, take a look, start to get involved in the new year.
Thank you to you all. Have a wonderful rest of your day.
Our transcripts are open source on GitHub. Improvements are welcome. 💚