Nadia Eghbal and Mikeal Rogers kick off Season 1 of Request For Commits with a two part conversation with Karl Fogel — a software developer who has been active in open source since its inception.
Linode – Our cloud server of choice! This is what we built our new CMS on. Use the code rfc20 to get 2 months free!
Rollbar – Put errors in their place! Full-stack error tracking for all apps in any language. Get the Bootstrap plan free for 90 days. That's nearly 300,000 errors tracked totally free. Members can get an extra $200 in credit.
Karl served on the board of the Open Source Initiative, which coined the term “open source”, and helped write Subversion, a popular version control system that predates Git. Karl also wrote a popular book on managing open source projects called Producing Open Source Software. He’s currently a partner at Open Tech Strategies, a firm that helps major organizations use open source to achieve their goals.
I'm Nadia Eghbal.
And I'm Mikeal Rogers.
On today's show, Mikeal and I talked with Karl Fogel, author of Producing Open Source Software. Karl served on the board of the Open Source Initiative, which coined the term 'open source' and helped write Subversion. He's currently a partner at Open Tech Strategies, helping major organizations use open source to achieve their goals.
Our focus on today's episode with Karl was about what has changed in open source since he first published his book ten years ago. We talked about the influence of Git and GitHub, and how they've changed both development workflows and our culture.
We also talked about changes in the wider perception of open source, whether open source has truly won, and the challenges that still remain.
So back in 2006 I started working at the Open Source Applications foundation on the Chandler Project, and I remember we had to kind of put together a governance policy and how do we manage an open source project, how do we do it openly, and basically your book kind of got slapped on everybody's desk. [laughter] The Producing Open Source Software first edition, and it was like "This is how you run open source projects."
Wow, that's really nice to hear, thank you.
And it was... Especially at that time it was an amazing guide, and I know from talking with Jacob Kaplan-Moss that the Django project did something similar, as well. I'm very curious how you got to write that book and what preceded it. It's produced by O'Reilly, right?
I'm curious why O'Reilly wanted to do something... It's very deep and very nerdy, so...
Yeah, actually I wanna take a quick second to give a shout out to O'Reilly because... I mean, that was never a book that was gonna be a bestseller, and they sort of knew that from the beginning, and they not only decided to produce it anyway, they gave me a very good editor, Andy Oram, who made a lot of contributions to the book in terms of shaping it and giving good feedback. And they let me publish it under a free license, which to a publisher, that's a pretty big move, and it's not something that they do with all their books. So I really appreciated the support I got from them.
So the answer to your main question there I'm afraid is pure luck. I really think that in the early 2000s, 2005-2006 the time was ripe for some kind of long-form guide to the social and community management aspects of open source to come out, and my book just happened to come out. If someone else had written a long-form guide, then... You know, it's like in the early days of physics - if you just happen to be the first person to think of calculus, you'll get all this credit; but there were probably ten people who thought of it, it's just that someone published this first.
So yeah, I just got really lucky with the timing. And the way that I was motivated to write it, that O'Reilly had contacted me about doing a Subversion book... I was coming off five or six years as a founding developer in the Subversion project and it had been my full-time job, and I'd gone from being mostly a programmer and sort of project technical - not necessarily technical lead, but technical arbiter or technical cheerleader in some sense, to more and more community manager. I mean, I was still doing coding, but a lot of my time was spent on just organizing and coordinating the work of others and interjecting what I felt were the appropriate noises in certain contentious discussion threads and things like that.
[00:04:09.07] So when it came time to write a Subversion book, I had already written a book, I knew folks at O'Reilly, and they said "Would you like to be one of the authors?" There were a couple other Subversion developers that I worked with who were also interested in writing, and we had all agreed that we would co-author it.
Then as I started to write, I really let down my co-authors. I said, "Hey, folks, I'm really sorry. I don't wanna write another technical manual. I've already done that once. You folks go do it, it's gonna be great." And I wrote the introduction and they wrote a wonderful book that became one of O'Reilly's better sellers and is still quite popular.
So I thought, "Well, what was it that I wanted to write if that wasn't the book?" and I realized the book I wanted to write was not about Subversion the software, it was about the running of a Subversion project, and about open source projects in general - Subversion wasn't the only one that I was involved in. So I went back to O'Reilly and I said very meekly, "Could I write this other book instead? What do you think of that?" and they said yes. So I sort of backed into it... I was forced into the realization that I wanted to write this book through trying to write another book and failing.
Was that a popular view back then? Like, when you said that you wanted to write this non-technical, more management-focused book around open source, were people like "Why?"
Let me cast back my memory... No, but then again, the people that I talked with - that's a very biased sample, right? Most people were encouraging, and if they were mystified as to why I wanted to write this, they hid it very well and were nothing but encouraging. Then it took a little bit longer to write than I thought, and people were asking "How's it going?" and I'd always give the same answer, like "How's your book going?" "Never ask. Thank you." [laughter] No one ever listened to that, they would just ask the next time. But eventually it got done.
I think there was, among people involved in open source. For example, the role of community manager was already a title you started to see people having. You started to see a phenomenon where the coordinating people, the people doing that community management and projects were no longer also the most technically sharp people. I was definitely not the best programmer in the Subversion project; I could think of a lot of names - I've probably even forgotten some names of people who I just think are better coders than I am, who were working on that project.
And that was true across a lot of open source projects. I could see that the people who were doing technical and community work together were not the Linus Torvalds model - and Linus Torvalds isn't by any means a typical example... The Linux kernel in general is not a typical example of how open source projects have ever operated. It's been its kind of own weird, unique thing for a long time. But one thing you can say about it is that the leader of the project is also one of the best programmers in the project. Linus is a very technically sharp person. But that was not the case in a lot of open source projects, and that to me seemed like a clue that, "Okay, something's happening here where open source is maturing to the point where different sets of skills are needed", and you've got these crossover people who are often founding members of a project and active in coding and other technical tasks, but their main focus, their main energy is going to the social health and the overall community health of the project. [00:07:45.19] I wasn't the only person sensing that. A lot of people seemed to already understand the topic of the book before I explained it to them.
For that first book, I mean, you came up through the '90s open source scene and were clearly doing a lot of community work on the Subversion project - did you write it mostly just from your own experiences and memory, or did you go through a phase of research and reaching out to other projects?
That's a really good question. Yeah, I researched other projects. I did rely a lot of my own experiences, which were somewhat broad; I had worked on a lot of projects by that point. But I was worried that I would be biased, and particularly towards backend systems projects, because I was a C programmer, I didn't do a huge amount of graphical user interface programming or stuff like that. Web programming was kind of new then, but I still hadn't done a lot of it. So I deliberately sought out some other projects to talk to and people were very generous with their time. I think I listed them all, either in the acknowledgments or in the running text or the footnote. So not all those were projects that I worked in, they were just places where people were willing to be informants.
Interesting. You mentioned that people were starting to come around and you were starting to see community manager as a title, but I do feel like the book addressed something and reset people's expectations about how open source projects run. It did bring a lot of this community stuff and not everything being purely technical to the forefront. If there was one presumption that projects had at the time, that the book was meant to address - is there one that you can point at? Or any kind of general stories that you might have heard about shifts in people's... What I really wanna get at is, people's conception of open source had been this pure meritocracy, pure technical side of things, right? Not a lot had been done in a formal way to address the role of people and people management, and processes and barriers to entry until your book, as far as I know.
I think I get the question you're asking, and it's a good one. I've never really thought of the book as addressing a sort of as yet unacknowledged need, but I guess in a way it was. The observation I had at the time in Subversion, and then as I started to talk to people in other projects I realized it was just as true for them as it was for subversion, was that there's no such thing as a structureless meritocracy, and there's not such thing as a structureless community. We're all heard of the famous essay The Tyranny Of Structurelessness, in which the author points out that if you think you have a structureless organization, what you really have is an organization where the rules are not clear and people with certain kind of personalities end up dominating by sometimes vicious or deceptive means. And that has certainly been the case in some open source projects. I don't wanna name names, but we could probably all think of some.
What I saw on Subversion was that managing a bunch of people who were not all under one management hierarchy, like they were coming from different companies, and some of them were true volunteers in the sense that there was no way in which they were being paid for their time, or only very indirectly, but a lot of them were being paid for it and they had their own priorities, and to make that scene work and to have the project make identifiable progress, you had to broker compromise; you had to convince people like, "Okay, this feature that you want needs some more design work, and the community has to accept it. That means it's not gonna be done in time for this upcoming release, but we don't wanna delay the release because there's another thing that this programmer or this company wants needs to be in that release and they're depending on it. And by the way, if you get them on your side by cooperating now, he'll be much more likely to review your changes and design when your stuff is ready." Things like that. Making sure that the right people meet or talk at in-person events. Occasional backchannel communications to ask someone to be a little bit less harsh toward a new developer who is showing promise or is perhaps representing an important organization that is going to bring more development energy to the project, but we need to not offend their first person who comes in, who is maybe not leading with his best code; it sometimes happens.
[00:12:17.09] There were all sorts of stuff that had to be done that was not necessarily visible from just watching the public mailing list. So the book was basically - I realize I'm giving a long answer, you should feel free to edit this down, by the way... Now I'm trying to be a little less verbose...
No, this is perfect.
Okay, I'm glad. [laughs] I guess the thing the book was meant to address was you get a lot of programmers who land in open source somehow, they find themselves running projects or occupying positions of influence, and both because no one has ever said it, and because it's not visible from the public activity on the project - or not entirely visible - and because there is a predisposition among programmers to be less aware of social things; statistically speaking, programmers I think are somewhat less socially adept people than most people. Obviously, there are exceptions to that, but I think it's a broad categorization that is statically true. So for all of those reasons, I wanted there to be a document that said, "Hey, you need to start thinking about this as a system. You need to start thinking about the project in the same way you think about the codebase. Parts need to work together, and you need to pay attention to how people feel and to their long-term interest, and you've gotta put yourself in their shoes. Here's a rough guide to doing that."
That's what I was thinking when I was writing the book, and I never really articulated that until you asked the question, but I'm pretty sure that's more or less what I was thinking.
Yeah, I mean, we're still struggling with that today. [laughs] We're talking in the past tense because the book came out ten years ago, but I'm still struggling to get people to recognize that today...
Well, let's go right to the controversial stuff. The Linux kernel project is famous for kind of having a toxic atmosphere, right? And Linus has basically said that he equates the thing that most of us call toxicity with meritocracy. In other words, the kinds of people who write the kinds of code that he wants to end up in a Linux kernel are the kinds of people who flourish in the atmosphere he has set up.
Maybe that's actually true, but I just don't think the Linux kernel project has run the experiment of trying to... Forking the project and running a nice version, where everyone is welcomed warmly and not insulted personally by a charismatic leader, in which they can see whether that theory is actually true.
Right. I was actually not even thinking about projects that are more than ten years old, but even projects that start today struggle with this. Just acknowledging that soft skills matter and that somebody needs to pick up this community work.
I think it's interesting that you said that you wrote the book in 2005, around this time when you felt like people were starting to notice and care about the need for skills beyond coding, but I feel like that's almost what people would say about right now too, so I wonder if anything's even changed in ten years or not.
Well, just imagine how much worse things would be if we hadn't all been through that. [laughter]
You never have an alternate universe in which to run experiment, unfortunately. But I think it will always be true, because the startup costs in open source are so low - although that's changing a little bit, and we can talk about that later - so that the people who start projects, they'll just land in it coming from a technical interest. They're not starting out by thinking of soft skills, so the projects are always launched in a way that's sort of biased towards a certain kind of culture, and then they end up having to correct toward a more socially functioning culture, even though that imposes a small amount of overhead on the technical running of the project.
[00:16:17.13] And if it's a useful project and people are like "Well, I'm gonna use it"... Or even if it's not useful, but it's just kind of a legacy being used, it's like, what incentive is there really? I think it's still very hard to tie together, and in some cases you can tie together the health of a project with its popularity, but sometimes it's a popular project and it's just not that kind of place.
Yeah... I can only make anecdotal studies there. One example is the LibreOffice project - it has really gone through a great deal of trouble to be welcoming to developers and to make their initial development process easier. Building a project is now way, way easier than it used to be; they've just really sunk a lot of time into making it easy to compile it from source, to welcome new developers. I think that's having a good effect, but how do you know how popular or how successful the project would be without that? You just don't.
You mentioned that you've released it under Creative Commons license, and I saw that you've actually kind of kept it a little bit up to date and you've kind of pushed small changes to it over time, but in 2013 you decided to actually do a full new edition of the book. What precipitated the need for an entire new edition, rather than just adjustments?
A few things. One, the adjustments that I had been doing in the years from 2006 roughly to 2013, they weren't that trivial. I mean, there were a lot of small scale changes that went in. I think most sections of the book got touched, some of them pretty heavily, but I was never thinking of it as a full rewrite. And then it was really partly my own feeling about certain things that were out of that and partly feedback I was getting from other people. One thing everyone noticed - and I noticed too, because I also use Git for my coding work, although I use Subversion for non-coding version control - was that all the examples used Subversion, which was totally the right thing to do in 2005, because that was the thing that you stored your open source code in, but it just wasn't by 2013; Git was the obvious answer, and frankly even though the site itself is not open source, GitHub was clearly the thing to use. For example, most active open source code is just on GitHub, and if the book doesn't acknowledge that fact, then it's just not reflecting reality and it's not giving people the smoothest entry into open source that they can have.
So one obvious thing was the revamping of all the examples to use Git instead of Subversion and to talk about GitHub. And also in general, the project hosting situation had changed. I'm sorry, I just don't consider SourceForge a thing anymore. [laughter] So many ads, too much visual noise, not compelling enough functionality, and that's despite the fact that the SourceForge platform itself finally went open source, as the Allure project - which is great, I'd love to be using it, but I'm afraid I just have a much better experience with GitHub and Git, so that's what I use.
So the recommendations about how to host projects really needed to change to be oriented more around the Git-based universe and to at least acknowledge and recommend GitHub, while acknowledging that it itself is not open source... Although I hope that they see a grand strategic vision whereby opening up their actual platform makes sense some day; I think that the real secret sauce there is the dev ops, it's not the code, so I hope they do that someday.
[00:19:53.06] The other thing that changed kind of in a big way was what I think of as the slow rise of business-to-business open source, which is... The old cliché was "Open source always starts when some individual programmer needs to scratch an itch"; she needs to analyze her log files better, so she writes a log analyzer and then she realizes that there are other sysadmins who need to do the same, so they start collaborating, and now you've got an open source log analyzer, and she's the de facto leader of this new open source project. Well, that did happen a lot, but now you have things like Android. You have businesses putting out open source codebases like TensorFlow. I don't mean to pick on Google examples only, it's just that those are the first things that come to mind, but Facebook also does this, Hewlett Packard does it... Lots of companies are releasing open source projects which are - I guess you could call them it's a corporation scratching a corporation's itch, but it is not a case of an individual developer; it's a management move, it is done for strategic reasons which they can articulate to themselves and sometimes they also articulate to the world.
And I thought that the rise of that kind kind of project needed to be covered better, and that that was a trend that if the book could explain it better to other managers in tech or tech-related companies, that perhaps it would encourage some of them to join that trend.
And sorry, I'm realizing that there's one component to the answer - the other thing that changed was that I expected governments to be doing more open source by 2013 than they were, and I had at that point been very active in trying to help some government agencies launch technical products as open source, because they were gonna need that technology anyway. It's taxpayer-funded, why not make it open source? And they were just really culturally not suited to it. There were just many, many things about the way governments do technology development, the way they do procurement, the way they tend to be risk-averse to the exclusion of many other considerations, really made open source an uphill struggle for them, and I wanted the book to talk a lot more about that, because I wanted it to be something that government decision-makers could circulate and use as a way to reassure themselves that they could do open source and that it could be successful, and that they didn't have to look at it as a risky move.
So there were some new trends that I wanted to cover and there were some new goals that I had for the book, and they just required ground-up reorganization and revamp.
Wow, that's great. We're gonna take a short break and when we come back Karl's gonna get into how GitHub has changed the open source landscape.
We're back with Karl Fogel. Karl, in your mind, what of Git and GitHub changed about open source today? What are the biggest shifts that happened from the Subversion Apache days to now?
[00:23:44.29] Well, so I might have to ask you for help answering this, because I wonder if I was so comfortable with old tools that maybe I was blind to something that was difficult about them. I didn't feel like GitHub changed the culture tremendously expect in the sense that Twitter changed the culture of the internet, which is to say it gave everyone an agreed-on namespace. Right now Twitter is essentially the "Hey, if you have an internet username, it's whatever your username is on Twitter. That's your handle now." And in open source your GitHub handle, which for many people is the same as their Twitter handle, that's like your chief identifier. And it's not a completely unified namespace and there are plenty of projects that host in other places and many developers contribute to projects that are hosted in places other than GitHub... But it is sort of a unified namespace.
If you have an open source project and you don't have the project name somewhere on GitHub, someone else is sure to take it for their fork, right? So you've gotta get that real estate even if you're not hosting there.
But I think the way GitHub wants to think about it is that they made it a lot easier for people to track sources, to make lightweight quick, so-called 'drive-by contributions' and to maintain what used to be called vendor branches, that is to say permanent non-hostile forks; internal, or sort of feature forks that are maintained in paralel with the project, where the upstream isn't going to ever take the patches, but they'd have otherwise no particular animosity toward the changes, or are even willing to make some adjustments so that the people who need to maintain that separate branch can do so.
So I think their goal was to make all that stuff easier, and also to make gazillions of dollars, which I'm happy to see they're doing. And I think that it is part of GitHub's self-identity - for the executive and upper management team, it's part of their self-identity to think of themselves as supporting open source, that they are doing good for open source. And as I said, I always remember that the platform itself is not open source, but that aside, I think in many ways it's true, they do a lot of things to support open source.
The moves that they made to give technical support and kind of a little nudge to projects to get real open source licenses in their repositories was a really helpful thing. Nowadays most active open source projects on GitHub do have a license file, and that's partly because GitHub made a push to help that happen, and they've done a lot to support getting open source into government agencies and things like that. So I think they had sort of cultural motivations, as well as technical and financial motivations.
So has it changed the culture of open source? That's the thing, I'm not really sure it was all that hard to contribute to an open source project before GitHub. Maybe that's because my specialty was working on one of the tools that is the main part of the contribution workflow, with the version control tools; I worked on CVS, which was the main open source version control system in the network on Subversion, which was for a while the main open source version control system. So if I wanted to make a drive-by contribution to some other project, of course I never had any problem doing it, because the version control tool is probably something I hacked on; it was just no trouble. But maybe you could tell me, was it actually harder?
Well, there's a couple things here, glancing over. Just a couple. And I suffer from the same problem, where you'll jump through hoops without realizing that they're hoops, because you're just used to doing this kind of stuff... But the Twitter analogy works really well; so yes, there's a shared namespace - and before that, people had e-mail addresses, so it's not like we'd lacked identity, but it did sort of unify those, so you know where to find anybody by a particular name, where to find a project by a particular name. But another thing that Twitter does too is it has a set of norms around how you communicate and how you do things with DMs and add replies and stuff like, right?
Source control is certainly part of the contribution experience, but if GitHub was just Git, it wouldn't be the hub... It wouldn't be GitHub, right? There's an extension of the language and the tools around collaboration that they also unified. In Subversion I can create a diff, but how I send that diff to you and how you communicate that it may or may not go in or out, how we might communicate about that review process, that is not a unified experience across projects in older open source the way that it is in GitHub, right?
That's true, and that's a really good point. I mean, it was never hard to find out. Usually you mail the diff to the mailing list and people review it there, right? But you had to find out the address, you had to go read the project's contribution documentation, and maybe that didn't exist or was not easy to find... And you're right, on GitHub it's 'Submit a pull request'. You know what to do - fork the repository, make your branch, make your change, turn it into a pull request against the upstream, and now it's being tracked just like an issue, and by the way, the issue tracking is also integrated, so now you don't have to go searching for the project's issue tracker.
Yeah, I mean that workflow itself may not be more discoverable than sending a diff to a mailing list, but once you do it, it's the same everywhere. I think that's the bigger shift.
No, in fact I think it's less discoverable, in the sense that the actual... I mean, I've trained a lot of people in using Git; I go to a wonderful organization... In fact, I'm gonna do a shout out for them, ChiHackNight.org, the Chicago Hack Night, on Tuesday nights here. There are a lot of newcomers there who haven't used Git or GitHub before, or they've heard of it and tried it out. So I've had to walk people through this process of creating a PR, making their own fork or repository, and people get so confused, like "Wait, I'm forking the repository... But what's a branch? What's a repository? Where does the PR live?" It's conceptually actually not easy at all, but once they know it, they know it for every project on GitHub. And I think your point is very good, it's not that it's easier, it's just that you only have to learn it once now.
I think there's also something to be said for the friendliness of GitHub, even just visually, right? Twitter is again maybe a great analogy for that... It's just prettier. People feel more comfortable on a more consumer-facing website than navigating around the corners of the internet.
Yeah, and that's one thing that Subversion never had - a default visual web browser interface. There were several of them and your project had to pick, so the one you picked might be different from what some other project picked. With GitHub it's like... There are a lot of people who think of Git as GitHub. They think that that web interface that you see on GitHub, that is part of Git. Obviously, in some technical sense that's not correct, but in a larger sense, as far as their experience and their actual workflow is concerned, that's a pretty accurate way of looking at it.
Yeah. I think also - and this is one that is really to glance over if you have any experience, but because we're in this new, publish-first mindset, newer people will publish stuff and put it up there, and they'll actually get contributions. And it actually takes a much broader skillset to take contributions than it takes to push them to other projects, especially in traditional tooling, and GitHub also makes that incredibly easy. Their diff view is quite nice. They have the image different...
Yeah, it really is.
... and all of these other features, right? So if you're somebody that doesn't know Git very well and you just got your project up, getting a contribution and then having to pull it down locally and look the diff, it's actually like a whole big extension of that collaboration toolchain, and they make that so easy for first-time publishers that are now dealing with contributions coming in. It makes that workflow for them really easy and it also just allows them to enjoy the process of getting contributions from people.
[00:32:00.07] Yeah, you're right. I've never thought about that, but the process of becoming an open source maintainer is a lot easier on GitHub, and it's so satisfying when you click that Merge Pull Request button and it just goes in. All you did was you clicked the green button and you've accepted a contribution from a perfect stranger on the internet. It's so empowering, right? And that was not an easy process for new maintainers. In the old system you'd manually apply a diff and then commit it, and you'd have to write their name by hand in the log message, or something.
I think we're also skipping over this entire generation of tools like Trac and JIRA, that in a lot of ways were much harder to use than sending a diff to a mailing list. [laughs]
Well yeah, I don't know, because I got so used to them. I don't think that they were a discrete generation; I think that they were a continuum of tools that as soon as the web came around, people started making bug trackers that... The original bug trackers were worked by e-mail submission. You would communicate with them by sending them e-mail and getting responses back, and actually a lot of projects ran on that. Then people started making websites that would track bugs and you could just interact with the website directly, and then that was integrated with Wiki functionality, Wiki was invented, and it just took a while for interfaces to sort out the conventions that actually worked. In a lot of ways, GitHub is the beneficiary of all the mistakes that everyone made before GitHub was created. If GitHub had been invented in the year 2000, they would have made all those same series of mistakes themselves, but instead they could just look back and see what everyone else did and not make those mistakes. No libel on them, of course, that's what they should do, but that's why it worked out so well for them.
It's like MySpace and Facebook, or any sort of second adopter.
Well, I do think there's another element of this though, which is that those tools - and JIRA in particular is very good at this... It's developed for maintainers and for teams with a big project and a big process. So it is customizable to a project's process. That means that's great for that individual project if it exists alone by itself, but in an open source ecosystem where everytime I go to a JIRA there's a different workflow, that's incredibly daunting for individuals out there.
GitHub, because they were thinking about Git in the scale of people and contributions and forks and repos - you kind of take for granted that no, you can't have super customized workflows at the repository.
Yeah... One of the things I kind of admire about GitHub's management team is... I mean, if you look, GitHub has its own bug tracker. They have an open source code, but you can file bugs against GitHub itself, and that tracker is public. If you look through there, there are like thousands of these feature requests and modifications that people want, that for each person requesting, this change would suit their needs, it would really make life easier for their project, and basically GitHub employees spend their lives saying no. You just look in those threads and they are polite and they explain why, but they have to turn down most of those requests because they have to really think about the big picture and keep GitHub simple for the majority of open source projects, and they do a really good job at that.
[00:35:44.27] One of the things that I hope is happening, and I assume it is and I would like to look into it more is that GitLab and other open sourcers - in GitLab's case there is an open source edition and also a proprietary edition - should be using GitHub as kind of like their free of charge research lab. All the things that are being requested in GitHub and all the decisions GitHub is making, and all the innovations that GitHub has to not experiment with because of their scale and all the existing customer base that they can't afford to tick off - that is a real opportunity for these other platforms to say, "Hey, GitHub made the wrong call there. We're gonna do that and try it out, because they have less to lose right now and a lot to gain", and I think that there could be a very productive interplay between the two, that is in the long run good for open source. We'll just have to see. But the fact that GitHub is making all these decisions in public is very useful, I think.
Yeah, I agree. So when you first got involved in open source in the '90s, it was sort of a counter-culture movement, and of all the things that you could say about open source today, I don't think that you could say that it was a counter-culture movement.
Well, it's funny... I think open source no longer thinks of itself as a counter-culture movement, especially in the United States. Well, actually let me back up a bit. So the term open source, at least for this usage of it, was coined in '97, I think.
And open source was going on for many years prior to that. I had run an open source company and had been a full-time open source developer long before the term was coined, and people just used the term 'free software' and got confused, because there was just widespread confusion about whether that meant free as in there's no charge. AOL used to ship CDs to everyone's doorstep and that software was free, but it wasn't free in the sense of free software, in the sense of freedom. So there was a lot of terminological confusion.
One of the things that I think is downplayed today, or there's a little bit of historical amnesia about is the degree to which the coining of the term 'open source' was not simply an attempt to separate a development methodology from the ideological drives of the free software foundation Richard Stallman, but was also just an attempt to resolve a real terminology problem that a lot of people - and especially people who ran open source businesses were having, which was "What term do we use that won't confuse our customers and the people who use our software?"
Cygnus Solutions, which later got bought by Red Hat, tried to go with the term 'sourceware' for a while. That was an interesting coinage, and in fact my company, Cyclic Software, which I was running with Jim Blandy at the time, we actually contacted them to see about using that term, and we got a non-committal response where it wasn't quite clear if they were trying to trademark it or they intended for only Cygnus to use it.
That's even weirder.
That wasn't gonna work... If only Cygnus can use it, that's not gonna be [unintelligible 00:39:04.22]
That defeats the purpose, yeah.
Anyway, it didn't have a good adjectival form, so it wasn't [unintelligible 00:39:09.25] Eventually, when the term 'open source' came out, I just felt this tremendous relief. I was like, "Okay, no term is perfect. This term has some possible confusions and problems as well, but it is way easier for explanatory purposes than free software has been, so I'm just gonna start using it." And I didn't intend any ideological switch by that. I was still very pro free software, I ran only free software on my boxes, I only developed free software... But I just thought, "Okay, here's a term that also means freedom that will confuse people less.
[00:39:48.17] And then roughly a year after that coinage, when Stallman and the FSF (Free Software Foundation) realized that a lot of the people who were driving the term open source, who had founded the term - not necessarily people who were using the term, which was a lot of us - were also not on board with [unintelligible 00:40:08.29] did they start to make this distinction between free software and open source, and say "Just because you support one doesn't mean you support the other. They're not the same thing, even though it's the exact same set of licenses and software... So what do we mean by 'not the same thing'?"
So that ideological split is kind of a post-facto creation. It was not actually something that was going on to the degree that it was later alleged to be going on.
And in your book, I'm trying to remember - it's called Producing Open Source Software, but isn't the subtitle also How To Run A Free Software Project?
Yeah, the book is a total diplomatic 'split the difference'.
Yeah, you really went right down the middle there.
...How To Run A Successful Free Software Project. [laughs]
Yeah... You didn't commit to either one.
Well, I didn't want to, because to me it's the same - like if there were two words for the vegetable broccoli, I might use both words, but it's the same vegetable. Open source to me is one things; I can call it 'free software', I can call it 'broccoli', I can call it 'open source', it is still the same thing. People have all sorts of different motivations for doing it. Someone's motivation for participating in a project or launching a project are not part of the project's license, and therefore they're not part of the term for me.
That's a good transition into our next section. We're gonna take a short break and when we come back we'll talk about the mainstream version of open source.
[00:43:57.03] We're back with Karl Fogel. Karl, today a lot of people are saying that open source is basically one, in the sense that a lot of companies are using it, a lot of people are roaring around the term 'open source' who might not have traditionally been engaged with open source... Do you think that open source has won, or they're just sort of like different battles to be fought? Is that helpful vocabulary?
It has absolutely not won. I do not know why people think that. Where do you walk into a store and buy a mobile phone that's running a truly open source operating system? I mean yeah, Android Core is open source, or is derived from the Android open source project. I guess when people say it's won, what they mean is that if you think of software as a sphere where it's constantly expanding - or as Marc Andreessen said "eating the world" - the surface of that sphere is mostly proprietary.
The ratio of the volume to the surface is constantly increasing, and most of that volume is open source, so people who are exposed to the backend of software and who are aware of what's going on behind the scene in tech say, "Oh look, open source is winning" or "Open source has won" because so much of the volume inside the sphere is open source. But most of the world only has contact with the surface, and most of that surface is proprietary, and that surface is the only link that they're going to have with any kind of meaningful software freedom, or lack of software freedom; their ability to customize, their ability to learn from the devices that they use... Their ability - I mean, it's not the case that every person should be a programmer, but perhaps they should have the ability to hire someone else or bring something to a third party service that specializes in customization and get something fixed or made to behave in a different way. And for most of the surface of that sphere it's completely impenetrable and opaque and you just can't do that stuff; you have to accept what is handed to you. So no, I don't think open source has won in the meaningful ways.
I think there's a really important distinction there between software as infrastructure and software on the consumer-facing side. The research I've been doing and where I'm interested is almost exclusively on infrastructure, and I noticed there is this difference on maybe the ideals of free software to begin with, or around being able to change the Xerox printer, that was the Richard Stallman thing.
Right, that's the legendary story, which I think is true, of Stallman trying to fix a printer and not having source code to the printer driver.
Right. And so I wonder, is that frustrating for them...? In some ways it really won on maybe the infrastructure side, and it's almost even - I keep saying "won", or just been massively adopted almost because it's equivalent of free labour, like price-free stuff that startups can use, and so has the needle moved at all on the principle side of things? Or does it even matter?
Well, I have a very utilitarian view of the principle side of it; I do think that software freedom is important, but it's increasingly an issue of control over your personal life and your families and friend's lives, or at least being able not to put them in harm's way. A great example is Karen Sandler, the executive director of the Software Freedom Conservancy, she has a heart device; she has a congenital heart condition, she has a device attached to her heart, and that device is running proprietary software. That software - I don't know the exact version running on her device, but that type of software has been shown to be extremely vulnerable to hacking, to remote control.
[00:48:03.02] In fact Dick Cheney, the Vice-President had a similar device in his heart and apparently had the wireless features on the device disabled for security reasons. Think about the fact that the Federal Agency in the U.S. that is responsible for approving medical devices not only does not review software source code, it does not even require that the source code be placed in escrow with the Agency in case an investigation is later necessary. It just evaluates the entire system as a black box and says, "Yes, approved" or "No, not approved", and they have nowhere near the resources or the confidence, let alone the mandate to review the software for vulnerabilities, when software vulnerabilities are increasingly affecting everyone. Everyone's had a credit card account that's been hacked in some way.
I wonder if those battles are gonna be addressed maybe not through software freedom or open source or those types of movements, but I guess as you're describing it, I'm thinking more around hacker/maker movements and hardware stuff, or they might come at it from the same angle, saying "Why can't I just modify anything?"
Yeah, and you do see a lot of that. I saw a keynote at the O'Reilly OSCom, the Open Source Convention, you probably saw it, too... The woman who had hacked her own insulin pump; the software that controls a device that dispenses a chemical into her bloodstream turned out to be hackable, so they hacked it.
So I think you're right, the maker movement is driving it, and they share a lot of language and people with the open source movement. I just used the open source movement unironically; to me it's largely the same as the free software movement.
So yeah, there are various pressures toward people having the ability to customize or to invite other people to help them customize the devices that run increasingly large swaths of our lives. I guess what's happened is open source kept winning individual battles, but the number of things that software took a controlling role in kept increasing so rapidly that the percentage of things that are open source on the surface has been going down, even as open source keeps winning area after area.
I think that if you separated it nicely into two camps, if you look at the production of software versus the consumption of software, the reason we keep talking about open source winning is because it really has won or very close to winning the production of software. If you were a developer in the early '90s, most if not all of your toolchain was proprietary. The way that you developed software was to use other proprietary software; that's completely turned on its head.
Yeah, that was probably true, although it didn't have to be.
It didn't have to be at the time, but now the predominant way that you develop any software, including proprietary software, is to use a bunch of open source software.
Right, that's a really good point. I think you're right.
I mean, that proprietary code that's on that hard device is probably compiled with JCC. [laughs]
Or one of the other free compilers.
Or LLVM, yeah. And so because the voices in our world are so dominated by the people that have actually produced the software, there is this mindset that "Hey, I live in this world all day that it's 99% open source." It feels like it has won. And I think the reason that it won though - in that space, and not in the consumer space - is that there is a utilitarian reason that you need something open source. It is infinitely more useful if it's open source, and more useable as a producer if it's open source. And there's all these network effects that make it better over time that I can evaluate as a producer.
[00:52:12.24] But if you're looking at products and the consumption of software, it being open source or not is not visible to the consumer of that software, at least not immediately. So there needs to be some kind of utilitarian argument around that, and I think it may be privacy and security. That's a very, very good argument and it's getting more tangible to consumers now.
Yeah, I think that's at least part of it, and that has been a winning argument. A lot of the open source privacy and security projects have seen a lot more adoption and a lot more funding; just for various reasons, many of those projects tend to be non-profit, or at least not plausibly for-profit. It's very clear that for all of his eloquence as a writer and speaker, which I think is considerable, the reason Richard Stallman succeeded was Emacs and GCC. He wrote or cause people to coalesce and help him write two really great programs, and then motivated a lot of people to write a lot of the pieces of the rest of a Unix-like system; didn't unfortunately get the kernel, Linus Torvalds got that, and that has caused some bad blood ever since. But it was writing good code, that people could actually use, that gave him influence.
That's why they took his other writings seriously, it was the utility of the code. But I think going back to the way you started presenting that idea, I think one of the important goals, one of the important motivating factors in the free software movement was keeping blurry the distinction between producers and consumers; the idea that there should not be a firm wall between these two camps, and that anyone who just thinks of themselves as only using software... I sort of prefer 'user' to 'consumer' because when you use software, you don't - it's not like apples, where once you use it, it's consumed. [laughter] The software is still there after you run it, so it's not being consumed. But the idea that any user has the potential, by very incremental degrees, to be invited into the production of the software... In fact, that's what happened to me, that's how I got into it. I was just using this stuff for writing my papers in college and exploring the nascent internet, and someone showed me how to use the info tree.
That was like the documentation tree for documentation that covered all of the GNU free software utilities, and right at the top of the introductory node, the top-level node in the info documentation browser was a paragraph that said "You can contribute to this documentation. To learn how to add more material to this info tree, click here", where 'click' meant navigating with the keyboard and hit return; I don't think there was a mouse. There was no mouse on those terminals, they were VT-100 terminals, but the idea that the system was inviting me to participate in the creation of more of the system - that struck me as really interesting.
I think there are a couple of interesting things that might be happening in tandem around that now. We haven't talked about this at all, but just the definition of a software developer has changed radically in the past five years, where a lot more people are learning how to code. Maybe they're not at a very high technical level, but just enough that they are able to modify small things around them and see that power. I think learning how to code has just become so much more accessible, so you have so many people that are interested in modifying the world around them in much more casual ways. That is blurring the line between consumer and producer. Look at any child today, everybody is learning how to code, and just imagine when they grow up and they just expect that everything around them can be transformed. It's almost like people are coming at it from a different direction, but then at the same time you see all these very proprietary platforms that are basically exploiting network effects to centralize where people congregate on the internet, and those things are still total black boxes.
I don't know what happens when the youngest generation now grows up... Will they say, "This is bullshit!"? "This is not how we were raised to see the internet."
They'll say "This is bullshit", but they'll say it on Facebook.
I think that point about network effects is really important. What happened as an increasingly large percentage of humanity got internet connections was that the payoff ratio for building a proprietary system changed. It used to be that if you were building a system there was some reward for making it a little hackable, because the users you were likely to attract... Well, people on the internet at that time were already more likely to have potential to contribute to your system, so there was statistically some potential reward for making your system have a slightly open door to people coming in and helping out. But if you're launching something like Facebook or Snapchat in the age of most of humanity being online, then the trouble you go through to make that thing hackable versus the payoff when most of those users are not going to take advantage of that, the reward matrix just looks different now, and maybe it just doesn't make economic sense for those proprietary platforms to have a porous surface.
And oddly you see, like on Snapchat for example, where people are... Snapchat offers tons of things to make people essentially modify around them, like stickers, drawing on things or whatever. So it's that same behavior, but it's still on Snapchat's platform.
Right, and they control it and they track... Like, you can't fork Snapchat and make your stickers in the forked Snapchat, let alone do something else.
[00:59:41.22] The uncharitable way to say it is that everyone's creative and environmental improvement impulses are being coopted and redirected into limited and controlled actions that do not threaten the platform providers. Basically every platform provider's business model is "I wanna be like a phone carrier. I just wanna have total control over the user base and have people have to join in order to get access to the rest of my user base", and that creates a mentality that is antithetical to the way open source works. You don't fork a monopoly-based thing. You don't fork a thing that has network effects.
I have a hard time thinking that that is necessarily... That these things have to be in conflict. I don't think that users are ever gonna... I don't think that you can sell a product to users in a competitive market based on the values that will attract a community around people hacking it. You have to be a great product compared to everybody else on the terms that most users are using it, but that doesn't necessarily mean that you can't also be hackable. You just have to have a culture around the product of actually creating something good.
Look at the one success story that we have, for a short period of time, which was Mozilla. They won for a while and took a huge amount of market share away from Microsoft - enough that Microsoft actually came and participated at Web Standards again - because they made a better browser for users, and not just for people that were hacking on websites.
And it's because it's better, not necessarily because of those...
Oh no, my doom and gloom is not a moral condemnation, it's an observation of economic reality. I think what you're saying is correct, but it's still not good news for open source.
No, and I think that's what's so interesting about right now in even how people are using the term open source, and a lot of people say something is open source when it's not actually. So the term itself has been sort of being coopted into different definitions, and for a lot of people now that are just coming into it, they say the term 'open source' and they just mean "Why shouldn't I share what I made with the world?" or "Why shouldn't I change something that I see?", but it doesn't necessarily carry all that other history or expectations with it.
Well yeah, that coopting has been going on. Ever since the term was coined, there have been groups and people using it in ways that don't mean what it originally meant. There have been people coopting the term since it was coined, but there's always been counter pressure to preserve its original meaning, because the original meaning is so unambiguous and so clear. It's so easy to identify when it's being correctly used, that the counter pressure usually is successful. So I don't see any more of that now than in the past. I think that's just a constant terminological tug of war that's going on, but mostly the meaning of the term is as strong now as it ever was.
Well, I think it's as strong now to a set of people that still hold on to that term really strongly, but to be frank I think they're almost putting blinders on to how so many other people are using it. We've talked about this - at what point does that new definition just become the definition because so many people are using it that way?
Yeah, that's how the language works and I'm totally on board with that, but I guess what I'm saying is I try to see that happening - and a number of people do, and then they actually go where possible... When it's an organizational source of terminology dilution, they'll go to that organization and say, "Hey, the term doesn't mean that. Stop doing that!" and in almost every case the organization reforms their usage, and that's the only reason that open source still means anything; it's because that constant process is going on, and I haven't actually seen the ratio changing that much lately, and of course it's a very hard thing to gather data on, and Nadia you have been trying to gather data on this and you've been out there doing research on this so you might be right, but the blinders are anyway not intentional. We are actually out actively looking for that, and to me it looks like it's about the same as it ever was, and we just have to stay vigilant.
[01:04:08.23] That's a nice recap of the problems of people misusing the term or using it for something that's not within the scope of what open source means. But there's also a fair amount of - I don't know how to say this without being mean...
Oh, go for it.
Corporations or projects that are open source within the definition of open source, but aren't what we would call open.
Actually, I think that's okay and I don't care. In other words, if you're forkable, you're open source. And if you run the project as a closed society and even the developer's names are kept top secret, as long as the source code is available and its under an open source license and it could be forked, you're open source.
You're thinking more about the future of it, rather than the current reality. Like, even if I can't get anything done now, if it becomes a big enough problem, I have that option, right?
Well yeah. I mean, the fact that you have that option affects the behavior of the central maintainers, whether they admit it openly or not. The knowledge that your thing can be forked causes you to maintain it differently, even if you never respond to any of the pull requests, you never respond to any of the e-mails of anyone from outside the maintainer group. The mere fact that someone could fork it forces you to take certain decisions in certain directions so as not to increase the danger of forking, for example. So you still get open source dynamics, even when they're not visible.
Yeah, that's a good question, Nadia. I do think that some people put blinders on and try to ignore it, but they tend to get reminded of it. [laughs]
I didn't hear Nadia's question, I'm sorry.
I really wonder whether some companies actually see it that way, or whether they're actually acutely aware of the fear of a fork. Because again, like we talked about network effects, where even if nobody likes the thing anymore, if everybody is using a certain thing, it's very hard to actually switch off.
Well, it just requires... I mean, for business-to-business open source. Again, Android is a classic example. Google is very aware of the potential for forks; they are very aware of the business implications, to the extent that those are predictable, depending on who might fork it. And indeed, some forks have started to appear, and that is something that gets factored into their decisions as to how they run their copy of the Android project, which so far most companies still socially accept as the master copy, but they are not required to do that. So that means at least the Android Core code is indeed open source, even though it is not run in the way most open source projects are; although I think actually they have taken contributions from the outside. It's not quite as closed as the tech press indicates it is.
From what I understand of your views, you see it as like the license and these guaranteed freedoms are what makes it open source and that's all it really matters, because you're saying if need you could always fork it.
I'm not quite saying that that's all that really matters, I'm just saying that it's a main thing... And sure, I would much rather have a project be run by a community, but that potential is always there as long as the open source license is there.
Yeah, the reason why I think collaboration and community is so intertwined is because, again, network effects... And it doesn't really matter whether something can technically be forked if there is actually no ability to change it, so I worry that relying too much on that core definition could act... It's sort of like this great hypothetical about whether that really happens. It's like anyone can create an alternative to Facebook in theory, but no one has successfully created an alternative, because everyone's on Facebook.
[01:08:04.11] Well, but I don't think that network effects in an open source development environment are quite the same... Let's take a couple of examples. GCC got forked years ago. It had a core group of maintainers, and then it had a bunch of revolutionaries who were not happy with how those maintainers were maintaining it. And from the beginning of the project there was no doubt about who this sort of socially accepted master copy was. It was the one maintained by the Free Software Foundation with a technical council that I don't know how they were selected, but I think Richard Stallman was involved in selecting them, and when these revolutionaries grew increasingly unhappy with technical decision being made and with how contributions were being accepted or not accepted, they had corporate funding, they went off and created EGCS.
EGCS started accepting all those patches that the GCC copy wouldn't take, and eventually it kind of blew past GCC in terms of technical ability to the point where the FSF said "Well, I guess you're kind of where stuff is happening now, so we're just gonna take the next version of EGCS and call that GCC and merge the two, and you won." And it was totally successful, and it happened because the problems were big enough that people were willing to devote resources to forking and solving them. Could the same thing happen with the Linux kernel? Absolutely. If Linus started making bad decisions, or if he started ticking off too many people and enough kernel developers who had the technical plausibility to launch a fork chose to make a fork - yeah, it would succeed, there's no question. But it's just that Linus is running the project well enough that no one needs to do that.
Yeah, I see your point, it is different.
Yeah, but Facebook, on the other hand, that's a whole different kind of network effect. I don't mean to completely argue your point away because I think it's a good one, which is that there are network effects, and it is a lot of effort to fork a popular project that has a successful or at least a cohesive maintenance team and a clear leadership structure.
And you need to have a community that cares enough to fork it. Again, fast-forwarding to some sort dystopian future that I don't actually know is the future or not, but if open source projects become more about users than about contributors, and people are just sort of using the thing, then it becomes a lot harder to mobilize people to change something. But maybe I'm just sort of making up...
Well, the degree... The ease with which it is possible to motivate people to make a fork or to change something will always be directly proportional to the amount of need for that change. If no one's motivated to change anything, that just means it's not important to someone for something to get changed, so why should we care?
I don't know if people can hate using something... There's a ton of legacy open source projects that are used in everybody's code and it's just really hard to switch out because everyone uses them.
I think the difference though is that there's just not enough people... Yes, people hate using it, but there's not enough people that want to be developing on it that can't, that would then fork it and fix it. And think that there's a tension here between the people using it and the people that wanna contribute and can't, or wanna fix this and can't. And sometimes it really is too difficult to pull that out. But io.js was a pretty successful fork, and that was in large part because there were a lot of people that wanted to contribute that couldn't, and that wanted to take on ownership of the project and couldn't. So there was a thriving community actually working on it, and then people that were using were like "Oh great, I can come and use this."
[01:12:00.15] Unfortunately I don't know the details of that particular fork, it sounds like you do. If you think there are interesting lessons to draw from it, please explain more.
So I've said this on a couple occasions, but I think the size of the user base is proportional... There's some percentage of that that would contribute, that wanna contribute in some way, and if they're enabled to, you'll have a thriving community. If you don't, you eventually will increase the tension, not just with your overall user base, but also with these people that would be contributing. And eventually, if that tension rises enough, you get a fork.
I think that where that [unintelligible 01:12:37.24] when you look at Android, the users of the Android code base are not the users of Android. The users of the Android code base are companies that manufacture phones, for the most part.
And indeed, they started forking Android.
Yes, exactly. So they have the resources to do that, and their needs do not necessarily line up with the needs of Google. The problem is that their needs are in many cases counter to the users of Android, so it puts Google in a strange place where they're not satisfying the needs of the users of the Android code base, but they are satisfying the needs of the Android end users. If you talk to anybody who uses Android, they're like "Oh, I have to use the newest Google phone that only takes the Google Android, because the ones where manufacturers have forked them are pretty much terrible." Except, I heard Java is really good. I think we're getting into very specific things right now... [laughter]
Well we are, but just to make a quick point about that, in theory, in some sort of long ark of software justice, there should be a link between what those companies are doing with their forks of Android and user's needs, because otherwise they're not gonna sell phones. Of course, I would love all those phones to be running a fully open source operating system, and the reasons why they're not are an interesting topic in their own right, but there should be some connection eventually between those forks and some kind of technical need being solved.
So when you're looking towards the future though, do you see that tension rising, and users starting to come more in conflict with that model, or are you more pessimistic about it and you feel like the surface is going to continue to be dominated the way that it is now?
I wanna give the optimistic answer, but I have no justification for it. Because software is increasingly being tied to hardware devices, and the hackability for a hardware device is so much... Like, the hacktivation energy, the threshold for hacking on something other than a normal laptop or desktop computer is just so much higher that the ratio in any given pool, in any given user base, the number of those users who will be developers, the percentage is gonna be lower. Just to hack on an Android phone - alright, you've gotta setup an Android development environment, you've gotta plug into the phone using a special piece of software that gets you into a development environment, and all of that software might be open source, but it's not like just compiling and running a program and then hacking on the source code and running it again on your laptop. The overhead to get to the point of development is just so much higher. And that's just phones. Do you think hacking on your car is gonna be easier than that? No, it's gonna be a lot harder.
I think unfortunately we have to leave it there with this view of a dystopian future... [laughter]
Always happy to make it darker for you.
...but we'll be back next week. We're gonna continue with Karl and talk about some much happier things, like contributions and governance models...
Oh, I'll turn that dark, too.
Oh, okay. [laughter]
Our transcripts are open source on GitHub. Improvements are welcome. 💚