Sustain Episode #40: How Open Source Maintainers Don't Get Rich

How Open Source Maintainers Don’t Get Rich with Bogdan Vasilescu :point_right: Take a Listen
1 Like


@jdorfman | @eric | @RichardLitt

Sponsored by


[00:14:02] “What we’re observing through this series of studies that we’ve done, and other people have done too, is that people’s behavior changes when you have this salience of information.”

[00:16:30] “On average, people submitting PR’s, they are more likely to add tests to their PR’s when the stuff is being displayed because then there’s some feedback loop that’s instant and very visible.”

:memo: Show Notes

@RichardLitt mentions how many other awesome papers/topics Bogdan wanted to talk about.

Here is what he was referring to…(we really need to book him for another episode)

  • Would love to talk about the research that my group at Carnegie Mellon University has been doing on open source sustainability. We have published several papers on topics related to sustainability:

  • How to Not Get Rich: An Empirical Study of Donations in Open Source

    • we quantify how commonly open-source projects ask for donations, statistically model characteristics of projects that ask for and receive donations, analyze for what the requested funds are needed and used, and assess whether the received donations achieve the intended outcomes
    • Justin’s annotations
  • Stress and Burnout in Open Source: Toward Finding, Understanding, and Mitigating Unhealthy Interactions

    • Toxic language in open source can manifest in multiple ways, including hate speech and microaggressions found also elsewhere online (e.g., Youtube), but also through open-source-specific displays of entitlement and urgency related to timing expectations.
    • Toxic GitHub issue discussions (in English) can be identified using a combination of pre-trained detectors of negative sentiment, anger, impoliteness, and toxicity.
  • Why do People Give Up FLOSSing? A Study of Contributor Disengagement in Open Source

    • From our survey:
      • contributors who work nights and weekends tend to disengage for different reasons than those working more regular hours.
      • the most common reasons for complete disengagement relate to transitions in employment, such as graduating from academia, changing employers, and changing roles.
    • From the data mining:
      • working predominantly during office hours and experiencing a job transition both increase a contributor’s risk of disengagement.
      • increased levels of activity and working on more popular projects both decrease a contributor’s risk of disengagement.
  • The Signals that Potential Contributors Look for When Choosing Open-source Projects

    • Our results reveal several key signals used to inform the decision whether or not to contribute to a GitHub project:
      • a README file with thorough contents and clear structure, describing what the project does, how to get started using it, what a new contributor could work on, and what guidelines they should follow;
      • the availability of scaffolding, such as issue and pull request templates, or issue labels;
      • how actively maintained the project is, along multiple dimensions, such as the number of contributors and the recency of commits;
      • the friendliness of the maintainers in issue and pull request discussions;
      • project popularity.
    • Some signals can be considered both attractive and unattractive by different users.
      • the presence of detailed contributing guidelines is seen by some contributors as “off-putting”, as it can set a higher bar to participation and impose too much process overhead.
      • some signals are important in the decision process but may be unclear to first-time GitHub contributors. For example, our model shows that politeness is an important signal for arbitrary new contributors but not for first-time GitHub contributors.
  • Going Farther Together: The Impact of Social Capital on Sustained Participation in Open Source

    • Contributing to projects where team members are more familiar with each other (from prior collaborations) is in general associated with decreased risk of disengagement.
    • Women are at higher risk of disengagement than men.
    • Higher team diversity along dimensions of programming language expertise is associated with a decreased risk of both short and long term disengagement. Moreover, gender and language diversity interact: when team members have more diverse programming language backgrounds, women are less likely than men to disengage early.
  • Gender and tenure diversity in GitHub teams

    • We use statistical modeling to analyze the relationship of gender and tenure diversity to productivity, when controlling for team size and other confounds. We find that both gender and tenure diversity have a statistically significant, positive effect on productivity, gender across all team sizes and tenure for teams larger than 10.
  • Adding Sparkle to Social Coding: An Empirical Study of Repository Badges in the npm Ecosystem

    • Infographic summary
    • Quality assurance badges displaying test coverage percentages correlate with developers increasing the size and, arguably, quality of their test suites.
    • Packages with a dependency-manager badge or an information badge tend to have overall fresher dependencies than packages without.
    • Popular packages with a quality-assurance badge tend to have about 2.2 times more downloads than comparable packages without.
    • Packages with too many badges tend to have fewer downloads.
  • Ecosystem-Level Determinants of Sustained Activity in Open-Source Projects: A Case Study of the PyPI Ecosystem

    • We interview maintainers of PyPI packages; integrate data from PyPI and GitHub, mining repositories and their interdependencies to assemble an ecosystem-level longitudinal data set; identify which packages became dormant; and estimate Cox proportional hazards survival regressions to model the factors affecting a package’s chances of entering this dormant state.
    • We find that the number of connections and the relative position in the dependency network are significant factors affecting the chances of a project becoming dormant.
1 Like