Measuring the Cost of Open Source Software Innovation on GitHub

Abstract

Open source software (OSS) is software that anyone can study, inspect, modify, and distribute freely under very limited restrictions, generally attribution. While OSS is vital to virtually all aspects of modern society, there is no standard methodology to satisfactorily measure the scope and impact of these intangible assets. Today, GitHub is the world’s largest forge with over 80 million users and 118 million public repositories. This study presents a framework based on GitHub’s administrative data to discover, profle, and measure the development of OSS. The data include over 7.75 million original, nondeprecated repositories with a machine detectable OSI-approved license. For each repository, we collect metadata such as commits, license, and information about contributors. Adopting a cost estimation model from software engineering and national accounting methods for measurement of software, we develop a methodology to generate estimates of investment in OSS that are consistent with measures of software investment in the U.S. national accounts. Our current estimates show that the U.S. investment in OSS in 2019 was $36.2 billion.

Type
Publication
BEA Working Paper Series, WP2022–10
J. Bayoán Santiago Calderón
J. Bayoán Santiago Calderón
Research Economist

🇵🇷 Economist by training. Data scientist / software developer by accident.