Measuring the Cost and Impact of Open Source Software as Intangible Capital

Abstract

Open Source Software (OSS), defined by Open Source Initiative, is computer software with its source code shared with a license in which the copyright holder provides the rights to study, change, and distribute the software to anyone and for any purpose. OSS is developed, maintained, and extended both within and outside of the private sector, through the contribution of independent developers as well as people from universities, government research institutions, businesses, and nonprofits. Examples include Apache server software, and R statistical programming software. Despite its ubiquity and extensive use, reliable measures of the scope and impact of OSS developed outside of the business sector are scarce. Activities around OSS development, a vital component of science activity, are not well-measured in existing federal statistics on innovation. Many of the OSS projects are developed and maintained in free repositories, such as GitHub, and information embedded in these repositories, including the code, contributors, and development activity, is publicly available. In this paper, we use data from GitHub, the largest platform with 31 million users and developers worldwide, obtaining information about OSS projects. We collect 5.2 million project repositories, containing metadata such as author, license, commits (approved code edits), and lines of code. We adopt methods used in software engineering to estimate the resource cost associated with creating OSS. We use lines of code as the measure of effort to estimate the time spent on software development and calculate the monetary value using the average compensation for computer programmers from Bureau of Labor Statistics wage data and other costs based on national accounts methodologies. Finally, use network analysis methods developed for bibliometrics and patent analysis to study the impact of these projects.

Date
2020-05-01 16:05 — 16:30
Location
Virtual