José Bayoán Santiago Calderón is a research economist in the national economic accounts research group at the Bureau of Economic Analysis. Before joining the federal statistical system, Dr. Santiago Calderón had years of experience in the private sector as a research scientist at various companies. Bayoán also held academic appointments with the Biocomplexity Institute and Initiative at the University of Virginia, where he started his career in public service.
His research has centered on improving decision-making, emphasizing the public good (e.g., science policy). His transdisciplinary research approach has enabled him to routinely collaborate across disciplines and develop a diverse set of domain knowledge and methodological toolset. He also participates in various open-source software communities (e.g., JuliaLang) and civic activism (e.g., Code4PR, Mentes Puertorriqueñas en Accion).
I read quite a bit of manga, manhwa & manhua as well as watching anime, donghua and KDramas. Sometimes, I even have time and energy to play some videogames. Check out the relevant profiles: PSN Profiles, MyAnimeList, MyDramaList.
PhD in Economics, 2019
Claremont Graduate University
Scalable Data Analysis
Julia, R, SQL, Git, Linux
Scientific Computing, Software Development
High-Performance Computing, Cloud Computing
Agent-Based Modeling (ABM)
Social Network Analysis
Geographic Information Systems (GIS)
Text Mining, Natural Language Processing (NLP)
I have a two years old doggo named Sadaharu
Manga (One Piece, One Punch-Man)
Currently watching some animes like Spy x Family and a bunch of isekais
Currently playing Baldur’s Gate III
My research focus is in the areas of the digital economy, intellectual property products (IPPs), and own account procurement. Some of my work include exploring a range of measurement issues concerning intangibles assets such as software (e.g., own account, open-source) and data.
Supervisor: Jon D. Samuels
My strategic & scientific consulting work included projects across multiple therapeutic areas such as rare diseases, metabolic diseases, pediatrics, oncology, and vaccines. I conducted multiple clinical trial evaluations of the safety and efficacy of formulations to support drug development strategies at the company (e.g., study design, stop/go decisions, model development, biomarker exploration, dose selection) and regulatory processes (e.g., type-C meetings).
My work in the product development team was primarily the development of the module for bioequivalence (BE) analysis in the Pumas ecosystem. This included the design, implementation, testing, documentation, maintenance, and coordination with the other components of the ecosystem.
Worked on multiple projects with federal and state agencies helping them meet their missions. These included:
Other work activities include:
Assisted the infrastructure team on helping the team best use UVA computing resources (e.g., high-performance computing) and best practices (e.g., version control).
Served as project lead and instructor for the Data Science for the Public Good Young Scholars Program (DSPG).
Supervisor: Sallie Ann Keller, PhD
Assisted the data collection and analysis of several experiments. Some tasks included recruitment, training, running experiments (human and animal subjects). Some of the methods for the data collection and analysis included computer laboratory experiments, drug studies (e.g., alcohol, testosterone), biometric research such as electroencephalogram (EGG) and electrocardiogram (ECG), eye-tracking, and blood work. Several of the tools used included z-Tree and iMotions-BIOPAC.
Supervisor: Paul Joseph Zak, PhD
Summer intern through the Agents of Change Empowerment and Retention Program (PARACa) fellowship, a Mentes Puertorriqueñas en Acción initiative. Worked on the annual report to the state senate on the status of the K-12 public education system titled “El estado actual de las escuelas públicas en Plan de Mejoramiento en Puerto Rico, año escolar 2010-2011”. Assisted the Coalition for Equity and High Quality Education (CECE, for its Spanish acronym) and members of the school community in the choosing and design of the advocacy plan for the year 2011-2012.
Supervisor: David Ortiz
With the recent proliferation of data collection and uses in the digital economy, the understanding and statistical treatment of data stocks and flows is of interest among compilers and users of national economic accounts. In this paper, we measure the value of own-account data stocks and flows for the U.S. business sector by summing the production costs of data-related activities implicit in occupations. Our method augments the traditional sum-of-costs methodology for measuring other own-account intellectual property products in national economic accounts by proxying occupation-level time-use factors using a machine learning model and the text of online job advertisements (Blackburn 2021). In our experimental estimates, we find that annual current-dollar investment in own-account data assets for the U.S. business sector grew from $84 billion in 2002 to $186 billion in 2021, with an average annual growth rate of 4.2 percent. Cumulative current-dollar investment for the period 2002–2021 was $2.6 trillion. In addition to the annual current-dollar investment, we present historical-cost net stocks, real growth rates, and effects on value-added by the industrial sector.
Open source software (OSS) is software that anyone can study, inspect, modify, and distribute freely under very limited restrictions, generally attribution. While OSS is vital to virtually all aspects of modern society, there is no standard methodology to satisfactorily measure the scope and impact of these intangible assets. Today, GitHub is the world’s largest forge with over 80 million users and 118 million public repositories. This study presents a framework based on GitHub’s administrative data to discover, profle, and measure the development of OSS. The data include over 7.75 million original, nondeprecated repositories with a machine detectable OSI-approved license. For each repository, we collect metadata such as commits, license, and information about contributors. Adopting a cost estimation model from software engineering and national accounting methods for measurement of software, we develop a methodology to generate estimates of investment in OSS that are consistent with measures of software investment in the U.S. national accounts. Our current estimates show that the U.S. investment in OSS in 2019 was $36.2 billion.
Pharmacometric modeling establishes causal quantitative relationships between administered dose, tissue exposures, desired and undesired effects and patient’s risk factors. These models are employed to de-risk drug development and guide precision medicine decisions. However, pharmacometric tools have not been designed to handle today’s heterogeneous big data and complex models. We set out to design a platform that facilitates domain-specific modeling and its integration with modern analytics to foster innovation and readiness in healthcare. Pumas demonstrates estimation methodologies with dramatic performance advances. New ODE solver algorithms, such as coeficient-optimized higher order integrators and new automatic stiffness detecting algorithms which are robust to frequent discontinuities, give rise to a median 4x performance improvement across a wide range of stiff and non-stiff systems seen in pharmacometric applications. These methods combine with JIT compiler techniques, such as statically-sized optimizations and discrete sensitivity analysis via forward-mode automatic differentiation, to further enhance the accuracy and performance of the solving and parameter estimation process. We demonstrate that when all of these techniques are combined with a validated clinical trial dosing mechanism and non-compartmental analysis (NCA) suite, real applications like NLME fitting see a median 81x acceleration while retaining the same accuracy. Meanwhile in areas with less prior software optimization, like optimal experimental design, we see orders of magnitude performance enhancements over competitors. Further, Pumas combines these technical advances with several workflows that are automated and designed to boost productivity of the day-to-day user activity. Together we show a fast pharmacometric modeling framework for next-generation precision analytics.
This paper studies community formation in OSS collaboration networks. While most current work examines the emergence of small-scale OSS projects, our approach draws on a large-scale historical dataset of 1.8 million GitHub users and their repository contributions. OSS collaborations are characterized by small groups of users that work closely together, leading to the presence of communities defined by short cycles in the underlying network structure. To understand the impact of this phenomenon, we apply a pre-processing step that accounts for the cyclic network structure by using Renewal-Nonbacktracking Random Walks (RNBRW) and the strength of pairwise collaborations before implementing the Louvain method to identify communities within the network. Equipping Louvain with RNBRW and the contribution strength provides a more assertive approach for detecting small-scale teams and reveals nontrivial differences in community detection such as users’ tendencies toward preferential attachment to more established collaboration communities. Using this method, we also identify key factors that affect community formation, including the effect of users’ location and primary programming language, which was determined using a comparative method of contribution activities. Overall, this paper offers several promising methodological insights for both open-source software experts and network scholars interested in studying team formation.
Econometrics.jl is a package for econometrics analysis. It provides a series of most common routines for applied econometrics such as models for continuous, nominal, and ordinal outcomes, longitudinal estimators, variable absorption, and support for convenience functionality such as weights, rank deficient, and robust variance covariance estimators. This study complements the package through a discussion of the motivation, placing the contribution within the Julia ecosystem and econometrics software in general, and provides insights on current gaps and ways the Julia ecosystem can evolve.