A summary of Small World with High Risks: A Study of Security Threats in the npm Ecosystem by Markus Zimmermann et al.
Nicholas M. Synovic
- 8 minutes read - 1659 wordsA summary of Small World with High Risks: A Study of Security Threats in the npm Ecosystem
Markus Zimmermann et al. 28th USENIX Security Symposium; 2019 DOI
For the summary of the paper, go to the Summary section of this article.
Table of Contents
First Pass
Read the title, abstract, introduction, section and sub-section headings, and conclusion
Problem
What is the problem addressed in the paper?
This paper analyzes the security risks that the npm
package manager exposes
end users to directly and indirectly through dependency analysis.
Motivation
Why should we care about this paper?
The 2016 left-pad
and 2018 eslint-scope
caused many dependent packages to
become exposed to security vulnerabilities after being taken down and
compromised respectfully.
Additionally (and quoted from the paper):
- Installing an average npm package introduces an implicit trust on 79
third-party packages and 39 maintainers, creating a surprisingly large attack
surface.
- Highly popular packages directly or indirectly influence many other packages
(often more than 100,000) and are thus potential targets for injecting malware.
- Some maintainers have an impact on hundreds of thousands of packages. As a
result, a very small number of compromised maintainer accounts suffices to
inject malware into the majority of all packages.
- The influence of individual packages and maintainers has been continuously
growing over the past few years, aggravating the risk of malware injection
attacks.
- A significant percentage (up to 40%) of all packages depend on code with at
least one publicly known vulnerability.
Category
What type of paper is this work?
This is a security paper with particular focus on security analysis of software supply chains.
Context
What other types of papers is the work related to?
Papers that analyze and quantify the risks to software hosting platforms/ software ecosystems. Additionally, papers that discuss the threat models of software ecosystems are also related.
Contributions
What are the author’s main contributions?
Their main contributions can be found in Motivation. More
generally, they show that npm
is small in that packages are tightly dependent
upon one another, and that a single security vulnerability is enough to
seriously cripple the functionality of the ecosystem. Furthermore, they analyze
the different threat models to npm
, as well as the role of maintainers with
respect to the wider ecosystem. In addition, they propose several different
mitigations for their proposed threat models. These include:
- a vetting process to create “trusted” maintainers
- a vetting process to analyze newly contributed code of specific packages
If both process were to be created for a single package, that package would be considered to have, “perfect first-party security”. And if this was to be extended to all transitive packages of that sole package, then it would be considered to have “perfect third-party security” If both of the considerations were to be met, then the package would be considered to be a “fully secured package”.
Second Pass
A proper read through of the paper is required to answer this
Background Work
What has been done prior to this paper?
Work has been done understanding the usage of “micro packages”, or packages that accomplish a small functionality.
Work has been done to understand the server and client security vulnerabilities in JavaScript.
Work has been done to understand software ecosystems and to raise questions that need to be answered with respect to understanding the evolution of the ecosystems.
Figures, Diagrams, Illustrations, and Graphs
Are the axes properly labeled? Are results shown with error bars, so that conclusions are statistically significant?
All of the figures are clearly made, as well as well captioned.
Clarity
Is the paper well written?
This paper is well written and dense. I do wonder if this paper could have been broken up into potentially two smaller papers. But at the same time, if the author’s were to do that, it might be hard to justify the overall contribution of the work per paper.
Relevant Work
Mark relevant work for review
The following relevant work can be found in the Citations section of this article.
- Revisiting software ecosystems research: A longitudinal literature study [2]
- Challenges in software ecosystems research [3]
- An ecosystem and socio-technical view on software maintenance and evolution [4]
- A look at the dynamics of the JavaScript package ecosystem [5]
- Structure and evolution of package dependency networks [6]
- An empirical comparison of dependency network evolution in seven software packaging ecosystems [7]
- The evolution of the R software ecosystem [8]
- The evolution of project inter-dependencies in a software ecosystem: The case of Apache [9]
- Gentoo package dependencies over time [10]
Methodology
What methodology did the author’s use to validate their contributions?
The author’s used npm
package metadata from 2011 to April of 2018 to generate
several graphs of how packages are related to one another. Following this, they
then utilized graph metrics to measure the potential vulnerabilities npm
is
exposed to, as well as the actual reach of vulnerable packages within npm
.
Additionally, they utilized the package metadata to visualize and understand the
growth of npm
year over year. They utilized these metrics to understand how
potentially dangerous their proposed threat models are to engineers who use
npm
.
Author Assumptions
What assumptions does the author(s) make? Are they justified assumptions?
The author’s assume that all proposed threat models are of the same concern. For some engineers, different models can be of different levels of concern.
Correctness
Do the assumptions seem valid?
Yes, as this would have involved a survey of engineers to understand
Future Directions
My own proposed future directions for the work
While the study of npm
is useful as it is the world’s largest software package
ecosystem, I’d like to apply the metrics implemented in this work to
understanding PTM software ecosystems, such as Hugging Face and PyTorch Hub.
Open Questions
What open questions do I have about the work?
Will the author’s perform a survey to understand if developers feel like the proposed threat models are feasible?
What is the npm
community’s opinion on reducing the number of micro packages
hosted on npm
?
Author Feedback
What feedback would I give to the authors?
This work is very interesting and allows for easy expansion and exploration into other software ecosystems. I suggest to make their graphs publicly available, as well as to submit the graph to services such as Snyk so that they can further analyze the data for security concerns (if they haven’t already).
Summary
A summary of the paper
The paper Small World with High Risks: A Study of Security Threats in the npm
Ecosystem by Markus Zimmermann et al. [1] was a large scale study on npm
packages and package dependencies taken from 2011 to April 2018. This study was
done to understand the various different threat models that exist on npm
as
well as to understand how npm
has evolved. By studying the evolution of npm
,
the author’s were able to analyze the growth of potentially vulnerable software
that can be affected by the proposed threat models. These threat models target
the underlying software package supply chain, and as npm
is considered to be a
small world (packages are tightly coupled to one another often resulting in long
chains), their are high risks involved when a single package is compromised, as
potentially countless more are affected by it.
The author’s main contributions were (taken from the paper):
- Installing an average npm package introduces an implicit trust on 79
third-party packages and 39 maintainers, creating a surprisingly large attack
surface.
- Highly popular packages directly or indirectly influence many other packages
(often more than 100,000) and are thus potential targets for injecting malware.
- Some maintainers have an impact on hundreds of thousands of packages. As a
result, a very small number of compromised maintainer accounts suffices to
inject malware into the majority of all packages.
- The influence of individual packages and maintainers has been continuously
growing over the past few years, aggravating the risk of malware injection
attacks.
- A significant percentage (up to 40%) of all packages depend on code with at
least one publicly known vulnerability.
In addition, they propose several different mitigations for their proposed threat models. These include:
- a vetting process to create “trusted” maintainers
- a vetting process to analyze newly contributed code of specific packages
If both process were to be created for a single package, that package would be considered to have, “perfect first-party security”. And if this was to be extended to all transitive packages of that sole package, then it would be considered to have “perfect third-party security” If both of the considerations were to be met, then the package would be considered to be a “fully secured package”.
The threat models that the author’s identified were:
- Malicious packages
- Exploiting Unmaintained Legacy Code
- Package Takeover
- Account Takeover
- Collusion Attacks
They found that:
- The number of maintainers on
npm
is growing significantly slower than the number of released packages. In other words, maintainers are creating more and more packages and are there by creating a larger and larger threat space for an attacker to execute an Account or Package Takeover attack. - That packages on
npm
have a linear growth of direct dependencies, but a super linear growth of transitive dependencies - That the average package reach is growing at an exponential rate year over year
- That there is growth in implicitly trusting maintainers
- That there is fairly linear growth in the number of unpatched advisories year over year
- That the rate at which published vulnerabilities per 10,000 packages has been rapidly increasing year over year.
Summarization Technique
This paper was summarized using a modified technique proposed by S. Keshav in his work How to Read a Paper [0].