As these tactics have become more recognizable, current security tools and vigilant developers have improved at detecting them quickly. However, an often overlooked yet potentially dangerous feature remains: Entry points.
This blog post explores how attackers can leverage entry points across multiple programming ecosystems with an emphasis on Pypi to trick victims into running malicious code. While this method doesn’t allow for immediate system compromise like automatic scripts or malicious dependencies, it offers a subtler approach for patient attackers to infiltrate systems, potentially evading standard security measures.
By understanding this lesser-known vector, we can better defend against the evolving landscape of Open source supply chain attacks.
Entry points are a powerful feature of the packaging system that allows developers to expose specific functionality as a cli command without requiring users to know the exact import path or structure of the package.
Entry points serve several purposes which include:
The most popular kind of entry point is console_scripts, which points to a function that you want to be made available as a command-line tool to whoever installs your package.
While primarily designed to enhance modularity and plugin systems, entry points can, if misused, become a vector for malicious actors to embed and execute harmful code. To understand how attackers can leverage Python entry points in their favor, let’s first understand how entry points were originally meant to work.
The location and format of entry point definitions can vary depending on the package format (wheel or source distribution).
For source distributions, entry points are typically defined in a package’s setup configuration. This can be in setup.py, setup.cfg for traditional setups, or pyproject.toml for more modern packaging approaches.
Here’s an example of how entry points might be defined in setup.py:
In a wheel file, which is a built package format, entry points are defined in the entry_points.txt file within the .dist-info directory.
Here’s how the entry_points.txt file might look for the above example:
The syntax for entry points follows this pattern:
In the above examples, my_command is a console script that will be created during installation. Anytime after the package installation, when a user types my_command in their terminal, it will execute the my_function from mypackage.module.
The plugin_name is a custom entry point that could be used by my_package to discover plugins. It points to PluginClass in my_package.plugins.
When a package is installed, these entry points are recorded in the package’s metadata. Other packages or tools can then query this metadata to discover and use the defined entry points.
If an attacker can manipulate a legitimate package’s metadata or convince a user to install a malicious package, they can potentially execute arbitrary code on the user’s system whenever the defined command or plugin is invoked. In the following section, I will provide multiple methods an attacker could use to trick someone into executing their malicious code through entry points.
Command-line interface (CLI) commands are the primary means by which users interact with an operating system through a text-based interface. These commands are interpreted and executed by the shell, which acts as an intermediary between the user and the operating system.
When a user enters a command, the shell follows a specific resolution mechanism to locate and execute the corresponding program. The exact order can vary slightly between different shells. However, the process typically begins by checking in order the directories listed in the PATH environment variable and runs the first matching executable it finds. Users can view their current PATH by entering the command “echo $PATH” in their terminal (the exact command will differ between operating systems), which displays the list of directories the shell searches for executables.
This resolution process ensures that when a user types a command, the appropriate action is taken. Understanding this process is crucial when considering how Python entry points, which can create new CLI commands, might interact with or potentially interfere with existing system commands.
Terminal output on an Ubuntu system showing the ‘ls’ command execution, its PATH location using ‘which ls’, and the system’s PATH environment variable, displaying the ‘ls’ PATH priority.
Malicious actors can exploit Python entry points in several ways to trick users into executing harmful code. We’ll explore a number of tactics, including Command-Jacking, Malicious Plugins and Malicious Extensions.
Impersonating Popular Third-Party Commands
Malicious packages can use entry points to masquerade as widely-used third-party tools. This tactic is particularly effective against developers who frequently use these tools in their workflows.
For instance, an attacker might create a package with a malicious ‘aws’ entry point. When unsuspecting developers who regularly use AWS services install this package and later execute the aws command, the fake ‘aws’ command could exfiltrate their AWS access keys and secrets. This attack could be devastating in CI/CD environments, where AWS credentials are often stored for automated deployments—potentially giving the attacker access to entire cloud infrastructures.
Another example could be a malicious package impersonating the ‘docker’ command, targeting developers working with containerized applications. The fake ‘docker’ command might secretly send images or container specifications to the attacker’s server during builds or deployments. In a microservices architecture, this could expose sensitive service configurations or even lead to the exfiltration of proprietary container images.
Other popular third-party commands that could be potential targets for impersonation include but not limited to:
Each of these commands is widely used in various development environments, making them attractive targets for attackers looking to maximize the impact of their malicious packages.
Impersonating System Commands
By using common system command names as entry points, attackers can impersonate fundamental system utilities. Commands like ‘touch,’ ‘curl,’ ‘cd’, ‘ls’, and ‘mkdir’ just to name a few, could be hijacked, leading to severe security breaches when users attempt to use these fundamental tools.
While this method potentially provides the highest chances of the victim accidentally executing the malicious code, it also carries the highest risk of failure for the attacker. The success of this approach primarily depends on the PATH order. If the directory containing the malicious entry points appears earlier in the PATH than the system directories, the malicious command will be executed instead of the system command. This is more likely to occur in development environments where local package directories are prioritized.
Another thing to keep in mind is that globally installed packages (requiring root/admin privileges) might override system commands for all users, while user-installed packages would only affect that specific user’s environment.
Comparison of Ubuntu terminal outputs before and after installation of a malicious package. An ‘ls’ command is added to the PATH /home/ubuntu/.local/bin/ls, which takes priority over the PATH of the legitimate ls command.
In each of these Command-Jacking tactics, while it’s simpler for an attacker to merely override CLI commands, the chances of remaining undetected are quite low. The moment victims can’t execute a command, they’ll likely become suspicious immediately. However, these attacks can be made much more effective and stealthy through a technique called “command wrapping.” Instead of simply replacing a command, wrapping involves creating an entry point that acts as a wrapper around the original command. Here’s how it works:
This method of command wrapping is particularly dangerous as it executes malicious code without the user’s knowledge while maintaining the appearance of normal operation. Since the legitimate command still runs and its output and behavior are preserved, there’s no immediate sign of compromise, making the attack extremely difficult to detect through normal use. This stealthy approach allows attackers to maintain long-term access and potentially exfiltrate sensitive information without raising suspicion.
However, implementing command wrapping requires additional research by the attacker. They need to understand the correct paths for the targeted commands on different operating systems and account for potential errors in their code. This complexity increases with the diversity of systems the attack targets.
An alternative approach, depending on the command being hijacked, is for the malicious package to not only perform its covert operations but also replicate some or all of the functionality of the original command. Instead of calling the real command, the wrapper simulates its behavior. This method could further decrease suspicion, especially for simpler commands, but it requires more effort from the attacker to accurately mimic the original command’s behavior across various scenarios.
The success of these attacks ultimately depends on the malicious package being installed and its scripts directory being prioritized in the system’s PATH.
Malicious Plugins & Extensions
Another powerful technique for abusing entry points is through the creation of malicious plugins for popular Python tools and frameworks. This approach can be particularly dangerous as it targets the development and testing process itself.
As an example, let’s consider how an attacker might target pytest, a widely-used testing framework in the Python ecosystem. By creating a malicious pytest plugin, an attacker could potentially compromise the integrity of the entire testing process.
Here’s how such an attack could work:
The malicious plugin could then stealthily run malicious code in the background during testing. The malicious plugin could also override pytest’s assertion comparison, causing, for example, all equality checks to pass regardless of their actual values, leading to false positives in test results, allowing buggy or vulnerable code to pass quality checks unnoticed.
In the following video demonstration, we showcase how such a malicious plugin can target pytest’s assertion handling, allowing an attacker to manipulate test results without alerting the developers. In this example, a developer was attempting a simple test scan of a basic calculator package.
Attackers can also target popular development tools, manipulating them to run malicious extensions. Flake8, a widely-used linting tool in the Python ecosystem, is one such example. Since Flake8 uses entry points to discover and load extensions, it becomes a potential target for malicious actors.
An attacker might exploit Flake8 by creating a malicious extension disguised as helpful linting rules. This extension would be defined as an entry point in the package’s setup configuration. For example, the setup file might specify an entry point named ‘MCH’, pointing to a malicious checker class within the package.
The malicious checker’s implementation could include functionality to perform harmful actions on the victim’s system, inject malicious “fixes” into the code, or manipulate linting results to hide or create issues. When a user runs Flake8 on their codebase, this malicious extension would activate, allowing the attacker to execute their harmful code.
This attack is particularly dangerous because linting tools often run on entire codebases, giving the attacker broad access to the source code. Moreover, the attack can be perpetrated through seemingly helpful linting rules, making it less likely to raise suspicion. It could serve as part of a larger supply chain attack to gather intelligence or introduce vulnerabilities into the target’s codebase.
Python wheels (.whl files) have become increasingly prevalent due to their performance benefits in package installation. However, they present a unique challenge for attackers
While both .tar.gz and .whl files may contain a setup.py file, .whl files don’t execute setup.py during installation. This characteristic has traditionally made it more difficult for attackers to achieve arbitrary code execution during the installation process when using .whl files.
However, the entry point attack method we’ve discussed provides a workaround for this limitation. By manipulating entry points, attackers can ensure their code is executed when specific commands are run, even if the package is distributed as a .whl file. This is particularly significant because when developers build a Python package using commands like “pip -m build”, newer pip versions automatically create both .tar.gz and .whl files. Additionally, pip prioritizes delivering the .whl file to users during installation
This shift in package format and installation behavior presents a new opportunity for attackers. Many security tools focus on analyzing execution of preinstall scripts during installation, which are typically associated with .tar.gz files. As a result, they may miss malicious code in packages distributed as .whl files, especially when the malicious behavior is triggered through entry points rather than immediate execution.
While this blog primarily focuses on Python, the exploitation of entry points for malicious purposes extends beyond the Python ecosystem. Through our research, we have confirmed that this type of attack vector exists in several other major ecosystems, including:
npm (JavaScript), Ruby Gems, NuGet (.NET), Dart Pub, and Rust Crates, though the vulnerability may not be limited to these alone.
Understanding how entry points function across various programming languages and package managers is crucial for grasping the widespread nature of this potential security risk and for developing comprehensive defensive strategies.
Entry points, while a powerful and useful feature for legitimate package development, can also be manipulated to deliver malicious code across multiple programming ecosystems
Attackers could exploit this mechanism through various methods, including Command-Jacking and the creation of malicious plugins and extensions for popular development tools.
Moving forward, it’s crucial to develop comprehensive security measures that account for entry point exploitation. By understanding and addressing these risks, we can work towards a more secure Python packaging environment, safeguarding both individual developers and enterprise systems against sophisticated supply chain attacks.
As part of the Checkmarx Supply Chain Security solution, our research team continuously monitors suspicious activities in the open-source software ecosystem. We track and flag “signals” that may indicate foul play, including suspicious entry points, and promptly alert our customers to help protect them from potential threats.
]]>The attack targeted users of Atomic, Trust Wallet, Metamask, Ronin, TronLink, Exodus, and other prominent wallets in the crypto ecosystem. Presenting themselves as utilities for extracting mnemonic phrases and decrypting wallet data, these packages appeared to offer valuable functionality for cryptocurrency users engaged in wallet recovery or management. However, behind the scenes, these packages would fetch malicious code from dependencies to covertly steal sensitive cryptocurrency wallet data, including private keys and mnemonic phrases, potentially granting the attackers full access to victims’ funds.
This strategic use of dependencies allowed the main packages to appear harmless while harboring malicious intent in their underlying components.
The attacker behind this campaign implemented a multi-faceted strategy to disguise their intent and maximize the download count of their packages. This approach combined several deceptive techniques, each designed to exploit different aspects of user trust and package evaluation practices.
Package names were carefully crafted to appeal to developers and users working with various wallet types. Names like “AtomicDecoderss,” “TrustDecoderss,” and “ExodusDecodes” mimicked legitimate tools.
Each package was accompanied by a professionally crafted README file, further enhancing its apparent legitimacy. These READMEs included detailed installation instructions, usage examples, and in one case, even “best practices” for virtual environments.
The attacker went a step further in their deception by including fake package statistics in the README files. At first glance, the packages displayed impressive stats, making them appear to be part of very popular and well-maintained projects. However, these were merely images creating an illusion of popularity and active development, further encouraging users to trust and download the package.
This level of detail and apparent popularity not only made the packages seem genuine but also significantly increased the likelihood of users implementing and running the malicious code.
Functionality was distributed across multiple dependencies within the packages themselves. Six of the malicious packages relied on a dependency called “cipherbcryptors,” which contained the core malicious code. Some packages further obfuscated their functionality by utilizing an additional dependency, “ccl_leveldbases.” This approach served dual purposes.
Firstly, it made the main packages appear more innocuous upon initial inspection, as they themselves contained little to no overtly malicious code.
Secondly, it significantly complicated the analysis process for security researchers and vigilant users. The full scope of each package’s capabilities wasn’t immediately apparent, requiring a deeper dive into the dependency chain to uncover the true nature of the code.
Within the “cipherbcryptors” package, where the heart of the malicious functionality resided, the code was heavily obfuscated. This obfuscation makes it challenging for automated security tools and human reviewers alike to quickly identify the package’s true intent.
Original obfuscated malicious function within the “cipherbcryptors” package
After deobfuscation – malicious function within the “cipherbcryptors” package
Furthermore, the attacker employed an additional layer of security by not hardcoding the address of their command and control server within any of the packages. Instead, they used external resources to retrieve this information dynamically.
This technique not only makes static analysis of the code more difficult but also provides the attackers with the flexibility to change their infrastructure without needing to update the packages themselves.
By combining these various deceptive techniques – from package naming and detailed documentation to false popularity metrics and code obfuscation – the attacker created a sophisticated web of deception. This multi-layered approach significantly increased the chances of the malicious packages being downloaded and used, all while making detection and analysis more challenging.
The execution of this attack diverged from the typical pattern seen in malicious packages. Rather than triggering malicious actions immediately upon installation, these packages lay dormant until specific functions were called. This approach significantly reduced the chances of detection by security tools that scan packages at installation time.
When a user attempted to use one of the advertised functions, the malicious code would activate. The process typically began with the package attempting to access the user’s cryptocurrency wallet data. For different wallet types, this involved targeting specific file locations or data structures known to contain sensitive information.
Once the wallet data was accessed, the malware would attempt to extract critical information such as private keys, mnemonic phrases, and potentially other sensitive data like transaction histories or wallet balances. This stolen data would then be prepared for exfiltration.
The exfiltration process involved encoding the stolen data and sending it to a remote server controlled by the attacker.
The consequences for victims could be severe and far-reaching. The most immediate and obvious impact is the potential for financial losses. With access to private keys and mnemonic phrases, attackers can swiftly drain cryptocurrency wallets. The irreversible nature of blockchain transactions means that once funds are stolen, recovery is nearly impossible. Beyond immediate financial theft, compromised wallet data can lead to ongoing vulnerability, as attackers may monitor and exploit the wallet over time.
The packages’ ability to fetch external code adds another layer of risk. This feature allows attackers to dynamically update and expand their malicious capabilities without updating the package itself. As a result, the impact could extend far beyond the initial theft, potentially introducing new threats or targeting additional assets over time.
This sophisticated supply chain attack on the Python ecosystem serves as another stark reminder of the ever-evolving threats in the open-source world. The cryptocurrency space, in particular, continues to be a prime target, with attackers constantly devising new and innovative methods to compromise valuable digital assets. This incident underscores the critical need for ongoing vigilance of the entire open-source community to remain alert and proactive in identifying and mitigating such threats. The attack’s complexity – from its deceptive packaging to its dynamic malicious capabilities and use of malicious dependencies – highlights the importance of comprehensive security measures and continuous monitoring.
As part of the Checkmarx Supply Chain Security solution, our research team continuously monitors suspicious activities in the open-source software ecosystem. We track and flag “signals” that may indicate foul play and promptly alert our customers to help protect them.
Checkmarx One customers are protected from this attack.
This could be the perfect ending to a cybersecurity film, but it’s even better—it’s the exciting reality of ZAP’s core team joining Checkmarx to deliver the best in Dynamic Application Security Testing (DAST)! ZAP project leaders Simon Bennetts, Ricardo Pereira, and Rick Mitchell are joining Checkmarx to help develop the next generation of our enterprise-grade DAST solution. They will continue to support the open-source project and grow the ZAP community. This transition allows them to focus fully on advancing DAST.
For those unfamiliar, DAST is a technique used to test web applications by detecting vulnerabilities while they are running. Unlike static analysis, DAST doesn’t require access to source code, making it essential in identifying security risks that emerge in live environments. This widely adopted tool helps uncover vulnerabilities that may remain hidden during static analysis.
ZAP (Zed Attack Proxy) is the world’s most widely used web application scanner, having been downloaded millions of times. This popularity is one of the reasons Checkmarx initially integrated it into its DAST solution.
Simon Bennets, Ricardo Pereira and Rick Mitchell are ZAP’s project leaders and have been contributing, guiding and reviewing any contribution to the open source project.
For users of Checkmarx DAST, our commercial offering, this collaboration means they will now benefit from the unmatched expertise of the ZAP core team. Checkmarx will continue to empower organizations to secure their applications from code to cloud. The ZAP team’s deep knowledge, gained from contributing to nearly every aspect of ZAP, adds tremendous value to Checkmarx’s already robust offering.
Checkmarx will gain unique insights and focus, enabling us to drive faster enhancements to Checkmarx DAST. The Checkmarx research team, working alongside ZAP’s leadership, will improve the accuracy of the engine, reducing false positives, and helping customers focus only on relevant findings. The team will also enhance key features like scan rules, automation, and authentication.
For customers already using our DAST solution, the usage will remain seamless. The new features developed with the ZAP core team will be integrated into our existing solution and available “out of the box” in Checkmarx DAST.
By combining Static Application Security Testing (SAST), DAST, and API security testing into a single platform, Checkmarx provides a unified view of vulnerabilities, allowing for comprehensive analysis. This holistic approach to vulnerability management, which includes both static and dynamic analysis, along with API security, ensures better prioritization and more efficient remediation.
Integrating DAST in the Checkmarx One platform helps secure applications by detecting vulnerabilities in live environments and throughout the SDLC. Correlation between the different results adds a layer of prioritization, making remediation more efficient. It’s a key part of our vision for managing application risk, spanning from development to runtime insights (code-to-cloud).
Moving forward, ZAP will be known as “ZAP by Checkmarx” and will continue as a separate, community-driven project under the Apache v2 license. As Simon noted in his blog, “This is by far the biggest investment any one company has made in ZAP and ensures that ZAP will continue to thrive.”
For the open-source community, this collaboration is great news. Checkmarx has a proven track record with open-source projects such as KICS, 2MS, CxFlow, Vorpal, ImageX and many others, contributing significantly to the community.
Personally, as the previous Product Manager for our open-source secret detection solution (2MS) and through my work with our research team identifying malicious packages, I’ve always felt a close connection to the open source community. I’m happy to have another opportunity to contribute.
The ZAP community will benefit from our company’s expertise and resources, enabling the development of key features requested by the community. ZAP by Checkmarx will continue to be a community-driven, open-source DAST solution, while our enhanced Checkmarx DAST will build upon and improve on our existing solution.
This marks a significant step in expanding our dynamic application security testing capabilities and strengths our commitment to Checkmarx DAST, with the knowledge and support of the ZAP project leaders.
If you’d like to learn more, feel free to contact us.
]]>
You are a security professional who has been given the task of managing and mitigating the wide array of vulnerabilities within the open source software your organization uses. In an ideal world, you would have the time, energy, and resources necessary to immediately begin addressing and fixing each and every one of these vulnerabilities immediately – including all the new ones coming in.
However, reality is quite different. No one has all the necessary resources readily at their disposal. The number of vulnerabilities in open source software is insurmountable. The reality is that it would take an unprecedented amount of time, manpower, and financial resources to tackle each and every vulnerability.
The only viable solution is to prioritize. So, how do you effectively prioritize your efforts?
Having the right tools in your vulnerability management toolbox is critical. This is where the Exploit Prediction Scoring System (EPSS) comes in.
Developed by the Forum of Incident Response and Security Teams (FIRST) [1], EPSS [2] provides a data-driven approach to predicting the likelihood of a vulnerability being exploited in the wild within the next 30 days. By leveraging this industry-recognized standard, organizations can make informed decisions and allocate their limited resources much more efficiently.
At its core, EPSS is a probability score ranging from 0 to 1 (0 to 100%). A higher score indicates a greater likelihood of a vulnerability being exploited. But how is this score calculated, and what factors does it consider?
EPSS considers a wide range of information, including:
By analyzing this data[LZ1] [RG2] and monitoring exploitation activity through various methods, such as honeypots, IDS/IPS sensors, and host-based detection, EPSS provides a nuanced understanding of the urgency and impact of vulnerabilities. It uses machine learning to detect patterns and connections between vulnerability data and exploitation activities gathered over time.
EPSS data is refreshed daily and offers two key metrics: probability and percentiles.
Probability represents the likelihood of a vulnerability being exploited in the wild within the next 30 days, while the percentile indicates the percentage of vulnerabilities that have a score equal to, or lower than, a particular vulnerability’s score.
For example, an EPSS probability of just 0.10 (10%) rests at about the 88th percentile – meaning that 88% of all CVEs are scored lower. These percentiles are derived from probabilities and offer insights into how a particular EPSS probability compares to all other scores – all the other CVEs.
The figure below shows the probability distribution of EPSS scores for over 170,000 vulnerabilities as of March 2022. It illustrates that most vulnerabilities score below 10%, indicating a global measure of vulnerability exploitation in the wild.
While EPSS is a powerful tool, it’s essential to understand its limitations and use it in conjunction with other exploitability metrics and threat intelligence. Some key points to keep in mind:
EPSS and CVSS (Common Vulnerability Scoring System) are both measures to help organizations manage vulnerabilities, but they serve different purposes.
EPSS aims to predict the likelihood of a vulnerability being exploited, using a forward-looking and probabilistic approach. In contrast, CVSS evaluates the severity of a vulnerability by assessing its characteristics and potential impact, with a descriptive and deterministic approach.
The graph below shows the comparison between CVSS and EPSS scores for a sample of CVEs.
EPSS should be treated as one aspect of the overall vulnerability management picture, complementing other factors like CVSS.
FIRST compares two strategies, one prioritizing CVEs with CVSS 7 and higher, and another prioritizing CVEs with EPSS of 0.1 and higher.
They compared:
Effort – The proportion of vulnerabilities being prioritized for remediation
Efficiency – How efficiently resources were spent by measuring the percent of prioritized vulnerabilities that were exploited.
Coverage – The percent of exploited vulnerabilities that were prioritized, calculated as the number of exploited vulnerabilities prioritized (TP – correctly prioritized) divided by the total number of exploited vulnerabilities (TP + FN – correctly prioritized + incorrectly delayed).
By prioritizing vulnerabilities with an EPSS of 0.1 and higher, organizations can significantly reduce their remediation effort while improving efficiency.
They also analyzed EPSS and CVSS scores to understand their correlation [3]. The analysis showed that attackers do not exclusively target the most impactful or easiest-to-exploit vulnerabilities, challenging the assumption that they only focus on the most severe ones. Therefore, it’s recommended to use a combination of factors to effectively prioritize which vulnerabilities to patch first.
At Checkmarx, we understand the importance of effective vulnerability management. That’s why Checkmarx Customers using the Checkmarx One Platform can leverage the integrated EPSS scores for quicker and more effective risk triaging and remediation.
EPSS complements our existing array of exploitability metrics which include but are not limited to:
By combining these tools, our customers can achieve a comprehensive view of their vulnerability landscape and take proactive measures to mitigate risks with high potential to be exploited by malicious actors.
The following displays how EPSS, and other exploitability metrics, are integrated into Checkmarx’ product.
Staying ahead of current threats requires a proactive and data-driven approach. EPSS offers organizations a valuable tool to prioritize their vulnerability management efforts effectively. By leveraging EPSS alongside other exploitability metrics, security professionals can make informed decisions, allocate resources efficiently, and strengthen their overall security posture.
At Checkmarx, we are committed to providing our customers with the most comprehensive and cutting-edge vulnerability management solutions and empowering organizations to navigate the complex landscape of open-source vulnerabilities with confidence and precision.
[1] FIRST: https://www.first.org/
[2] EPSS: https://www.first.org/epss/
[3] EPSS: User Guide, Using EPSS and CVSS Together https://www.first.org/epss/user-guide
The attackers have employed a multi-faceted approach to craft an illusion of authenticity around their malicious packages.
One deceptive tactic combines brandjacking and combosquatting—two methods that fall under the broader category of typosquatting. This strategy creates the illusion that their packages are either extensions of or closely related to the legitimate “noblox.js” library.
For example: noblox.js-async, noblox.js-thread, and noblox.js-api.
Since libraries commonly have multiple versions or extensions for specific use cases, mimicking this naming pattern increases the likelihood that developers will install the attackers’ packages, assuming they’re official extensions of “noblox.js.”
Another tactic used is starjacking, a popular method attackers employ to fake their package stats. In this instance, the malicious packages were linked to the GitHub repository URL of the genuine “noblox.js” package. This tactic falsely inflates the perceived popularity and trustworthiness of the malicious packages.
The attackers also used tactics to disguise the malware within the package itself. They meticulously mimicked the structure of the legitimate “noblox.js” but introduced their malicious code in the “postinstall.js” file. They heavily obfuscated this code, even including nonsensical Chinese characters to deter easy analysis.
These combined techniques create a convincing façade of legitimacy, significantly increasing the chances of the malicious packages being installed and executed on developers’ systems. As reported in previous analyses by Socket, Stacklok and Reversinglabs, these tactics have been consistently employed and refined throughout the year-long campaign.
The malicious code exploits NPM’s postinstall hook, ensuring automatic execution when the package is installed. This hook, designed for legitimate setup processes, becomes a gateway for running the obfuscated malware without the user’s knowledge or consent.
At first glance, the obfuscated code appears daunting and impenetrable. However, by simply using an online automated JavaScript deobfuscation tool, we were able to gain significant insight into the malicious code’s operation. This initial deobfuscation revealed the general steps the malware takes to achieve its objectives. Yet, the resulting code still contained confusing elements and required additional cleaning to fully comprehend its functionality. This process of incremental deobfuscation and analysis allowed us to piece together the complete attack flow, with its most notable details listed below.
The malware searches for Discord authentication tokens in multiple locations.
The stolen tokens are then validated to ensure only active ones are exfiltrated.
The malware aggressively undermines the system’s security measures. It first targets Malwarebytes, attempting to stop its service if running. This is followed by a more comprehensive attack on Windows Defender: the script identifies all disk drives and adds them to Windows Defender’s exclusion list. This action effectively blinds Windows Defender to any file on the system. By disabling third-party antivirus and manipulating built-in Windows security, the malware creates an environment where it can operate freely, significantly increasing its potential for damage and persistence.
The malware expands its capabilities by downloading two additional executables from the attacker’s GitHub repository. These files, “cmd.exe” and “Client-built.exe”, are fetched using base64-encoded URLs and saved to the “C:\WindowsApi” directory with randomized names. The malware uses a combination of “nodeapi_” prefix and a unique hexadecimal string, likely to help the malicious files blend in with legitimate system files.
To ensure long-term access, the malware manipulates a Windows registry key to ensure it runs consistently on the infected system.
Specifically, it adds the path of the downloaded “Client-built.exe” to the strategic registry location:
“HKCU\Software\Classes\ms-settings\Shell\Open\command”
By modifying this key, the malware hijacks legitimate Windows functionality. As a result, whenever a user attempts to open the Windows Settings app, the system inadvertently executes the malware instead.
Throughout its execution, the malware collects various types of sensitive information from the infected system. This data is packaged and sent to the attacker’s command and control server using a Discord webhook.
The final stage involves deploying QuasarRAT, a remote access tool that gives the attacker extensive control over the infected system.
The second-stage malware originates from an active GitHub repository: https://github.com/aspdasdksa2/callback. As of this writing, the repository remains accessible and potentially in use for distributing malware through other packages. The repository, owned by user “aspdasdksa2”, contains multiple malicious executables. The repository’s continued existence and its content suggest ongoing malware development and distribution.
Previous malicious npm packages were found to be linked to a different repository of that user for their second stage, but that repository is no longer accessible.
Notably, the attacker maintains a second repository named “noblox-spoof”, which appears to house the latest malicious npm package content, directly referencing the target of this campaign.
The most recent malicious packages impersonating the popular noblox.js library (four packages published by user “bicholassancheck14” – noblox.js-async, noblox.js-threads, noblox.js-thread, and noblox.js-api) have been taken down after we promptly reported them to npm’s security team. While this is a positive development, it’s important to note that the threat is not entirely neutralized.
We strongly advise the developer community to remain vigilant. The attacker’s continued infrastructure presence and persistence pose an ongoing threat. Developers should exercise caution, particularly when working with packages that resemble popular libraries like noblox.js.
This malware campaign, targeting Roblox developers through npm packages, has persisted for over a year. By mimicking the popular “noblox.js” library, attackers published dozens of malicious packages designed to steal sensitive data and compromise systems.
Central to the malware’s effectiveness is its approach to persistence, leveraging the Windows Settings app to ensure sustained access.
The discovery of these malicious NPM packages serves as a stark reminder of the persistent threats facing the developer community. By masquerading as legitimate, helpful libraries, attackers continue to find new ways to exploit trust within the open-source ecosystem.
This campaign underscores the critical importance of thoroughly vetting packages before incorporation into projects. Developers must remain vigilant, verifying the authenticity of packages, especially those resembling popular libraries, to protect themselves and their users from such sophisticated supply chain attacks.
As part of the Checkmarx Supply Chain Security solution, our research team continuously monitors suspicious activities in the open-source software ecosystem. We track and flag “signals” that may indicate foul play and promptly alert our customers to help protect them.
A malicious campaign involving several python packages, most notably the “spl-types” Python package began on June 25th with the upload of an innocuous package to PyPI. This initial version, devoid of malicious content, was intended to establish credibility and avoid immediate detection. It was a wolf in sheep’s clothing, waiting for the right moment to reveal its true nature. The attacker’s patience paid off on July 3rd when they unleashed multiple malicious versions of the package. The heart of the malware was in the init.py file, obfuscated to evade casual inspection. Upon installation, this code would execute automatically, setting in motion a chain of events designed to compromise and control the victim’s systems, while also exfiltrating their data and draining their crypto wallets.
The initial payload acted as a springboard, reaching out to external resources to download and execute additional malicious scripts. These scripts formed the core of the operation, and meticulously scanned the victim’s system for valuable data. The malware cast a wide net, targeting an array of sensitive information – browser data was readily available, as the malware harvested saved passwords, cookies, browsing history, and even stored credit card information. Cryptocurrency wallets, including popular options like Exodus, Electrum, and Monero, were also prime targets. The attack didn’t stop there; it also sought out data from messaging applications such as Telegram, Signal, and Session. In a particularly invasive move, the malware captured screenshots of the victim’s system, providing the attacker with a visual snapshot of the user’s activities. It also scoured the system for files containing specific keywords related to cryptocurrencies and other sensitive information, including GitHub recovery codes and BitLocker keys. The final part of this digital heist involved compressing the stolen data and exfiltrating it to the attacker’s command and control server via several Telegram bots.
Also included in the malicious scripts was a backdoor that granted the attacker remote control over the victim’s system, allowing for ongoing access and potential future exploits.
One of the attacker’s telegram bots receiving screenshots and data from victims machines.
Profiling the Victims & Motives behind the attack
As we continued to follow this attack, a clear pattern emerged around the victims. What united them was not their profession or location, but their involvement with Raydium and Solana, two prominent players in the cryptocurrency space. This commonality suggests that the attacker had a specific target in mind.
The focus on users of these platforms indicates a level of strategic thinking on the part of the attacker. By targeting this specific group, they positioned themselves to potentially intercept or manipulate high-value transactions, pointing to clear financial motives behind the attack.
The attacker was strategic when thinking about how to deliver the malicious package to unsuspecting victims. Their approach was twofold: create a seemingly legitimate package and then lending it credibility through manipulative online engagement.
The first step involved creating a package that would raise minimal suspicion.
In screenshots from compromised systems, we observed victims using, or installing, a package named “Raydium”.
It’s crucial to note that while Raydium is a legitimate blockchain-related platform (an Automated Market Maker (AMM) and liquidity provider built on the Solana blockchain for the Serum Decentralized Exchange (DEX)), it does not have an official Python library.
Exploiting this gap, the attacker, used a separate username to publish a Python package named “Raydium” on PyPI.
The malicious packages of this campaign were dependencies within other seemingly legitimate packages.
This package included the malicious “spl-types” as a dependency, effectively disguising the threat within a seemingly relevant and legitimate package.
To lend credibility to this package and ensure its widespread adoption, the attacker scoured StackExchange, a popular Q&A platform similar to Stack Overflow, for highly viewed threads related to Raydium and Solana development. Upon identifying a suitable and popular thread, the attacker contributed what seemed to be a high-quality, detailed answer. This response, while ostensibly helpful, included references to their malicious “Raydium” package. By choosing a thread with high visibility—garnering thousands of views—the attacker maximized their potential reach.
Developers seeking solutions to Raydium-related questions would likely view the answer as credible and follow the suggestions with minimal suspicion.
This tactic underscores the importance of verifying the authenticity of packages, especially those recommended in forums by unknown individuals.
The impact of this attack goes beyond theory. Behind each compromised system is a real person, and their stories reveal the true cost of such breaches. There were many cases, but in this blog, let’s look at a couple notable ones that highlight different aspects of this attack:
In one of the malware-captured screenshots, we observed clear personal details of a victim. Cross-referencing this information with LinkedIn allowed us to identify the individual, who happened to also be employed at a respected IT company. This discovery prompted us to take the step of reaching out to warn them about the breach. During our subsequent communication, we learned the victim’s entire Solana crypto wallet had been drained shortly after unknowingly downloading the malicious package. This case vividly illustrates how such attacks can have immediate and severe financial consequences for individuals.
Bottom left: Screenshot of the victim’s screen. Top right: Windows Defender scan declaring in Dutch that the system is clear of threats after the scan. Top left: victim’s private key.
Another victim’s experience highlighted a critical weakness in current cybersecurity practices. A screenshot from their system showed a private key clearly visible – a goldmine for any attacker since these keys bypass any password or multi factor authentication active on the account that the private key is for.
In addition to that, and what made this image particularly alarming, was the Windows Virus and Threat Protection screen displayed alongside it, declaring that a scan had just been completed and that the system was clear of threats.
The revelation that Windows Virus and Threat Protection failed to detect the threat during active data exfiltration is particularly concerning. It emphasizes a critical blind spot in traditional security measures when it comes to malicious activities initiated through package managers. This failure of detection occurred not just before or after the attack, but during the very moment the malware was active and stealing data.
This incident provides a real-world example of how Endpoint Detection and Response (EDR) systems can fall short in stopping malicious package activity or sending relevant alerts.
Moreover, even if the malicious package is later taken down from the repository and not publicly declared as malicious, EDR systems typically won’t flag the package as vulnerable if it remains installed on a user’s system. This leaves users potentially exposed to ongoing threats from previously downloaded malicious packages.
PyPi for example, to this day, completely eliminates all traces of a package, leaving no placeholders behind. Although malicious usernames often remain, they appear without any associated malicious packages. Consequently, anyone who encounters these malicious usernames may be unaware of the user’s history of uploading malicious packages.
Just recently, we published a POC on the blind spots of current EDR solutions.
Attack Timeline
The “spl-types” malware incident is more than just another data breach. It serves as a stark reminder of the cost of cybersecurity failures and the ongoing challenges we face in securing the software supply chain.
This incident’s impact extends beyond individual users to enterprises. A single compromised developer can inadvertently introduce vulnerabilities into an entire company’s software ecosystem, potentially affecting the whole corporate network.
This attack serves as a wake-up call for both individuals and organizations to reassess their security strategies. Relying solely on traditional security measures is not sufficient. A more comprehensive approach is needed, one that includes rigorous vetting of third-party packages, continuous monitoring of development environments, and fostering a culture of security awareness among developers.
As part of the Checkmarx Supply Chain Security solution, our research team continuously monitors suspicious activities in the open-source software ecosystem. We track and flag “signals” that may indicate foul play and promptly alert our customers to help protect them.
The fight against such sophisticated threats is ongoing, and as we gather more insights into the attacker’s methods and infrastructure, we will continue to share our findings with the community. Together, we can work towards a safer digital future for all.
The malicious script follows a systematic approach to compromise the victim’s system and exfiltrate sensitive data. It begins by scanning the user’s file system and focusing on two specific locations: the root folder and the DCIM folder. The script searches for files during the scanning process, with extensions such as .py, .php, and .zip files, as well as photos with .png, .jpg, and .jpeg extensions.
Once identified, the script sends their file paths, along with the actual files and photos, to the attackers Telegram bot. This all occurs without the end-user’s knowledge or consent.
The following code snippet demonstrates the core functionality of the malicious script:
The inclusion of hardcoded sensitive information, such as the bot token and chat ID, allowed us to gain further valuable insights into the attacker’s infrastructure and operations.
Further analysis of the Telegram bot to which the exfiltrated data was being sent uncovered additional findings.
The hardcoded “chat_id” and “bot_token” present in the malicious packages allowed us to gain direct access to the attacker’s Telegram bot and monitor its activities. This technique (outlined here) proved to be a valuable tool in understanding the scope and nature of the attack.
This Telegram bot exhibited a significant history of activity. It had records dating back to at least 2022 – long before the malicious packages were released on PyPI, and contained over 90,000 messages.
The messages were primarily in Arabic. Further investigation with GitHub’s “TeleTracker” revealed that the bot operator maintained numerous other bots and was likely based in Iraq.
Initially, the bot appeared to function as a typical underground marketplace, offering various illicit services such as: purchasing Telegram and Instagram views, followers, spam services, and discounted Netflix memberships.
However, upon further examination of the bot’s message history, we found evidence of more sinister activities that appeared to be involved in financial theft. Additional messages suggested they were sent from systems compromised by the harmful Python packages.
The discovery of this elaborate Telegram-based cybercriminal operation underscores the importance of thorough and persistent investigation when researching the true extent of such attacks. Especially since the initial malicious packages served as a mere entry point to a much larger criminal ecosystem.
The discovery of the malicious Python packages on PyPI and the subsequent investigation into the Telegram bot have shed light on a sophisticated and widespread cybercriminal operation. What initially appeared to be an isolated incident of malicious packages turned out to be just the tip of the iceberg, revealing a well-established criminal ecosystem based in Iraq.
This use case describes simple data exfiltration, but think about what could happen if an attacker targeted an enterprise. While the initial compromise might occur on an individual developer’s machine, the implications for enterprises can be significant.
As the fight against malicious actors in the open source ecosystem persists, collaboration and information sharing among the security community will be critical in identifying and stopping these attacks. Through collective effort and proactive measures, we can work towards a safer and more secure open source ecosystem for all. Our research team is continuously investigating this attack to gain additional insights into the attacker’s modus operandi and will share further findings with the community as they emerge.
As part of the Checkmarx Supply Chain Security solution, our research team continuously monitors suspicious activities in the open-source software ecosystem. We track and flag “signals” that may indicate foul play and promptly alert our customers to help protect them.
In this blog, we will talk about some key concepts of JWTs and present CVE-2023-46943: a vulnerability related to JWTs found during one of our research activities.
A JWT consists of three Base64Url encoded parts separated by dots (.):
In its final form, it looks something like “xxxxx.yyyyy.zzzzz”.
The following image taken from jwt.io helps provide a better understanding of the JWT structure:
JWTs are mostly used in stateless applications, where no session is created on the server-side. When a user performs authentication and logs in, a token is generated and returned to the client.
Implicit authentication means that the user is then responsible for securely storing and managing the JWT. So, for subsequent requests to protected resources of the server, the client sends the token in the Authorization header using the Bearer schema.
This looks something like:
Authorization: Bearer <token>
After that, the server verifies the JWT validity and authenticity by checking its digital signature and if it contains the expected claims, such as the user data and allowed permissions. If the token is valid, the server grants access to the protected resources accordingly.
The following diagram depicts this authentication flow:
CVE-2023-46943 was discovered during research conducted on open source software by Checkmarx’ research group. It is a good example that shows how JWT works and how important it is to implement it securely.
This CVE deals with a weak implementation of the JWT’s HMAC secret. Since the security of the token depends heavily on the strength of the HMAC secret used, signing the tokens with a weak secret makes it easy for attackers to crack it. If they are successful, they can use the predictable secret to create valid JWTs, allowing them to access sensitive information and actions within the application.
Let’s see how this plays out:
First, the attacker saves a JWT issued by the app and brute-forces it with common tools.
As the image above demonstrates – this was successful and the HMAC secret used for signing the tokens is literally “secret” (to illustrate how weak this secret is, it can be cracked in less than a second).
Now it is simple for an attacker to craft a valid JWT that claims an admin identity. We have the secret to sign the token and we just need to add the necessary admin data to its payload, in this case, the key “user.isAdmin” equal to “true”.
At this point, given a valid JWT, access is still limited to all the admin-protected resources (because there is more validation in place). Plus, the token must contain additional user data in its payload, such as its uuid. However, it is already possible to query users’ information, which contains the data we need.
And that’s it. The missing admin user details are returned by the above response, enabling an attacker to build a perfectly valid JWT (by adding that data to the JWT payload and resigning the token) and perform any actions in the application as an administrator.
This CVE is just one example of a common JWT issue: Weak Token Secret. This can be mitigated in two ways:
Another potential problem is that hardcoding keys in shared open-source code can lead to a potential backdoor for unauthorized access. As multiple parties can use the application, when keys are hardcoded in the shared code, everyone can use it by default because it works out of the box. This represents a significant security risk, especially if there is nothing else forcing the users to configure the password, because everyone is then vulnerable and susceptible to exploitation. In this case, attackers could have complete unauthorized access to every application implemented.
Other security considerations include:
learn more about Checkmarx’ approach to SCA (Software Composition Analysis) security, and overall application security testing, request a demo of our Checkmarx One Application Security Platform today. Or sign up for a 14-day free trial here.
Imagine downloading a seemingly harmless AI model from a trusted platform like Hugging Face, only to discover that it has opened a backdoor for attackers to control your system. This is the potential risk posed by CVE-2024-34359. This critical vulnerability affects the popular llama_cpp_python package, which is used for integrating AI models with Python. If exploited, it could allow attackers to execute arbitrary code on your system, compromising data and operations. Over 6,000 models on Hugging Face were potentially vulnerable, highlighting the broad and severe impact this could have on businesses, developers, and users alike. This vulnerability underscores the fact that AI platforms and developers have yet to fully catch up to the challenges of supply chain security.
Jinja2: This library is a popular Python tool for template rendering, primarily used for generating HTML. Its ability to execute dynamic content makes it powerful but can pose a significant security risk if not correctly configured to restrict unsafe operations.
`llama_cpp_python`: This package integrates Python’s ease of use with C++’s performance, making it ideal for complex AI models handling large data volumes. However, its use of Jinja2 for processing model metadata without enabling necessary security safeguards exposes it to template injection attacks.
[image: jinja and llama]
CVE-2024-34359 is a critical vulnerability stemming from the misuse of the Jinja2 template engine within the `llama_cpp_python` package. This package, designed to enhance computational efficiency by integrating Python with C++, is used in AI applications. The core issue arises from processing template data without proper security measures such as sandboxing, which Jinja2 supports but was not implemented in this instance. This oversight allows attackers to inject malicious templates that execute arbitrary code on the host system.
The exploitation of this vulnerability can lead to unauthorized actions by attackers, including data theft, system compromise, and disruption of operations. Given the critical role of AI systems in processing sensitive and extensive datasets, the impact of such vulnerabilities can be widespread, affecting everything from individual privacy to organizational operational integrity.
This vulnerability underscores a critical concern: the security of AI systems is deeply intertwined with the security of their supply chains. Dependencies on third-party libraries and frameworks can introduce vulnerabilities that compromise entire systems. The key risks include:
With over 6,000 models on the HuggingFace platform using `gguf` format with templates—thus potentially susceptible to similar vulnerabilities—the breadth of the risk is substantial. This highlights the necessity for increased vigilance and enhanced security measures across all platforms hosting or distributing AI models.
The vulnerability identified has been addressed in version 0.2.72 of the llama-cpp-python package, which includes a fix enhancing sandboxing and input validation measures. Organizations are advised to update to this latest version promptly to secure their systems.
The discovery of CVE-2024-34359 serves as a stark reminder of the vulnerabilities that can arise at the confluence of AI and supply chain security. It highlights the need for vigilant security practices throughout the lifecycle of AI systems and their components. As AI technology becomes more embedded in critical applications, ensuring these systems are built and maintained with a security-first approach is vital to safeguard against potential threats that could undermine the technology’s benefits.
]]>The importance of Appsec is very well recognized in the software and security worlds. Organizations from all verticals must secure their software applications and need to start from the first line of code, and as a part of the software development process.
It is the responsibility of all organizations – big and small, regardless of industry – to educate their staff on best practices in AppSec and cybersecurity.
Cybersecurity training is not a “one size fits all” but requires organizations to provide the right training to their employees, and the training needs to relate to the responsibilities and job functions of the specific responsibilities of the employees.It’s important to recognize that different employees require different training programs. General security awareness programs are not enough for AppSec professionals and software developers. Codebashing was built to meet AppSec professionals and developers where they are.
This joint initiative by Checkmarx and OWASP comes to play in parallel with the recent publication in the US by the National Institute of Standards and Technology (NIST), which published the updated 2.0 version of the Cybersecurity Framework (CSF 2.0), along with the NIS2 regulation (EU) that go into effect in October 2024. Both regulations rely on cybersecurity training as a condition for compliance.
Codebashing is a toolbox that allows companies to tailor, run, and manage a training program based on their specific needs and goals. It also allows developers to consume micro-content on their own timeline and at their own pace, without disrupting their workflow.
OWASP, as a community, have been leaders in the AppSec world since the beginning, and this new initiative will allow members to continue to advance their knowledge and expertise – so they can continue to be leaders for years to come.
By making AppSec training accessible, Checkmarx is truly securing the applications that are driving our world.
Thank you to Andew van der Stock, Executive Director of OWASP, and to all the people at Checkmarx who have been instrumental at making this partnership come to life!
]]>