Overview
On July 18th, Apache Spark released an official statement regarding a newly found vulnerability within Apache Spark's ACL implementation (tracked via CVE-2022-33891, which, at the time of this article’s publishing, has not yet received a CVSS score). This finding has been credited to Kostya Kortchinsky, a security researcher at Databricks.
Apache Spark is a "Unified engine for large-scale data analytics", an engine that can execute data engineering, data science and machine learning tasks as well as provide an API in multiple programming languages.
In scenarios where ACLs are enabled through the Spark UI, an attacker can exploit this vulnerability through impersonation, where they can then provide an arbitrary username and payload. If the given payload reaches a permission check function in Apache Spark (which builds and executes a shell command), then the end result will be shell command execution, as the user Spark is running as on the system.
Affected Versions
This vulnerability affects Apache Spark versions 3.0.3 and earlier, versions 3.1.1 to 3.1.2, and versions 3.2.0 to 3.2.1.
Mitigation at StackPath
StackPath has implemented a rule that detects and blocks this kind of behavior, providing protection against this attack. If you are running Apache Spark, please update it to the latest version.