A group of computer scientists from prestigious institutions including Google DeepMind, ETH Zurich, and OpenAI have unveiled a groundbreaking attack that penetrates the secrecy of closed AI models, shedding light on their internal workings. Through a clever technique, they managed to reveal the embedding projection layer of transformer models, a feat previously considered elusive due to the opaque nature of these models.
Their attack, which builds upon a method proposed in 2016 for model extraction, allows them to uncover crucial details of these so-called “black box” models through API queries. By investing as little as a few dollars or up to several thousand, depending on the model’s size, the researchers were able to expose hidden dimensions and parameters, providing unprecedented insights into the structure and capabilities of these AI systems.
For instance, they successfully extracted the entire projection matrix of OpenAI’s ada and babbage language models for under $20 USD, confirming the hidden dimensions of these models. Moreover, they estimated that it would cost less than $2,000 to recover the projection matrix of the gpt-3.5-turbo model. While the attack doesn’t fully unveil the models, it provides crucial information that could potentially be leveraged for further probing and exploitation.
The implications of this breakthrough are significant, raising concerns about the security and privacy of AI models, particularly those used in sensitive applications such as national security. Access to critical parameters could facilitate the replication of proprietary models, posing a threat to intellectual property and security. In response to these findings, recommendations have been made urging governments to explore measures to restrict the open-access release of advanced AI models and enhance security measures to safeguard critical intellectual property.
As the AI landscape continues to evolve, it becomes increasingly vital to address the vulnerabilities and risks associated with these powerful technologies. The revelation of hidden layers within closed AI models underscores the need for robust defenses and regulatory frameworks to ensure the responsible and secure development and deployment of AI systems.
Leave a Reply