Data Security

Lock, Stock, and Two Neural Networks: Securing Your AI Model

Published: May. 04, 2023

As we’ve noted, AI developers are under increasing scrutiny from regulators. While user privacy has been the focus, security is starting to get more attention.  

Most recently, Senator Mark Warner (D-VA)—the chairman of the Senate Intelligence Committee—sent letters to ten of the largest AI companies requesting information about each company’s security approach and practices.  

These requests are distinct from the potential use of AI as a defensive or offensive cyber tool. AI may prove to be uniquely useful for attackers and defenders attempting to: (i)identify cyber vulnerabilities, while automating both sides of an attack; (ii) write malicious code; (iii) detect threats; (iv) employ and adjust to countermeasures; and (v) identify the scope of impacted data. 

But what about securing the AI system itself? If you’re an AI developer, what should you be doing to reduce your security risks (and reduce potential legal risk and liability)?

1. Use secure coding standards and baseline security standards

The security practices required for AI systems and machine learning are largely the same as for any other application or algorithm. For example, essential parts of any secure-coding standard, whether or not AI is involved, include:

  • Limiting (and monitoring) third-party access;
  • Scanning and validation of code, and
  • Tracking and mitigating risk.  

Similarly, regulators expect AI developers to meet other baseline technical standards and corporate governance models, to ensure that security is properly managed and resourced. State and federal laws also require affirmative security obligations when processing personally identifiable data. Finally, contracts between organizations generally require adherence to sound information security practices. These generally applicable practices will go a long way to preserving the confidentiality, integrity, and availability of AI systems.

2. Assess and manage risk using a third-party standard (like NIST)

Many standards could be used, but a good place to start is the NIST AI Risk Management Framework, released earlier this year. NIST—an agency within the U.S. Department of Commerce—develops and publishes industry standards that are often referenced as de facto requirements by courts and regulators. The AI Risk Management Framework incorporates other standards such as the NIST Cybersecurity Framework, but includes many AI-specific steps to improve the security and resilience of AI systems (for example, using watermarking technologies as a deterrent to data and model extraction attacks). Just as important is that risks are properly documented, shared with the appropriate departments and individuals, and mitigated and remediated throughout the lifecycle of the AI system. The AI Risk Management Framework provides documentation and tracking guidance which, if properly implemented, would demonstrate a programmatic commitment to managing security.  

3. Limit what goes into your training data

AI and machine learning tools are only as good as the model on which they have been trained; imperfect or incomplete training sets lead to imperfect or incomplete outputs (“garbage in, garbage out”). 

The Biden administration has begun to take action to prevent bias and discrimination from contaminating AI models, but this same issue extends to security. Among the inbound cyber threats to AI are adversarial examples (where attackers confuse an AI model into misinterpreting an input) and data poisoning (where attackers compromise deep-learning training with intentionally malicious information). While both attacks require scale in order to defeat a large training model that is otherwise benign, the consequences can be significant (even dangerous, such as when AI is used to make decisions in a self-driving car). Input fuzzing and designing input validation guidelines during development and testing are valid security techniques. The development of tools for ensuring a high degree of confidence in the training data set is also critical, including means to detect large-scale or automated means of breaking the model through feeding misinformation.  

4. Limit what the model provides in outputs

AI systems are designed to respond in useful ways to end-user prompts, but not all users have benevolent intentions (e.g., the numerous subreddits dedicated to “jailbreaking” generative AI models). Bad actors may attempt to exfiltrate information about the AI model, training data sets, and other intellectual property by using the tool against itself. Models should be trained to withstand this interrogation, walling off sensitive information that could potentially expose how the model works and what it knows. Attackers may also attempt to exfiltrate information from other end users that have been absorbed into the training model. For example, if one company used the AI system to formulate a strategic plan, a competitor with a carefully crafted prompt may be able to observe what the AI system learned from the first user’s inputs. 

Because end users can (and will) come up with novel uses for AI systems,  ensure that sensitive user information is not regurgitated to all future users. If the model cannot be adequately limited in such a way, consider input sanitization to prevent users from volunteering information that cannot be protected later.

* * *


Senator Warner has made clear that “beyond industry commitments, … 

some level of regulation is necessary” to ensure that security is built into AI systems. With other leaders in Congress (and in statehouses, agencies, foreign capitals, and basically every other regulatory body around the world) also considering statutory or regulatory restrictions on AI, if you’re an AI developer, now is the time to demonstrate that security is central to how your systems are designed, built, maintained, and operated.