TLP:CLEAR
Best Practices for Securing Data Used to Train & Operate AI Systems
Executive Summary
This Cybersecurity Information Sheet (CSI) provides essential guidance on securing data used in artificial intelligence (AI) and machine learning (ML) systems. It also highlights the importance of data security in ensuring the accuracy and integrity of AI outcomes and outlines potential risks arising from data integrity issues in various stages of AI development and deployment.
This CSI provides a brief overview of the AI system lifecycle and general best practices to secure data used during the development, testing, and operation of AI-based systems. These best practices include the incorporation of techniques such as data encryption, digital signatures, data provenance tracking, secure storage, and trust infrastructure. This CSI also provides an in-depth examination of three significant areas of data security risks in AI systems: data supply chain, maliciously modified (“poisoned”) data, and data drift. Each section provides a detailed description of the risks and the corresponding best practices to mitigate those risks.
This guidance is intended primarily for organizations using AI systems in their operations, with a focus on protecting sensitive, proprietary, or mission critical data. The principles outlined in this information sheet provide a robust foundation for securing AI data and ensuring the reliability and accuracy of AI-driven outcomes.
This document was authored by the National Security Agency’s Artificial Intelligence Security Center (AISC), the Cybersecurity and Infrastructure Security Agency (CISA), the Federal Bureau of Investigation (FBI), the Australian Signals Directorate’s Australian Cyber Security Centre (ASD’s ACSC), the New Zealand’s Government Communications
Security Bureau’s National Cyber Security Centre (NCSC-NZ), and the United Kingdom’s National Cyber Security Centre (NCSC-UK). The goals of this guidance are to:
- Raise awareness of the potential risks related to data security in the development, testing, and deployment of AI systems;
- Provide guidance and best practices for securing AI data across various stages of the AI lifecycle, with an in-depth description of the three aforementioned significant areas of data security risks; and
- Establish a strong foundation for data security in AI systems by promoting the adoption of robust data security measures and encouraging proactive risk mitigation strategies.
Please click here to read more detail
TLP:CLEAR