As Congress recommends both guardrails and a “full steam ahead” mindset for federal AI deployments, agencies will feel the pressure to quickly deliver AI-based services to citizens. But how can they know that their robots will not cause harm and endanger individual team members, their organizations, and the citizens they serve?
Government agencies have an obligation to provide accurate information to citizens, and a malicious bot can have both legal and moral implications. Last year, for example, the The IRS was cited by the Government Accountability Office for its use of AI to flag tax returns for audit, after the technology was found to possibly have unintentional bias. The IRS had humans aware of this system, but other guidance from the Executive Order and other directives did not appear to have been implemented at the time the potential for bias was discovered.
The IRS incident is a reminder of how important it is for agencies to do everything possible to avoid risk to citizens and protect government and personal data, before the risk becomes a reality. This may seem intimidating, but federal direction And executives highlight what is needed, including understanding AI risks, running DevOps and DevSecOps teams simultaneously, creating an independent red team that ensures the model delivers the highest quality results, and much more, although the details of how to do this aren’t as clear. However, building on already defined best practices for data security and software development as a whole provides a clear path forward on what is needed to ensure AI does not introduce risk.
Keep risk at the forefront
Validating AI can be daunting, as many AI models trade off accuracy and explainability, but it is necessary to mitigate risk. Start by asking questions that quality assurance (QA) would ask for any application. What is the risk of failure and what is the potential impact of this failure? What potential results could your AI system produce? Who could he present them to? What impact could this have?
A risk-based approach to application development is not new, but it needs to be strengthened for AI. Many teams have become accustomed to simply producing or purchasing software that meets requirements. Additionally, DevOps processes integrate quality and security testing into the process from the start. But because AI requires careful consideration of the ways in which the system might “misbehave” relative to its intended use, simply applying current quality assurance processes is not a good approach. AI cannot simply be corrected if it makes a mistake.
Adopt an adversarial mindset
Red teams are routinely deployed to uncover weaknesses in systems and should be used for testing AI, but not in the same way as for traditional application development. An AI red team should be isolated from the day-to-day development team and its successes and failures.
AI red teams in government should include in-house technologists and ethicists, public lab participants, and, ideally, trusted external consultants – none of whom build or profit from the software. Everyone must understand the impact of the AI system on the broader technological infrastructure in place, as well as on citizens.
AI red teams must work with an adversarial mindset to identify harmful or discriminatory outcomes from an AI system as well as unanticipated or undesirable system behaviors. They should also specifically look for limitations or potential risks associated with misuse of the AI system.
Red teams should be freed from the pressures of release schedule and political expectations and report to a senior person, likely the Chief AI Officer (CAIO), who is outside of the development or implementation team. artwork. This will help ensure the effectiveness of the AI model and align with existing safeguards.
Rethink the validation/development ratio
Advances in AI have brought significant improvements in efficiency. A chatbot that might have taken months to build can now be produced in just days.
Don’t assume that AI testing can be done that quickly. Proper validation of AI systems is multi-faceted, and the ratio of testing time to development time will need to be closer to 70-80% for AI, rather than the usual 35-50% for AI systems. business software. Much of this improvement is because requirements are often surfaced during testing, and this cycle becomes more of a “mini-iterative development cycle” rather than a traditional “testing” cycle. DevOps teams should allow time to check training data, privacy violations, bias, error states, penetration attempts, data leaks, and responsibilities, such as the possibility that the results of AI make false or misleading statements. Additionally, red teams need their own time to make the system misbehave.
Establish AI data guidelines
Agencies should establish guidelines for which data will or will not be used to train their AI systems. If using internal data, agencies must maintain a record of the data and inform data generators that the data will be used to train an AI model. Guidelines should be specific to each unique use case.
AI models do not partition data internally like a database does, so data trained from one source can be accessed under another user account. Agencies should consider adopting a “one model per sensitive domain” policy if their organization trains AI models with sensitive data, which likely applies to most government implementations.
Be transparent about AI results
AI developers must communicate what content or recommendations are generated by an AI system. For example, if an agency’s clients interact with a chatbot, they should be informed that the content is generated by AI.
Similarly, if an AI system produces content such as documents or images, the agency might be required to keep a record of these assets so that they can then be validated as “real.” These assets may also require a digital watermark. Although it is not yet an obligation, many agencies are already adopting this best practice.
Agencies must continually monitor, pilot, refine and validate models to ensure they work as intended and provide accurate and unbiased information. By prioritizing independence, integrity and transparency, the models built today will provide foundation agencies with the means they need to improve their operations and serve citizens while maintaining security and safety. private life of the public.
David Colwell is Vice President of Artificial Intelligence and Machine Learning at Tricentis, a provider of automated software testing solutions designed to accelerate application delivery and digital transformation.
Copyright © 2024 Federal Information Network. All rights reserved. This website is not intended for users located in the European Economic Area.