The Importance of Testing AI/ML Applications

2022-11-04

关注

The Importance of Testing AI/ML Applications — Illustration: © IoT For All

The evolving nature of AI models makes their products ambiguous and unpredictable. Quality assurance methods must accommodate the complexity of AI/ML applications and overcome issues related to lack of security, privacy, and trust. Let’s take a look at the approach to testing AI/ML applications and some of the important issues to be aware of.

Verifications & Validations of AI/ML Applications

The standard approach to creating AI models, known as the Standard Cross-Industry Process for Data Mining (CRISP-DM), starts with data collection, preparation, and cleaning. The resulting data is then iteratively used in multiple model approaches before the perfect model is finalized. Testing for this model first uses a subset of the information that has gone through the above process. By feeding this test data into the model and running multiple combinations of hyperparameters or variants on the model to see its correctness or accuracy, the model is supported by appropriate metrics. These test datasets are randomly generated from the original dataset and applied to the model. This process is very similar to new data simulation methods and will determine how future AI models will scale.

'Quality assurance methods must accommodate the complexity of AI/ML applications and overcome issues related to lack of security, privacy, and trust.' -Alice BabsClick To Tweet

Quality Assurance Challenges

There are countless issues that must be addressed with data-driven testing and quality assurance of AI/ML applications. Let’s take a look at a few:

Interpretability

The decision-making algorithm of an AI model has always been regarded as a black box. Recently, there has been a clear trend toward making models transparent by explaining how they arrive at a set of results based on a set of inputs. It aids in understanding and improving model performance and helps recipients understand model behavior. This is even more important in areas where complaints are common, such as insurance or healthcare systems. Some countries also require explanations for decisions made in conjunction with AI models.

Post facto analysis is the key to interpretability. By performing post-analysis on specific instances misclassified by the AI model, data scientists can understand the parts of the dataset that the model actively focuses on when making decisions.

Bias

The decision-making ability of an AI model depends mainly on the quality of the data it is exposed to. There are many cases where bias seeps into how input data or models are streamed, such as Facebook’s sexist ads or Amazon’s AI-based automated recruiting systems that expose discrimination against women.

The historical data Amazon uses for its systems has been heavily skewed over the past decade due to the dominance of men in the workforce and the tech industry. Even large models like Open AI or Code pilot suffer from world bias permeating their models as they are trained on inherently biased global datasets. To remove bias, it is important to understand what the data was selected for and which features contributed to the decision. A bias in the model can be detected by identifying the attributes that excessively impact it. Once these attributes are identified, they are tested to see if they represent the entire dataset.

Safety

According to the Deloitte State of AI in Enterprise Survey, 62 percent of respondents believe cybersecurity risk is an important issue for AI adoption. Forrester Consulting’s Emergence of Offensive AI report found that 88 percent of security industry decision-makers believe offensive AI is on the horizon.

Since AI models are built on the principle of becoming more intelligent with each iteration of actual data, attacks on such systems also tend to get smarter. Things are further complicated by the advent of adversarial hacks, which aim to attack AI models by modifying a simple aspect of the input data down to a single pixel in an image. Such small changes can introduce more severe disruptions in the model, leading to misclassification and erroneous results.

The starting point for overcoming such security issues is understanding the types of attacks and vulnerabilities in the model that hackers can exploit. It is critical to collect literature and domain knowledge on such attacks and create a repository that can predict such attacks in the future. Employing AI-based cybersecurity systems is an effective technique for deterring hackers. AI-based methods can predict how hackers will react, similar to how they predict other outcomes.

Privacy

As privacy concerns such as GDPR, CCPA, and more, become increasingly common across all applications and data systems, AI models are also under scrutiny. Not to mention that AI systems rely heavily on massive amounts of real-time data to make intelligent decisions—data that can reveal a wealth of information about a person’s demographics, behavior, and consumption attributes.

The AI model needs to be examined to assess how it discloses information to address privacy concerns. Privacy-conscious AI models take appropriate steps to de-anonymize, pseudonymize, or use state-of-the-art privacy-enhancing techniques. The model can be evaluated for privacy violations by analyzing how a privacy attacker takes training data input from the model and effectively modifies it to gain access to personally identifiable information. The two-step process of discovering derivable training data through an inference attack and then identifying the presence or absence of PII in the data helps identify privacy concerns when deploying models.

Accurate Testing

Accurate testing of AI-based applications requires extending the concept of quality assurance from the scope of performance, reliability, and stability to new dimensions of explainability, security, bias, and privacy. The international standardization community is also working on this idea by extending the traditional ISO 25010 standard to include the above aspects. As AI and ML model development continue, focusing on all of these aspects will result in more robust, always-learning, and compliant models capable of producing more accurate and realistic results.

Artificial Intelligence
Cybersecurity
Device Testing
IT and Security
Machine Learning

Artificial Intelligence
Cybersecurity
Device Testing
IT and Security
Machine Learning

参考译文

测试AI/ML应用程序的重要性

AI模型不断发展的本质使得它们的产品模糊不清且不可预测。质量保证方法必须适应AI/ML应用程序的复杂性，并克服与缺乏安全性、隐私性和信任相关的问题。让我们来看看测试AI/ML应用程序的方法和一些需要注意的重要问题。创建AI模型的标准方法被称为数据挖掘的标准跨行业流程(CRISP-DM)，从数据收集、准备和清理开始。在完美的模型最终确定之前，生成的数据将在多种模型方法中迭代使用。对该模型的测试首先使用经过上述过程的信息的子集。通过将这些测试数据输入到模型中，并在模型上运行多个超参数或变量的组合，以查看其正确性或准确性，模型得到了适当的度量的支持。这些测试数据集是从原始数据集随机生成的，并应用于模型。这一过程与新的数据模拟方法非常相似，并将决定未来的AI模型将如何扩展。AI/ML应用程序的数据驱动测试和质量保证必须解决无数问题。让我们来看看几个:AI模型的决策算法一直被认为是一个黑盒子。最近，有一种明确的趋势，就是通过解释模型如何基于一组输入得到一组结果，从而使模型透明。它有助于理解和改进模型性能，并帮助接收者理解模型行为。在投诉普遍的领域，如保险或医疗系统，这一点甚至更为重要。一些国家还要求对与人工智能模型相结合的决策做出解释。事后分析是可解释性的关键。通过对AI模型错误分类的特定实例进行后分析，数据科学家可以了解模型在做出决策时积极关注的数据集部分。AI模型的决策能力主要取决于它所接触到的数据的质量。在很多情况下，偏见会渗透到输入数据或模型的流媒体中，比如Facebook的性别歧视广告或亚马逊的基于人工智能的自动招聘系统，这些系统暴露了对女性的歧视。由于男性在劳动力市场和科技行业占据主导地位，亚马逊用于其系统的历史数据在过去10年里严重扭曲。甚至像Open AI或Code pilot这样的大型模型也会受到世界偏见的影响，因为它们是在固有的有偏见的全球数据集上训练的。为了消除偏见，重要的是要了解选择数据的目的以及哪些特征对决策有贡献。通过识别过分影响模型的属性，可以检测出模型中的偏差。一旦确定了这些属性，就对它们进行测试，看它们是否代表整个数据集。根据德勤的企业AI状况调查，62%的受访者认为网络安全风险是采用AI的一个重要问题。Forrester咨询公司的《进攻性人工智能的出现》报告发现，88%的安全行业决策者认为进攻性人工智能即将出现。由于人工智能模型建立在随着实际数据的每一次迭代而变得更加智能的原则之上，对此类系统的攻击也往往变得更加智能。由于对抗性黑客的出现，情况变得更加复杂，这种黑客的目的是通过将输入数据的一个简单方面修改为图像中的一个像素来攻击AI模型。这样的小更改可能会在模型中引入更严重的中断，导致错误分类和错误结果。克服这类安全问题的出发点是了解黑客可以利用的攻击类型和模型中的漏洞。收集有关此类攻击的文献和领域知识，并创建一个可以预测未来此类攻击的存储库，这是至关重要的。使用基于人工智能的网络安全系统是阻止黑客的有效技术。基于人工智能的方法可以预测黑客的反应，就像他们预测其他结果一样。随着GDPR、CCPA等隐私问题在所有应用程序和数据系统中变得越来越普遍，人工智能模型也受到了审查。更不用说人工智能系统严重依赖大量实时数据来做出智能决策——这些数据可以揭示一个人的人口统计、行为和消费属性的大量信息。需要对人工智能模型进行检查，以评估它如何披露信息以解决隐私问题。注重隐私的AI模型会采取适当的步骤去匿名化、化名化或使用最先进的隐私增强技术。通过分析隐私攻击者如何从模型中获取训练数据输入，并有效地修改它以获得对个人身份信息的访问，可以评估该模型是否侵犯隐私。通过推理攻击发现可派生的训练数据，然后确定数据中是否存在PII，这两步过程有助于在部署模型时确定隐私问题。基于人工智能的应用程序的精确测试需要将质量保证的概念从性能、可靠性和稳定性的范围扩展到可解释性、安全性、偏差和隐私性的新维度。国际标准化团体也在通过扩展传统的ISO 25010标准来实现这一想法，以包括上述方面。随着AI和ML模型开发的继续，关注所有这些方面将产生更健壮、始终学习和兼容的模型，能够产生更准确和更现实的结果。

您觉得本篇内容如何

评分

声明：转载此文是出于传递更多信息之目的。若有来源标注错误或侵犯了您的合法权益，请与我们联系，我们将及时更正、删除，谢谢。

您需要登录才可以回复登录|注册

提交评论

iotforall

这家伙很懒，什么描述也没留下

The Importance of Testing AI/ML Applications

Verifications & Validations of AI/ML Applications

Quality Assurance Challenges

Interpretability

Bias

Safety

Privacy

Accurate Testing

评论

热门资讯

iotforall

相关阅读

工业机器人AMR的「最强大脑」，你了解多少？

消息称三星已放缓汽车半导体项目开发优先专研AI芯片

苹果M5芯片首度曝光：台积电代工用于人工智能服务器

爱芯元智亮相2024世界人工智能大会，以边端智能创普惠AI

2024 WAIC智能芯片及多模态大模型论坛丨爱芯通元AI处理器助力打造普惠智能

三星电子将为日本 Preferred Networks 生产 2nm AI 芯片

“智能时代·共创未来”2024人工智能创新发展论坛成功举办

首次公开?三星2纳米制程获AI芯片订单

三星宣布获首个2nm AI芯片订单

总规模近900亿！上海三大先导产业母基金成立

iotforall

点击进入下一篇

The Importance of Testing AI/ML Applications

Verifications & Validations of AI/ML Applications

Quality Assurance Challenges

Interpretability

Bias

Safety

Privacy

Accurate Testing

评论

热门资讯

iotforall

相关阅读

工业机器人AMR的「最强大脑」，你了解多少？

消息称三星已放缓汽车半导体项目开发优先专研AI芯片

苹果M5芯片首度曝光：台积电代工 用于人工智能服务器

爱芯元智亮相2024世界人工智能大会，以边端智能创普惠AI

2024 WAIC智能芯片及多模态大模型论坛丨爱芯通元AI处理器助力打造普惠智能

三星电子将为日本 Preferred Networks 生产 2nm AI 芯片

“智能时代·共创未来”2024人工智能创新发展论坛成功举办

首次公开?三星2纳米制程获AI芯片订单

三星宣布获首个2nm AI芯片订单

总规模近900亿！上海三大先导产业母基金成立

iotforall

点击进入下一篇

苹果M5芯片首度曝光：台积电代工用于人工智能服务器