



在过去的几年里,人们对人工智能 (AI) 产生了很多担忧,因为商界人士和技术专家都担心他们认为这些系统拥有巨大的决策能力。




评估治理结构。管理 AI 的健康生态系统必须包括治理流程和结构...... AI 问责制意味着在组织层面寻找治理的可靠证据,包括 AI 系统的明确目标和目标;明确定义的角色、职责和权限;能够管理人工智能系统的多学科劳动力;广泛的利益相关者;和风险管理流程。此外,寻找系统级治理元素至关重要,例如特定 AI 系统的记录技术规范、合规性以及利益相关者对系统设计和操作信息的访问。


以下是切实可行的建议:使用区块链技术确保所有关于人工智能或机器学习模型的决策都被记录下来并且是可审计的。(全面披露:2018 年,我提交了一项美国专利申请 [16/128,359 USA],围绕使用区块链进行模型开发治理。)


开发 AI 决策模型是一个复杂的过程,其中包含无数增量决策——模型的变量、模型设计、使用的训练和测试数据、特征的选择等等。所有这些决定都可以记录到区块链中,这也可以提供查看模型原始潜在特征的能力。您还可以将构建变量集不同部分以及参与模型权重创建和模型测试的所有科学家记录到区块链中。

模型治理和透明度对于构建可审计的道德 AI 技术至关重要。在区块链技术的支持下,这些决策的总和和总记录提供了在内部有效管理模型、赋予问责制并满足肯定会为您的 AI 服务的监管机构所需的可见性。


在区块链成为流行语之前,我开始在我的数据科学组织中实施类似的分析模型管理方法。2010 年,我制定了一个以分析跟踪文档 (ATD) 为中心的开发流程。这种方法详细的模型设计、变量集、科学家分配、培训和测试数据以及成功标准,将整个开发过程分解为三个或更多敏捷冲刺。

我认识到需要采用 ATD 的结构化方法,因为我已经看到了太多的负面结果,这些负面结果已经成为金融行业的常态:缺乏验证和问责制。以银行业为例,十年前分析模型的典型生命周期如下所示:



3、最终,银行可能会通过假设模型构建正确并且表现良好而处于高风险境地 - 但也不是真的知道。银行无法验证模型或了解模型在什么情况下不可靠或不可信。这些现实导致不必要的风险或大量模型被丢弃和重建,经常重复上述过程。


我正在申请专利的发明描述了如何使用区块链技术对分析和机器学习模型开发进行编码,以将实体链、工作任务和需求与模型相关联,包括测试和验证检查。它复制了我在组织中用来构建模型的大部分历史方法——ATD 本质上仍然是我的科学家、经理和我之间的合同,它描述了:




4、该模型必须改进的领域,例如,在交易级别上将不存在卡 (CNP) 信用卡欺诈提高 30%




8、道德 AI程序和测试






如您所见,ATD 提供了一组非常具体的要求。该团队包括直接建模经理、分配给项目的一组数据科学家,以及作为敏捷模型开发过程负责人的我。一旦我们都协商好我们的角色、职责、时间表和构建要求后,团队中的每个人都会签署 ATD 作为合同。ATD 成为我们定义整个敏捷模型开发过程的文档。然后,它被分解为一组需求、角色和任务,这些需求、角色和任务被放到区块链上进行正式分配、工作、验证和完成。



重要的是,区块链实例化了一系列决策。它显示了一个变量是否可以接受,它是否在模型中引入了偏差,或者该变量是否被正确利用。区块链不仅仅是一个积极成果的清单,它是构建这些模型的过程的记录——错误、更正和改进都被记录下来。例如,道德 AI 测试失败等结果会保留在区块链中,用于消除偏见的补救步骤也是如此。我们可以非常细化地看到旅程:





这种方法提供了高度的信心,即没有人在模型中添加了表现不佳的变量或在模型中引入了某种形式的偏差。它确保没有人在其数据规范中使用了不正确的字段或未经许可和验证更改了已验证的变量。如果没有 ATD(现在是区块链)提供的严格审查流程来让我的数据科学组织保持可审计性,我的数据科学家可能会无意中引入一个有错误的模型,尤其是当这些模型和相关算法变得越来越复杂时。




Scott Zoldi 是 FICO 的首席分析官,负责FICO 产品和技术解决方案的分析开发。在 FICO 期间,Scott 负责撰写 110 多项分析专利,其中 71 项已获得批准,46 项正在申请中。Scott 积极参与新分析产品和大数据分析应用程序的开发,其中许多应用程序利用了新的流分析创新,例如自适应分析、协作分析和自我校准分析。Scott 最近专注于应用流式自学习分析来实时检测网络安全攻击。Scott 在软件圣地亚哥和卓越网络中心两个董事会任职。Scott在杜克大学获得理论和计算物理学博士学位。


The past few years have brought much hand wringing and arm waving about artificial intelligence (AI), as business people and technologists alike worry about the outsize decisioning power they believe these systems to have.

As a data scientist, I am accustomed to being the voice of reason about the possibilities and limitations of AI. In this article I’ll explain how companies can use blockchain technology for model development governance, a breakthrough to better understand AI, make the model development process auditable, and identify and assign accountability for AI decisioning.

Using blockchain for model development governance

While there is widespread awareness about the need to govern AI, the discussion about how to do so is often nebulous, such as in “How to Build Accountability into Your AI” in Harvard Business Review:

Assess governance structures. A healthy ecosystem for managing AI must include governance processes and structures.... Accountability for AI means looking for solid evidence of governance at the organizational level, including clear goals and objectives for the AI system; well-defined roles, responsibilities, and lines of authority; a multidisciplinary workforce capable of managing AI systems; a broad set of stakeholders; and risk-management processes. Additionally, it is vital to look for system-level governance elements, such as documented technical specifications of the particular AI system, compliance, and stakeholder access to system design and operation information.

This exhaustive list of requirements is enough to make any reader’s eyes glaze over. How exactly does an organization go about obtaining “system-level governance elements” and provide “stakeholder access to system design and operation information”?

Here is actual, actionable advice: Use blockchain technology to ensure that all of the decisions made about an AI or machine learning model are recorded and are auditable. (Full disclosure: In 2018 I filed a US patent application [16/128,359 USA] around using blockchain for model development governance.)

How blockchain creates auditability

Developing an AI decisioning model is a complex process that comprises myriad incremental decisions—the model’s variables, the model design, the training and test data utilized, the selection of features, and so on. All of these decisions could be recorded to the blockchain, which could also provide the ability to view the model’s raw latent features. You could also record to the blockchain all scientists who built different portions of the variable sets, and who participated in model weight creation and model testing.

Model governance and transparency are essential in building ethical AI technology that is auditable. As enabled by blockchain technology, the sum and total record of these decisions provides the visibility required to effectively govern models internally, ascribe accountability, and satisfy the regulators who are definitely coming for your AI.

Before blockchain: Analytic models adrift

Before blockchain became a buzzword, I began implementing a similar analytic model management approach in my data science organization. In 2010 I instituted a development process centered on an analytic tracking document (ATD). This approach detailed model design, variable sets, scientists assigned, training and testing data, and success criteria, breaking down the entire development process into three or more agile sprints.

I recognized that a structured approach with ATDs was required because I’d seen far too many negative outcomes from what had become the norm across much of the financial industry: a lack of validation and accountability. Using banking as an example, a decade ago the typical lifespan of an analytic model looked like this:

· A data scientist builds a model, self-selecting the variables it contains. This led to scientists creating redundant variables, not using validated variable design and creating of new errors in model code. In the worst cases, a data scientist might make decisions with variables that could introduce bias, model sensitivity, or target leaks.

· When the same data scientist leaves the organization, his or her development directories are typically either deleted or, if there are a number of different directories, it becomes unclear which directories are responsible for the final model. The bank often doesn’t have the source code for the model or might have just pieces of it. Just looking at code, no one definitively understands how the model was built, the data on which it was built, and the assumptions that factored into the model build.

· Ultimately the bank could be put in a high-risk situation by assuming the model was built properly and will behave well—but not really knowing either. The bank is unable to validate the model or understand under what conditions the model will be unreliable or untrustworthy. These realities result in unnecessary risk or in a large number of models being discarded and rebuilt, often repeating the journey above.

A blockchain to codify accountability

My patent-pending invention describes how to codify analytic and machine learning model development using blockchain technology to associate a chain of entities, work tasks, and requirements with a model, including testing and validation checks. It replicates much of the historical approach I used to build models in my organization—the ATD remains essentially a contract between my scientists, managers, and me that describes:

· What the model is

· The model’s objectives

· How we’d build that model, including prescribed machine learning algorithm

· Areas that the model must improve upon, for example, a 30% improvement in card not present (CNP) credit card fraud at a transaction level

· The degrees of freedom the scientists have to solve the problem, and those which they don’t

· Re-use of trusted and validated variable and model code snip-its

· Training and test data requirements

· Ethical AI procedures and tests

· Robustness and stability tests

· Specific model testing and model validation checklists

· Specific assigned analytic scientists to select the variables, build the models, and train them and those who will validate code, confirm results, perform testing of the model variables and model output

· Specific success criteria for the model and specific customer segments

· Specific analytic sprints, tasks, and scientists assigned, and formal sprint reviews/approvals of requirements met.

As you can see, the ATD informs a set of requirements that is very specific. The team includes the direct modeling manager, the group of data scientists assigned to the project, and me as owner of the agile model development process. Everyone on the team signs the ATD as a contract once we’ve all negotiated our roles, responsibilities, timelines, and requirements of the build. The ATD becomes the document by which we define the entire agile model development process. It then gets broken into a set of requirements, roles, and tasks, which are put on the blockchain to be formally assigned, worked, validated, and completed.

Having individuals who are tracked against each of the requirements, the team then assesses a set of existing collateral, which are typically pieces of previous validated variable code and models. Some variables have been approved in the past, others will be adjusted, and still others will be new. The blockchain then records each time the variable is used in this model—for example, any code that was adopted from code stores, written new, and changes that were made—who did it, which tests were done, which modeling manager approved it, and my sign-off.

A blockchain enables granular tracking

Importantly, the blockchain instantiates a trail of decision making. It shows if a variable is acceptable, if it introduces bias into the model, or if the variable is utilized properly. The blockchain is not just a checklist of positive outcomes, it’s a recording of the journey of building these models—mistakes, corrections, and improvements are all recorded. For example, outcomes such as failed Ethical AI tests are persisted to the blockchain, as are the remediation steps used to remove bias. We can see the journey at a very granular level:

· The pieces of the model

· The way the model functions

· The way the model responds to expected data, rejects bad data, or responds to a simulated changing environment

All of these items are codified in the context of who worked on the model and who approved each action. At the end of the project we can see, for example, that each of the variables contained in this critical model has been reviewed, put on the blockchain, and approved.

This approach provides a high level of confidence that no one has added a variable to the model that performs poorly or introduces some form of bias into the model. It ensures that no one has used an incorrect field in their data specification or changed validated variables without permission and validation. Without the critical review process afforded by the ATD (and now the blockchain) to hold my data science organization auditable, my data scientists could inadvertently introduce a model with errors, particularly as these models and associated algorithms become more and more complex.

Model development journeys that are transparent result in less bias

In sum, overlaying the model development process on the blockchain gives the analytic model its own entity, life, structure, and description. Model development becomes a structured process, at the end of which detailed documentation can be produced to ensure that all elements have gone through the proper review. These elements also can be revisited at any time in the future, providing essential assets for use in model governance. Many of these assets become part of the observability and monitoring requirements when the model is ultimately used, versus having to be discovered or assigned post-development.

In this way, analytic model development and decisioning becomes auditable, a critical factor in making AI technology, and the data scientists that design it, accountable—an essential step in eradicating bias from the analytic models used to make decisions that affect people’s lives.

Scott Zoldi is chief analytics officer at FICO responsible for the analytic development of FICO’s product and technology solutions. While at FICO, Scott has been responsible for authoring more than 110 analytic patents, with 71 granted and 46 pending. Scott is actively involved in the development of new analytic products and big data analytics applications, many of which leverage new streaming analytic innovations such as adaptive analytics, collaborative profiling, and self-calibrating analytics. Scott is most recently focused on the applications of streaming self-learning analytics for real-time detection of cybersecurity attacks. Scott serves on two boards of directors, Software San Diego and Cyber Center of Excellence. Scott received his PhD in theoretical and computational physics from Duke University.

本文主要内容转载原作者Scott Zoldi,仅供广大读者参考,如有侵犯您的知识产权或者权益,请联系我提供证据,我会予以删除。

