Explaining predictive models: the Evidence Counterfactual

news2024/12/20 15:05:47

Imagine being targeted with an advertisement for this blog. You’d like to know: why did the AI model predict you’d be interested in the Faculty of Business and Economics’ blog, based on the hundreds of web pages you visited? The answer could be: because you visited www.great-business-faculties.com, www.datascience-in-business.org and www.living-in-antwerp.com: if you would not have visited these pages, you’d no longer be targeted with this specific ad. This explanation is an example of an imaginary “Evidence Counterfactual”.

The use of such browsing data, as well as Facebook likes or location data is often used in targeted advertising systems, and beyond. The predictive models are increasingly complex, while end users become more vocal in their wish for transparent and explainable systems.

When being the subject of such predictive models, businesses and citizens might ask: why I am being rejected for credit? Why I am being targeted with this ad? Etc. The European GDPR even provides the “right to obtain meaningful information about the logic involved” to data subjects who are involved in automated decision making.

In this post, you’ll learn more about the Evidence Counterfactual, an increasingly important approach within the “explainable AI” research domain, that helps understand the decisions of predictive systems that use Big Data. The Applied Data Mining research group has developed algorithms to provide such explanations and validated them in a variety of business domains.

Behavioral big data

More and more companies are tapping into a large pool of humanly-generated data, or “behavioral big data”. Think of a person liking Instagram posts, visiting different locations captured by their mobile GPS, searching Google, making online payments, connecting to people on LinkedIn, and so on. All these behavioral traces lead to artificial intelligent (AI) systems with very high predictive performance in a variety of application areas, ranging from finance to risk to marketing.

The goal of these AI systems is to use this data to predict a variable of interest, for example, a person’s personality traits, product interests, creditworthiness, and so on. The model uses a large number of small pieces of evidence to make predictions. Let’s refer to all that data as the “evidence pool“. The pieces of evidence are either “present” or “missing”. All pieces that are present can be used to make predictions.

Tourist or citizen?

To illustrate how behavioral big data can be seen as a “pool of evidence,” imagine a model that uses location data of people in New York City to predict if someone is a tourist or a citizen. Out of all possible places to go to (the “evidence pool”), a person will only visit a relatively small number of places each month.

These are the pieces of evidence that are “present” and are represented by a value of 1 (see Figure 1). All places that are not visited by that person are “missing” and get a corresponding zero value in the data matrix.

In Figure 1, for example, Anna visited 85 places out of the 50,000 possible places used by the model. She visited Times Square and Dumbo, but she did not visit Columbia University, making this a missing piece of evidence. The model decides she’s a tourist.

The intuition behind the Evidence Counterfactual

Explaining how predictive systems make decisions based on big data is challenging. Evidence Counterfactuals helps understand the reasons behind individual model predictions. This explanation approach3 identifies a causal relationship between two events: event A causes Event B, only if we observe a difference in B after changing A while keeping everything else constant.

The Evidence Counterfactual shows a subset of evidence (event A) that causally drives the model’s decision (event B). We imagine two worlds, identical in every way up until the point where the evidence set is present in one world, but not in the other. The first world is the “factual” world, and the unobserved world is the “counterfactual” world. To help clarify this, consider the following:

IF Anna did not visit Times Square and Dumbo, THEN the model’s prediction changes from tourist to NY citizen.

The pieces of evidence {Times Square, Dumbo} are a subset of the evidence of Anna (all the places she visited) and explain the model’s decision. Simply removing Times Square or Dumbo from her visited locations would not change the predicted class. Both locations need to be “removed” (feature value set to zero) to change the model’s decision.

The “factual world” is the one that’s observed and includes all the places Anna visited. The “counterfactual world” that results in a predicted class change is identical to the factual world in every way up until the two locations Times Square and Dumbo.

An important advantage of counterfactuals is that they do not require all features that are used in the model (the “evidence pool”) or all the evidence (e.g., all places Anna visited) to be part of the explanation. This is especially interesting in the context of humanly-generated big data: it allows us to explain predictions using concise and comprehensible explanations.

Computing Evidence Counterfactuals

The huge dimensionality of the behavioral data makes it infeasible to compute counterfactual explanations using a complete search algorithm (this search strategy would check all subsets of evidence as candidate explanations).

Alternatively, a heuristic search algorithm can be used to efficiently find counterfactuals. One existing approach is based on a best-first search and makes use of the model’s scoring function to first consider subsets of evidence that, when removed, reduce the predicted score the most in the direction of the opposite predicted class.

There are at least two weaknesses of this strategy:

  1. for some nonlinear models, removing one feature does not result in a predicted score change, which results in the search algorithm picking a random feature to expand in the first iteration;
  2. the search time is very sensitive to the size of the counterfactual explanation: the more evidence that needs to be removed, the longer it takes the algorithm to find the explanation.

As an alternative to the best-first search, we proposed a search strategy that chooses features according to their overall importance for the predicted score.5 The idea is that the more accurate the importance rankings are, the more likely it is to find a counterfactual explanation starting from removing the top-ranked feature up until a counterfactual explanation is found. The hybrid algorithm LIME-Counterfactual (LIME-C) seems a favorable alternative to the best-first search because of its good overall effectiveness and efficiency.

Other data and models

Evidence Counterfactuals can address various data types, from textual data and tabular data (e.g. standard Excel files) to image data. The issue is to define what it means for evidence to be “present” or “missing.” To compute counterfactuals, we need to define the notion of “removing evidence” or setting evidence to “missing.” In this post, we focused on behavioral big data. For these data, which is very sparse (a lot of zero values in the data matrix), it makes sense to represent evidence that’s “present” by those features (e.g., word or behavior) having a nonzero value.

Key takeaways

  • Predictive systems that are trained from humanly-generated Big Data have high predictive performance, however, explaining them is challenging.
  • Explaining data-driven decisions is important for a variety of reasons (increase trust and acceptance, improve models, gain insights, etc.), and for many stakeholders (data scientists, managers, decision subjects, etc.).
  • The Evidence Counterfactual is an explanation approach that can be applied across many relevant applications and highlights a key subset of evidence that led to a particular model decision.

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/612301.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

搭建ubuntu容器内C/C++开发调试环境

一、创建容器 为了让容器内的调试器(gdb、lldb)能够正常调试,在创建容器时需要添加参数: podman添加参数:--cap-addSYS_PTRACE,docker添加参数--cap-addSYS_PTRACE --security-opt seccompunconfined 否…

chatgpt赋能python:Python快捷键——另存为

Python快捷键——另存为 Python是一种高级编程语言,由Guido van Rossum于1991年创立。它的干净简洁,常规用途在计算机编程领域找到了广泛的应用。Python语言是一种普遍而有效的编程语言,有不少的Python程序员喜欢利用快捷键来提高编程效率。…

中医养生APP小程序开发 了解传统文化传承医学经典

中国文化博大精深,中国传统文化更是历史久远,一直到几千年后的今天很多传统文化依然对我们现在的生活有着重大的影响,比如中医。随着人们对健康关注度的提高,很多人把目光投向了追本溯源的中医上,企图通过中医养生达到…

搜索算法(四) 广度优先搜素算法

一、BFS bfs一层一层地遍历图或树,一般用队列实现,可以计算距离目标的步数。 二、例题 1) 力扣https://leetcode.cn/problems/shortest-bridge/ 这道题实际是计算两个岛屿之间的最短距离,可以先用dfs搜索到第一个岛屿并且记录第…

TDEngine - taosdump的安装与使用实战

taosdump的安装与使用实战 一、taosdump简介二、下载三、安装四、taosdump主要参数五、taosdump数据导出(备份)六、taosdump数据导入七、不同版本的数据迁移7.1 问题:报错- create database 语句不一致7.2 解决:修改导出的dbs.sql…

MTK平台的SWT异常的简单总结(2)——SWT原理和分析

(1)原理性 (2)SWT如何抓取Log 遇到SWT问题详细可参考MTK提供的FAQ:SWT机制介绍。 获取Ap Log的路径:/sdcard/debuglogger/mobilelog/APLog_XXXXX 获取db的路径:/data/aee_exp 如果db没有打包…

RK3288 Android5.1添加WiFiBT模块AP6212

CPU:RK3288 系统:Android 5.1 注:RK3288系统,目前 Android 5.0 Kernel 3.10 SDK 支持 Braodcom,Realtek 等 WiFi BT 模块 各个 WiFi BT 模块已经做到动态兼容,Android 上层不再需要像以前一样进 行特定宏的配置 此…

华为OD机试真题 Java 实现【关联子串】【2023Q1 100分】,附详细解题思路

一、题目描述 给定两个字符串str1和str2, str1进行排列组合只要有一个为str2的子串则认为str1是str2的关联子串, 请返回子串在str2的起始位置,若不是关联子串则返回-1。 二、输入描述 qwe dsgfasgfwe 三、输出描述 -1 四、解题思路 …

遇到大数据处理,你会怎么办?快来看一下位图和布隆过滤器(下)

目录 前文 一,为什么有布隆过滤器 二,什么是布隆过滤器 三,布隆过滤器的实现 四,布隆过滤器的优缺点 4.1 布隆过滤器的优点 4.2 布隆过滤器的缺点及其改进方式 4.2.1 查找误判及其改进方式分析 4.2.2 不能删除以及改进方式分…

【HTML】第 1 节 - HTML 初体验

欢迎来到博主 Apeiron 的博客,祝您旅程愉快 。 时止则止,时行则行。动静不失其时,其道光明。 目录 1、缘起 2、HTML 概念 2.1、HTML 定义 2.2、标签语法 3、HTML 基本骨架 4、标签的关系 5、注释 6、总结 1、缘起 最近在学习微信小程…

程序员0基础转行大数据年薪25万,只因我做了这件事...

现在我在成都的一家企业做大数据架构师,一个月税前可以拿到20k,还有项目奖金,一年下来最少也能拿25万。生活和工作也都在有条不紊地运转,每天也会有新的挑战,这正是我想要的生活。 01 机械工程专业 但我决定转行互联…

SpringBoot自定义starter之接口日志输出

文章目录 前言文章主体1 项目全部源码2 项目结构介绍3 starter 的使用3.1 配置文件 application,yml的内容3.2 启动类3.3 控制器类 4 测试结果 结语 前言 本文灵感来源是一道面试题。 要求做一个可以复用的接口日志输出工具,在使用时引入依赖,即可使用。…

MySQL数据库 10.DCL操作

目录 🤔 前言: 🤔DCL介绍: 🤔1.DCL管理用户: 1.查询用户: 图示: 2.创建用户 示例1: 运行结果:​ 示例2: 运行结果:​ 3.修改…

算法修炼之筑基篇——筑基一层中期(解决01背包,完全背包,多重背包)

✨博主:命运之光​​​​​​ 🦄专栏:算法修炼之练气篇​​​​​ 🍓专栏:算法修炼之筑基篇 ✨博主的其他文章:点击进入博主的主页​​​​​​ 前言:学习了算法修炼之练气篇想必各位蒟蒻们的基…

安全——网络安全协议的引入

TCP/IP安全缺陷 信息泄露 概述 网络中投递的报文往往包含账号、口令等敏感信息,若这些信息泄露则是灾难性的后果。其中嗅探是一种常见而隐蔽的网络攻击手段。 嗅探 概述 问题:在共享式网络架构下,所有的数据都是以广播方式进行发送&…

程序员大专毕业,月薪2w是什么体验?

在这个数据驱动的时代,大数据行业的发展前景也非常广阔,我相信我的未来会越来越光明 01 开始学习 是迈向前方的第一步 我是三月,一个来自小城市的大专毕业生。现在在杭州一家公司做大数据开发工程师,目前薪资是20k*13。 我本身…

运维小白必学篇之基础篇第十三集:网络概述中继实验

网络概述中继实验 实验作业(主机名为自己的名字): 1、搭建中继环境,要求如下: 网络要求: 内网:192.168.50.50 网关:192.168.50.254 192.168.60.254 外网:192.168.60.60 主…

【论文阅读】An Object SLAM Framework for Association, Mapping, and High-Level Tasks

一、系统概述 这篇文章是一个十分完整的物体级SLAM框架,偏重于建图及高层应用,在前端的部分使用了ORBSLAM作为基础框架,用于提供点云以及相机的位姿,需要注意的是,这篇文章使用的是相机,虽然用的是点云这个…

DevOps该怎么做?

年初在家待了一段时间看了两本书收获还是挺多的. 这些年一直忙于项目, 经历了软件项目的每个阶段, 多多少少知道每个阶段是个什么, 会做哪些事情浮于表面, 没有深入去思考每个阶段背后的理论基础, 最佳实践和落地工具. 某天leader说你书看完了, 只有笔记没有总结, 你就写个总结…

小白必看!轻松理解和解决MySQL幻读问题!

大家好,我是小米!今天我来给大家分享一下关于MySQL数据库中常见的一个问题——幻读,以及如何解决它。相信对于数据库开发和管理的小伙伴们来说,幻读是一个相对棘手的问题,但只要我们掌握了正确的解决方法,它…