政安晨:【Keras机器学习示例演绎】(二十)—— 综合梯度的模型可解释性

news2025/1/11 10:03:46

目录

综合梯度

设置

综合梯度算法

可视化渐变和集成渐变的辅助类

让我们试一试


政安晨的个人主页政安晨

欢迎 👍点赞✍评论⭐收藏

收录专栏: TensorFlow与Keras机器学习实战

希望政安晨的博客能够对您有所裨益,如有不足之处,欢迎在评论区提出指正!

本文目标:如何获得分类模型的综合梯度。

综合梯度


集成梯度是一种将分类模型的预测归因于其输入特征的技术。它是一种模型可解释性技术:您可以用它来直观显示输入特征与模型预测之间的关系。

综合梯度是计算预测输出与输入特征之间梯度的一种变体。要计算综合梯度,我们需要执行以下步骤:

1. 识别输入和输出。在我们的例子中,输入是图像,输出是模型的最后一层(具有 softmax 激活的密集层)。

2. 在对特定数据点进行预测时,计算哪些特征对神经网络很重要。要确定这些特征,我们需要选择一个基线输入。基线输入可以是黑色图像(所有像素值均设为零)或随机噪音。基线输入的形状必须与我们的输入图像相同,例如 (299, 299, 3)。

3. 按照给定的步数对基线进行插值。步数代表我们对给定输入图像进行梯度逼近时所需的步数。步数是一个超参数。作者建议使用 20 到 1000 步。

4. 预处理这些插值图像并进行前向传递。

5. 获取这些插值图像的梯度。

6. 使用梯形法则对梯度积分进行近似计算。

要深入了解综合梯度以及这种方法的工作原理。

设置

import numpy as np
import matplotlib.pyplot as plt
from scipy import ndimage
from IPython.display import Image, display

import tensorflow as tf
import keras
from keras import layers
from keras.applications import xception


# Size of the input image
img_size = (299, 299, 3)

# Load Xception model with imagenet weights
model = xception.Xception(weights="imagenet")

# The local path to our target image
img_path = keras.utils.get_file("elephant.jpg", "https://i.imgur.com/Bvro0YD.png")
display(Image(img_path))

综合梯度算法

def get_img_array(img_path, size=(299, 299)):
    # `img` is a PIL image of size 299x299
    img = keras.utils.load_img(img_path, target_size=size)
    # `array` is a float32 Numpy array of shape (299, 299, 3)
    array = keras.utils.img_to_array(img)
    # We add a dimension to transform our array into a "batch"
    # of size (1, 299, 299, 3)
    array = np.expand_dims(array, axis=0)
    return array


def get_gradients(img_input, top_pred_idx):
    """Computes the gradients of outputs w.r.t input image.

    Args:
        img_input: 4D image tensor
        top_pred_idx: Predicted label for the input image

    Returns:
        Gradients of the predictions w.r.t img_input
    """
    images = tf.cast(img_input, tf.float32)

    with tf.GradientTape() as tape:
        tape.watch(images)
        preds = model(images)
        top_class = preds[:, top_pred_idx]

    grads = tape.gradient(top_class, images)
    return grads


def get_integrated_gradients(img_input, top_pred_idx, baseline=None, num_steps=50):
    """Computes Integrated Gradients for a predicted label.

    Args:
        img_input (ndarray): Original image
        top_pred_idx: Predicted label for the input image
        baseline (ndarray): The baseline image to start with for interpolation
        num_steps: Number of interpolation steps between the baseline
            and the input used in the computation of integrated gradients. These
            steps along determine the integral approximation error. By default,
            num_steps is set to 50.

    Returns:
        Integrated gradients w.r.t input image
    """
    # If baseline is not provided, start with a black image
    # having same size as the input image.
    if baseline is None:
        baseline = np.zeros(img_size).astype(np.float32)
    else:
        baseline = baseline.astype(np.float32)

    # 1. Do interpolation.
    img_input = img_input.astype(np.float32)
    interpolated_image = [
        baseline + (step / num_steps) * (img_input - baseline)
        for step in range(num_steps + 1)
    ]
    interpolated_image = np.array(interpolated_image).astype(np.float32)

    # 2. Preprocess the interpolated images
    interpolated_image = xception.preprocess_input(interpolated_image)

    # 3. Get the gradients
    grads = []
    for i, img in enumerate(interpolated_image):
        img = tf.expand_dims(img, axis=0)
        grad = get_gradients(img, top_pred_idx=top_pred_idx)
        grads.append(grad[0])
    grads = tf.convert_to_tensor(grads, dtype=tf.float32)

    # 4. Approximate the integral using the trapezoidal rule
    grads = (grads[:-1] + grads[1:]) / 2.0
    avg_grads = tf.reduce_mean(grads, axis=0)

    # 5. Calculate integrated gradients and return
    integrated_grads = (img_input - baseline) * avg_grads
    return integrated_grads


def random_baseline_integrated_gradients(
    img_input, top_pred_idx, num_steps=50, num_runs=2
):
    """Generates a number of random baseline images.

    Args:
        img_input (ndarray): 3D image
        top_pred_idx: Predicted label for the input image
        num_steps: Number of interpolation steps between the baseline
            and the input used in the computation of integrated gradients. These
            steps along determine the integral approximation error. By default,
            num_steps is set to 50.
        num_runs: number of baseline images to generate

    Returns:
        Averaged integrated gradients for `num_runs` baseline images
    """
    # 1. List to keep track of Integrated Gradients (IG) for all the images
    integrated_grads = []

    # 2. Get the integrated gradients for all the baselines
    for run in range(num_runs):
        baseline = np.random.random(img_size) * 255
        igrads = get_integrated_gradients(
            img_input=img_input,
            top_pred_idx=top_pred_idx,
            baseline=baseline,
            num_steps=num_steps,
        )
        integrated_grads.append(igrads)

    # 3. Return the average integrated gradients for the image
    integrated_grads = tf.convert_to_tensor(integrated_grads)
    return tf.reduce_mean(integrated_grads, axis=0)

可视化渐变和集成渐变的辅助类

class GradVisualizer:
    """Plot gradients of the outputs w.r.t an input image."""

    def __init__(self, positive_channel=None, negative_channel=None):
        if positive_channel is None:
            self.positive_channel = [0, 255, 0]
        else:
            self.positive_channel = positive_channel

        if negative_channel is None:
            self.negative_channel = [255, 0, 0]
        else:
            self.negative_channel = negative_channel

    def apply_polarity(self, attributions, polarity):
        if polarity == "positive":
            return np.clip(attributions, 0, 1)
        else:
            return np.clip(attributions, -1, 0)

    def apply_linear_transformation(
        self,
        attributions,
        clip_above_percentile=99.9,
        clip_below_percentile=70.0,
        lower_end=0.2,
    ):
        # 1. Get the thresholds
        m = self.get_thresholded_attributions(
            attributions, percentage=100 - clip_above_percentile
        )
        e = self.get_thresholded_attributions(
            attributions, percentage=100 - clip_below_percentile
        )

        # 2. Transform the attributions by a linear function f(x) = a*x + b such that
        # f(m) = 1.0 and f(e) = lower_end
        transformed_attributions = (1 - lower_end) * (np.abs(attributions) - e) / (
            m - e
        ) + lower_end

        # 3. Make sure that the sign of transformed attributions is the same as original attributions
        transformed_attributions *= np.sign(attributions)

        # 4. Only keep values that are bigger than the lower_end
        transformed_attributions *= transformed_attributions >= lower_end

        # 5. Clip values and return
        transformed_attributions = np.clip(transformed_attributions, 0.0, 1.0)
        return transformed_attributions

    def get_thresholded_attributions(self, attributions, percentage):
        if percentage == 100.0:
            return np.min(attributions)

        # 1. Flatten the attributions
        flatten_attr = attributions.flatten()

        # 2. Get the sum of the attributions
        total = np.sum(flatten_attr)

        # 3. Sort the attributions from largest to smallest.
        sorted_attributions = np.sort(np.abs(flatten_attr))[::-1]

        # 4. Calculate the percentage of the total sum that each attribution
        # and the values about it contribute.
        cum_sum = 100.0 * np.cumsum(sorted_attributions) / total

        # 5. Threshold the attributions by the percentage
        indices_to_consider = np.where(cum_sum >= percentage)[0][0]

        # 6. Select the desired attributions and return
        attributions = sorted_attributions[indices_to_consider]
        return attributions

    def binarize(self, attributions, threshold=0.001):
        return attributions > threshold

    def morphological_cleanup_fn(self, attributions, structure=np.ones((4, 4))):
        closed = ndimage.grey_closing(attributions, structure=structure)
        opened = ndimage.grey_opening(closed, structure=structure)
        return opened

    def draw_outlines(
        self,
        attributions,
        percentage=90,
        connected_component_structure=np.ones((3, 3)),
    ):
        # 1. Binarize the attributions.
        attributions = self.binarize(attributions)

        # 2. Fill the gaps
        attributions = ndimage.binary_fill_holes(attributions)

        # 3. Compute connected components
        connected_components, num_comp = ndimage.label(
            attributions, structure=connected_component_structure
        )

        # 4. Sum up the attributions for each component
        total = np.sum(attributions[connected_components > 0])
        component_sums = []
        for comp in range(1, num_comp + 1):
            mask = connected_components == comp
            component_sum = np.sum(attributions[mask])
            component_sums.append((component_sum, mask))

        # 5. Compute the percentage of top components to keep
        sorted_sums_and_masks = sorted(component_sums, key=lambda x: x[0], reverse=True)
        sorted_sums = list(zip(*sorted_sums_and_masks))[0]
        cumulative_sorted_sums = np.cumsum(sorted_sums)
        cutoff_threshold = percentage * total / 100
        cutoff_idx = np.where(cumulative_sorted_sums >= cutoff_threshold)[0][0]
        if cutoff_idx > 2:
            cutoff_idx = 2

        # 6. Set the values for the kept components
        border_mask = np.zeros_like(attributions)
        for i in range(cutoff_idx + 1):
            border_mask[sorted_sums_and_masks[i][1]] = 1

        # 7. Make the mask hollow and show only the border
        eroded_mask = ndimage.binary_erosion(border_mask, iterations=1)
        border_mask[eroded_mask] = 0

        # 8. Return the outlined mask
        return border_mask

    def process_grads(
        self,
        image,
        attributions,
        polarity="positive",
        clip_above_percentile=99.9,
        clip_below_percentile=0,
        morphological_cleanup=False,
        structure=np.ones((3, 3)),
        outlines=False,
        outlines_component_percentage=90,
        overlay=True,
    ):
        if polarity not in ["positive", "negative"]:
            raise ValueError(
                f""" Allowed polarity values: 'positive' or 'negative'
                                    but provided {polarity}"""
            )
        if clip_above_percentile < 0 or clip_above_percentile > 100:
            raise ValueError("clip_above_percentile must be in [0, 100]")

        if clip_below_percentile < 0 or clip_below_percentile > 100:
            raise ValueError("clip_below_percentile must be in [0, 100]")

        # 1. Apply polarity
        if polarity == "positive":
            attributions = self.apply_polarity(attributions, polarity=polarity)
            channel = self.positive_channel
        else:
            attributions = self.apply_polarity(attributions, polarity=polarity)
            attributions = np.abs(attributions)
            channel = self.negative_channel

        # 2. Take average over the channels
        attributions = np.average(attributions, axis=2)

        # 3. Apply linear transformation to the attributions
        attributions = self.apply_linear_transformation(
            attributions,
            clip_above_percentile=clip_above_percentile,
            clip_below_percentile=clip_below_percentile,
            lower_end=0.0,
        )

        # 4. Cleanup
        if morphological_cleanup:
            attributions = self.morphological_cleanup_fn(
                attributions, structure=structure
            )
        # 5. Draw the outlines
        if outlines:
            attributions = self.draw_outlines(
                attributions, percentage=outlines_component_percentage
            )

        # 6. Expand the channel axis and convert to RGB
        attributions = np.expand_dims(attributions, 2) * channel

        # 7.Superimpose on the original image
        if overlay:
            attributions = np.clip((attributions * 0.8 + image), 0, 255)
        return attributions

    def visualize(
        self,
        image,
        gradients,
        integrated_gradients,
        polarity="positive",
        clip_above_percentile=99.9,
        clip_below_percentile=0,
        morphological_cleanup=False,
        structure=np.ones((3, 3)),
        outlines=False,
        outlines_component_percentage=90,
        overlay=True,
        figsize=(15, 8),
    ):
        # 1. Make two copies of the original image
        img1 = np.copy(image)
        img2 = np.copy(image)

        # 2. Process the normal gradients
        grads_attr = self.process_grads(
            image=img1,
            attributions=gradients,
            polarity=polarity,
            clip_above_percentile=clip_above_percentile,
            clip_below_percentile=clip_below_percentile,
            morphological_cleanup=morphological_cleanup,
            structure=structure,
            outlines=outlines,
            outlines_component_percentage=outlines_component_percentage,
            overlay=overlay,
        )

        # 3. Process the integrated gradients
        igrads_attr = self.process_grads(
            image=img2,
            attributions=integrated_gradients,
            polarity=polarity,
            clip_above_percentile=clip_above_percentile,
            clip_below_percentile=clip_below_percentile,
            morphological_cleanup=morphological_cleanup,
            structure=structure,
            outlines=outlines,
            outlines_component_percentage=outlines_component_percentage,
            overlay=overlay,
        )

        _, ax = plt.subplots(1, 3, figsize=figsize)
        ax[0].imshow(image)
        ax[1].imshow(grads_attr.astype(np.uint8))
        ax[2].imshow(igrads_attr.astype(np.uint8))

        ax[0].set_title("Input")
        ax[1].set_title("Normal gradients")
        ax[2].set_title("Integrated gradients")
        plt.show()

让我们试一试

# 1. Convert the image to numpy array
img = get_img_array(img_path)

# 2. Keep a copy of the original image
orig_img = np.copy(img[0]).astype(np.uint8)

# 3. Preprocess the image
img_processed = tf.cast(xception.preprocess_input(img), dtype=tf.float32)

# 4. Get model predictions
preds = model.predict(img_processed)
top_pred_idx = tf.argmax(preds[0])
print("Predicted:", top_pred_idx, xception.decode_predictions(preds, top=1)[0])

# 5. Get the gradients of the last layer for the predicted label
grads = get_gradients(img_processed, top_pred_idx=top_pred_idx)

# 6. Get the integrated gradients
igrads = random_baseline_integrated_gradients(
    np.copy(orig_img), top_pred_idx=top_pred_idx, num_steps=50, num_runs=2
)

# 7. Process the gradients and plot
vis = GradVisualizer()
vis.visualize(
    image=orig_img,
    gradients=grads[0].numpy(),
    integrated_gradients=igrads.numpy(),
    clip_above_percentile=99,
    clip_below_percentile=0,
)

vis.visualize(
    image=orig_img,
    gradients=grads[0].numpy(),
    integrated_gradients=igrads.numpy(),
    clip_above_percentile=95,
    clip_below_percentile=28,
    morphological_cleanup=True,
    outlines=True,
)

演绎展示:
 

 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step

WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1699486705.534012   86541 device_compiler.h:187] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.

Predicted: tf.Tensor(386, shape=(), dtype=int64) [('n02504458', 'African_elephant', 0.8871446)]


本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1629532.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

debian配置BIND DNS服务器

前言 局域网内有很多台主机&#xff0c;IP难以记忆。 而修改hosts文件又难以做到配置共享和统一&#xff0c;需要一台内网的DNS服务器。 效果展示 这里添加了一个域名hello.dog&#xff0c;将其指向为192.168.1.100。 同时&#xff0c;外网的域名不会受到影响&#xff0c;…

实验8 NAT配置

实验8 NAT配置 一、 原理描述二、 实验目的三、 实验内容1.实验场景2.实验要求 四、 实验配置五、 实验步骤2.静态NAT配置3.NAT Outbound配置4.NAT Easy-IP配置 一、 原理描述 2019年11月26日&#xff0c;全球43亿个IPv4地址正式耗尽&#xff0c;这意味着没有更多的IPv4地址可…

Java 网络编程之TCP(五):分析服务端注册OP_WRITE写数据的各种场景(二)

接上文 二、注册OP_WRITE写数据 服务端代码&#xff1a; import java.io.IOException; import java.net.InetSocketAddress; import java.nio.ByteBuffer; import java.nio.channels.SelectableChannel; import java.nio.channels.SelectionKey; import java.nio.channels.S…

STM32H750外设之ADC开关控制功能介绍

目录 概述 1 ADC开关控制介绍 2 开关控制功能流程 2.1 软件使能ADC 的流程 2.2 软件禁止 ADC 的流程 3 相关寄存器 3.1 ADCx_ISR 3.2 ADCx_CR 4 使能/禁止ADC流程图 ​5 写入 ADC 控制位时的限制 概述 本文介绍STM32H750外设之ADC开关控制功能&#xff0c;该功能是…

禅道项目管理系统身份认证绕过漏洞

禅道项目管理系统身份认证绕过漏洞 1.漏洞描述 禅道项目管理软件是国产的开源项目管理软件&#xff0c;专注研发项目管理&#xff0c;内置需求管理、任务管理、bug管理、缺陷管理、用例管理、计划发布等功能&#xff0c;完整覆盖了研发项目管理的核心流程。 禅道项目管理系统…

每日OJ题_DFS回溯剪枝⑧_力扣494. 目标和

目录 力扣494. 目标和 解析代码&#xff08;path设置成全局&#xff09; 解析代码&#xff08;path设置全局&#xff09; 力扣494. 目标和 494. 目标和 难度 中等 给你一个非负整数数组 nums 和一个整数 target 。 向数组中的每个整数前添加 或 - &#xff0c;然后串联…

“无媒体,不活动”,这句话怎么理解?

传媒如春雨&#xff0c;润物细无声&#xff0c;大家好&#xff0c;我是51媒体网胡老师。 “无媒体&#xff0c;不活动”通常指的是在现代社会中&#xff0c;媒体对于各种活动&#xff0c;尤其是公共活动和事件的推广、宣传和影响力是至关重要的。它强调了媒体在塑造公众意识、…

【Redis 开发】(Feed流的模式,GEO数据结构,BitMap,HyperLogLog)

Redis FeedTimeline GEOBitMapHyperLogLog Feed Feed流产品有两种常见模式: Timeline:不做内容筛选&#xff0c;简单的按照内容发布时间排序&#xff0c;常用于好友或关注。例如朋友圈 优点:信息全面&#xff0c;不会有缺失。并且实现也相对简单 缺点:信息噪音较多&#xff0c…

池化整合多元数据库,zData X 一体机助力证券公司IT基础架构革新

引言 近期&#xff0c;云和恩墨 zData X 多元数据库一体机&#xff08;以下简称 zData X&#xff09;在某证券公司的OA、短信和CRM业务系统中成功上线&#xff0c;标志着其IT基础架构完成从集中式存储向池化高性能分布式存储的转变。zData X 成功整合了该证券公司使用的达梦、O…

【VBA】获取指定目录下的Excel文件,并合并所有excel中的内容。

1.新建一个excel表格。并创建两个Sheet&#xff0c;名字分别命名为FileList 和 All information。 2.按ALTF11进入 VBA编程模块&#xff0c;插入模块。 3.将如下 第五部分代码复制到模块中。 点击运行即可&#xff0c;然后就能提取指定目录下的所有excel文件信息并合并到一起…

2021东北四省赛补题/个人题解

Dashboard - The 15th Chinese Northeast Collegiate Programming Contest - Codeforces I 模拟 #include <bits/stdc.h> using i64 long long; using namespace std; #define int long long int mp[8] {0, 7, 27, 41, 49, 63, 78, 108}; void solve() {int n; cin …

如何有效的将丢失的mfc140u.dll修复,几种mfc140u.dll丢失的解决方法

当你在运行某个程序或应用程序时&#xff0c;突然遭遇到mfc140u.dll丢失的错误提示&#xff0c;这可能会对你的电脑运行产生一些不利影响。但是&#xff0c;不要担心&#xff0c;以下是一套详细的mfc140u.dll丢失的解决方法。 mfc140u.dll缺失问题的详细解决步骤 步骤1&#x…

Atcoder Beginner Contest351 A-E Solution题解

文章目录 [A - The bottom of the ninth](https://atcoder.jp/contests/abc351/tasks/abc351_a)[B - Spot the Difference ](https://atcoder.jp/contests/abc351/tasks/abc351_b)[D - Grid and Magnet](https://atcoder.jp/contests/abc351/tasks/abc351_d)E Note&#xff1a;…

盲人地图使用的革新体验:助力视障人士独立、安全出行

在我们日常生活中&#xff0c;地图导航已经成为不可或缺的出行工具。而对于盲人群体来说&#xff0c;盲人地图使用这一课题的重要性不言而喻&#xff0c;它不仅关乎他们的出行便利性&#xff0c;更是他们追求生活独立与品质的重要一环。 近年来&#xff0c;一款名为蝙蝠…

《HCIP-openEuler实验指导手册》1.7 Apache虚拟主机配置

知识点 配置步骤 需求 域名访问目录test1.com/home/source/test1test2.com/home/source/test2test3.com/home/source/test3 创建配置文件 touch /etc/httpd/conf.d/vhost.conf vim /etc/httpd/conf.d/vhost.conf文件内容如下 <VirtualHost *.81> ServerName test1.c…

python中如何用matplotlib写雷达图

#代码 import numpy as np # import matplotlib as plt # from matplotlib import pyplot as plt import matplotlib.pyplot as pltplt.rcParams[font.sans-serif].insert(0, SimHei) plt.rcParams[axes.unicode_minus] Falselabels np.array([速度, 力量, 经验, 防守, 发球…

AtCoder Beginner Contest 351 G. Hash on Tree(树剖维护动态dp 口胡题解)

题目 n(n<2e5)个点&#xff0c;给定一个长为a的初始权值数组&#xff0c; 以1为根有根树&#xff0c; 树哈希值f计算如下&#xff1a; &#xff08;1&#xff09;如果一个点u是叶子节点&#xff0c;则f[u]a[u] &#xff08;2&#xff09;否则&#xff0c; q(q<2e5)次…

【C++】从零开始认识继承

送给大家一句话&#xff1a; 其实我们每个人的生活都是一个世界&#xff0c;即使最平凡的人也要为他生活的那个世界而奋斗。 – 路遥 《平凡的世界》 ✩◝(◍⌣̎◍)◜✩✩◝(◍⌣̎◍)◜✩✩◝(◍⌣̎◍)◜✩ ✩◝(◍⌣̎◍)◜✩✩◝(◍⌣̎◍)◜✩✩◝(◍⌣̎◍)◜✩ ✩◝(◍…

详解如何品味品深茶的精髓

在众多的茶品牌中&#xff0c;品深茶以其独特的韵味和深厚的文化底蕴&#xff0c;赢得了众多茶友的喜爱。今天&#xff0c;让我们一同探寻品深茶的精髓&#xff0c;品味其独特的魅力。 品深茶&#xff0c;源自中国传统茶文化的精髓&#xff0c;承载着世代茶人的智慧与匠心。这…

linux kernel内存泄漏检测工具之slub debug

一、背景 slub debug 是一个debug集&#xff0c;聚焦于kmem_cache 分配机制的slub内存&#xff08;比如kmalloc&#xff09;&#xff0c;这部分内存在内核中使用最频繁&#xff0c;slub debug其中有相当部分是用来处理内存踩踏&#xff0c;内存use after free 等异常的&#x…