基于numpy的鸢尾花数据获取、处理等操作。

news2024/11/25 2:54:32

这是搬运的。

《基于numpy的鸢尾花数据操作》

一、 实验准备

1.1 实验概述

我们本次实验将会使用的编程语言是Python,在本次实验当中我们将会使用结合我们学习过的numpy 中的知识点进行实验,通过本实验当中我们将学会如何使用numpy来对实际数据进行处理,加深numpy的理解。

Numpy:

NumPy(Numerical Python的简称)是Python数值计算最重要的基础包。大多数提供科学计算的包都是用NumPy的数组作为构建基础。

NumPy的部分功能如下:

1、ndarray,一个具有矢量算术运算和复杂广播能力的快速且节省空间的多维数组。 2、用于对整组数据进行快速运算的标准数学函数(无需编写循环)。 3、用于读写磁盘数据的工具以及用于操作内存映射文件的工具。 4、线性代数、随机数生成以及傅里叶变换功能。 5、用于集成由C、C++、Fortran等语言编写的代码的A C API。

由于NumPy提供了一个简单易用的C API,因此很容易将数据传递给由低级语言编写的外部库,外部库也能以NumPy数组的形式将数据返回给Python。这个功能使Python成为一种包装C/C++/Fortran历史代码库的选择,并使被包装库拥有一个动态的、易用的接口。

NumPy本身并没有提供多么高级的数据分析功能,理解NumPy数组以及面向数组的计算将有助于你更加高效地使用诸如pandas之类的工具。因为NumPy是一个很大的题目,我会在附录A中介绍更多NumPy高级功能,比如广播。

对于大部分数据分析应用而言,我们最关注的功能主要集中在:

1、用于数据整理和清理、子集构造和过滤、转换等快速的矢量化数组运算。 2、常用的数组算法,如排序、唯一化、集合运算等。 3、高效的描述统计和数据聚合/摘要运算。 4、用于异构数据集的合并/连接运算的数据对齐和关系型数据运算。 5、将条件逻辑表述为数组表达式(而不是带有if-elif-else分支的循环)。 6、数据的分组运算(聚合、转换、函数应用等)。

1.2 实验目的

  • 了解各类数据文件
  • 掌握numpy中各种方法的灵活应用
  • 掌握numpy对实际数据的处理方法
  • 掌握numpy对真实数据处理的流程

1.3 实验环境

实验环境:python3.6以上、Numpy、Jupyter Notebook、Google Chrome\IE浏览器

二、 实验步骤

2.1 数据的读取

NumPy能够读写磁盘上的文本数据或二进制数据。本次实验我们将直接加载鸢尾花的数据,数据集当中主要包括了鸢尾花的花萼长宽、花瓣长宽以及鸢尾花的类别,我们将直接从网上导入数据(可能会有些慢,之后我们会直接使用下载好的数据),并从元组数据中提取一列。

import numpy as np url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris_1d = np.genfromtxt(url, delimiter=',', dtype=None) print(iris_1d) #提取一列 species = np.array([row[4] for row in iris_1d]) print(species[:5]) 

部分输出如下:

In [1]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">url</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#ba2121">'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_1d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#000000">url</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008000"><strong>None</strong></span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">iris_1d</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#提取一列</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">species</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.array([<span style="color:#000000">row</span>[<span style="color:#008800">4</span>] <span style="color:#008000"><strong>for</strong></span> <span style="color:#000000">row</span> <span style="color:#008000"><strong>in</strong></span> <span style="color:#000000">iris_1d</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">species</span>[:<span style="color:#008800">5</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[(5.1, 3.5, 1.4, 0.2, b'Iris-setosa') (4.9, 3. , 1.4, 0.2, b'Iris-setosa')
 (4.7, 3.2, 1.3, 0.2, b'Iris-setosa') (4.6, 3.1, 1.5, 0.2, b'Iris-setosa')
 (5. , 3.6, 1.4, 0.2, b'Iris-setosa') (5.4, 3.9, 1.7, 0.4, b'Iris-setosa')
 (4.6, 3.4, 1.4, 0.3, b'Iris-setosa') (5. , 3.4, 1.5, 0.2, b'Iris-setosa')
 (4.4, 2.9, 1.4, 0.2, b'Iris-setosa') (4.9, 3.1, 1.5, 0.1, b'Iris-setosa')
 (5.4, 3.7, 1.5, 0.2, b'Iris-setosa') (4.8, 3.4, 1.6, 0.2, b'Iris-setosa')
 (4.8, 3. , 1.4, 0.1, b'Iris-setosa') (4.3, 3. , 1.1, 0.1, b'Iris-setosa')
 (5.8, 4. , 1.2, 0.2, b'Iris-setosa') (5.7, 4.4, 1.5, 0.4, b'Iris-setosa')
 (5.4, 3.9, 1.3, 0.4, b'Iris-setosa') (5.1, 3.5, 1.4, 0.3, b'Iris-setosa')
 (5.7, 3.8, 1.7, 0.3, b'Iris-setosa') (5.1, 3.8, 1.5, 0.3, b'Iris-setosa')
 (5.4, 3.4, 1.7, 0.2, b'Iris-setosa') (5.1, 3.7, 1.5, 0.4, b'Iris-setosa')
 (4.6, 3.6, 1. , 0.2, b'Iris-setosa') (5.1, 3.3, 1.7, 0.5, b'Iris-setosa')
 (4.8, 3.4, 1.9, 0.2, b'Iris-setosa') (5. , 3. , 1.6, 0.2, b'Iris-setosa')
 (5. , 3.4, 1.6, 0.4, b'Iris-setosa') (5.2, 3.5, 1.5, 0.2, b'Iris-setosa')
 (5.2, 3.4, 1.4, 0.2, b'Iris-setosa') (4.7, 3.2, 1.6, 0.2, b'Iris-setosa')
 (4.8, 3.1, 1.6, 0.2, b'Iris-setosa') (5.4, 3.4, 1.5, 0.4, b'Iris-setosa')
 (5.2, 4.1, 1.5, 0.1, b'Iris-setosa') (5.5, 4.2, 1.4, 0.2, b'Iris-setosa')
 (4.9, 3.1, 1.5, 0.1, b'Iris-setosa') (5. , 3.2, 1.2, 0.2, b'Iris-setosa')
 (5.5, 3.5, 1.3, 0.2, b'Iris-setosa') (4.9, 3.1, 1.5, 0.1, b'Iris-setosa')
 (4.4, 3. , 1.3, 0.2, b'Iris-setosa') (5.1, 3.4, 1.5, 0.2, b'Iris-setosa')
 (5. , 3.5, 1.3, 0.3, b'Iris-setosa') (4.5, 2.3, 1.3, 0.3, b'Iris-setosa')
 (4.4, 3.2, 1.3, 0.2, b'Iris-setosa') (5. , 3.5, 1.6, 0.6, b'Iris-setosa')
 (5.1, 3.8, 1.9, 0.4, b'Iris-setosa') (4.8, 3. , 1.4, 0.3, b'Iris-setosa')
 (5.1, 3.8, 1.6, 0.2, b'Iris-setosa') (4.6, 3.2, 1.4, 0.2, b'Iris-setosa')
 (5.3, 3.7, 1.5, 0.2, b'Iris-setosa') (5. , 3.3, 1.4, 0.2, b'Iris-setosa')
 (7. , 3.2, 4.7, 1.4, b'Iris-versicolor')
 (6.4, 3.2, 4.5, 1.5, b'Iris-versicolor')
 (6.9, 3.1, 4.9, 1.5, b'Iris-versicolor')
 (5.5, 2.3, 4. , 1.3, b'Iris-versicolor')
 (6.5, 2.8, 4.6, 1.5, b'Iris-versicolor')
 (5.7, 2.8, 4.5, 1.3, b'Iris-versicolor')
 (6.3, 3.3, 4.7, 1.6, b'Iris-versicolor')
 (4.9, 2.4, 3.3, 1. , b'Iris-versicolor')
 (6.6, 2.9, 4.6, 1.3, b'Iris-versicolor')
 (5.2, 2.7, 3.9, 1.4, b'Iris-versicolor')
 (5. , 2. , 3.5, 1. , b'Iris-versicolor')
 (5.9, 3. , 4.2, 1.5, b'Iris-versicolor')
 (6. , 2.2, 4. , 1. , b'Iris-versicolor')
 (6.1, 2.9, 4.7, 1.4, b'Iris-versicolor')
 (5.6, 2.9, 3.6, 1.3, b'Iris-versicolor')
 (6.7, 3.1, 4.4, 1.4, b'Iris-versicolor')
 (5.6, 3. , 4.5, 1.5, b'Iris-versicolor')
 (5.8, 2.7, 4.1, 1. , b'Iris-versicolor')
 (6.2, 2.2, 4.5, 1.5, b'Iris-versicolor')
 (5.6, 2.5, 3.9, 1.1, b'Iris-versicolor')
 (5.9, 3.2, 4.8, 1.8, b'Iris-versicolor')
 (6.1, 2.8, 4. , 1.3, b'Iris-versicolor')
 (6.3, 2.5, 4.9, 1.5, b'Iris-versicolor')
 (6.1, 2.8, 4.7, 1.2, b'Iris-versicolor')
 (6.4, 2.9, 4.3, 1.3, b'Iris-versicolor')
 (6.6, 3. , 4.4, 1.4, b'Iris-versicolor')
 (6.8, 2.8, 4.8, 1.4, b'Iris-versicolor')
 (6.7, 3. , 5. , 1.7, b'Iris-versicolor')
 (6. , 2.9, 4.5, 1.5, b'Iris-versicolor')
 (5.7, 2.6, 3.5, 1. , b'Iris-versicolor')
 (5.5, 2.4, 3.8, 1.1, b'Iris-versicolor')
 (5.5, 2.4, 3.7, 1. , b'Iris-versicolor')
 (5.8, 2.7, 3.9, 1.2, b'Iris-versicolor')
 (6. , 2.7, 5.1, 1.6, b'Iris-versicolor')
 (5.4, 3. , 4.5, 1.5, b'Iris-versicolor')
 (6. , 3.4, 4.5, 1.6, b'Iris-versicolor')
 (6.7, 3.1, 4.7, 1.5, b'Iris-versicolor')
 (6.3, 2.3, 4.4, 1.3, b'Iris-versicolor')
 (5.6, 3. , 4.1, 1.3, b'Iris-versicolor')
 (5.5, 2.5, 4. , 1.3, b'Iris-versicolor')
 (5.5, 2.6, 4.4, 1.2, b'Iris-versicolor')
 (6.1, 3. , 4.6, 1.4, b'Iris-versicolor')
 (5.8, 2.6, 4. , 1.2, b'Iris-versicolor')
 (5. , 2.3, 3.3, 1. , b'Iris-versicolor')
 (5.6, 2.7, 4.2, 1.3, b'Iris-versicolor')
 (5.7, 3. , 4.2, 1.2, b'Iris-versicolor')
 (5.7, 2.9, 4.2, 1.3, b'Iris-versicolor')
 (6.2, 2.9, 4.3, 1.3, b'Iris-versicolor')
 (5.1, 2.5, 3. , 1.1, b'Iris-versicolor')
 (5.7, 2.8, 4.1, 1.3, b'Iris-versicolor')
 (6.3, 3.3, 6. , 2.5, b'Iris-virginica')
 (5.8, 2.7, 5.1, 1.9, b'Iris-virginica')
 (7.1, 3. , 5.9, 2.1, b'Iris-virginica')
 (6.3, 2.9, 5.6, 1.8, b'Iris-virginica')
 (6.5, 3. , 5.8, 2.2, b'Iris-virginica')
 (7.6, 3. , 6.6, 2.1, b'Iris-virginica')
 (4.9, 2.5, 4.5, 1.7, b'Iris-virginica')
 (7.3, 2.9, 6.3, 1.8, b'Iris-virginica')
 (6.7, 2.5, 5.8, 1.8, b'Iris-virginica')
 (7.2, 3.6, 6.1, 2.5, b'Iris-virginica')
 (6.5, 3.2, 5.1, 2. , b'Iris-virginica')
 (6.4, 2.7, 5.3, 1.9, b'Iris-virginica')
 (6.8, 3. , 5.5, 2.1, b'Iris-virginica')
 (5.7, 2.5, 5. , 2. , b'Iris-virginica')
 (5.8, 2.8, 5.1, 2.4, b'Iris-virginica')
 (6.4, 3.2, 5.3, 2.3, b'Iris-virginica')
 (6.5, 3. , 5.5, 1.8, b'Iris-virginica')
 (7.7, 3.8, 6.7, 2.2, b'Iris-virginica')
 (7.7, 2.6, 6.9, 2.3, b'Iris-virginica')
 (6. , 2.2, 5. , 1.5, b'Iris-virginica')
 (6.9, 3.2, 5.7, 2.3, b'Iris-virginica')
 (5.6, 2.8, 4.9, 2. , b'Iris-virginica')
 (7.7, 2.8, 6.7, 2. , b'Iris-virginica')
 (6.3, 2.7, 4.9, 1.8, b'Iris-virginica')
 (6.7, 3.3, 5.7, 2.1, b'Iris-virginica')
 (7.2, 3.2, 6. , 1.8, b'Iris-virginica')
 (6.2, 2.8, 4.8, 1.8, b'Iris-virginica')
 (6.1, 3. , 4.9, 1.8, b'Iris-virginica')
 (6.4, 2.8, 5.6, 2.1, b'Iris-virginica')
 (7.2, 3. , 5.8, 1.6, b'Iris-virginica')
 (7.4, 2.8, 6.1, 1.9, b'Iris-virginica')
 (7.9, 3.8, 6.4, 2. , b'Iris-virginica')
 (6.4, 2.8, 5.6, 2.2, b'Iris-virginica')
 (6.3, 2.8, 5.1, 1.5, b'Iris-virginica')
 (6.1, 2.6, 5.6, 1.4, b'Iris-virginica')
 (7.7, 3. , 6.1, 2.3, b'Iris-virginica')
 (6.3, 3.4, 5.6, 2.4, b'Iris-virginica')
 (6.4, 3.1, 5.5, 1.8, b'Iris-virginica')
 (6. , 3. , 4.8, 1.8, b'Iris-virginica')
 (6.9, 3.1, 5.4, 2.1, b'Iris-virginica')
 (6.7, 3.1, 5.6, 2.4, b'Iris-virginica')
 (6.9, 3.1, 5.1, 2.3, b'Iris-virginica')
 (5.8, 2.7, 5.1, 1.9, b'Iris-virginica')
 (6.8, 3.2, 5.9, 2.3, b'Iris-virginica')
 (6.7, 3.3, 5.7, 2.5, b'Iris-virginica')
 (6.7, 3. , 5.2, 2.3, b'Iris-virginica')
 (6.3, 2.5, 5. , 1.9, b'Iris-virginica')
 (6.5, 3. , 5.2, 2. , b'Iris-virginica')
 (6.2, 3.4, 5.4, 2.3, b'Iris-virginica')
 (5.9, 3. , 5.1, 1.8, b'Iris-virginica')]
[b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'
 b'Iris-setosa']
/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:4: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default.
  after removing the cwd from sys.path.

将一维元组数组转化为二维numpy数组

import numpy as np iris_1d = np.genfromtxt('iris.data', delimiter=',', dtype=None) # 方法1,将每一行转换为一个列表并获取前4项 iris_2d = np.array([row.tolist()[:4] for row in iris_1d]) # 打印转化后的二维numpy数组的前5行 print(iris_2d[:5]) # 方法2,仅从源导入前4列 iris_2d = np.genfromtxt('iris.data', delimiter=',', dtype='float', usecols=[0, 1, 2, 3]) # 打印转化后的二维numpy数组的前5行 print(iris_2d[:5]) 

输出如下:

[[5.1 3.5 1.4 0.2] [4.9 3. 1.4 0.2] [4.7 3.2 1.3 0.2] [4.6 3.1 1.5 0.2] [5. 3.6 1.4 0.2]] [[5.1 3.5 1.4 0.2] [4.9 3. 1.4 0.2] [4.7 3.2 1.3 0.2] [4.6 3.1 1.5 0.2] [5. 3.6 1.4 0.2]]
In [2]:
 
               
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_1d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008000"><strong>None</strong></span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 方法1,将每一行转换为一个列表并获取前4项</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.array([<span style="color:#000000">row</span>.tolist()[:<span style="color:#008800">4</span>] <span style="color:#008000"><strong>for</strong></span> <span style="color:#000000">row</span> <span style="color:#008000"><strong>in</strong></span> <span style="color:#000000">iris_1d</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 打印转化后的二维numpy数组的前5行</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">iris_2d</span>[:<span style="color:#008800">5</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 方法2,仅从源导入前4列</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'float'</span>, <span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">0</span>, <span style="color:#008800">1</span>, <span style="color:#008800">2</span>, <span style="color:#008800">3</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 打印转化后的二维numpy数组的前5行</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">iris_2d</span>[:<span style="color:#008800">5</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]
 [4.7 3.2 1.3 0.2]
 [4.6 3.1 1.5 0.2]
 [5.  3.6 1.4 0.2]]
[[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]
 [4.7 3.2 1.3 0.2]
 [4.6 3.1 1.5 0.2]
 [5.  3.6 1.4 0.2]]
/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:5: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default.
  """

求出鸢尾属植物萼片长度的平均值、中位数和标准差(第1列)

import numpy as np # 先提取要计算的一列 sepallength = np.genfromtxt('iris.data', delimiter=',', dtype='float', usecols=[0]) mu, med, sd = np.mean(sepallength), np.median(sepallength), np.std(sepallength) print(mu, med, sd)

输出如下:

5.843333333333334 5.8 0.8253012917851409
In [3]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 先提取要计算的一列</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">sepallength</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'float'</span>, <span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">0</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">mu</span>, <span style="color:#000000">med</span>, <span style="color:#000000">sd</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.mean(<span style="color:#000000">sepallength</span>), <span style="color:#000000">np</span>.median(<span style="color:#000000">sepallength</span>), <span style="color:#000000">np</span>.std(<span style="color:#000000">sepallength</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">mu</span>, <span style="color:#000000">med</span>, <span style="color:#000000">sd</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
5.843333333333334 5.8 0.8253012917851409

2.2 规范化数组

在numpy中我们是否规范化数组呢?使数组的值正好介于0和1之间?答案当然是“肯定的”

import numpy as np # 创建一种标准化形式的鸢尾属植物间隔长度,其值正好介于0和1之间,这样最小值为0,最大值为1 sepallength = np.genfromtxt('iris.data', delimiter=',', dtype='float', usecols=[0]) Smax, Smin = sepallength.max(), sepallength.min() S = (sepallength - Smin)/(Smax - Smin) print(S) # or ptp()表示最大值-最小值 S = (sepallength - Smin)/sepallength.ptp() print(S)

输出如下:

[0.22 0.17 0.11 0.08 0.19 0.31 0.08 0.19 0.03 0.17 0.31 0.14 0.14 0. 0.42 0.39 0.31 0.22 0.39 0.22 0.31 0.22 0.08 0.22 0.14 0.19 0.19 0.25 0.25 0.11 0.14 0.31 0.25 0.33 0.17 0.19 0.33 0.17 0.03 0.22 0.19 0.06 0.03 0.19 0.22 0.14 0.22 0.08 0.28 0.19 0.75 0.58 0.72 0.33 0.61 0.39 0.56 0.17 0.64 0.25 0.19 0.44 0.47 0.5 0.36 0.67 0.36 0.42 0.53 0.36 0.44 0.5 0.56 0.5 0.58 0.64 0.69 0.67 0.47 0.39 0.33 0.33 0.42 0.47 0.31 0.47 0.67 0.56 0.36 0.33 0.33 0.5 0.42 0.19 0.36 0.39 0.39 0.53 0.22 0.39 0.56 0.42 0.78 0.56 0.61 0.92 0.17 0.83 0.67 0.81 0.61 0.58 0.69 0.39 0.42 0.58 0.61 0.94 0.94 0.47 0.72 0.36 0.94 0.56 0.67 0.81 0.53 0.5 0.58 0.81 0.86 1. 0.58 0.56 0.5 0.94 0.56 0.58 0.47 0.72 0.67 0.72 0.42 0.69 0.67 0.67 0.56 0.61 0.53 0.44] [0.22 0.17 0.11 0.08 0.19 0.31 0.08 0.19 0.03 0.17 0.31 0.14 0.14 0. 0.42 0.39 0.31 0.22 0.39 0.22 0.31 0.22 0.08 0.22 0.14 0.19 0.19 0.25 0.25 0.11 0.14 0.31 0.25 0.33 0.17 0.19 0.33 0.17 0.03 0.22 0.19 0.06 0.03 0.19 0.22 0.14 0.22 0.08 0.28 0.19 0.75 0.58 0.72 0.33 0.61 0.39 0.56 0.17 0.64 0.25 0.19 0.44 0.47 0.5 0.36 0.67 0.36 0.42 0.53 0.36 0.44 0.5 0.56 0.5 0.58 0.64 0.69 0.67 0.47 0.39 0.33 0.33 0.42 0.47 0.31 0.47 0.67 0.56 0.36 0.33 0.33 0.5 0.42 0.19 0.36 0.39 0.39 0.53 0.22 0.39 0.56 0.42 0.78 0.56 0.61 0.92 0.17 0.83 0.67 0.81 0.61 0.58 0.69 0.39 0.42 0.58 0.61 0.94 0.94 0.47 0.72 0.36 0.94 0.56 0.67 0.81 0.53 0.5 0.58 0.81 0.86 1. 0.58 0.56 0.5 0.94 0.56 0.58 0.47 0.72 0.67 0.72 0.42 0.69 0.67 0.67 0.56 0.61 0.53 0.44]
In [4]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 创建一种标准化形式的鸢尾属植物间隔长度,其值正好介于0和1之间,这样最小值为0,最大值为1</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">sepallength</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'float'</span>, <span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">0</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">Smax</span>, <span style="color:#000000">Smin</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">sepallength</span>.max(), <span style="color:#000000">sepallength</span>.min()</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">S</span> <span style="color:#aa22ff"><strong>=</strong></span> (<span style="color:#000000">sepallength</span> <span style="color:#aa22ff"><strong>-</strong></span> <span style="color:#000000">Smin</span>)<span style="color:#aa22ff"><strong>/</strong></span>(<span style="color:#000000">Smax</span> <span style="color:#aa22ff"><strong>-</strong></span> <span style="color:#000000">Smin</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">S</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># or  ptp()表示最大值-最小值</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">S</span> <span style="color:#aa22ff"><strong>=</strong></span> (<span style="color:#000000">sepallength</span> <span style="color:#aa22ff"><strong>-</strong></span> <span style="color:#000000">Smin</span>)<span style="color:#aa22ff"><strong>/</strong></span><span style="color:#000000">sepallength</span>.ptp()</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">S</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[0.22222222 0.16666667 0.11111111 0.08333333 0.19444444 0.30555556
 0.08333333 0.19444444 0.02777778 0.16666667 0.30555556 0.13888889
 0.13888889 0.         0.41666667 0.38888889 0.30555556 0.22222222
 0.38888889 0.22222222 0.30555556 0.22222222 0.08333333 0.22222222
 0.13888889 0.19444444 0.19444444 0.25       0.25       0.11111111
 0.13888889 0.30555556 0.25       0.33333333 0.16666667 0.19444444
 0.33333333 0.16666667 0.02777778 0.22222222 0.19444444 0.05555556
 0.02777778 0.19444444 0.22222222 0.13888889 0.22222222 0.08333333
 0.27777778 0.19444444 0.75       0.58333333 0.72222222 0.33333333
 0.61111111 0.38888889 0.55555556 0.16666667 0.63888889 0.25
 0.19444444 0.44444444 0.47222222 0.5        0.36111111 0.66666667
 0.36111111 0.41666667 0.52777778 0.36111111 0.44444444 0.5
 0.55555556 0.5        0.58333333 0.63888889 0.69444444 0.66666667
 0.47222222 0.38888889 0.33333333 0.33333333 0.41666667 0.47222222
 0.30555556 0.47222222 0.66666667 0.55555556 0.36111111 0.33333333
 0.33333333 0.5        0.41666667 0.19444444 0.36111111 0.38888889
 0.38888889 0.52777778 0.22222222 0.38888889 0.55555556 0.41666667
 0.77777778 0.55555556 0.61111111 0.91666667 0.16666667 0.83333333
 0.66666667 0.80555556 0.61111111 0.58333333 0.69444444 0.38888889
 0.41666667 0.58333333 0.61111111 0.94444444 0.94444444 0.47222222
 0.72222222 0.36111111 0.94444444 0.55555556 0.66666667 0.80555556
 0.52777778 0.5        0.58333333 0.80555556 0.86111111 1.
 0.58333333 0.55555556 0.5        0.94444444 0.55555556 0.58333333
 0.47222222 0.72222222 0.66666667 0.72222222 0.41666667 0.69444444
 0.66666667 0.66666667 0.55555556 0.61111111 0.52777778 0.44444444]
[0.22222222 0.16666667 0.11111111 0.08333333 0.19444444 0.30555556
 0.08333333 0.19444444 0.02777778 0.16666667 0.30555556 0.13888889
 0.13888889 0.         0.41666667 0.38888889 0.30555556 0.22222222
 0.38888889 0.22222222 0.30555556 0.22222222 0.08333333 0.22222222
 0.13888889 0.19444444 0.19444444 0.25       0.25       0.11111111
 0.13888889 0.30555556 0.25       0.33333333 0.16666667 0.19444444
 0.33333333 0.16666667 0.02777778 0.22222222 0.19444444 0.05555556
 0.02777778 0.19444444 0.22222222 0.13888889 0.22222222 0.08333333
 0.27777778 0.19444444 0.75       0.58333333 0.72222222 0.33333333
 0.61111111 0.38888889 0.55555556 0.16666667 0.63888889 0.25
 0.19444444 0.44444444 0.47222222 0.5        0.36111111 0.66666667
 0.36111111 0.41666667 0.52777778 0.36111111 0.44444444 0.5
 0.55555556 0.5        0.58333333 0.63888889 0.69444444 0.66666667
 0.47222222 0.38888889 0.33333333 0.33333333 0.41666667 0.47222222
 0.30555556 0.47222222 0.66666667 0.55555556 0.36111111 0.33333333
 0.33333333 0.5        0.41666667 0.19444444 0.36111111 0.38888889
 0.38888889 0.52777778 0.22222222 0.38888889 0.55555556 0.41666667
 0.77777778 0.55555556 0.61111111 0.91666667 0.16666667 0.83333333
 0.66666667 0.80555556 0.61111111 0.58333333 0.69444444 0.38888889
 0.41666667 0.58333333 0.61111111 0.94444444 0.94444444 0.47222222
 0.72222222 0.36111111 0.94444444 0.55555556 0.66666667 0.80555556
 0.52777778 0.5        0.58333333 0.80555556 0.86111111 1.
 0.58333333 0.55555556 0.5        0.94444444 0.55555556 0.58333333
 0.47222222 0.72222222 0.66666667 0.72222222 0.41666667 0.69444444
 0.66666667 0.66666667 0.55555556 0.61111111 0.52777778 0.44444444]

找到numpy数组的百分位数

import numpy as np sepallength = np.genfromtxt('iris.data', delimiter=',', dtype='float', usecols=[0]) print(np.percentile(sepallength, q=[5, 95])) # [4.6 7.255]

输出如下:

[4.6 7.25]
In [5]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">sepallength</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'float'</span>, <span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">0</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">np</span>.percentile(<span style="color:#000000">sepallength</span>, <span style="color:#000000">q</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">5</span>, <span style="color:#008800">95</span>]))</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># </em></span><span style="color:#00bb00"><em>[</em></span><span style="color:#408080"><em>4.6   7.255</em></span><span style="color:#00bb00"><em>]</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[4.6   7.255]

在数组中的随机位置插入值

import numpy as np # 在iris_2d数据集中的20个随机位置插入np.nan值 iris_2d = np.genfromtxt('iris.data', delimiter=',', dtype='object') # 方法1 np.random.seed(100) # i,j包含iris_2d所有元素的行号和列号 i, j = np.where(iris_2d) # print(i, j) iris_2d[np.random.choice((i), 20), np.random.choice((j), 20)] = np.nan print(iris_2d[:10]) # 方法2 np.random.seed(100) iris_2d[np.random.randint(150, size=20), np.random.randint(4, size=20)] = np.nan print(iris_2d[:10]) 

输出如下:

In [6]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 在iris_2d数据集中的20个随机位置插入np.nan值</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'object'</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 方法1</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">np</span>.random.seed(<span style="color:#008800">100</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># i,j包含iris_2d所有元素的行号和列号</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">i</span>, <span style="color:#000000">j</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.where(<span style="color:#000000">iris_2d</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(i, j)</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span>[<span style="color:#000000">np</span>.random.choice((<span style="color:#000000">i</span>), <span style="color:#008800">20</span>), <span style="color:#000000">np</span>.random.choice((<span style="color:#000000">j</span>), <span style="color:#008800">20</span>)] <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.nan</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">iris_2d</span>[:<span style="color:#008800">10</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 方法2 </em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">np</span>.random.seed(<span style="color:#008800">100</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span>[<span style="color:#000000">np</span>.random.randint(<span style="color:#008800">150</span>, <span style="color:#000000">size</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">20</span>), <span style="color:#000000">np</span>.random.randint(<span style="color:#008800">4</span>, <span style="color:#000000">size</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">20</span>)] <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.nan</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span><span style="color:#00bb00">(</span><span style="color:#000000">iris_2d</span>[:<span style="color:#008800">10</span>]<span style="color:#00bb00">)</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[[b'5.1' b'3.5' b'1.4' b'0.2' b'Iris-setosa']
 [b'4.9' b'3.0' b'1.4' b'0.2' b'Iris-setosa']
 [b'4.7' b'3.2' b'1.3' b'0.2' b'Iris-setosa']
 [b'4.6' b'3.1' b'1.5' b'0.2' b'Iris-setosa']
 [b'5.0' b'3.6' b'1.4' b'0.2' b'Iris-setosa']
 [b'5.4' b'3.9' b'1.7' b'0.4' b'Iris-setosa']
 [b'4.6' b'3.4' b'1.4' b'0.3' b'Iris-setosa']
 [b'5.0' b'3.4' b'1.5' b'0.2' b'Iris-setosa']
 [b'4.4' b'2.9' b'1.4' b'0.2' b'Iris-setosa']
 [b'4.9' b'3.1' b'1.5' b'0.1' b'Iris-setosa']]
[[b'5.1' b'3.5' b'1.4' b'0.2' b'Iris-setosa']
 [b'4.9' b'3.0' b'1.4' b'0.2' b'Iris-setosa']
 [b'4.7' b'3.2' b'1.3' b'0.2' b'Iris-setosa']
 [b'4.6' b'3.1' b'1.5' b'0.2' b'Iris-setosa']
 [b'5.0' b'3.6' b'1.4' b'0.2' b'Iris-setosa']
 [b'5.4' b'3.9' b'1.7' b'0.4' b'Iris-setosa']
 [b'4.6' b'3.4' b'1.4' b'0.3' b'Iris-setosa']
 [b'5.0' b'3.4' b'1.5' b'0.2' b'Iris-setosa']
 [b'4.4' nan b'1.4' b'0.2' b'Iris-setosa']
 [b'4.9' b'3.1' b'1.5' b'0.1' b'Iris-setosa']]

使用numpy还可以找到数组中缺失值的位置

import numpy as np # 在iris_2d的sepallength中查找缺失值的数量和位置(第1列) iris_2d = np.genfromtxt('iris.data', delimiter=',',usecols=(0, 1,2,3),dtype=float) iris_2d[np.random.randint(150, size=20), np.random.randint(4, size=20)] = np.nan print("缺失值个数: \n", np.isnan(iris_2d[:, 0]).sum()) print("缺失值位置: \n", np.where(np.isnan(iris_2d[:, 0])))

输出如下:

缺失值个数: 2 缺失值位置: (array([36, 56]),)
In [7]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 在iris_2d的sepallength中查找缺失值的数量和位置(第1列)</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>,<span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>(<span style="color:#008800">0</span>, <span style="color:#008800">1</span>,<span style="color:#008800">2</span>,<span style="color:#008800">3</span>),<span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008000">float</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span>[<span style="color:#000000">np</span>.random.randint(<span style="color:#008800">150</span>, <span style="color:#000000">size</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">20</span>), <span style="color:#000000">np</span>.random.randint(<span style="color:#008800">4</span>, <span style="color:#000000">size</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">20</span>)] <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.nan</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#ba2121">"缺失值个数: \n"</span>, <span style="color:#000000">np</span>.isnan(<span style="color:#000000">iris_2d</span>[:, <span style="color:#008800">0</span>]).sum())</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span><span style="color:#00bb00">(</span><span style="color:#ba2121">"缺失值位置: \n"</span>, <span style="color:#000000">np</span>.where(<span style="color:#000000">np</span>.isnan(<span style="color:#000000">iris_2d</span>[:, <span style="color:#008800">0</span>]))<span style="color:#00bb00">)</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
缺失值个数: 
 5
缺失值位置: 
 (array([ 38,  80, 106, 113, 121]),)

从numpy数组中删除包含缺失值的行

import numpy as np iris_2d = np.genfromtxt('iris.data', delimiter=',', dtype='float', usecols=[0, 1, 2, 3]) iris_2d[np.random.randint(150, size=20), np.random.randint(4, size=20)] = np.nan # print(iris_2d) # Method 1: ~表示取反 any_nan_in_row = np.array([~np.any(np.isnan(row)) for row in iris_2d]) # print(any_nan_in_row) # 打印的是布尔型数组 # print(iris_2d[any_nan_in_row]) 打印的是剔除掉包含缺失值的行的矩阵 print(iris_2d[any_nan_in_row][:5]) # Method 2: # print(np.isnan(iris_2d)) # 返回的是布尔型数组;false+false+false+false == 0 print(iris_2d[np.sum(np.isnan(iris_2d), axis=1) == 0][:5]) 

输出如下:

[[5.1 3.5 1.4 0.2] [4.9 3. 1.4 0.2] [4.6 3.1 1.5 0.2] [5. 3.6 1.4 0.2] [5.4 3.9 1.7 0.4]] [[5.1 3.5 1.4 0.2] [4.9 3. 1.4 0.2] [4.6 3.1 1.5 0.2] [5. 3.6 1.4 0.2] [5.4 3.9 1.7 0.4]]
In [8]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'float'</span>, <span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">0</span>, <span style="color:#008800">1</span>, <span style="color:#008800">2</span>, <span style="color:#008800">3</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span>[<span style="color:#000000">np</span>.random.randint(<span style="color:#008800">150</span>, <span style="color:#000000">size</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">20</span>), <span style="color:#000000">np</span>.random.randint(<span style="color:#008800">4</span>, <span style="color:#000000">size</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">20</span>)] <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.nan</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(iris_2d)</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># Method 1: ~表示取反</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">any_nan_in_row</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.array([<span style="color:#aa22ff"><strong>~</strong></span><span style="color:#000000">np</span>.any(<span style="color:#000000">np</span>.isnan(<span style="color:#000000">row</span>)) <span style="color:#008000"><strong>for</strong></span> <span style="color:#000000">row</span> <span style="color:#008000"><strong>in</strong></span> <span style="color:#000000">iris_2d</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(any_nan_in_row) # 打印的是布尔型数组</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(iris_2d[any_nan_in_row]) 打印的是剔除掉包含缺失值的行的矩阵</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">iris_2d</span>[<span style="color:#000000">any_nan_in_row</span>][:<span style="color:#008800">5</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># Method 2:</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(np.isnan(iris_2d)) # 返回的是布尔型数组;false+false+false+false == 0</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span><span style="color:#00bb00">(</span><span style="color:#000000">iris_2d</span>[<span style="color:#000000">np</span>.sum(<span style="color:#000000">np</span>.isnan(<span style="color:#000000">iris_2d</span>), <span style="color:#000000">axis</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">1</span>) <span style="color:#aa22ff"><strong>==</strong></span> <span style="color:#008800">0</span>][:<span style="color:#008800">5</span>]<span style="color:#00bb00">)</span></span></span></span></span></span>
[[4.9 3.  1.4 0.2]
 [4.7 3.2 1.3 0.2]
 [4.6 3.1 1.5 0.2]
 [5.  3.6 1.4 0.2]
 [4.6 3.4 1.4 0.3]]
[[4.9 3.  1.4 0.2]
 [4.7 3.2 1.3 0.2]
 [4.6 3.1 1.5 0.2]
 [5.  3.6 1.4 0.2]
 [4.6 3.4 1.4 0.3]]

找到numpy数组的两列之间的相关性

import numpy as np # 在iris_2d中找出SepalLength(第1列)和PetalLength(第3列)之间的相关性 iris_2d = np.genfromtxt('iris.data', delimiter=',', dtype='float', usecols=[0, 1, 2, 3]) # Solution 1 print(np.corrcoef(iris_2d[:, 0], iris_2d[:, 2])[0, 1]) # 0.8717541573048718 # Solution 2 from scipy.stats.stats import pearsonr corr, p_value = pearsonr(iris_2d[:, 0], iris_2d[:, 2]) print(corr) # 0.8717541573048713 

输出如下:

0.8717541573048718 0.8717541573048714
In [9]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 在iris_2d中找出SepalLength(第1列)和PetalLength(第3列)之间的相关性</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'float'</span>, <span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">0</span>, <span style="color:#008800">1</span>, <span style="color:#008800">2</span>, <span style="color:#008800">3</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># Solution 1</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">np</span>.corrcoef(<span style="color:#000000">iris_2d</span>[:, <span style="color:#008800">0</span>], <span style="color:#000000">iris_2d</span>[:, <span style="color:#008800">2</span>])[<span style="color:#008800">0</span>, <span style="color:#008800">1</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 0.8717541573048718</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># Solution 2</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>from</strong></span> <span style="color:#000000">scipy</span>.stats.stats <span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">pearsonr</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">corr</span>, <span style="color:#000000">p_value</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">pearsonr</span>(<span style="color:#000000">iris_2d</span>[:, <span style="color:#008800">0</span>], <span style="color:#000000">iris_2d</span>[:, <span style="color:#008800">2</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">corr</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 0.8717541573048713</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
0.871754157304871
0.8717541573048713

查找给定数组是否具有任何空值

import numpy as np url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris_2d = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0, 1, 2, 3]) print(np.isnan(iris_2d).any()) # False 

输出如下:

False
In [10]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">url</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#ba2121">'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#000000">url</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'float'</span>, <span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">0</span>, <span style="color:#008800">1</span>, <span style="color:#008800">2</span>, <span style="color:#008800">3</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">np</span>.isnan(<span style="color:#000000">iris_2d</span>).any())</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># False</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
False

在numpy数组中查找唯一值的计数

import numpy as np # 找出鸢尾属植物物种中的独特值和独特值的数量 url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris = np.genfromtxt(url, delimiter=',', dtype='object') species = np.array([row.tolist()[4] for row in iris]) print(species) u, counts = np.unique(species, return_counts=True) print(u, counts) 

输出如下:

In [18]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 找出鸢尾属植物物种中的独特值和独特值的数量</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">url</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#ba2121">'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#000000">url</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'object'</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">species</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.array([<span style="color:#000000">row</span>.tolist()[<span style="color:#008800">4</span>] <span style="color:#008000"><strong>for</strong></span> <span style="color:#000000">row</span> <span style="color:#008000"><strong>in</strong></span> <span style="color:#000000">iris</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">species</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">u</span>, <span style="color:#000000">counts</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.unique(<span style="color:#000000">species</span>, <span style="color:#000000">return_counts</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008000"><strong>True</strong></span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span><span style="color:#00bb00">(</span><span style="color:#000000">u</span>, <span style="color:#000000">counts</span><span style="color:#00bb00">)</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'
 b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'
 b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'
 b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'
 b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'
 b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'
 b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'
 b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'
 b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'
 b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'
 b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'
 b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'
 b'Iris-setosa' b'Iris-setosa' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'
 b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'
 b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'
 b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'
 b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'
 b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'
 b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'
 b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'
 b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'
 b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'
 b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'
 b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'
 b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'
 b'Iris-virginica' b'Iris-virginica']
[b'Iris-setosa' b'Iris-versicolor' b'Iris-virginica'] [50 50 50]

将数字转换为分类(文本)数组

import numpy as np # 将iris_2d的花瓣长度(第3列)加入以形成文本数组 # Less than 3 --> 'small' # 3-5 --> 'medium' # '>=5 --> 'large' url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris = np.genfromtxt(url, delimiter=',', dtype='object') # [0, 3, 5, 10]表示划分成[0,3), [3,5), [5,10), [10,>10) 4个区间; # 返回的petallength数组是每个元素对应这4个区间的索引 petallength = np.digitize(iris[:, 2].astype('float'), [0, 3, 5, 10]) # print(petallength) label_map = {1: 'small', 2: 'medium', 3: 'large', 4: np.nan} petallength2 = [label_map[x] for x in petallength] print(petallength2) 

输出如下:

In [12]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 将iris_2d的花瓣长度(第3列)加入以形成文本数组</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># Less than 3 --> 'small'</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 3-5 --> 'medium'</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># '>=5 --> 'large'</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">url</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#ba2121">'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#000000">url</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'object'</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># [0, 3, 5, 10]表示划分成[0,3), [3,5), [5,10), [10,>10) 4个区间;</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 返回的petallength数组是每个元素对应这4个区间的索引</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">petallength</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.digitize(<span style="color:#000000">iris</span>[:, <span style="color:#008800">2</span>].astype(<span style="color:#ba2121">'float'</span>), [<span style="color:#008800">0</span>, <span style="color:#008800">3</span>, <span style="color:#008800">5</span>, <span style="color:#008800">10</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(petallength)</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">label_map</span> <span style="color:#aa22ff"><strong>=</strong></span> {<span style="color:#008800">1</span>: <span style="color:#ba2121">'small'</span>, <span style="color:#008800">2</span>: <span style="color:#ba2121">'medium'</span>, <span style="color:#008800">3</span>: <span style="color:#ba2121">'large'</span>, <span style="color:#008800">4</span>: <span style="color:#000000">np</span>.nan}</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">petallength2</span> <span style="color:#aa22ff"><strong>=</strong></span> [<span style="color:#000000">label_map</span>[<span style="color:#000000">x</span>] <span style="color:#008000"><strong>for</strong></span> <span style="color:#000000">x</span> <span style="color:#008000"><strong>in</strong></span> <span style="color:#000000">petallength</span>]</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span><span style="color:#00bb00">(</span><span style="color:#000000">petallength2</span><span style="color:#00bb00">)</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
['small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'large', 'medium', 'medium', 'medium', 'medium', 'medium', 'large', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'large', 'large', 'large', 'large', 'large', 'large', 'medium', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'medium', 'large', 'medium', 'large', 'large', 'medium', 'medium', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'medium', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large']

按列对2D数组进行排序

import numpy as np # 根据sepallength列对数据集进行排序 url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris = np.genfromtxt(url, delimiter=',', dtype='object') print(iris[iris[:, 0].argsort()])

部分输出如下:

In [13]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 将iris_2d的花瓣长度(第3列)加入以形成文本数组</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># Less than 3 --> 'small'</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 3-5 --> 'medium'</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># '>=5 --> 'large'</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">url</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#ba2121">'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#000000">url</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'object'</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># [0, 3, 5, 10]表示划分成[0,3), [3,5), [5,10), [10,>10) 4个区间;</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 返回的petallength数组是每个元素对应这4个区间的索引</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">petallength</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.digitize(<span style="color:#000000">iris</span>[:, <span style="color:#008800">2</span>].astype(<span style="color:#ba2121">'float'</span>), [<span style="color:#008800">0</span>, <span style="color:#008800">3</span>, <span style="color:#008800">5</span>, <span style="color:#008800">10</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(petallength)</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">label_map</span> <span style="color:#aa22ff"><strong>=</strong></span> {<span style="color:#008800">1</span>: <span style="color:#ba2121">'small'</span>, <span style="color:#008800">2</span>: <span style="color:#ba2121">'medium'</span>, <span style="color:#008800">3</span>: <span style="color:#ba2121">'large'</span>, <span style="color:#008800">4</span>: <span style="color:#000000">np</span>.nan}</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">petallength2</span> <span style="color:#aa22ff"><strong>=</strong></span> [<span style="color:#000000">label_map</span>[<span style="color:#000000">x</span>] <span style="color:#008000"><strong>for</strong></span> <span style="color:#000000">x</span> <span style="color:#008000"><strong>in</strong></span> <span style="color:#000000">petallength</span>]</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span><span style="color:#00bb00">(</span><span style="color:#000000">petallength2</span><span style="color:#00bb00">)</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
['small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'large', 'medium', 'medium', 'medium', 'medium', 'medium', 'large', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'large', 'large', 'large', 'large', 'large', 'large', 'medium', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'medium', 'large', 'medium', 'large', 'large', 'medium', 'medium', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'medium', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large']

在numpy数组中找到最常见的值

import numpy as np # 在鸢尾属植物数据集中找到最常见的花瓣长度petallenth值(第3列) url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris = np.genfromtxt(url, delimiter=',', dtype='object') vals, counts = np.unique(iris[:, 2], return_counts=True) # print(np.argmax(counts)) # 返回的是最大值所在的下标; print(vals[np.argmax(counts)]) 

输出如下:

b'1.5'
In [14]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 在鸢尾属植物数据集中找到最常见的花瓣长度petallenth值(第3列)</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">url</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#ba2121">'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#000000">url</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'object'</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">vals</span>, <span style="color:#000000">counts</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.unique(<span style="color:#000000">iris</span>[:, <span style="color:#008800">2</span>], <span style="color:#000000">return_counts</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008000"><strong>True</strong></span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(np.argmax(counts))  # 返回的是最大值所在的下标;</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span><span style="color:#00bb00">(</span><span style="color:#000000">vals</span>[<span style="color:#000000">np</span>.argmax(<span style="color:#000000">counts</span>)]<span style="color:#00bb00">)</span></span></span></span></span></span>
b'1.5'

找到第一次出现的值大于给定值的位置

import numpy as np # 在数据集的第4列petalwidth中查找第一次出现的值大于1.0的位置。 url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris = np.genfromtxt(url, delimiter=',', dtype='object') print(np.argwhere(iris[:, 3].astype(float) > 1.0)[0]) # print(np.argwhere(iris[:, 3].astype(float) > 1.0)) # 返回值是一个列向量 # print(np.where(iris[:, 3].astype(float) > 1.0)) # 返回值是数组 

输出如下:

[50]
In [15]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 在数据集的第4列petalwidth中查找第一次出现的值大于1.0的位置。</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">url</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#ba2121">'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#000000">url</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'object'</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">np</span>.argwhere(<span style="color:#000000">iris</span>[:, <span style="color:#008800">3</span>].astype(<span style="color:#008000">float</span>) <span style="color:#aa22ff"><strong>></strong></span> <span style="color:#008800">1.0</span>)[<span style="color:#008800">0</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(np.argwhere(iris[:, 3].astype(float) > 1.0))  # 返回值是一个列向量</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(np.where(iris[:, 3].astype(float) > 1.0))  # 返回值是数组</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[50]

将大于给定值的所有值替换为给定的截止值

import numpy as np # 从数组a中,用30替换所有大于30的元素,用10替换所有小于10的元素。 np.set_printoptions(precision=2) np.random.seed(100) # 生成1-50内的随机数组,长度是20个元素 a = np.random.uniform(1,50, 20) print(a) # [27.63 14.64 21.8 42.39 1.23 6.96 33.87 41.47 7.7 29.18 44.67 11.25 # 10.08 6.31 11.77 48.95 40.77 9.43 41. 14.43] # Solution 1: Using np.clip print(np.clip(a, a_min=10, a_max=30)) # [27.63 14.64 21.8 30. 10. 10. 30. 30. 10. 29.18 30. 11.25 # 10.08 10. 11.77 30. 30. 10. 30. 14.43] # Solution 2: Using np.where print(np.where(a < 10, 10, np.where(a > 30, 30, a))) # [27.63 14.64 21.8 30. 10. 10. 30. 30. 10. 29.18 30. 11.25 # 10.08 10. 11.77 30. 30. 10. 30. 14.43]

输出分别如下:

[27.63 14.64 21.8 42.39 1.23 6.96 33.87 41.47 7.7 29.18 44.67 11.25 10.08 6.31 11.77 48.95 40.77 9.43 41. 14.43] [27.63 14.64 21.8 30. 10. 10. 30. 30. 10. 29.18 30. 11.25 10.08 10. 11.77 30. 30. 10. 30. 14.43] [27.63 14.64 21.8 30. 10. 10. 30. 30. 10. 29.18 30. 11.25 10.08 10. 11.77 30. 30. 10. 30. 14.43]
In [16]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 从数组a中,用30替换所有大于30的元素,用10替换所有小于10的元素。</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">np</span>.set_printoptions(<span style="color:#000000">precision</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">2</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">np</span>.random.seed(<span style="color:#008800">100</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 生成1-50内的随机数组,长度是20个元素</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">a</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.random.uniform(<span style="color:#008800">1</span>,<span style="color:#008800">50</span>, <span style="color:#008800">20</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">a</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># [27.63 14.64 21.8  42.39  1.23  6.96 33.87 41.47  7.7  29.18 44.67 11.25</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#  10.08  6.31 11.77 48.95 40.77  9.43 41.   14.43]</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># Solution 1: Using np.clip</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">np</span>.clip(<span style="color:#000000">a</span>, <span style="color:#000000">a_min</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">10</span>, <span style="color:#000000">a_max</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">30</span>))</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># [27.63 14.64 21.8  30.   10.   10.   30.   30.   10.   29.18 30.   11.25</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#  10.08 10.   11.77 30.   30.   10.   30.   14.43]</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># Solution 2: Using np.where</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">np</span>.where(<span style="color:#000000">a</span> <span style="color:#aa22ff"><strong><</strong></span> <span style="color:#008800">10</span>, <span style="color:#008800">10</span>, <span style="color:#000000">np</span>.where(<span style="color:#000000">a</span> <span style="color:#aa22ff"><strong>></strong></span> <span style="color:#008800">30</span>, <span style="color:#008800">30</span>, <span style="color:#000000">a</span>)))</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># </em></span><span style="color:#00bb00"><em>[</em></span><span style="color:#408080"><em>27.63 14.64 21.8  30.   10.   10.   30.   30.   10.   29.18 30.   11.25</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#  10.08 10.   11.77 30.   30.   10.   30.   14.43</em></span><span style="color:#00bb00"><em>]</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[27.63 14.64 21.8  42.39  1.23  6.96 33.87 41.47  7.7  29.18 44.67 11.25
 10.08  6.31 11.77 48.95 40.77  9.43 41.   14.43]
[27.63 14.64 21.8  30.   10.   10.   30.   30.   10.   29.18 30.   11.25
 10.08 10.   11.77 30.   30.   10.   30.   14.43]
[27.63 14.64 21.8  30.   10.   10.   30.   30.   10.   29.18 30.   11.25
 10.08 10.   11.77 30.   30.   10.   30.   14.43]

根据给定的分类变量创建组ID

import numpy as np url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' species = np.genfromtxt(url, delimiter=',', dtype='str', usecols=[4]) np.random.seed(100) #随机数种子 species_small = np.sort(np.random.choice(species, size=20)) #排序 # 方法1: # output = [np.argwhere(np.unique(species_small) == s).tolist()[0][0] for val in np.unique(species_small) for s in species_small[species_small==val]] # 方法2: 使用循环遍历 output = [] uniqs = np.unique(species_small) for val in uniqs: # 在组中的唯一值 for s in species_small[species_small==val]: # 在组中的每一个元素 groupid = np.argwhere(uniqs == s).tolist()[0][0] # 组的ID output.append(groupid) print(output) 

输出如下:

[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2]
In [17]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">url</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#ba2121">'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">species</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#000000">url</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'str'</span>, <span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">4</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">np</span>.random.seed(<span style="color:#008800">100</span>) <span style="color:#408080"><em>#随机数种子</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">species_small</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.sort(<span style="color:#000000">np</span>.random.choice(<span style="color:#000000">species</span>, <span style="color:#000000">size</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">20</span>)) <span style="color:#408080"><em>#排序</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 方法1:</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># output = [np.argwhere(np.unique(species_small) == s).tolist()[0][0] for val in np.unique(species_small) for s in species_small[species_small==val]]</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 方法2: 使用循环遍历</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">output</span> <span style="color:#aa22ff"><strong>=</strong></span> []</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">uniqs</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.unique(<span style="color:#000000">species_small</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>for</strong></span> <span style="color:#000000">val</span> <span style="color:#008000"><strong>in</strong></span> <span style="color:#000000">uniqs</span>:  <span style="color:#408080"><em># 在组中的唯一值</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">    <span style="color:#008000"><strong>for</strong></span> <span style="color:#000000">s</span> <span style="color:#008000"><strong>in</strong></span> <span style="color:#000000">species_small</span>[<span style="color:#000000">species_small</span><span style="color:#aa22ff"><strong>==</strong></span><span style="color:#000000">val</span>]:  <span style="color:#408080"><em># 在组中的每一个元素</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">        <span style="color:#000000">groupid</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.argwhere(<span style="color:#000000">uniqs</span> <span style="color:#aa22ff"><strong>==</strong></span> <span style="color:#000000">s</span>).tolist()[<span style="color:#008800">0</span>][<span style="color:#008800">0</span>]  <span style="color:#408080"><em># 组的ID</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">        <span style="color:#000000">output</span>.append(<span style="color:#000000">groupid</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span><span style="color:#00bb00">(</span><span style="color:#000000">output</span><span style="color:#00bb00">)</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2]

三、实验总结

通过本次实验的主要目的,是希望我们能够通过它一方面能够更加熟练的使用numpy;另一方面更重要的是,通过我们使用numpy对于鸢尾花的各种处理,让我们能够初步了解到数据处理是怎么一回事。实验当中用到的许多方法也是在我们之后进行数据分析的时候也会经常用到的。

四、思考与练习

在本次的实验当中,我们使用numpy进行了鸢尾花数据的处理,其中有相关性、规范化、缺失值处理等,想一想,你是否还起到其他的处理方法呢?不防验证验证。

总结:

1、了解了numpy库的功能

1、ndarray,一个具有矢量算术运算和复杂广播能力的快速且节省空间的多维数组。
2、用于对整组数据进行快速运算的标准数学函数(无需编写循环)。
3、用于读写磁盘数据的工具以及用于操作内存映射文件的工具。
4、线性代数、随机数生成以及傅里叶变换功能。
5、用于集成由C、C++、Fortran等语言编写的代码的A C API。

2、掌握了数据获取与处理方法

核心函数1:genfromtxt,直接从网站链接读取数据。

iris_1d = np.genfromtxt(url, delimiter=',', dtype=None)

核心函数2:np.array(),创建数组,代码中提取iris_ld列表第四列,并输出前4个数值。

species = np.array([row[4] for row in iris_1d])
print(species[:5])

核心函数3:row.tolist()[:4]获取数据每行前四列。

iris_2d = np.array([row.tolist()[:4] for row in iris_1d])

核心函数4:usecols=[0,1,2,3],从数据源获取每行前四列。

iris_2d = np.genfromtxt('iris.data', delimiter=',', dtype='float', usecols=[0, 1, 2, 3]

核心函数5:np.mean();np.median();np.std();获取指定

 

变量的平均值、中位数和标准差。

mu, med, sd = np.mean(sepallength), np.median(sepallength), np.std(sepallength)

核心函数6:.max;.min;求数据最小值、最大值,可以用来规范化数组。

Smax, Smin = sepallength.max(), sepallength.min()

核心函数7:percentile(sepallength,)找到变量的百分位数。

print(np.percentile(sepallength, q=[5, 95]))

核心函数8:random.choice()随机寻找变量的20个。

np.random.choice((i), 20)

核心函数9:random.randint()寻找变量的缺失值。

iris_2d[np.random.randint(150, size=20), np.random.randint(4, size=20)] = np.nan

核心函数10:~np.any(np.isnan(row))寻找缺失值所在的行。

any_nan_in_row = np.array([~np.any(np.isnan(row)) for row in iris_2d])

核心函数11:np.corrcoef()寻找两列数据之间的相关性

print(np.corrcoef(iris_2d[:, 0], iris_2d[:, 2])[0, 1])

核心函数12:np.any()寻找指定数组中的空值。

print(np.isnan(iris_2d).any())

核心函数13:np.unique()寻找特殊值的计数

u, counts = np.unique(species, return_counts=True

核心函数14:np.digitize()函数将数字转化为文本数组。

petallength = np.digitize(iris[:, 2].astype('float'), [0, 3, 5, 10])

核心函数15:iris按第一列二维数组排序。

print(iris[iris[:, 0].argsort()])

核心函数16:argwhere()函数在某列查找第一个大于特定值的数。

print(np.argwhere(iris[:, 3].astype(float) > 1.0)[0])

核心函数17:np.clip()替换函数。

print(np.clip(a, a_min=10, a_max=30))

核心函数18:创建ID。

for s in species_small[species_small==val]]

    ​综合以上总结,通过反复训练与编写将会提升利用numpy库获取与处理数据的能力。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/489406.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

SPSS如何进行判别分析之案例实训?

文章目录 0.引言1.一般判别分析2.逐步判别分析3.决策树分析 0.引言 因科研等多场景需要进行绘图处理&#xff0c;笔者对SPSS进行了学习&#xff0c;本文通过《SPSS统计分析从入门到精通》及其配套素材结合网上相关资料进行学习笔记总结&#xff0c;本文对判别分析进行阐述。 1…

Mysql Sql 优化之 Explain

在开发中&#xff0c;往往遇到一些慢查询语句&#xff0c; 我们需要对慢查询进行优化。Explain工具就是用来分析某个慢查询执行情况的工具。通过在select 语句前加上explain 关键字&#xff0c;然后执行就会得到某个sql 执行计划信息&#xff0c;通过分析执行计划&#xff0c;我…

JavaWeb:JavaScript 教程 笔记

1 JavaScript html完成了架子&#xff0c;css做了美化&#xff0c;但是网页是死的&#xff0c;我们需要给他注入灵魂&#xff0c;所以接下来我们需要学习JavaScript&#xff0c;这门语言会让我们的页面能够和用户进行交互。 1.1 介绍 通过代码/js效果演示提供资料进行效果演…

SettingsProvider单编生效

SettingsProvider单编生效 1、单编SettingProvider, make SettingsProvider&#xff0c;会生成apk&#xff0c;apk的路径为&#xff1a; product/qssi/system/priv-app/SettingsProvider/2、将手机恢复出厂设置&#xff0c;在过google向导之前 3、删除SettingProvider对应的…

【Redis】Redis分布式锁的10个坑

文章目录 前言1. 非原子操作&#xff08;setnx expire&#xff09;2.被别的客户端请求覆盖&#xff08; setnx value为过期时间&#xff09;3. 忘记设置过期时间4. 业务处理完&#xff0c;忘记释放锁5. B的锁被A给释放了6. 释放锁时&#xff0c;不是原子性7. 锁过期释放&…

复杂系统层级原理与模型驱动软件体系结构

最近看到模型驱动在国内渐渐被更多的人注意&#xff0c;前几天又看到一些关于UML优劣和应用方面的争论。作为繁忙工作中的一种休息&#xff0c;从过往的研究笔记中整理一点东西放在这里&#xff0c;与大家交流。 层级理论是构建复杂软件体系的基本原则 诺贝尔奖获得者赫伯特 A.…

RocketMQ第三节(生产者和消费者)

目录 1&#xff1a;生产者&#xff08;同步、异步、单向&#xff09; 1.1&#xff1a;同步发送消息&#xff08;每发送一条等待mq返回值&#xff09; 1.2&#xff1a;异步发送消息 1.3&#xff1a;单向发送消息&#xff08;不管成功失败&#xff0c;只管发送消息&#xff09…

学会以下几点,拍出人像大片很简单

很多人拍照&#xff0c;都是用手机拍的&#xff0c;虽然手机的摄影功能并没有相机那么齐全&#xff0c;但我们依然可以借助一些技巧拍出美美的照片&#xff0c;下面是为大家整理出了一份手机拍照指南&#xff0c;拍出人像大片很简单。 1、捕捉光影 光影是拍照的重要元素之一&…

【Unity编辑器】拓展Inspector视图

目录 1、拓展源生组件 2、拓展继承组件 3、组件不可编辑 4、Context菜单 1、拓展源生组件 摄像机就是典型的源生组件。它的局限性就是拓展组件只能加在源生组件的最上面或者最下面&#xff0c;不能插在中间&#xff0c;不过这样也就够了 using UnityEngine; using UnityEd…

接口测试之Jmeter+Ant+Jenkins接口自动化测试平台

目录 平台简介 环境准备 Jenkins简介 下载与安装 平台搭建 依赖文件配置 build.xml配置 Ant构建 阿里大佬倾情演绎&#xff0c;3天让你学会Jmeter接口测试&#xff0c;学不会算我输_哔哩哔哩_bilibilihttps://www.bilibili.com/video/BV1Q84y1K7bK/?spm_id_from333.99…

如何设置渗透测试实验室

导语&#xff1a;在本文中&#xff0c;我将介绍设置渗透实验室的最快方法。在开始下载和安装之前&#xff0c;必须确保你使用的计算机符合某些渗透测试的要求&#xff0c;这可以确保你可以一次运行多个虚拟机而不会出现任何问题。 在本文中&#xff0c;我将介绍设置渗透实验室的…

Python | 人脸识别系统 — 用户操作

本博客为人脸识别系统的摄像头画面展示代码解释 人脸识别系统博客汇总&#xff1a;人脸识别系统-博客索引 项目GitHub地址&#xff1a; 注意&#xff1a;阅读本博客前请先参考以下博客 工具安装、环境配置&#xff1a;人脸识别系统-简介 UI界面设计&#xff1a;人脸识别系统-UI…

错题汇总03

1.以下对二维数组a进行正确初始化的语句是 A int a[2][]{{0,1,2},{3,4,5}} B int a[][3]{{0,1,2},{3,4,5}} C int a[2][4]{{0,1,2},{3,4},{5}}; D int a[][3]{{0,,2},{},{3,4,5}} A数组列不能省略 C数组越界 D数组初始化每一行必须连续初始化 2.能把函数处理结果的二个数据…

msvcr110.dll丢失的解决方法-计算机中丢失msvcr110.dll怎么办?

看到您遇到了msvcr110.dll丢失的问题&#xff0c;这是由于您的计算机缺少必要的系统文件导致的。为了解决这个问题&#xff0c;您可以尝试以下几个步骤方法&#xff1a;就可以轻松解决msvcr110.dll丢失的问题。 msvcr110.dll修复方法一 重新安装Microsoft Visual C msvcr110.d…

windows安装node.js和vue3.x

目录 下载并安装node配置环境变量配置淘宝镜像源安装webpack全局打包工具安装cnpm安装vue-cli 3.xcnpm问题警告的解决办法 下载并安装node 1&#xff0c;下载nodejs 直接从node.js官网下载&#xff1a;https://nodejs.org/en/download 根据自己电脑的版本选择32位或者64位&…

建筑诊断用热像仪应用-flir T530红外热成像仪

建筑诊断用热像仪应用-flir T530红外热成像仪 建筑诊断领域热成像技术 隔热性能不良或隔热性能不足、潮气、建筑物外表面裂缝及不达标工程往往使住宅用建筑及商用建筑业主蒙受巨大的 经济损失。而红外热像仪却能够帮助您快速察觉有待改进的能效问题 建筑诊断用热像仪应用|带电…

两分钟成为 ChatGPT 国内高手【不要再拿ChatGPT当百度用了】

不要再问ChatGPT那些问百度的问题了&#xff0c;有更进阶的用法 更高效的编写prompts&#xff0c;以便ChatGPT给出更精准的回答 但是需要注意的是&#xff1a;国内现在根本没有GPT-4使用&#xff0c;但凡是说有GPT-4的都是骗子。 GPT 可以写文章&#xff0c;可以写诗&#x…

全方位揭秘!大数据从0到1的完美落地之运行流程和分片机制

一个完整的MapReduce程序在分布式运行时有三类实例进程&#xff1a; MRAppMaster: 负责整个程序的过程调度及状态协调MapTask: 负责Map阶段的整个数据处理流程ReduceTask: 负责Reduce阶段的整个数据处理流程 当一个作业提交后(mr程序启动)&#xff0c;大概流程如下&#xff1…

TouchGFX开发(2)----触摸屏幕组件点亮LED

TouchGFX开发.1----安装软件 概述创建 TouchGFX 项目添加图片组件添加按钮interactions 设置生成代码打开文件配置LED触摸点亮LED演示效果 概述 了解如何使用 TouchGFX 配置屏幕&#xff0c;添加触摸按钮&#xff0c;并通过按钮控制板载 LED 的状态。 创建 TouchGFX 项目 打…

详解map、set、multimap、multiset的使用

✍作者&#xff1a;阿润菜菜 &#x1f4d6;专栏&#xff1a;C 目录 前言set、multiset的使用1. set2. multiset3. 什么时候应该使用multiset而不是set map、multimap的使用1.map2.multimap3.什么时候应该使用multimap而不是map 前言 map、set、multimap、multiset是C STL中的四…