Python酷库之旅-第三方库Pandas(044)

一、用法精讲

151、pandas.Series.any方法

151-1、语法

151-2、参数

151-3、功能

151-4、返回值

151-5、说明

151-6、用法

151-6-1、数据准备

151-6-2、代码示例

151-6-3、结果输出

152、pandas.Series.autocorr方法

152-1、语法

152-2、参数

152-3、功能

152-4、返回值

152-5、说明

152-6、用法

152-6-1、数据准备

152-6-2、代码示例

152-6-3、结果输出

153、pandas.Series.between方法

153-1、语法

153-2、参数

153-3、功能

153-4、返回值

153-5、说明

153-6、用法

153-6-1、数据准备

153-6-2、代码示例

153-6-3、结果输出

154、pandas.Series.clip方法

154-1、语法

154-2、参数

154-3、功能

154-4、返回值

154-5、说明

154-6、用法

154-6-1、数据准备

154-6-2、代码示例

154-6-3、结果输出

155、pandas.Series.corr方法

155-1、语法

155-2、参数

155-3、功能

155-4、返回值

155-5、说明

155-6、用法

155-6-1、数据准备

155-6-2、代码示例

155-6-3、结果输出

二、推荐阅读

1、Python筑基之旅

2、Python函数之旅

3、Python算法之旅

4、Python魔法之旅

5、博客个人主页

一、用法精讲

151、pandas.Series.any方法

151-1、语法

# 151、pandas.Series.any方法
pandas.Series.any(*, axis=0, bool_only=False, skipna=True, **kwargs)
Return whether any element is True, potentially over an axis.

Returns False unless there is at least one element within a series or along a Dataframe axis that is True or equivalent (e.g. non-zero or non-empty).

Parameters:
axis
{0 or ‘index’, 1 or ‘columns’, None}, default 0
Indicate which axis or axes should be reduced. For Series this parameter is unused and defaults to 0.

0 / ‘index’ : reduce the index, return a Series whose index is the original column labels.

1 / ‘columns’ : reduce the columns, return a Series whose index is the original index.

None : reduce all axes, return a scalar.

bool_only
bool, default False
Include only boolean columns. Not implemented for Series.

skipna
bool, default True
Exclude NA/null values. If the entire row/column is NA and skipna is True, then the result will be False, as for an empty row/column. If skipna is False, then NA are treated as True, because these are not equal to zero.

**kwargs
any, default None
Additional keywords have no effect but might be accepted for compatibility with NumPy.

Returns:
scalar or Series
If level is specified, then, Series is returned; otherwise, scalar is returned.

151-2、参数

151-2-1、axis(可选，默认值为0)：该参数在Series对象中没有实际意义，因为Series是一维的。

151-2-2、bool_only(可选，默认值为False)：如果为True，则仅计算布尔值。

151-2-3、skipna(可选，默认值为True)：如果为True，则跳过NA/null值。

151-2-4、**kwargs(可选)：传递给函数的其他关键字参数。

151-3、功能

检查Series对象中的是否存在至少一个True值，如果Series中至少有一个值为True，则返回True，否则返回False。

151-4、返回值

返回一个布尔值，如果Series对象中的任意一个元素是True，则返回True，否则返回False。

151-5、说明

应用场景：

151-5-1、数据验证：在数据分析过程中，可以用来验证某些条件是否在至少一个数据中满足。

151-5-2、条件检查：可以用于检查数据集中的是否存在任何一个元素符合特定条件。

151-5-3、数据清理：在数据清理过程中，检查数据集中是否存在任何非空值或者是否有任何数据满足特定标准。

151-6、用法

151-6-1、数据准备

无

151-6-2、代码示例

# 151、pandas.Series.any方法
import pandas as pd
data = pd.Series([False, False, True, 0, 1])
result = data.any()
print(result)

151-6-3、结果输出

# 151、pandas.Series.any方法
# True

152、pandas.Series.autocorr方法

152-1、语法

# 152、pandas.Series.autocorr方法
pandas.Series.autocorr(lag=1)
Compute the lag-N autocorrelation.

This method computes the Pearson correlation between the Series and its shifted self.

Parameters:
lag
int, default 1
Number of lags to apply before performing autocorrelation.

Returns:
float
The Pearson correlation between self and self.shift(lag).

152-2、参数

152-2-1、lag(可选，默认值为1)：指定的滞后值，滞后值表示计算当前数据与之前多少个时间步的数据之间的相关性。

152-3、功能

计算时间序列数据与其自身在指定滞后(lag)下的自相关系数。

152-4、返回值

返回一个浮点数，表示时间序列在指定滞后下的自相关系数。

152-5、说明

应用场景：

152-5-1、时间序列分析：自相关性分析是时间序列分析中的重要部分，帮助理解数据的模式和趋势。

152-5-2、预测模型：自相关性高的数据可能具有预测性，可以用于构建预测模型。

152-5-3、信号处理：在信号处理和控制系统中，自相关性用于分析信号的延迟效应。

152-6、用法

152-6-1、数据准备

无

152-6-2、代码示例

# 152、pandas.Series.autocorr方法
import pandas as pd
data = pd.Series([0.1, 0.4, 0.3, 0.7, 0.9])
result = data.autocorr(lag=1)
print(result)

152-6-3、结果输出

# 152、pandas.Series.autocorr方法
# 0.6657502859356824

153、pandas.Series.between方法

153-1、语法

# 153、pandas.Series.between方法
pandas.Series.between(left, right, inclusive='both')
Return boolean Series equivalent to left <= series <= right.

This function returns a boolean vector containing True wherever the corresponding Series element is between the boundary values left and right. NA values are treated as False.

Parameters:
leftscalar or list-like
Left boundary.

rightscalar or list-like
Right boundary.

inclusive{“both”, “neither”, “left”, “right”}
Include boundaries. Whether to set each bound as closed or open.

Changed in version 1.3.0.

Returns:
Series
Series representing whether each element is between left and right (inclusive).

153-2、参数

153-2-1、left(必须)：表示区间的左端点。

153-2-2、right(必须)：表示区间的右端点。

153-2-3、inclusive(可选，默认值为'both')：指定是否包含边界值，可以是以下三个值之一：

153-2-3-1、'both'：包括left和right。

153-2-3-2、'neither'：不包括left和right。

153-2-3-3、'left'：包括left，但不包括right。

153-2-3-4、'right'：包括right，但不包括left。

153-3、功能

用于判断序列中的元素是否在指定的left和right值之间，并返回一个布尔序列。

153-4、返回值

返回一个布尔类型的pandas.Series，其中每个值表示原序列中的相应值是否在指定的区间范围内。

153-5、说明

应用场景：

153-5-1、数据筛选：快速筛选出符合条件的数值。

153-5-2、条件判断：在数据处理中进行复杂的条件判断和过滤。

153-5-3、数据分析：分析数据时，筛选出在特定范围内的数据点

153-6、用法

153-6-1、数据准备

无

153-6-2、代码示例

# 153、pandas.Series.between方法
import pandas as pd
data = pd.Series([3, 5, 6, 8, 10, 11, 24])
result = data.between(5, 11, 'neither')
print(result)

153-6-3、结果输出

# 153、pandas.Series.between方法
# 0    False
# 1    False
# 2     True
# 3     True
# 4     True
# 5    False
# 6    False
# dtype: bool

154、pandas.Series.clip方法

154-1、语法

# 154、pandas.Series.clip方法
pandas.Series.clip(lower=None, upper=None, *, axis=None, inplace=False, **kwargs)
Trim values at input threshold(s).

Assigns values outside boundary to boundary values. Thresholds can be singular values or array like, and in the latter case the clipping is performed element-wise in the specified axis.

Parameters:
lower
float or array-like, default None
Minimum threshold value. All values below this threshold will be set to it. A missing threshold (e.g NA) will not clip the value.

upper
float or array-like, default None
Maximum threshold value. All values above this threshold will be set to it. A missing threshold (e.g NA) will not clip the value.

axis
{{0 or ‘index’, 1 or ‘columns’, None}}, default None
Align object with lower and upper along the given axis. For Series this parameter is unused and defaults to None.

inplace
bool, default False
Whether to perform the operation in place on the data.

*args, **kwargs
Additional keywords have no effect but might be accepted for compatibility with numpy.

Returns:
Series or DataFrame or None
Same type as calling object with the values outside the clip boundaries replaced or None if inplace=True.

154-2、参数

154-2-1、lower(可选，默认值为None)：用于设置下界的标量值或数组，如果设置为None，则不应用下界。

154-2-2、upper(可选，默认值为None)：用于设置上界的标量值或数组，如果设置为None，则不应用上界。

154-2-3、axis(可选，默认值为None)：未使用，保留参数。

154-2-4、inplace(可选，默认值为False)：如果设置为True，将直接在原序列上进行修改，而不是返回一个新的序列。

154-2-5、**kwargs(可选)：其他关键字参数。

154-3、功能

将序列中的值限制在指定的上下界范围内，如果某个值小于lower，则将其设置为lower；如果某个值大于upper，则将其设置为upper。

154-4、返回值

返回一个新的pandas.Series，其中的值已被限制在指定范围内，如果inplace=True，则返回None。

154-5、说明

应用场景：

154-5-1、处理异常值：限制数据中的极端值，防止其影响分析结果。

154-5-2、数据清理：将数据限制在合理范围内，确保数据质量。

154-5-3、预处理：在机器学习中对数据进行预处理，确保输入数据在合理范围内。

154-6、用法

154-6-1、数据准备

无

154-6-2、代码示例

# 154、pandas.Series.clip方法
import pandas as pd
data = pd.Series([1, 2, 3, 4, 5])
clipped_data = data.clip(lower=2, upper=4)
print(clipped_data, end='\n\n')
data.clip(lower=2, upper=4, inplace=True)
print(data)

154-6-3、结果输出

# 154、pandas.Series.clip方法
# 0    2
# 1    2
# 2    3
# 3    4
# 4    4
# dtype: int64
#
# 0    2
# 1    2
# 2    3
# 3    4
# 4    4
# dtype: int64

155、pandas.Series.corr方法

155-1、语法

# 155、pandas.Series.corr方法
pandas.Series.corr(other, method='pearson', min_periods=None)
Compute correlation with other Series, excluding missing values.

The two Series objects are not required to be the same length and will be aligned internally before the correlation function is applied.

Parameters:
otherSeries
Series with which to compute the correlation.

method{‘pearson’, ‘kendall’, ‘spearman’} or callable
Method used to compute correlation:

pearson : Standard correlation coefficient

kendall : Kendall Tau correlation coefficient

spearman : Spearman rank correlation

callable: Callable with input two 1d ndarrays and returning a float.

Warning

Note that the returned matrix from corr will have 1 along the diagonals and will be symmetric regardless of the callable’s behavior.

min_periodsint, optional
Minimum number of observations needed to have a valid result.

Returns:
float
Correlation with other.

155-2、参数

155-2-1、other(必须)：表示另一个与当前序列进行相关性计算的Series对象。

155-2-2、method(可选，默认值为'pearson')：计算相关系数的方法，可选值包括：

155-2-2-1、'pearson'：计算皮尔逊相关系数，这是最常用的相关系数，衡量线性关系。

155-2-2-2、'kendall'：计算肯德尔等级相关系数，用于衡量序列间的等级相关性。

155-2-2-3、'spearman'：计算斯皮尔曼等级相关系数，用于衡量序列间的单调关系。

155-2-3、min_periods(可选，默认值为None)：最小有效观察数，要求计算相关系数的非NA(缺失值)数据点的最少数量，如果可用数据点少于这个值，将返回NaN。

155-3、功能

用于计算两个序列之间的相关系数，相关系数用于衡量两个序列之间的线性关系。

155-4、返回值

返回一个浮点数，表示两个序列之间的相关系数。

155-5、说明

应用场景：

155-5-1、数据分析：评估两个变量之间的线性关系。

155-5-2、特征选择：在机器学习中，选择与目标变量高度相关的特征。

155-5-3、金融分析：评估不同金融资产之间的关系。

155-6、用法

155-6-1、数据准备

无

155-6-2、代码示例

# 155、pandas.Series.corr方法
# 155-1、计算皮尔逊相关系数
import pandas as pd
data1 = pd.Series([1, 2, 3, 4, 5])
data2 = pd.Series([5, 4, 3, 2, 1])
corr_pearson = data1.corr(data2)
print(corr_pearson, end='\n\n')

# 155-2、计算斯皮尔曼等级相关系数
import pandas as pd
data1 = pd.Series([1, 2, 3, 4, 5])
data2 = pd.Series([5, 4, 3, 2, 1])
corr_spearman = data1.corr(data2, method='spearman')
print(corr_spearman, end='\n\n')

# 155-3、指定最小有效观察数
import pandas as pd
data1 = pd.Series([1, 2, 3, 4, None])
data2 = pd.Series([5, 4, 3, 2, 1])
corr_with_min_periods = data1.corr(data2, min_periods=4)
print(corr_with_min_periods)

155-6-3、结果输出

# 155、pandas.Series.corr方法
# 155-1、计算皮尔逊相关系数
# -0.9999999999999999

# 155-2、计算斯皮尔曼等级相关系数
# -0.9999999999999999

# 155-3、指定最小有效观察数
# -1.0