Python酷库之旅-第三方库Pandas(059)

一、用法精讲

226、pandas.Series.pad方法

226-1、语法

226-2、参数

226-3、功能

226-4、返回值

226-5、说明

226-6、用法

226-6-1、数据准备

226-6-2、代码示例

226-6-3、结果输出

227、pandas.Series.replace方法

227-1、语法

227-2、参数

227-3、功能

227-4、返回值

227-5、说明

227-6、用法

227-6-1、数据准备

227-6-2、代码示例

227-6-3、结果输出

228、pandas.Series.argsort方法

228-1、语法

228-2、参数

228-3、功能

228-4、返回值

228-5、说明

228-6、用法

228-6-1、数据准备

228-6-2、代码示例

228-6-3、结果输出

229、pandas.Series.argmin方法

229-1、语法

229-2、参数

229-3、功能

229-4、返回值

229-5、说明

229-6、用法

229-6-1、数据准备

229-6-2、代码示例

229-6-3、结果输出

230、pandas.Series.argmax方法

230-1、语法

230-2、参数

230-3、功能

230-4、返回值

230-5、说明

230-6、用法

230-6-1、数据准备

230-6-2、代码示例

230-6-3、结果输出

二、推荐阅读

1、Python筑基之旅

2、Python函数之旅

3、Python算法之旅

4、Python魔法之旅

5、博客个人主页

一、用法精讲

226、pandas.Series.pad方法

226-1、语法

# 226、pandas.Series.pad方法
pandas.Series.pad(*, axis=None, inplace=False, limit=None, downcast=_NoDefault.no_default)
Fill NA/NaN values by propagating the last valid observation to next valid.

Deprecated since version 2.0: Series/DataFrame.pad is deprecated. Use Series/DataFrame.ffill instead.

Returns:
Series/DataFrame or None
Object with missing values filled or None if inplace=True.

226-2、参数

226-2-1、axis(可选，默认值为None)：用于确定操作的轴，对于Series来说，它的作用不大，因为Series是一维的，通常不需要指定轴。

226-2-2、inplace(可选，默认值为False)：如果设为True，则会在原地修改Series 对象而不是返回一个新的对象。

226-2-3、limit(可选，默认值为None)：一个整数，指定填充的最大范围，即如果填充的连续缺失值超过这个限制，填充将会停止。

226-2-4、downcast(可选)：用于控制数据类型的转换，通常情况下，你不需要设置这个参数，它在自动调整数据类型时会用到。

226-3、功能

用于填充Series对象中的缺失值(例如NaN)，使用前一个有效值进行填充，这在处理时间序列数据时特别有用，可以保持数据的连续性。

226-4、返回值

返回一个Series对象，其中所有的缺失值都被填充为前一个有效值，如果inplace=True，则直接在原Series对象上修改，不返回新对象

226-5、说明

与pandas.Series.ffill方法的功能相同。

226-6、用法

226-6-1、数据准备

无

226-6-2、代码示例

# 226、pandas.Series.pad方法
import pandas as pd
import numpy as np
# 创建一个包含缺失值的Series
s = pd.Series([1, np.nan, 3, np.nan, 5])
# 使用pad方法填充缺失值
result = s.pad()
print(result)

226-6-3、结果输出

# 226、pandas.Series.pad方法
# 0    1.0
# 1    1.0
# 2    3.0
# 3    3.0
# 4    5.0
# dtype: float64

227、pandas.Series.replace方法

227-1、语法

# 227、pandas.Series.replace方法
pandas.Series.replace(to_replace=None, value=_NoDefault.no_default, *, inplace=False, limit=None, regex=False, method=_NoDefault.no_default)
Replace values given in to_replace with value.

Values of the Series/DataFrame are replaced with other values dynamically. This differs from updating with .loc or .iloc, which require you to specify a location to update with some value.

Parameters:
to_replacestr, regex, list, dict, Series, int, float, or None
How to find the values that will be replaced.

numeric, str or regex:

numeric: numeric values equal to to_replace will be replaced with value

str: string exactly matching to_replace will be replaced with value

regex: regexs matching to_replace will be replaced with value

list of str, regex, or numeric:

First, if to_replace and value are both lists, they must be the same length.

Second, if regex=True then all of the strings in both lists will be interpreted as regexs otherwise they will match directly. This doesn’t matter much for value since there are only a few possible substitution regexes you can use.

str, regex and numeric rules apply as above.

dict:

Dicts can be used to specify different replacement values for different existing values. For example, {'a': 'b', 'y': 'z'} replaces the value ‘a’ with ‘b’ and ‘y’ with ‘z’. To use a dict in this way, the optional value parameter should not be given.

For a DataFrame a dict can specify that different values should be replaced in different columns. For example, {'a': 1, 'b': 'z'} looks for the value 1 in column ‘a’ and the value ‘z’ in column ‘b’ and replaces these values with whatever is specified in value. The value parameter should not be None in this case. You can treat this as a special case of passing two lists except that you are specifying the column to search in.

For a DataFrame nested dictionaries, e.g., {'a': {'b': np.nan}}, are read as follows: look in column ‘a’ for the value ‘b’ and replace it with NaN. The optional value parameter should not be specified to use a nested dict in this way. You can nest regular expressions as well. Note that column names (the top-level dictionary keys in a nested dictionary) cannot be regular expressions.

None:

This means that the regex argument must be a string, compiled regular expression, or list, dict, ndarray or Series of such elements. If value is also None then this must be a nested dictionary or Series.

See the examples section for examples of each of these.

valuescalar, dict, list, str, regex, default None
Value to replace any values matching to_replace with. For a DataFrame a dict of values can be used to specify which value to use for each column (columns not in the dict will not be filled). Regular expressions, strings and lists or dicts of such objects are also allowed.

inplacebool, default False
If True, performs operation inplace and returns None.

limitint, default None
Maximum size gap to forward or backward fill.

Deprecated since version 2.1.0.

regexbool or same types as to_replace, default False
Whether to interpret to_replace and/or value as regular expressions. Alternatively, this could be a regular expression or a list, dict, or array of regular expressions in which case to_replace must be None.

method{‘pad’, ‘ffill’, ‘bfill’}
The method to use when for replacement, when to_replace is a scalar, list or tuple and value is None.

Deprecated since version 2.1.0.

Returns:
Series/DataFrame
Object after replacement.

Raises:
AssertionError
If regex is not a bool and to_replace is not None.

TypeError
If to_replace is not a scalar, array-like, dict, or None

If to_replace is a dict and value is not a list, dict, ndarray, or Series

If to_replace is None and regex is not compilable into a regular expression or is a list, dict, ndarray, or Series.

When replacing multiple bool or datetime64 objects and the arguments to to_replace does not match the type of the value being replaced

ValueError
If a list or an ndarray is passed to to_replace and value but they are not the same length.

227-2、参数

227-2-1、to_replace(可选，默认值为None)：用于指定要被替换的值，可以是单个值、列表、字典、正则表达式等，该参数定义了要查找和替换的内容。

227-2-2、value(可选)：用于指定替换to_replace中值的内容，可以是单个值、列表或字典，如果to_replace是字典，那么value必须也是字典。

227-2-3、inplace(可选，默认值为False)：如果设为True，则直接在原Series对象上进行替换，而不是返回一个新的Series对象。

227-2-4、limit(可选，默认值为None)：一个整数，指定最大替换次数，即如果替换的次数超过这个限制，替换将会停止。

227-2-5、regex(可选，默认值为False)：如果设为True，to_replace被解释为正则表达式，并进行模式匹配替换。

227-2-6、method(可选)：用于指定填充方法，如果使用了填充方法(如'ffill'或'bfill')，value和to_replace参数将被忽略。

227-3、功能

用于将Series中的特定值替换为其他指定的值，这对于数据清洗和预处理特别有用。

227-4、返回值

返回一个Series对象，其中指定的值已经被替换，如果inplace=True，则直接在原Series对象上修改，不返回新对象。

227-5、说明

无

227-6、用法

227-6-1、数据准备

无

227-6-2、代码示例

# 227、pandas.Series.replace方法
import pandas as pd
# 创建一个包含不同值的Series
s = pd.Series(['apple', 'banana', 'cherry', 'banana', 'apple'])
# 使用replace方法替换'banana'为'orange'
result = s.replace(to_replace='banana', value='orange')
print(result)

227-6-3、结果输出

# 227、pandas.Series.replace方法
# 0     apple
# 1    orange
# 2    cherry
# 3    orange
# 4     apple
# dtype: object

228、pandas.Series.argsort方法

228-1、语法

# 228、pandas.Series.argsort方法
pandas.Series.argsort(axis=0, kind='quicksort', order=None, stable=None)
Return the integer indices that would sort the Series values.

Override ndarray.argsort. Argsorts the value, omitting NA/null values, and places the result in the same locations as the non-NA values.

Parameters:
axis
{0 or ‘index’}
Unused. Parameter needed for compatibility with DataFrame.

kind
{‘mergesort’, ‘quicksort’, ‘heapsort’, ‘stable’}, default ‘quicksort’
Choice of sorting algorithm. See numpy.sort() for more information. ‘mergesort’ and ‘stable’ are the only stable algorithms.

order
None
Has no effect but is accepted for compatibility with numpy.

stable
None
Has no effect but is accepted for compatibility with numpy.

Returns:
Series[np.intp]
Positions of values within the sort order with -1 indicating nan values.

228-2、参数

228-2-1、axis(可选，默认值为0)：整数，指定轴进行排序，对于Series对象，该参数通常无效，因为Series是一维的，所以轴总是0。

228-2-2、kind(可选，默认值为'quicksort')：字符串，指定排序算法，可选值包括：

'quicksort'：快速排序，默认值。
'mergesort'：归并排序，稳定的排序算法。
'heapsort'：堆排序，不稳定的排序算法。

228-2-3、order(可选，默认值为None)：在DataFrame中用于指定排序的列顺序；对于Series对象，此参数通常没有作用。

228-2-4、stable(可选，默认值为None)：是否使用稳定排序算法，稳定排序算法保持相等元素的原始相对顺序，若设置为True，会使用稳定排序算法，否则会使用不稳定排序算法。

228-3、功能

返回的是一个与原Series具有相同长度的整数序列，这些整数表示元素在排序后的索引位置。换句话说，返回的序列中每个值是原序列中对应元素的排序位置。

228-4、返回值

返回一个Series对象，其中包含原始Series中每个元素的排序索引。

228-5、说明

无

228-6、用法

228-6-1、数据准备

无

228-6-2、代码示例

# 228、pandas.Series.argsort方法
import pandas as pd
s = pd.Series([3, 6, 5, 11, 10, 8, 10, 24])
sorted_indices = s.argsort()
print(sorted_indices)

228-6-3、结果输出

# 228、pandas.Series.argsort方法
# 0    0
# 1    2
# 2    1
# 3    5
# 4    4
# 5    6
# 6    3
# 7    7
# dtype: int64

229、pandas.Series.argmin方法

229-1、语法

# 229、pandas.Series.argmin方法
pandas.Series.argmin(axis=None, skipna=True, *args, **kwargs)
Return int position of the smallest value in the Series.

If the minimum is achieved in multiple locations, the first row position is returned.

Parameters:
axis
{None}
Unused. Parameter needed for compatibility with DataFrame.

skipna
bool, default True
Exclude NA/null values when showing the result.

*args, **kwargs
Additional arguments and keywords for compatibility with NumPy.

Returns:
int
Row position of the minimum value.

229-2、参数

229-2-1、axis(可选，默认值为None)：整数，指定轴进行排序，对于Series对象，该参数通常无效，因为Series是一维的，所以轴总是0。

229-2-2、skipna(可选，默认值为True)：是否忽略缺失值(NaN)，如果设置为True，则在寻找最小值时会跳过NaN值；如果为False，且存在NaN，则结果会是NaN。

229-2-3、*args(可选)：其他位置参数，为后续扩展功能做预留。

229-2-4、**kwargs(可选)：其他关键字参数，为后续扩展功能做预留。

229-3、功能

返回的是Series中最小值的索引，如果Series中包含多个最小值，返回第一个最小值的索引。

229-4、返回值

返回一个整数值，表示最小值在原始Series中的索引位置。

229-5、说明

无

229-6、用法

229-6-1、数据准备

无

229-6-2、代码示例

# 229、pandas.Series.argmin方法
import pandas as pd
import numpy as np
s = pd.Series([3, 1, 2, np.nan, 4])
min_index = s.argmin()
print(min_index)

229-6-3、结果输出

# 229、pandas.Series.argmin方法
# 1

230、pandas.Series.argmax方法

230-1、语法

# 230、pandas.Series.argmax方法
pandas.Series.argmax(axis=None, skipna=True, *args, **kwargs)
Return int position of the largest value in the Series.

If the maximum is achieved in multiple locations, the first row position is returned.

Parameters:
axis
{None}
Unused. Parameter needed for compatibility with DataFrame.

skipna
bool, default True
Exclude NA/null values when showing the result.

*args, **kwargs
Additional arguments and keywords for compatibility with NumPy.

Returns:
int
Row position of the maximum value.

230-2、参数

230-2-1、axis(可选，默认值为None)：整数，指定轴进行排序，对于Series对象，该参数通常无效，因为Series是一维的，所以轴总是0。

230-2-2、skipna(可选，默认值为True)：是否忽略缺失值(NaN)，如果设置为True，则在寻找最小值时会跳过NaN值；如果为False，且存在NaN，则结果会是NaN。

230-2-3、*args(可选)：其他位置参数，为后续扩展功能做预留。

230-2-4、**kwargs(可选)：其他关键字参数，为后续扩展功能做预留。

230-3、功能

返回的是Series中最大值的索引，如果Series中包含多个最大值，返回第一个最大值的索引。

230-4、返回值

返回一个整数值，表示最大值在原始Series中的索引位置。

230-5、说明

无

230-6、用法

230-6-1、数据准备

无

230-6-2、代码示例

# 230、pandas.Series.argmax方法
import pandas as pd
import numpy as np
s = pd.Series([3, 1, 4, np.nan, 2])
max_index = s.argmax()
print(max_index)

230-6-3、结果输出

# 230、pandas.Series.argmax方法
# 2