上一篇链接:
Python pandas库|任凭弱水三千,我只取一瓢饮(4)_Hann Yang的博客-CSDN博客
S~W: Function46~56
Types['Function'][45:]
['set_eng_float_format', 'show_versions', 'test', 'timedelta_range', 'to_datetime', 'to_numeric', 'to_pickle', 'to_timedelta', 'unique', 'value_counts', 'wide_to_long']
Function46
set_eng_float_format(accuracy: 'int' = 3, use_eng_prefix: 'bool' = False) -> 'None'
Help on function set_eng_float_format in module pandas.io.formats.format:
set_eng_float_format(accuracy: 'int' = 3, use_eng_prefix: 'bool' = False) -> 'None'
Alter default behavior on how float is formatted in DataFrame.
Format float in engineering format. By accuracy, we mean the number of
decimal digits after the floating point.
See also EngFormatter.
Function47
show_versions(as_json: 'str | bool' = False) -> 'None'
Help on function show_versions in module pandas.util._print_versions:
show_versions(as_json: 'str | bool' = False) -> 'None'
Provide useful information, important for bug reports.
It comprises info about hosting operation system, pandas version,
and versions of other installed relative packages.
Parameters
----------
as_json : str or bool, default False
* If False, outputs info in a human readable form to the console.
* If str, it will be considered as a path to a file.
Info will be written to that file in JSON format.
* If True, outputs info in JSON format to the console.
Function48
test(extra_args=None)
Help on function test in module pandas.util._tester:
test(extra_args=None)
Function49
timedelta_range(start=None, end=None, periods: 'Optional[int]' = None, freq=None, name=None, closed=None) -> 'TimedeltaIndex'
Help on function timedelta_range in module pandas.core.indexes.timedeltas:
timedelta_range(start=None, end=None, periods: 'Optional[int]' = None, freq=None, name=None, closed=None) -> 'TimedeltaIndex'
Return a fixed frequency TimedeltaIndex, with day as the default
frequency.
Parameters
----------
start : str or timedelta-like, default None
Left bound for generating timedeltas.
end : str or timedelta-like, default None
Right bound for generating timedeltas.
periods : int, default None
Number of periods to generate.
freq : str or DateOffset, default 'D'
Frequency strings can have multiples, e.g. '5H'.
name : str, default None
Name of the resulting TimedeltaIndex.
closed : str, default None
Make the interval closed with respect to the given frequency to
the 'left', 'right', or both sides (None).
Returns
-------
TimedeltaIndex
Notes
-----
Of the four parameters ``start``, ``end``, ``periods``, and ``freq``,
exactly three must be specified. If ``freq`` is omitted, the resulting
``TimedeltaIndex`` will have ``periods`` linearly spaced elements between
``start`` and ``end`` (closed on both sides).
To learn more about the frequency strings, please see `this link
<https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases>`__.
Examples
--------
>>> pd.timedelta_range(start='1 day', periods=4)
TimedeltaIndex(['1 days', '2 days', '3 days', '4 days'],
dtype='timedelta64[ns]', freq='D')
The ``closed`` parameter specifies which endpoint is included. The default
behavior is to include both endpoints.
>>> pd.timedelta_range(start='1 day', periods=4, closed='right')
TimedeltaIndex(['2 days', '3 days', '4 days'],
dtype='timedelta64[ns]', freq='D')
The ``freq`` parameter specifies the frequency of the TimedeltaIndex.
Only fixed frequencies can be passed, non-fixed frequencies such as
'M' (month end) will raise.
>>> pd.timedelta_range(start='1 day', end='2 days', freq='6H')
TimedeltaIndex(['1 days 00:00:00', '1 days 06:00:00', '1 days 12:00:00',
'1 days 18:00:00', '2 days 00:00:00'],
dtype='timedelta64[ns]', freq='6H')
Specify ``start``, ``end``, and ``periods``; the frequency is generated
automatically (linearly spaced).
>>> pd.timedelta_range(start='1 day', end='5 days', periods=4)
TimedeltaIndex(['1 days 00:00:00', '2 days 08:00:00', '3 days 16:00:00',
'5 days 00:00:00'],
dtype='timedelta64[ns]', freq=None)
Function50
to_datetime(arg: 'DatetimeScalarOrArrayConvertible', errors: 'str' = 'raise', dayfirst: 'bool' = False, yearfirst: 'bool' = False, utc: 'bool | None' = None, format: 'str | None' = None, exact: 'bool' = True, unit: 'str | None' = None, infer_datetime_format: 'bool' = False, origin='unix', cache: 'bool' = True) -> 'DatetimeIndex | Series | DatetimeScalar | NaTType | None'
Help on function to_datetime in module pandas.core.tools.datetimes:
to_datetime(arg: 'DatetimeScalarOrArrayConvertible', errors: 'str' = 'raise', dayfirst: 'bool' = False, yearfirst: 'bool' = False, utc: 'bool | None' = None, format: 'str | None' = None, exact: 'bool' = True, unit: 'str | None' = None, infer_datetime_format: 'bool' = False, origin='unix', cache: 'bool' = True) -> 'DatetimeIndex | Series | DatetimeScalar | NaTType | None'
Convert argument to datetime.
Parameters
----------
arg : int, float, str, datetime, list, tuple, 1-d array, Series, DataFrame/dict-like
The object to convert to a datetime.
errors : {'ignore', 'raise', 'coerce'}, default 'raise'
- If 'raise', then invalid parsing will raise an exception.
- If 'coerce', then invalid parsing will be set as NaT.
- If 'ignore', then invalid parsing will return the input.
dayfirst : bool, default False
Specify a date parse order if `arg` is str or its list-likes.
If True, parses dates with the day first, eg 10/11/12 is parsed as
2012-11-10.
Warning: dayfirst=True is not strict, but will prefer to parse
with day first (this is a known bug, based on dateutil behavior).
yearfirst : bool, default False
Specify a date parse order if `arg` is str or its list-likes.
- If True parses dates with the year first, eg 10/11/12 is parsed as
2010-11-12.
- If both dayfirst and yearfirst are True, yearfirst is preceded (same
as dateutil).
Warning: yearfirst=True is not strict, but will prefer to parse
with year first (this is a known bug, based on dateutil behavior).
utc : bool, default None
Return UTC DatetimeIndex if True (converting any tz-aware
datetime.datetime objects as well).
format : str, default None
The strftime to parse time, eg "%d/%m/%Y", note that "%f" will parse
all the way up to nanoseconds.
See strftime documentation for more information on choices:
https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior.
exact : bool, True by default
Behaves as:
- If True, require an exact format match.
- If False, allow the format to match anywhere in the target string.
unit : str, default 'ns'
The unit of the arg (D,s,ms,us,ns) denote the unit, which is an
integer or float number. This will be based off the origin.
Example, with unit='ms' and origin='unix' (the default), this
would calculate the number of milliseconds to the unix epoch start.
infer_datetime_format : bool, default False
If True and no `format` is given, attempt to infer the format of the
datetime strings based on the first non-NaN element,
and if it can be inferred, switch to a faster method of parsing them.
In some cases this can increase the parsing speed by ~5-10x.
origin : scalar, default 'unix'
Define the reference date. The numeric values would be parsed as number
of units (defined by `unit`) since this reference date.
- If 'unix' (or POSIX) time; origin is set to 1970-01-01.
- If 'julian', unit must be 'D', and origin is set to beginning of
Julian Calendar. Julian day number 0 is assigned to the day starting
at noon on January 1, 4713 BC.
- If Timestamp convertible, origin is set to Timestamp identified by
origin.
cache : bool, default True
If True, use a cache of unique, converted dates to apply the datetime
conversion. May produce significant speed-up when parsing duplicate
date strings, especially ones with timezone offsets. The cache is only
used when there are at least 50 values. The presence of out-of-bounds
values will render the cache unusable and may slow down parsing.
.. versionchanged:: 0.25.0
- changed default value from False to True.
Returns
-------
datetime
If parsing succeeded.
Return type depends on input:
- list-like: DatetimeIndex
- Series: Series of datetime64 dtype
- scalar: Timestamp
In case when it is not possible to return designated types (e.g. when
any element of input is before Timestamp.min or after Timestamp.max)
return will have datetime.datetime type (or corresponding
array/Series).
See Also
--------
DataFrame.astype : Cast argument to a specified dtype.
to_timedelta : Convert argument to timedelta.
convert_dtypes : Convert dtypes.
Examples
--------
Assembling a datetime from multiple columns of a DataFrame. The keys can be
common abbreviations like ['year', 'month', 'day', 'minute', 'second',
'ms', 'us', 'ns']) or plurals of the same
>>> df = pd.DataFrame({'year': [2015, 2016],
... 'month': [2, 3],
... 'day': [4, 5]})
>>> pd.to_datetime(df)
0 2015-02-04
1 2016-03-05
dtype: datetime64[ns]
If a date does not meet the `timestamp limitations
<https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html
#timeseries-timestamp-limits>`_, passing errors='ignore'
will return the original input instead of raising any exception.
Passing errors='coerce' will force an out-of-bounds date to NaT,
in addition to forcing non-dates (or non-parseable dates) to NaT.
>>> pd.to_datetime('13000101', format='%Y%m%d', errors='ignore')
datetime.datetime(1300, 1, 1, 0, 0)
>>> pd.to_datetime('13000101', format='%Y%m%d', errors='coerce')
NaT
Passing infer_datetime_format=True can often-times speedup a parsing
if its not an ISO8601 format exactly, but in a regular format.
>>> s = pd.Series(['3/11/2000', '3/12/2000', '3/13/2000'] * 1000)
>>> s.head()
0 3/11/2000
1 3/12/2000
2 3/13/2000
3 3/11/2000
4 3/12/2000
dtype: object
>>> %timeit pd.to_datetime(s, infer_datetime_format=True) # doctest: +SKIP
100 loops, best of 3: 10.4 ms per loop
>>> %timeit pd.to_datetime(s, infer_datetime_format=False) # doctest: +SKIP
1 loop, best of 3: 471 ms per loop
Using a unix epoch time
>>> pd.to_datetime(1490195805, unit='s')
Timestamp('2017-03-22 15:16:45')
>>> pd.to_datetime(1490195805433502912, unit='ns')
Timestamp('2017-03-22 15:16:45.433502912')
.. warning:: For float arg, precision rounding might happen. To prevent
unexpected behavior use a fixed-width exact type.
Using a non-unix epoch origin
>>> pd.to_datetime([1, 2, 3], unit='D',
... origin=pd.Timestamp('1960-01-01'))
DatetimeIndex(['1960-01-02', '1960-01-03', '1960-01-04'],
dtype='datetime64[ns]', freq=None)
In case input is list-like and the elements of input are of mixed
timezones, return will have object type Index if utc=False.
>>> pd.to_datetime(['2018-10-26 12:00 -0530', '2018-10-26 12:00 -0500'])
Index([2018-10-26 12:00:00-05:30, 2018-10-26 12:00:00-05:00], dtype='object')
>>> pd.to_datetime(['2018-10-26 12:00 -0530', '2018-10-26 12:00 -0500'],
... utc=True)
DatetimeIndex(['2018-10-26 17:30:00+00:00', '2018-10-26 17:00:00+00:00'],
dtype='datetime64[ns, UTC]', freq=None)
Function51
to_numeric(arg, errors='raise', downcast=None)
Help on function to_numeric in module pandas.core.tools.numeric:
to_numeric(arg, errors='raise', downcast=None)
Convert argument to a numeric type.
The default return dtype is `float64` or `int64`
depending on the data supplied. Use the `downcast` parameter
to obtain other dtypes.
Please note that precision loss may occur if really large numbers
are passed in. Due to the internal limitations of `ndarray`, if
numbers smaller than `-9223372036854775808` (np.iinfo(np.int64).min)
or larger than `18446744073709551615` (np.iinfo(np.uint64).max) are
passed in, it is very likely they will be converted to float so that
they can stored in an `ndarray`. These warnings apply similarly to
`Series` since it internally leverages `ndarray`.
Parameters
----------
arg : scalar, list, tuple, 1-d array, or Series
Argument to be converted.
errors : {'ignore', 'raise', 'coerce'}, default 'raise'
- If 'raise', then invalid parsing will raise an exception.
- If 'coerce', then invalid parsing will be set as NaN.
- If 'ignore', then invalid parsing will return the input.
downcast : {'integer', 'signed', 'unsigned', 'float'}, default None
If not None, and if the data has been successfully cast to a
numerical dtype (or if the data was numeric to begin with),
downcast that resulting data to the smallest numerical dtype
possible according to the following rules:
- 'integer' or 'signed': smallest signed int dtype (min.: np.int8)
- 'unsigned': smallest unsigned int dtype (min.: np.uint8)
- 'float': smallest float dtype (min.: np.float32)
As this behaviour is separate from the core conversion to
numeric values, any errors raised during the downcasting
will be surfaced regardless of the value of the 'errors' input.
In addition, downcasting will only occur if the size
of the resulting data's dtype is strictly larger than
the dtype it is to be cast to, so if none of the dtypes
checked satisfy that specification, no downcasting will be
performed on the data.
Returns
-------
ret
Numeric if parsing succeeded.
Return type depends on input. Series if Series, otherwise ndarray.
See Also
--------
DataFrame.astype : Cast argument to a specified dtype.
to_datetime : Convert argument to datetime.
to_timedelta : Convert argument to timedelta.
numpy.ndarray.astype : Cast a numpy array to a specified type.
DataFrame.convert_dtypes : Convert dtypes.
Examples
--------
Take separate series and convert to numeric, coercing when told to
>>> s = pd.Series(['1.0', '2', -3])
>>> pd.to_numeric(s)
0 1.0
1 2.0
2 -3.0
dtype: float64
>>> pd.to_numeric(s, downcast='float')
0 1.0
1 2.0
2 -3.0
dtype: float32
>>> pd.to_numeric(s, downcast='signed')
0 1
1 2
2 -3
dtype: int8
>>> s = pd.Series(['apple', '1.0', '2', -3])
>>> pd.to_numeric(s, errors='ignore')
0 apple
1 1.0
2 2
3 -3
dtype: object
>>> pd.to_numeric(s, errors='coerce')
0 NaN
1 1.0
2 2.0
3 -3.0
dtype: float64
Downcasting of nullable integer and floating dtypes is supported:
>>> s = pd.Series([1, 2, 3], dtype="Int64")
>>> pd.to_numeric(s, downcast="integer")
0 1
1 2
2 3
dtype: Int8
>>> s = pd.Series([1.0, 2.1, 3.0], dtype="Float64")
>>> pd.to_numeric(s, downcast="float")
0 1.0
1 2.1
2 3.0
dtype: Float32
Function52
to_pickle(obj: Any, filepath_or_buffer: Union[ForwardRef('PathLike[str]'), str, IO[~AnyStr], io.RawIOBase, io.BufferedIOBase, io.TextIOBase, _io.TextIOWrapper, mmap.mmap], compression: Union[str, Dict[str, Any], NoneType] = 'infer', protocol: int = 5, storage_options: Union[Dict[str, Any], NoneType] = None)
Help on function to_pickle in module pandas.io.pickle:
to_pickle(obj: Any, filepath_or_buffer: Union[ForwardRef('PathLike[str]'), str, IO[~AnyStr], io.RawIOBase, io.BufferedIOBase, io.TextIOBase, _io.TextIOWrapper, mmap.mmap], compression: Union[str, Dict[str, Any], NoneType] = 'infer', protocol: int = 5, storage_options: Union[Dict[str, Any], NoneType] = None)
Pickle (serialize) object to file.
Parameters
----------
obj : any object
Any python object.
filepath_or_buffer : str, path object or file-like object
File path, URL, or buffer where the pickled object will be stored.
.. versionchanged:: 1.0.0
Accept URL. URL has to be of S3 or GCS.
compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None}, default 'infer'
If 'infer' and 'path_or_url' is path-like, then detect compression from
the following extensions: '.gz', '.bz2', '.zip', or '.xz' (otherwise no
compression) If 'infer' and 'path_or_url' is not path-like, then use
None (= no decompression).
protocol : int
Int which indicates which protocol should be used by the pickler,
default HIGHEST_PROTOCOL (see [1], paragraph 12.1.2). The possible
values for this parameter depend on the version of Python. For Python
2.x, possible values are 0, 1, 2. For Python>=3.0, 3 is a valid value.
For Python >= 3.4, 4 is a valid value. A negative value for the
protocol parameter is equivalent to setting its value to
HIGHEST_PROTOCOL.
storage_options : dict, optional
Extra options that make sense for a particular storage connection, e.g.
host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
are forwarded to ``urllib`` as header options. For other URLs (e.g.
starting with "s3://", and "gcs://") the key-value pairs are forwarded to
``fsspec``. Please see ``fsspec`` and ``urllib`` for more details.
.. versionadded:: 1.2.0
.. [1] https://docs.python.org/3/library/pickle.html
See Also
--------
read_pickle : Load pickled pandas object (or any object) from file.
DataFrame.to_hdf : Write DataFrame to an HDF5 file.
DataFrame.to_sql : Write DataFrame to a SQL database.
DataFrame.to_parquet : Write a DataFrame to the binary parquet format.
Examples
--------
>>> original_df = pd.DataFrame({"foo": range(5), "bar": range(5, 10)})
>>> original_df
foo bar
0 0 5
1 1 6
2 2 7
3 3 8
4 4 9
>>> pd.to_pickle(original_df, "./dummy.pkl")
>>> unpickled_df = pd.read_pickle("./dummy.pkl")
>>> unpickled_df
foo bar
0 0 5
1 1 6
2 2 7
3 3 8
4 4 9
>>> import os
>>> os.remove("./dummy.pkl")
Function53
to_timedelta(arg, unit=None, errors='raise')
Help on function to_timedelta in module pandas.core.tools.timedeltas:
to_timedelta(arg, unit=None, errors='raise')
Convert argument to timedelta.
Timedeltas are absolute differences in times, expressed in difference
units (e.g. days, hours, minutes, seconds). This method converts
an argument from a recognized timedelta format / value into
a Timedelta type.
Parameters
----------
arg : str, timedelta, list-like or Series
The data to be converted to timedelta.
.. deprecated:: 1.2
Strings with units 'M', 'Y' and 'y' do not represent
unambiguous timedelta values and will be removed in a future version
unit : str, optional
Denotes the unit of the arg for numeric `arg`. Defaults to ``"ns"``.
Possible values:
* 'W'
* 'D' / 'days' / 'day'
* 'hours' / 'hour' / 'hr' / 'h'
* 'm' / 'minute' / 'min' / 'minutes' / 'T'
* 'S' / 'seconds' / 'sec' / 'second'
* 'ms' / 'milliseconds' / 'millisecond' / 'milli' / 'millis' / 'L'
* 'us' / 'microseconds' / 'microsecond' / 'micro' / 'micros' / 'U'
* 'ns' / 'nanoseconds' / 'nano' / 'nanos' / 'nanosecond' / 'N'
.. versionchanged:: 1.1.0
Must not be specified when `arg` context strings and
``errors="raise"``.
errors : {'ignore', 'raise', 'coerce'}, default 'raise'
- If 'raise', then invalid parsing will raise an exception.
- If 'coerce', then invalid parsing will be set as NaT.
- If 'ignore', then invalid parsing will return the input.
Returns
-------
timedelta64 or numpy.array of timedelta64
Output type returned if parsing succeeded.
See Also
--------
DataFrame.astype : Cast argument to a specified dtype.
to_datetime : Convert argument to datetime.
convert_dtypes : Convert dtypes.
Notes
-----
If the precision is higher than nanoseconds, the precision of the duration is
truncated to nanoseconds for string inputs.
Examples
--------
Parsing a single string to a Timedelta:
>>> pd.to_timedelta('1 days 06:05:01.00003')
Timedelta('1 days 06:05:01.000030')
>>> pd.to_timedelta('15.5us')
Timedelta('0 days 00:00:00.000015500')
Parsing a list or array of strings:
>>> pd.to_timedelta(['1 days 06:05:01.00003', '15.5us', 'nan'])
TimedeltaIndex(['1 days 06:05:01.000030', '0 days 00:00:00.000015500', NaT],
dtype='timedelta64[ns]', freq=None)
Converting numbers by specifying the `unit` keyword argument:
>>> pd.to_timedelta(np.arange(5), unit='s')
TimedeltaIndex(['0 days 00:00:00', '0 days 00:00:01', '0 days 00:00:02',
'0 days 00:00:03', '0 days 00:00:04'],
dtype='timedelta64[ns]', freq=None)
>>> pd.to_timedelta(np.arange(5), unit='d')
TimedeltaIndex(['0 days', '1 days', '2 days', '3 days', '4 days'],
dtype='timedelta64[ns]', freq=None)
Function54
unique(values)
Help on function unique in module pandas.core.algorithms:
unique(values)
Hash table-based unique. Uniques are returned in order
of appearance. This does NOT sort.
Significantly faster than numpy.unique for long enough sequences.
Includes NA values.
Parameters
----------
values : 1d array-like
Returns
-------
numpy.ndarray or ExtensionArray
The return can be:
* Index : when the input is an Index
* Categorical : when the input is a Categorical dtype
* ndarray : when the input is a Series/ndarray
Return numpy.ndarray or ExtensionArray.
See Also
--------
Index.unique : Return unique values from an Index.
Series.unique : Return unique values of Series object.
Examples
--------
>>> pd.unique(pd.Series([2, 1, 3, 3]))
array([2, 1, 3])
>>> pd.unique(pd.Series([2] + [1] * 5))
array([2, 1])
>>> pd.unique(pd.Series([pd.Timestamp("20160101"), pd.Timestamp("20160101")]))
array(['2016-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
>>> pd.unique(
... pd.Series(
... [
... pd.Timestamp("20160101", tz="US/Eastern"),
... pd.Timestamp("20160101", tz="US/Eastern"),
... ]
... )
... )
<DatetimeArray>
['2016-01-01 00:00:00-05:00']
Length: 1, dtype: datetime64[ns, US/Eastern]
>>> pd.unique(
... pd.Index(
... [
... pd.Timestamp("20160101", tz="US/Eastern"),
... pd.Timestamp("20160101", tz="US/Eastern"),
... ]
... )
... )
DatetimeIndex(['2016-01-01 00:00:00-05:00'],
dtype='datetime64[ns, US/Eastern]',
freq=None)
>>> pd.unique(list("baabc"))
array(['b', 'a', 'c'], dtype=object)
An unordered Categorical will return categories in the
order of appearance.
>>> pd.unique(pd.Series(pd.Categorical(list("baabc"))))
['b', 'a', 'c']
Categories (3, object): ['a', 'b', 'c']
>>> pd.unique(pd.Series(pd.Categorical(list("baabc"), categories=list("abc"))))
['b', 'a', 'c']
Categories (3, object): ['a', 'b', 'c']
An ordered Categorical preserves the category ordering.
>>> pd.unique(
... pd.Series(
... pd.Categorical(list("baabc"), categories=list("abc"), ordered=True)
... )
... )
['b', 'a', 'c']
Categories (3, object): ['a' < 'b' < 'c']
An array of tuples
>>> pd.unique([("a", "b"), ("b", "a"), ("a", "c"), ("b", "a")])
array([('a', 'b'), ('b', 'a'), ('a', 'c')], dtype=object)
Function55
value_counts(values, sort: 'bool' = True, ascending: 'bool' = False, normalize: 'bool' = False, bins=None, dropna: 'bool' = True) -> 'Series'
Help on function value_counts in module pandas.core.algorithms:
value_counts(values, sort: 'bool' = True, ascending: 'bool' = False, normalize: 'bool' = False, bins=None, dropna: 'bool' = True) -> 'Series'
Compute a histogram of the counts of non-null values.
Parameters
----------
values : ndarray (1-d)
sort : bool, default True
Sort by values
ascending : bool, default False
Sort in ascending order
normalize: bool, default False
If True then compute a relative histogram
bins : integer, optional
Rather than count values, group them into half-open bins,
convenience for pd.cut, only works with numeric data
dropna : bool, default True
Don't include counts of NaN
Returns
-------
Series
Function56
wide_to_long(df: 'DataFrame', stubnames, i, j, sep: 'str' = '', suffix: 'str' = '\\d+') -> 'DataFrame'
Help on function wide_to_long in module pandas.core.reshape.melt:
wide_to_long(df: 'DataFrame', stubnames, i, j, sep: 'str' = '', suffix: 'str' = '\\d+') -> 'DataFrame'
Wide panel to long format. Less flexible but more user-friendly than melt.
With stubnames ['A', 'B'], this function expects to find one or more
group of columns with format
A-suffix1, A-suffix2,..., B-suffix1, B-suffix2,...
You specify what you want to call this suffix in the resulting long format
with `j` (for example `j='year'`)
Each row of these wide variables are assumed to be uniquely identified by
`i` (can be a single column name or a list of column names)
All remaining variables in the data frame are left intact.
Parameters
----------
df : DataFrame
The wide-format DataFrame.
stubnames : str or list-like
The stub name(s). The wide format variables are assumed to
start with the stub names.
i : str or list-like
Column(s) to use as id variable(s).
j : str
The name of the sub-observation variable. What you wish to name your
suffix in the long format.
sep : str, default ""
A character indicating the separation of the variable names
in the wide format, to be stripped from the names in the long format.
For example, if your column names are A-suffix1, A-suffix2, you
can strip the hyphen by specifying `sep='-'`.
suffix : str, default '\\d+'
A regular expression capturing the wanted suffixes. '\\d+' captures
numeric suffixes. Suffixes with no numbers could be specified with the
negated character class '\\D+'. You can also further disambiguate
suffixes, for example, if your wide variables are of the form A-one,
B-two,.., and you have an unrelated column A-rating, you can ignore the
last one by specifying `suffix='(!?one|two)'`. When all suffixes are
numeric, they are cast to int64/float64.
Returns
-------
DataFrame
A DataFrame that contains each stub name as a variable, with new index
(i, j).
See Also
--------
melt : Unpivot a DataFrame from wide to long format, optionally leaving
identifiers set.
pivot : Create a spreadsheet-style pivot table as a DataFrame.
DataFrame.pivot : Pivot without aggregation that can handle
non-numeric data.
DataFrame.pivot_table : Generalization of pivot that can handle
duplicate values for one index/column pair.
DataFrame.unstack : Pivot based on the index values instead of a
column.
Notes
-----
All extra variables are left untouched. This simply uses
`pandas.melt` under the hood, but is hard-coded to "do the right thing"
in a typical case.
Examples
--------
>>> np.random.seed(123)
>>> df = pd.DataFrame({"A1970" : {0 : "a", 1 : "b", 2 : "c"},
... "A1980" : {0 : "d", 1 : "e", 2 : "f"},
... "B1970" : {0 : 2.5, 1 : 1.2, 2 : .7},
... "B1980" : {0 : 3.2, 1 : 1.3, 2 : .1},
... "X" : dict(zip(range(3), np.random.randn(3)))
... })
>>> df["id"] = df.index
>>> df
A1970 A1980 B1970 B1980 X id
0 a d 2.5 3.2 -1.085631 0
1 b e 1.2 1.3 0.997345 1
2 c f 0.7 0.1 0.282978 2
>>> pd.wide_to_long(df, ["A", "B"], i="id", j="year")
... # doctest: +NORMALIZE_WHITESPACE
X A B
id year
0 1970 -1.085631 a 2.5
1 1970 0.997345 b 1.2
2 1970 0.282978 c 0.7
0 1980 -1.085631 d 3.2
1 1980 0.997345 e 1.3
2 1980 0.282978 f 0.1
With multiple id columns
>>> df = pd.DataFrame({
... 'famid': [1, 1, 1, 2, 2, 2, 3, 3, 3],
... 'birth': [1, 2, 3, 1, 2, 3, 1, 2, 3],
... 'ht1': [2.8, 2.9, 2.2, 2, 1.8, 1.9, 2.2, 2.3, 2.1],
... 'ht2': [3.4, 3.8, 2.9, 3.2, 2.8, 2.4, 3.3, 3.4, 2.9]
... })
>>> df
famid birth ht1 ht2
0 1 1 2.8 3.4
1 1 2 2.9 3.8
2 1 3 2.2 2.9
3 2 1 2.0 3.2
4 2 2 1.8 2.8
5 2 3 1.9 2.4
6 3 1 2.2 3.3
7 3 2 2.3 3.4
8 3 3 2.1 2.9
>>> l = pd.wide_to_long(df, stubnames='ht', i=['famid', 'birth'], j='age')
>>> l
... # doctest: +NORMALIZE_WHITESPACE
ht
famid birth age
1 1 1 2.8
2 3.4
2 1 2.9
2 3.8
3 1 2.2
2 2.9
2 1 1 2.0
2 3.2
2 1 1.8
2 2.8
3 1 1.9
2 2.4
3 1 1 2.2
2 3.3
2 1 2.3
2 3.4
3 1 2.1
2 2.9
Going from long back to wide just takes some creative use of `unstack`
>>> w = l.unstack()
>>> w.columns = w.columns.map('{0[0]}{0[1]}'.format)
>>> w.reset_index()
famid birth ht1 ht2
0 1 1 2.8 3.4
1 1 2 2.9 3.8
2 1 3 2.2 2.9
3 2 1 2.0 3.2
4 2 2 1.8 2.8
5 2 3 1.9 2.4
6 3 1 2.2 3.3
7 3 2 2.3 3.4
8 3 3 2.1 2.9
Less wieldy column names are also handled
>>> np.random.seed(0)
>>> df = pd.DataFrame({'A(weekly)-2010': np.random.rand(3),
... 'A(weekly)-2011': np.random.rand(3),
... 'B(weekly)-2010': np.random.rand(3),
... 'B(weekly)-2011': np.random.rand(3),
... 'X' : np.random.randint(3, size=3)})
>>> df['id'] = df.index
>>> df # doctest: +NORMALIZE_WHITESPACE, +ELLIPSIS
A(weekly)-2010 A(weekly)-2011 B(weekly)-2010 B(weekly)-2011 X id
0 0.548814 0.544883 0.437587 0.383442 0 0
1 0.715189 0.423655 0.891773 0.791725 1 1
2 0.602763 0.645894 0.963663 0.528895 1 2
>>> pd.wide_to_long(df, ['A(weekly)', 'B(weekly)'], i='id',
... j='year', sep='-')
... # doctest: +NORMALIZE_WHITESPACE
X A(weekly) B(weekly)
id year
0 2010 0 0.548814 0.437587
1 2010 1 0.715189 0.891773
2 2010 1 0.602763 0.963663
0 2011 0 0.544883 0.383442
1 2011 1 0.423655 0.791725
2 2011 1 0.645894 0.528895
If we have many columns, we could also use a regex to find our
stubnames and pass that list on to wide_to_long
>>> stubnames = sorted(
... set([match[0] for match in df.columns.str.findall(
... r'[A-B]\(.*\)').values if match != []])
... )
>>> list(stubnames)
['A(weekly)', 'B(weekly)']
All of the above examples have integers as suffixes. It is possible to
have non-integers as suffixes.
>>> df = pd.DataFrame({
... 'famid': [1, 1, 1, 2, 2, 2, 3, 3, 3],
... 'birth': [1, 2, 3, 1, 2, 3, 1, 2, 3],
... 'ht_one': [2.8, 2.9, 2.2, 2, 1.8, 1.9, 2.2, 2.3, 2.1],
... 'ht_two': [3.4, 3.8, 2.9, 3.2, 2.8, 2.4, 3.3, 3.4, 2.9]
... })
>>> df
famid birth ht_one ht_two
0 1 1 2.8 3.4
1 1 2 2.9 3.8
2 1 3 2.2 2.9
3 2 1 2.0 3.2
4 2 2 1.8 2.8
5 2 3 1.9 2.4
6 3 1 2.2 3.3
7 3 2 2.3 3.4
8 3 3 2.1 2.9
>>> l = pd.wide_to_long(df, stubnames='ht', i=['famid', 'birth'], j='age',
... sep='_', suffix=r'\w+')
>>> l
... # doctest: +NORMALIZE_WHITESPACE
ht
famid birth age
1 1 one 2.8
two 3.4
2 one 2.9
two 3.8
3 one 2.2
two 2.9
2 1 one 2.0
two 3.2
2 one 1.8
two 2.8
3 one 1.9
two 2.4
3 1 one 2.2
two 3.3
2 one 2.3
two 3.4
3 one 2.1
two 2.9
12个pandas子模块又包含310个库函数(含类、方法、子模块):
import pandas as pd
funcs = [_ for _ in dir(pd) if not _.startswith('_')]
types = type(pd.DataFrame), type(pd.array), type(pd)
Names = 'Type','Function','Module','Other'
Types = {}
count = 0
for f in funcs:
t = type(eval("pd."+f))
t = Names[-1 if t not in types else types.index(type(eval("pd."+f)))]
Types[t] = Types.get(t,[])+[f]
for j,n in enumerate(Types['Module'],1):
print(f"\n{j}:【{n}】")
fun = [_ for _ in dir(eval('pd.'+n)) if not _.startswith('_')]
count += len(fun)
for i,f in enumerate(fun,1):
print(f'{f:18} ',end='' if i%5 or i==len(fun) else '\n')
print("\n小计:",len(fun))
print("合计:",count)
1:【api】
extensions indexers types
小计: 32:【arrays】
ArrowStringArray BooleanArray Categorical DatetimeArray FloatingArray
IntegerArray IntervalArray PandasArray PeriodArray SparseArray
StringArray TimedeltaArray
小计: 123:【compat】
F IS64 PY310 PY38 PY39
PYPY chainmap get_lzma_file import_lzma is_numpy_dev
is_platform_arm is_platform_linux is_platform_little_endian is_platform_mac is_platform_windows
np_array_datetime64_compat np_datetime64_compat np_version_under1p18 np_version_under1p19 np_version_under1p20
numpy pa_version_under1p0 pa_version_under2p0 pa_version_under3p0 pa_version_under4p0
pickle_compat platform pyarrow set_function_name sys
warnings
小计: 314:【core】
accessor aggregation algorithms api apply
array_algos arraylike arrays base common
computation config_init construction describe dtypes
flags frame generic groupby indexers
indexes indexing internals missing nanops
ops reshape roperator series shared_docs
sorting strings tools util window
小计: 355:【errors】
AbstractMethodError AccessorRegistrationWarning DtypeWarning DuplicateLabelError EmptyDataError
IntCastingNaNError InvalidIndexError MergeError NullFrequencyError NumbaUtilError
OptionError OutOfBoundsDatetime OutOfBoundsTimedelta ParserError ParserWarning
PerformanceWarning UnsortedIndexError UnsupportedFunctionCall
小计: 186:【io】
api clipboards common date_converters excel
feather_format formats gbq html json
orc parquet parsers pickle pytables
sas spss sql stata xml
小计: 207:【offsets】
BDay BMonthBegin BMonthEnd BQuarterBegin BQuarterEnd
BYearBegin BYearEnd BaseOffset BusinessDay BusinessHour
BusinessMonthBegin BusinessMonthEnd CBMonthBegin CBMonthEnd CDay
CustomBusinessDay CustomBusinessHour CustomBusinessMonthBegin CustomBusinessMonthEnd DateOffset
Day Easter FY5253 FY5253Quarter Hour
LastWeekOfMonth Micro Milli Minute MonthBegin
MonthEnd Nano QuarterBegin QuarterEnd Second
SemiMonthBegin SemiMonthEnd Tick Week WeekOfMonth
YearBegin YearEnd
小计: 428:【pandas】
BooleanDtype Categorical CategoricalDtype CategoricalIndex DataFrame
DateOffset DatetimeIndex DatetimeTZDtype ExcelFile ExcelWriter
Flags Float32Dtype Float64Dtype Float64Index Grouper
HDFStore Index IndexSlice Int16Dtype Int32Dtype
Int64Dtype Int64Index Int8Dtype Interval IntervalDtype
IntervalIndex MultiIndex NA NaT NamedAgg
Period PeriodDtype PeriodIndex RangeIndex Series
SparseDtype StringDtype Timedelta TimedeltaIndex Timestamp
UInt16Dtype UInt32Dtype UInt64Dtype UInt64Index UInt8Dtype
api array arrays bdate_range compat
concat core crosstab cut date_range
describe_option errors eval factorize get_dummies
get_option infer_freq interval_range io isna
isnull json_normalize lreshape melt merge
merge_asof merge_ordered notna notnull offsets
option_context options pandas period_range pivot
pivot_table plotting qcut read_clipboard read_csv
read_excel read_feather read_fwf read_gbq read_hdf
read_html read_json read_orc read_parquet read_pickle
read_sas read_spss read_sql read_sql_query read_sql_table
read_stata read_table read_xml reset_option set_eng_float_format
set_option show_versions test testing timedelta_range
to_datetime to_numeric to_pickle to_timedelta tseries
unique util value_counts wide_to_long
小计: 1199:【plotting】
PlotAccessor andrews_curves autocorrelation_plot bootstrap_plot boxplot
boxplot_frame boxplot_frame_groupby deregister_matplotlib_converters hist_frame hist_series
lag_plot parallel_coordinates plot_params radviz register_matplotlib_converters
scatter_matrix table
小计: 1710:【testing】
assert_extension_array_equal assert_frame_equal assert_index_equal assert_series_equal
小计: 411:【tseries】
api frequencies offsets
小计: 312:【util】
Appender Substitution cache_readonly hash_array hash_pandas_object
version
小计: 6
合计: 310
其中第8个pandas就是主模块:
>>> dir(pd)==dir(pd.pandas)
True
对第4个子模块core再扩展一下:
import pandas as pd
funcs = [_ for _ in dir(pd.core) if not _.startswith('_')]
types = type(pd.DataFrame), type(pd.array), type(pd)
Names = 'Type','Function','Module','Other'
Types = {}
count = 0
for f in funcs:
t = type(eval("pd.core."+f))
t = Names[-1 if t not in types else types.index(type(eval("pd.core."+f)))]
Types[t] = Types.get(t,[])+[f]
for j,n in enumerate(Types['Module'],1):
print(f"\n{j}:【{n}】")
fun = [_ for _ in dir(eval('pd.core.'+n)) if not _.startswith('_')]
count += len(fun)
for i,f in enumerate(fun,1):
print(f'{f:18} ',end='' if i%5 or i==len(fun) else '\n')
print("\n小计:",len(fun))
又翻出1299个:
1:【accessor】
CachedAccessor DirNamesMixin PandasDelegate annotations delegate_names
doc register_dataframe_accessor register_index_accessor register_series_accessor warnings
小计: 102:【aggregation】
ABCSeries AggFuncType Any Callable DefaultDict
FrameOrSeries Hashable Index Iterable Sequence
SpecificationError TYPE_CHECKING annotations com defaultdict
is_dict_like is_list_like is_multi_agg_with_relabel maybe_mangle_lambdas normalize_keyword_aggregation
partial reconstruct_func relabel_result validate_func_kwargs
小计: 243:【algorithms】
ABCDatetimeArray ABCExtensionArray ABCIndex ABCMultiIndex ABCRangeIndex
ABCSeries ABCTimedeltaArray AnyArrayLike ArrayLike DtypeObj
FrameOrSeriesUnion PandasDtype Scalar SelectN SelectNFrame
SelectNSeries TYPE_CHECKING Union algos annotations
cast checked_add_with_arr construct_1d_object_array_from_listlike dedent diff
doc duplicated ensure_float64 ensure_object ensure_platform_int
ensure_wrapped_if_datetimelike extract_array factorize factorize_array final
get_data_algo htable iNaT infer_dtype_from_array is_array_like
is_bool_dtype is_categorical_dtype is_complex_dtype is_datetime64_dtype is_extension_array_dtype
is_float_dtype is_integer is_integer_dtype is_list_like is_numeric_dtype
is_object_dtype is_scalar is_timedelta64_dtype isin isna
lib mode na_value_for_dtype needs_i8_conversion np
operator pandas_dtype pd_array quantile rank
safe_sort sanitize_to_nanoseconds searchsorted take take_nd
union_with_duplicates unique unique1d validate_indices value_counts
value_counts_arraylike warn
小计: 774:【api】
BooleanDtype Categorical CategoricalDtype CategoricalIndex DataFrame
DateOffset DatetimeIndex DatetimeTZDtype Flags Float32Dtype
Float64Dtype Float64Index Grouper Index IndexSlice
Int16Dtype Int32Dtype Int64Dtype Int64Index Int8Dtype
Interval IntervalDtype IntervalIndex MultiIndex NA
NaT NamedAgg Period PeriodDtype PeriodIndex
RangeIndex Series StringDtype Timedelta TimedeltaIndex
Timestamp UInt16Dtype UInt32Dtype UInt64Dtype UInt64Index
UInt8Dtype array bdate_range date_range factorize
interval_range isna isnull notna notnull
period_range set_eng_float_format timedelta_range to_datetime to_numeric
to_timedelta unique value_counts
小计: 585:【apply】
ABCDataFrame ABCNDFrame ABCSeries AggFuncType AggFuncTypeBase
AggFuncTypeDict AggObjType Any Apply Axis
DataError Dict FrameApply FrameColumnApply FrameOrSeries
FrameOrSeriesUnion FrameRowApply GroupByApply Hashable Iterator
List NDFrameApply ResType ResamplerWindowApply SelectionMixin
SeriesApply SpecificationError TYPE_CHECKING abc annotations
cache_readonly cast com create_series_with_explicit_dtype ensure_wrapped_if_datetimelike
frame_apply inspect is_dict_like is_extension_array_dtype is_list_like
is_nested_object is_sequence lib np option_context
pd_array safe_sort warnings
小计: 486:【array_algos】
masked_reductions putmask quantile replace take
transforms
小计: 67:【arraylike】
Any OpsMixin array_ufunc extract_array lib
maybe_dispatch_ufunc_to_dunder_op np operator roperator unpack_zerodim_and_defer
warnings
小计: 118:【arrays】
ArrowStringArray BaseMaskedArray BooleanArray Categorical DatetimeArray
ExtensionArray ExtensionOpsMixin ExtensionScalarOpsMixin FloatingArray IntegerArray
IntervalArray PandasArray PeriodArray SparseArray StringArray
TimedeltaArray base boolean categorical datetimelike
datetimes floating integer interval masked
numeric numpy_ period period_array sparse
string_ string_arrow timedeltas
小计: 339:【base】
ABCDataFrame ABCIndex ABCSeries AbstractMethodError Any
ArrayLike DataError DirNamesMixin Dtype DtypeObj
ExtensionArray FrameOrSeries Generic Hashable IndexLabel
IndexOpsMixin NoNewAttributesMixin OpsMixin PYPY PandasObject
SelectionMixin Shape SpecificationError TYPE_CHECKING TypeVar
algorithms annotations cache_readonly cast create_series_with_explicit_dtype
doc duplicated final is_categorical_dtype is_dict_like
is_extension_array_dtype is_object_dtype is_scalar isna lib
nanops np nv remove_na_arraylike textwrap
unique1d value_counts
小计: 4710:【common】
ABCExtensionArray ABCIndex ABCSeries Any AnyArrayLike
Callable Collection Iterable Iterator NpDtype
Scalar SettingWithCopyError SettingWithCopyWarning T TYPE_CHECKING
abc all_none all_not_none annotations any_none
any_not_none apply_if_callable asarray_tuplesafe builtins cast
cast_scalar_indexer consensus_name_attr construct_1d_object_array_from_listlike contextlib convert_to_list_like
count_not_none defaultdict flatten get_callable_name get_cython_func
get_rename_function index_labels_to_array inspect is_array_like is_bool_dtype
is_bool_indexer is_builtin_func is_extension_array_dtype is_full_slice is_integer
is_null_slice is_true_slices isna iterable_not_string lib
maybe_iterable_to_list maybe_make_list not_none np np_version_under1p18
partial pipe random_state require_length_match standardize_mapping
temp_setattr warnings
小计: 6211:【computation】
align api check common engines
eval expr expressions ops parsing
pytables scope
小计: 1212:【config_init】
cf chained_assignment colheader_justify_doc data_manager_doc float_format_doc
is_bool is_callable is_instance_factory is_int is_nonnegative_int
is_one_of_factory is_terminal is_text max_cols max_colwidth_doc
os parquet_engine_doc pc_ambiguous_as_wide_doc pc_chop_threshold_doc pc_colspace_doc
pc_east_asian_width_doc pc_expand_repr_doc pc_html_border_doc pc_html_use_mathjax_doc pc_large_repr_doc
pc_latex_escape pc_latex_longtable pc_latex_multicolumn pc_latex_multicolumn_format pc_latex_multirow
pc_latex_repr_doc pc_max_categories_doc pc_max_cols_doc pc_max_info_cols_doc pc_max_info_rows_doc
pc_max_rows_doc pc_max_seq_items pc_memory_usage_doc pc_min_rows_doc pc_multi_sparse_doc
pc_nb_repr_h_doc pc_pprint_nest_depth pc_precision_doc pc_show_dimensions_doc pc_table_schema_doc
pc_width_doc plotting_backend_doc reader_engine_doc register_converter_cb register_converter_doc
register_plotting_backend_cb sql_engine_doc string_storage_doc styler_max_elements styler_sparse_columns_doc
styler_sparse_index_doc table_schema_cb tc_sim_interactive_doc use_bottleneck_cb use_bottleneck_doc
use_inf_as_na_cb use_inf_as_na_doc use_inf_as_null_doc use_numba_cb use_numba_doc
use_numexpr_cb use_numexpr_doc warnings writer_engine_doc
小计: 6913:【construction】
ABCExtensionArray ABCIndex ABCPandasArray ABCRangeIndex ABCSeries
Any AnyArrayLike ArrayLike DatetimeTZDtype Dtype
DtypeObj ExtensionDtype IntCastingNaNError Sequence TYPE_CHECKING
annotations array cast com construct_1d_arraylike_from_scalar
construct_1d_object_array_from_listlike create_series_with_explicit_dtype ensure_wrapped_if_datetimelike extract_array is_datetime64_ns_dtype
is_empty_data is_extension_array_dtype is_float_dtype is_integer_dtype is_list_like
is_object_dtype is_timedelta64_ns_dtype isna lib ma
maybe_cast_to_datetime maybe_cast_to_integer_array maybe_convert_platform maybe_infer_to_datetimelike maybe_upcast
np range_to_ndarray registry sanitize_array sanitize_masked_array
sanitize_to_nanoseconds warnings
小计: 4714:【describe】
ABC Callable DataFrameDescriber FrameOrSeries FrameOrSeriesUnion
Hashable NDFrameDescriberAbstract Sequence SeriesDescriber TYPE_CHECKING
Timestamp abstractmethod annotations cast concat
describe_categorical_1d describe_ndframe describe_numeric_1d describe_timestamp_1d describe_timestamp_as_categorical_1d
format_percentiles is_bool_dtype is_datetime64_any_dtype is_numeric_dtype is_timedelta64_dtype
np refine_percentiles reorder_columns select_describe_func validate_percentile
warnings
小计: 3115:【dtypes】
api base cast common concat
dtypes generic inference missing
小计: 916:【flags】
Flags weakref
小计: 217:【frame】
AggFuncType Any AnyArrayLike AnyStr Appender
ArrayLike ArrayManager Axes Axis BaseInfo
BlockManager CachedAccessor Callable CategoricalIndex ColspaceArgType
CompressionOptions DataFrame DataFrameInfo DatetimeArray DatetimeIndex
Dtype ExtensionArray ExtensionDtype FilePathOrBuffer FillnaOptions
FloatFormatType FormattersType FrameOrSeriesUnion Frequency Hashable
IO Index IndexKeyFunc IndexLabel Iterable
Iterator Level MultiIndex NDFrame NpDtype
OpsMixin PeriodIndex PythonFuncType Renamer Scalar
Sequence Series SparseFrameAccessor StorageOptions StringIO
Substitution Suffixes TYPE_CHECKING TimedeltaArray ValueKeyFunc
abc algorithms annotations arrays_to_mgr cast
check_bool_indexer check_key_length collections com console
construct_1d_arraylike_from_scalar construct_2d_arraylike_from_scalar convert_to_index_sliceable dataclasses_to_dicts datetime
dedent deprecate_kwarg deprecate_nonkeyword_arguments dict_to_mgr doc
duplicated ensure_index ensure_index_from_sequences ensure_platform_int extract_array
find_common_type fmt functools generic get_group_index
get_handle get_option ibase import_optional_dependency infer_dtype_from_object
infer_dtype_from_scalar invalidate_string_dtypes is_1d_only_ea_dtype is_1d_only_ea_obj is_bool_dtype
is_dataclass is_datetime64_any_dtype is_dict_like is_dtype_equal is_extension_array_dtype
is_float is_float_dtype is_hashable is_integer is_integer_dtype
is_iterator is_list_like is_object_dtype is_scalar is_sequence
isna itertools lexsort_indexer lib libalgos
ma maybe_box_native maybe_downcast_to_dtype maybe_droplevels melt
mgr_to_mgr mmap nanops nargsort ndarray_to_mgr
nested_data_to_arrays no_default notna np nv
ops overload pandas pandas_dtype properties
rec_array_to_mgr reconstruct_func relabel_result reorder_arrays rewrite_axis_style_signature
sanitize_array sanitize_masked_array take_2d_multi to_arrays treat_as_nested
validate_axis_style_args validate_bool_kwarg validate_numeric_casting validate_percentile warnings
小计: 15018:【generic】
ABCDataFrame ABCSeries AbstractMethodError Any AnyStr
ArrayManager Axis BlockManager Callable CompressionOptions
DataFrameFormatter DataFrameRenderer DatetimeIndex Dtype DtypeArg
DtypeObj Expanding ExponentialMovingWindow ExtensionArray FilePathOrBuffer
Flags FrameOrSeries Hashable Index IndexKeyFunc
IndexLabel InvalidIndexError JSONSerializable Level Manager
Mapping MultiIndex NDFrame NpDtype PandasObject
Period PeriodIndex RangeIndex Renamer Rolling
Sequence SingleArrayManager StorageOptions T TYPE_CHECKING
Tick TimedeltaConvertibleTypes Timestamp TimestampConvertibleTypes ValueKeyFunc
Window algos align_method_FRAME annotations arraylike
bool_t cast collections com concat
config create_series_with_explicit_dtype describe_ndframe doc ensure_index
ensure_object ensure_platform_int ensure_str extract_array final
find_valid_index fmt functools gc get_indexer_indexer
ibase import_optional_dependency indexing is_bool is_bool_dtype
is_datetime64_any_dtype is_datetime64tz_dtype is_dict_like is_dtype_equal is_extension_array_dtype
is_float is_hashable is_list_like is_nested_list_like is_number
is_numeric_dtype is_object_dtype is_re_compilable is_scalar is_timedelta64_dtype
isna json lib mgr_to_mgr missing
nanops notna np nv operator
overload pandas_dtype pickle pprint_thing re
rewrite_axis_style_signature timedelta to_offset validate_ascending validate_bool_kwarg
validate_fillna_kwargs warnings weakref
小计: 11819:【groupby】
DataFrameGroupBy GroupBy Grouper NamedAgg SeriesGroupBy
base categorical generic groupby grouper
numba_ ops
小计: 1220:【indexers】
ABCIndex ABCSeries Any AnyArrayLike ArrayLike
TYPE_CHECKING annotations check_array_indexer check_key_length check_setitem_lengths
deprecate_ndim_indexing is_array_like is_bool_dtype is_empty_indexer is_exact_shape_match
is_extension_array_dtype is_integer is_integer_dtype is_list_like is_list_like_indexer
is_scalar_indexer is_valid_positional_slice length_of_indexer maybe_convert_indices np
unpack_1tuple validate_indices warnings
小计: 2821:【indexes】
accessors api base category datetimelike
datetimes extension frozen interval multi
numeric period range timedeltas
小计: 1422:【indexing】
ABCDataFrame ABCSeries AbstractMethodError Any CategoricalIndex
Hashable Index IndexSlice IndexingError IndexingMixin
IntervalIndex InvalidIndexError MultiIndex NDFrameIndexerBase Sequence
TYPE_CHECKING algos annotations check_array_indexer check_bool_indexer
com concat_compat convert_from_missing_indexer_tuple convert_missing_indexer convert_to_index_sliceable
doc ensure_index extract_array infer_fill_value is_array_like
is_bool_dtype is_empty_indexer is_exact_shape_match is_hashable is_integer
is_iterator is_label_like is_list_like is_list_like_indexer is_nested_tuple
is_numeric_dtype is_object_dtype is_scalar is_sequence isna
item_from_zerodim length_of_indexer maybe_convert_ix need_slice needs_i8_conversion
np pd_array suppress warnings
小计: 5423:【internals】
ArrayManager Block BlockManager DataManager DatetimeTZBlock
ExtensionBlock NumericBlock ObjectBlock SingleArrayManager SingleBlockManager
SingleDataManager api array_manager base blocks
concat concatenate_managers construction create_block_manager_from_arrays create_block_manager_from_blocks
make_block managers ops
小计: 2324:【missing】
Any ArrayLike Axis F NP_METHODS
SP_METHODS TYPE_CHECKING algos annotations cast
check_value_size clean_fill_method clean_interp_method clean_reindex_fill_method find_valid_index
get_fill_func import_optional_dependency infer_dtype_from interpolate_1d interpolate_2d
interpolate_2d_with_fill interpolate_array_2d is_array_like is_numeric_v_string_like is_valid_na_for_dtype
isna lib mask_missing na_value_for_dtype needs_i8_conversion
np partial wraps
小计: 3325:【nanops】
Any ArrayLike Dtype DtypeObj F
NaT NaTType PeriodDtype Scalar Shape
Timedelta annotations bn bottleneck_switch cast
check_below_min_count disallow extract_array functools get_corr_func
get_dtype get_empty_reduction_result get_option iNaT import_optional_dependency
is_any_int_dtype is_bool_dtype is_complex is_datetime64_any_dtype is_float
is_float_dtype is_integer is_integer_dtype is_numeric_dtype is_object_dtype
is_scalar is_timedelta64_dtype isna itertools lib
make_nancomp na_accum_func na_value_for_dtype nanall nanany
nanargmax nanargmin nancorr nancov naneq
nange nangt nankurt nanle nanlt
nanmax nanmean nanmedian nanmin nanne
nanpercentile nanprod nansem nanskew nanstd
nansum nanvar needs_i8_conversion notna np
np_percentile_argname operator pandas_dtype set_use_bottleneck warnings
小计: 7526:【ops】
ABCDataFrame ABCSeries ARITHMETIC_BINOPS Appender COMPARISON_BINOPS
Level TYPE_CHECKING add_flex_arithmetic_methods algorithms align_method_FRAME
align_method_SERIES annotations arithmetic_op array_ops common
comp_method_OBJECT_ARRAY comparison_op dispatch docstrings fill_binop
flex_arith_method_FRAME flex_comp_method_FRAME flex_method_SERIES frame_arith_method_with_reindex get_array_op
get_op_result_name invalid invalid_comparison is_array_like is_list_like
isna kleene_and kleene_or kleene_xor logical_op
make_flex_doc mask_ops maybe_dispatch_ufunc_to_dunder_op maybe_prepare_scalar_for_op methods
missing np operator radd rand_
rdiv rdivmod rfloordiv rmod rmul
roperator ror_ rpow rsub rtruediv
rxor should_reindex_frame_op unpack_zerodim_and_defer warnings
小计: 5927:【reshape】
api concat melt merge pivot
reshape tile util
小计: 828:【roperator】
operator radd rand_ rdiv rdivmod
rfloordiv rmod rmul ror_ rpow
rsub rtruediv rxor
小计: 1329:【series】
ABCDataFrame AggFuncType Any Appender ArrayLike
Axis CachedAccessor Callable CategoricalAccessor CategoricalIndex
CombinedDatetimelikeProperties DatetimeIndex Dtype DtypeObj ExtensionArray
FillnaOptions Float64Index FrameOrSeriesUnion Hashable IO
Index IndexKeyFunc InvalidIndexError Iterable MultiIndex
NDFrame NpDtype PeriodIndex Sequence Series
SeriesApply SingleArrayManager SingleBlockManager SingleManager SparseAccessor
StorageOptions StringIO StringMethods Substitution TYPE_CHECKING
TimedeltaIndex Union ValueKeyFunc algorithms annotations
base cast check_bool_indexer com convert_dtypes
create_series_with_explicit_dtype dedent deprecate_ndim_indexing deprecate_nonkeyword_arguments doc
ensure_index ensure_key_mapped ensure_platform_int ensure_wrapped_if_datetimelike extract_array
fmt generic get_option get_terminal_size ibase
is_bool is_dict_like is_empty_data is_hashable is_integer
is_iterator is_list_like is_object_dtype is_scalar isna
lib maybe_box_native maybe_cast_pointwise_result missing na_value_for_dtype
nanops nargsort no_default notna np
nv ops overload pandas pandas_dtype
properties remove_na_arraylike reshape sanitize_array to_datetime
tslibs unpack_1tuple validate_all_hashable validate_bool_kwarg validate_numeric_casting
validate_percentile warnings weakref
小计: 10330:【shared_docs】
annotations
小计: 131:【sorting】
ABCMultiIndex ABCRangeIndex Callable DefaultDict IndexKeyFunc
Iterable Sequence Shape TYPE_CHECKING algos
annotations compress_group_index decons_group_index decons_obs_group_ids defaultdict
ensure_int64 ensure_key_mapped ensure_platform_int extract_array get_compressed_ids
get_flattened_list get_group_index get_group_index_sorter get_indexer_dict get_indexer_indexer
hashtable indexer_from_factorized is_extension_array_dtype is_int64_overflow_possible isna
lexsort_indexer lib nargminmax nargsort np
unique_label_indices
小计: 3632:【strings】
BaseStringArrayMethods StringMethods accessor base object_array
小计: 533:【tools】
datetimes numeric timedeltas times
小计: 434:【util】
hashing numba_
小计: 235:【window】
Expanding ExpandingGroupby ExponentialMovingWindow ExponentialMovingWindowGroupby Rolling
RollingGroupby Window common doc ewm
expanding indexers numba_ online rolling
小计: 15
合计: 1299
待续......
下一篇链接:
https://hannyang.blog.csdn.net/article/details/128431737