上一篇链接:
Python pandas库|任凭弱水三千,我只取一瓢饮(5)_Hann Yang的博客-CSDN博客
DataFrame 类方法(211个,其中包含18个子类、2个子模块)
>>> import pandas as pd
>>> funcs = [_ for _ in dir(pd.DataFrame) if 'a'<=_[0]<='z']
>>> len(funcs)
211
>>> for i,f in enumerate(funcs,1):
print(f'{f:18}',end='' if i%5 else '\n')
abs add add_prefix add_suffix agg
aggregate align all any append
apply applymap asfreq asof assign
astype at at_time attrs axes
backfill between_time bfill bool boxplot
clip columns combine combine_first compare
convert_dtypes copy corr corrwith count
cov cummax cummin cumprod cumsum
describe diff div divide dot
drop drop_duplicates droplevel dropna dtypes
duplicated empty eq equals eval
ewm expanding explode ffill fillna
filter first first_valid_index flags floordiv
from_dict from_records ge get groupby
gt head hist iat idxmax
idxmin iloc index infer_objects info
insert interpolate isin isna isnull
items iteritems iterrows itertuples join
keys kurt kurtosis last last_valid_index
le loc lookup lt mad
mask max mean median melt
memory_usage merge min mod mode
mul multiply ndim ne nlargest
notna notnull nsmallest nunique pad
pct_change pipe pivot pivot_table plot
pop pow prod product quantile
query radd rank rdiv reindex
reindex_like rename rename_axis reorder_levels replace
resample reset_index rfloordiv rmod rmul
rolling round rpow rsub rtruediv
sample select_dtypes sem set_axis set_flags
set_index shape shift size skew
slice_shift sort_index sort_values sparse squeeze
stack std style sub subtract
sum swapaxes swaplevel tail take
to_clipboard to_csv to_dict to_excel to_feather
to_gbq to_hdf to_html to_json to_latex
to_markdown to_numpy to_parquet to_period to_pickle
to_records to_sql to_stata to_string to_timestamp
to_xarray to_xml transform transpose truediv
truncate tshift tz_convert tz_localize unstack
update value_counts values var where
xs
Series 类方法刚好也有211个:
>>> funcs = [_ for _ in dir(pd.Series) if 'a'<=_[0]<='z']
>>> len(funcs)
211
>>> for i,f in enumerate(funcs,1):
print(f'{f:18}',end='' if i%5 else '\n')
abs add add_prefix add_suffix agg
aggregate align all any append
apply argmax argmin argsort array
asfreq asof astype at at_time
attrs autocorr axes backfill between
between_time bfill bool cat clip
combine combine_first compare convert_dtypes copy
corr count cov cummax cummin
cumprod cumsum describe diff div
divide divmod dot drop drop_duplicates
droplevel dropna dt dtype dtypes
duplicated empty eq equals ewm
expanding explode factorize ffill fillna
filter first first_valid_index flags floordiv
ge get groupby gt hasnans
head hist iat idxmax idxmin
iloc index infer_objects interpolate is_monotonic
is_monotonic_decreasingis_monotonic_increasingis_unique isin isna
isnull item items iteritems keys
kurt kurtosis last last_valid_index le
loc lt mad map mask
max mean median memory_usage min
mod mode mul multiply name
nbytes ndim ne nlargest notna
notnull nsmallest nunique pad pct_change
pipe plot pop pow prod
product quantile radd rank ravel
rdiv rdivmod reindex reindex_like rename
rename_axis reorder_levels repeat replace resample
reset_index rfloordiv rmod rmul rolling
round rpow rsub rtruediv sample
searchsorted sem set_axis set_flags shape
shift size skew slice_shift sort_index
sort_values sparse squeeze std str
sub subtract sum swapaxes swaplevel
tail take to_clipboard to_csv to_dict
to_excel to_frame to_hdf to_json to_latex
to_list to_markdown to_numpy to_period to_pickle
to_sql to_string to_timestamp to_xarray tolist
transform transpose truediv truncate tshift
tz_convert tz_localize unique unstack update
value_counts values var view where
xs
两者同名的方法有181个,另各有30个不同名的:
>>> A,B = [_ for _ in dir(pd.DataFrame) if 'a'<=_[0]<='z'],[_ for _ in dir(pd.Series) if 'a'<=_[0]<='z']
>>> len(set(A)&set(B))
181
>>> len(set(A)|set(B))
241
>>> len(set(A)-set(B))
30
>>> len(set(B)-set(A))
30
>>> for i,f in enumerate(set(A)-set(B),1):
print(f'{f:18}',end='' if i%5 else '\n')
boxplot to_html from_dict to_xml info
corrwith eval to_parquet to_records join
stack columns melt iterrows to_feather
applymap to_stata style pivot set_index
assign itertuples lookup query select_dtypes
from_records insert merge to_gbq pivot_table
>>>
>>> for i,f in enumerate(set(B)-set(A),1):
print(f'{f:18}',end='' if i%5 else '\n')
factorize nbytes between to_list str
argsort rdivmod argmax tolist item
is_monotonic_increasingdt autocorr is_monotonic_decreasingview
repeat name array map dtype
divmod to_frame unique ravel searchsorted
hasnans is_unique is_monotonic cat argmin
>>>
>>> for i,f in enumerate(set(A)&set(B),1):
print(f'{f:18}',end='' if i%5 else '\n')
lt get reorder_levels reindex_like rfloordiv
rtruediv gt diff index update
add_prefix swapaxes reset_index mod reindex
product apply set_flags to_numpy cumprod
min transpose kurtosis to_latex median
eq last_valid_index rename pow all
loc to_pickle squeeze divide duplicated
to_json sort_values astype resample shape
to_xarray to_period kurt ffill idxmax
plot to_clipboard cumsum nlargest var
add abs any tshift nunique
count combine keys values set_axis
isnull sparse first_valid_index combine_first ewm
notnull empty mask truncate to_csv
bool at clip radd to_markdown
value_counts first isna between_time replace
sample idxmin div iloc add_suffix
pipe to_sql items max rsub
flags sem to_string to_excel prod
fillna backfill align pct_change expanding
nsmallest append attrs rmod bfill
ndim rank floordiv unstack groupby
skew quantile copy ne describe
sort_index truediv mode dropna drop
compare tz_convert cov equals memory_usage
sub pad rename_axis ge mean
last cummin notna agg convert_dtypes
round transform asof isin asfreq
slice_shift xs mad infer_objects rpow
drop_duplicates mul cummax corr droplevel
dtypes subtract rdiv filter multiply
to_dict le dot aggregate pop
rolling where interpolate head tail
size iteritems rmul take iat
to_hdf to_timestamp shift hist std
sum at_time tz_localize axes swaplevel
explode
所有函数帮助已上传本站资源版块,欢迎下载:
https://download.csdn.net/download/boysoft2002/87343363https://download.csdn.net/download/boysoft2002/87343363
to_系列函数:22个 (1~11)
Function01
to_clipboard(self, excel: 'bool_t' = True, sep: 'str | None' = None, **kwargs) -> 'None'
Copy object to the system clipboard.
Help on function to_clipboard in module pandas.core.generic:
to_clipboard(self, excel: 'bool_t' = True, sep: 'str | None' = None, **kwargs) -> 'None'
Copy object to the system clipboard.
Write a text representation of object to the system clipboard.
This can be pasted into Excel, for example.
Parameters
----------
excel : bool, default True
Produce output in a csv format for easy pasting into excel.
- True, use the provided separator for csv pasting.
- False, write a string representation of the object to the clipboard.
sep : str, default ``'\t'``
Field delimiter.
**kwargs
These parameters will be passed to DataFrame.to_csv.
See Also
--------
DataFrame.to_csv : Write a DataFrame to a comma-separated values
(csv) file.
read_clipboard : Read text from clipboard and pass to read_table.
Notes
-----
Requirements for your platform.
- Linux : `xclip`, or `xsel` (with `PyQt4` modules)
- Windows : none
- OS X : none
Examples
--------
Copy the contents of a DataFrame to the clipboard.
>>> df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=['A', 'B', 'C'])
>>> df.to_clipboard(sep=',') # doctest: +SKIP
... # Wrote the following to the system clipboard:
... # ,A,B,C
... # 0,1,2,3
... # 1,4,5,6
We can omit the index by passing the keyword `index` and setting
it to false.
>>> df.to_clipboard(sep=',', index=False) # doctest: +SKIP
... # Wrote the following to the system clipboard:
... # A,B,C
... # 1,2,3
... # 4,5,6
Function02
to_csv(self, path_or_buf: 'FilePathOrBuffer[AnyStr] | None' = None, sep: 'str' = ',', na_rep: 'str' = '', float_format: 'str | None' = None, columns: 'Sequence[Hashable] | None' = None, header: 'bool_t | list[str]' = True, index: 'bool_t' = True, index_label: 'IndexLabel | None' = None, mode: 'str' = 'w', encoding: 'str | None' = None, compression: 'CompressionOptions' = 'infer', quoting: 'int | None' = None, quotechar: 'str' = '"', line_terminator: 'str | None' = None, chunksize: 'int | None' = None, date_format: 'str | None' = None, doublequote: 'bool_t' = True, escapechar: 'str | None' = None, decimal: 'str' = '.', errors: 'str' = 'strict', storage_options: 'StorageOptions' = None) -> 'str | None'
Help on function to_csv in module pandas.core.generic:
to_csv(self, path_or_buf: 'FilePathOrBuffer[AnyStr] | None' = None, sep: 'str' = ',', na_rep: 'str' = '', float_format: 'str | None' = None, columns: 'Sequence[Hashable] | None' = None, header: 'bool_t | list[str]' = True, index: 'bool_t' = True, index_label: 'IndexLabel | None' = None, mode: 'str' = 'w', encoding: 'str | None' = None, compression: 'CompressionOptions' = 'infer', quoting: 'int | None' = None, quotechar: 'str' = '"', line_terminator: 'str | None' = None, chunksize: 'int | None' = None, date_format: 'str | None' = None, doublequote: 'bool_t' = True, escapechar: 'str | None' = None, decimal: 'str' = '.', errors: 'str' = 'strict', storage_options: 'StorageOptions' = None) -> 'str | None'
Write object to a comma-separated values (csv) file.
Parameters
----------
path_or_buf : str or file handle, default None
File path or object, if None is provided the result is returned as
a string. If a non-binary file object is passed, it should be opened
with `newline=''`, disabling universal newlines. If a binary
file object is passed, `mode` might need to contain a `'b'`.
.. versionchanged:: 1.2.0
Support for binary file objects was introduced.
sep : str, default ','
String of length 1. Field delimiter for the output file.
na_rep : str, default ''
Missing data representation.
float_format : str, default None
Format string for floating point numbers.
columns : sequence, optional
Columns to write.
header : bool or list of str, default True
Write out the column names. If a list of strings is given it is
assumed to be aliases for the column names.
index : bool, default True
Write row names (index).
index_label : str or sequence, or False, default None
Column label for index column(s) if desired. If None is given, and
`header` and `index` are True, then the index names are used. A
sequence should be given if the object uses MultiIndex. If
False do not print fields for index names. Use index_label=False
for easier importing in R.
mode : str
Python write mode, default 'w'.
encoding : str, optional
A string representing the encoding to use in the output file,
defaults to 'utf-8'. `encoding` is not supported if `path_or_buf`
is a non-binary file object.
compression : str or dict, default 'infer'
If str, represents compression mode. If dict, value at 'method' is
the compression mode. Compression mode may be any of the following
possible values: {'infer', 'gzip', 'bz2', 'zip', 'xz', None}. If
compression mode is 'infer' and `path_or_buf` is path-like, then
detect compression mode from the following extensions: '.gz',
'.bz2', '.zip' or '.xz'. (otherwise no compression). If dict given
and mode is one of {'zip', 'gzip', 'bz2'}, or inferred as
one of the above, other entries passed as
additional compression options.
.. versionchanged:: 1.0.0
May now be a dict with key 'method' as compression mode
and other entries as additional compression options if
compression mode is 'zip'.
.. versionchanged:: 1.1.0
Passing compression options as keys in dict is
supported for compression modes 'gzip' and 'bz2'
as well as 'zip'.
.. versionchanged:: 1.2.0
Compression is supported for binary file objects.
.. versionchanged:: 1.2.0
Previous versions forwarded dict entries for 'gzip' to
`gzip.open` instead of `gzip.GzipFile` which prevented
setting `mtime`.
quoting : optional constant from csv module
Defaults to csv.QUOTE_MINIMAL. If you have set a `float_format`
then floats are converted to strings and thus csv.QUOTE_NONNUMERIC
will treat them as non-numeric.
quotechar : str, default '\"'
String of length 1. Character used to quote fields.
line_terminator : str, optional
The newline character or character sequence to use in the output
file. Defaults to `os.linesep`, which depends on the OS in which
this method is called ('\\n' for linux, '\\r\\n' for Windows, i.e.).
chunksize : int or None
Rows to write at a time.
date_format : str, default None
Format string for datetime objects.
doublequote : bool, default True
Control quoting of `quotechar` inside a field.
escapechar : str, default None
String of length 1. Character used to escape `sep` and `quotechar`
when appropriate.
decimal : str, default '.'
Character recognized as decimal separator. E.g. use ',' for
European data.
errors : str, default 'strict'
Specifies how encoding and decoding errors are to be handled.
See the errors argument for :func:`open` for a full list
of options.
.. versionadded:: 1.1.0
storage_options : dict, optional
Extra options that make sense for a particular storage connection, e.g.
host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
are forwarded to ``urllib`` as header options. For other URLs (e.g.
starting with "s3://", and "gcs://") the key-value pairs are forwarded to
``fsspec``. Please see ``fsspec`` and ``urllib`` for more details.
.. versionadded:: 1.2.0
Returns
-------
None or str
If path_or_buf is None, returns the resulting csv format as a
string. Otherwise returns None.
See Also
--------
read_csv : Load a CSV file into a DataFrame.
to_excel : Write DataFrame to an Excel file.
Examples
--------
>>> df = pd.DataFrame({'name': ['Raphael', 'Donatello'],
... 'mask': ['red', 'purple'],
... 'weapon': ['sai', 'bo staff']})
>>> df.to_csv(index=False)
'name,mask,weapon\nRaphael,red,sai\nDonatello,purple,bo staff\n'
Create 'out.zip' containing 'out.csv'
>>> compression_opts = dict(method='zip',
... archive_name='out.csv') # doctest: +SKIP
>>> df.to_csv('out.zip', index=False,
... compression=compression_opts) # doctest: +SKIP
Function03
to_dict(self, orient: 'str' = 'dict', into=<class 'dict'>)
Help on function to_dict in module pandas.core.frame:
to_dict(self, orient: 'str' = 'dict', into=<class 'dict'>)
Convert the DataFrame to a dictionary.
The type of the key-value pairs can be customized with the parameters
(see below).
Parameters
----------
orient : str {'dict', 'list', 'series', 'split', 'records', 'index'}
Determines the type of the values of the dictionary.
- 'dict' (default) : dict like {column -> {index -> value}}
- 'list' : dict like {column -> [values]}
- 'series' : dict like {column -> Series(values)}
- 'split' : dict like
{'index' -> [index], 'columns' -> [columns], 'data' -> [values]}
- 'records' : list like
[{column -> value}, ... , {column -> value}]
- 'index' : dict like {index -> {column -> value}}
Abbreviations are allowed. `s` indicates `series` and `sp`
indicates `split`.
into : class, default dict
The collections.abc.Mapping subclass used for all Mappings
in the return value. Can be the actual class or an empty
instance of the mapping type you want. If you want a
collections.defaultdict, you must pass it initialized.
Returns
-------
dict, list or collections.abc.Mapping
Return a collections.abc.Mapping object representing the DataFrame.
The resulting transformation depends on the `orient` parameter.
See Also
--------
DataFrame.from_dict: Create a DataFrame from a dictionary.
DataFrame.to_json: Convert a DataFrame to JSON format.
Examples
--------
>>> df = pd.DataFrame({'col1': [1, 2],
... 'col2': [0.5, 0.75]},
... index=['row1', 'row2'])
>>> df
col1 col2
row1 1 0.50
row2 2 0.75
>>> df.to_dict()
{'col1': {'row1': 1, 'row2': 2}, 'col2': {'row1': 0.5, 'row2': 0.75}}
You can specify the return orientation.
>>> df.to_dict('series')
{'col1': row1 1
row2 2
Name: col1, dtype: int64,
'col2': row1 0.50
row2 0.75
Name: col2, dtype: float64}
>>> df.to_dict('split')
{'index': ['row1', 'row2'], 'columns': ['col1', 'col2'],
'data': [[1, 0.5], [2, 0.75]]}
>>> df.to_dict('records')
[{'col1': 1, 'col2': 0.5}, {'col1': 2, 'col2': 0.75}]
>>> df.to_dict('index')
{'row1': {'col1': 1, 'col2': 0.5}, 'row2': {'col1': 2, 'col2': 0.75}}
You can also specify the mapping type.
>>> from collections import OrderedDict, defaultdict
>>> df.to_dict(into=OrderedDict)
OrderedDict([('col1', OrderedDict([('row1', 1), ('row2', 2)])),
('col2', OrderedDict([('row1', 0.5), ('row2', 0.75)]))])
If you want a `defaultdict`, you need to initialize it:
>>> dd = defaultdict(list)
>>> df.to_dict('records', into=dd)
[defaultdict(<class 'list'>, {'col1': 1, 'col2': 0.5}),
defaultdict(<class 'list'>, {'col1': 2, 'col2': 0.75})]
Function04
to_excel(self, excel_writer, sheet_name: 'str' = 'Sheet1', na_rep: 'str' = '', float_format: 'str | None' = None, columns=None, header=True, index=True, index_label=None, startrow=0, startcol=0, engine=None, merge_cells=True, encoding=None, inf_rep='inf', verbose=True, freeze_panes=None, storage_options: 'StorageOptions' = None) -> 'None'
Help on function to_excel in module pandas.core.generic:
to_excel(self, excel_writer, sheet_name: 'str' = 'Sheet1', na_rep: 'str' = '', float_format: 'str | None' = None, columns=None, header=True, index=True, index_label=None, startrow=0, startcol=0, engine=None, merge_cells=True, encoding=None, inf_rep='inf', verbose=True, freeze_panes=None, storage_options: 'StorageOptions' = None) -> 'None'
Write object to an Excel sheet.
To write a single object to an Excel .xlsx file it is only necessary to
specify a target file name. To write to multiple sheets it is necessary to
create an `ExcelWriter` object with a target file name, and specify a sheet
in the file to write to.
Multiple sheets may be written to by specifying unique `sheet_name`.
With all data written to the file it is necessary to save the changes.
Note that creating an `ExcelWriter` object with a file name that already
exists will result in the contents of the existing file being erased.
Parameters
----------
excel_writer : path-like, file-like, or ExcelWriter object
File path or existing ExcelWriter.
sheet_name : str, default 'Sheet1'
Name of sheet which will contain DataFrame.
na_rep : str, default ''
Missing data representation.
float_format : str, optional
Format string for floating point numbers. For example
``float_format="%.2f"`` will format 0.1234 to 0.12.
columns : sequence or list of str, optional
Columns to write.
header : bool or list of str, default True
Write out the column names. If a list of string is given it is
assumed to be aliases for the column names.
index : bool, default True
Write row names (index).
index_label : str or sequence, optional
Column label for index column(s) if desired. If not specified, and
`header` and `index` are True, then the index names are used. A
sequence should be given if the DataFrame uses MultiIndex.
startrow : int, default 0
Upper left cell row to dump data frame.
startcol : int, default 0
Upper left cell column to dump data frame.
engine : str, optional
Write engine to use, 'openpyxl' or 'xlsxwriter'. You can also set this
via the options ``io.excel.xlsx.writer``, ``io.excel.xls.writer``, and
``io.excel.xlsm.writer``.
.. deprecated:: 1.2.0
As the `xlwt <https://pypi.org/project/xlwt/>`__ package is no longer
maintained, the ``xlwt`` engine will be removed in a future version
of pandas.
merge_cells : bool, default True
Write MultiIndex and Hierarchical Rows as merged cells.
encoding : str, optional
Encoding of the resulting excel file. Only necessary for xlwt,
other writers support unicode natively.
inf_rep : str, default 'inf'
Representation for infinity (there is no native representation for
infinity in Excel).
verbose : bool, default True
Display more information in the error logs.
freeze_panes : tuple of int (length 2), optional
Specifies the one-based bottommost row and rightmost column that
is to be frozen.
storage_options : dict, optional
Extra options that make sense for a particular storage connection, e.g.
host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
are forwarded to ``urllib`` as header options. For other URLs (e.g.
starting with "s3://", and "gcs://") the key-value pairs are forwarded to
``fsspec``. Please see ``fsspec`` and ``urllib`` for more details.
.. versionadded:: 1.2.0
See Also
--------
to_csv : Write DataFrame to a comma-separated values (csv) file.
ExcelWriter : Class for writing DataFrame objects into excel sheets.
read_excel : Read an Excel file into a pandas DataFrame.
read_csv : Read a comma-separated values (csv) file into DataFrame.
Notes
-----
For compatibility with :meth:`~DataFrame.to_csv`,
to_excel serializes lists and dicts to strings before writing.
Once a workbook has been saved it is not possible to write further
data without rewriting the whole workbook.
Examples
--------
Create, write to and save a workbook:
>>> df1 = pd.DataFrame([['a', 'b'], ['c', 'd']],
... index=['row 1', 'row 2'],
... columns=['col 1', 'col 2'])
>>> df1.to_excel("output.xlsx") # doctest: +SKIP
To specify the sheet name:
>>> df1.to_excel("output.xlsx",
... sheet_name='Sheet_name_1') # doctest: +SKIP
If you wish to write to more than one sheet in the workbook, it is
necessary to specify an ExcelWriter object:
>>> df2 = df1.copy()
>>> with pd.ExcelWriter('output.xlsx') as writer: # doctest: +SKIP
... df1.to_excel(writer, sheet_name='Sheet_name_1')
... df2.to_excel(writer, sheet_name='Sheet_name_2')
ExcelWriter can also be used to append to an existing Excel file:
>>> with pd.ExcelWriter('output.xlsx',
... mode='a') as writer: # doctest: +SKIP
... df.to_excel(writer, sheet_name='Sheet_name_3')
To set the library that is used to write the Excel file,
you can pass the `engine` keyword (the default engine is
automatically chosen depending on the file extension):
>>> df1.to_excel('output1.xlsx', engine='xlsxwriter') # doctest: +SKIP
Function05
to_feather(self, path: 'FilePathOrBuffer[AnyStr]', **kwargs) -> 'None'
Help on function to_feather in module pandas.core.frame:
to_feather(self, path: 'FilePathOrBuffer[AnyStr]', **kwargs) -> 'None'
Write a DataFrame to the binary Feather format.
Parameters
----------
path : str or file-like object
If a string, it will be used as Root Directory path.
**kwargs :
Additional keywords passed to :func:`pyarrow.feather.write_feather`.
Starting with pyarrow 0.17, this includes the `compression`,
`compression_level`, `chunksize` and `version` keywords.
.. versionadded:: 1.1.0
Function06
to_gbq(self, destination_table: 'str', project_id: 'str | None' = None, chunksize: 'int | None' = None, reauth: 'bool' = False, if_exists: 'str' = 'fail', auth_local_webserver: 'bool' = False, table_schema: 'list[dict[str, str]] | None' = None, location: 'str | None' = None, progress_bar: 'bool' = True, credentials=None) -> 'None'
Help on function to_gbq in module pandas.core.frame:
to_gbq(self, destination_table: 'str', project_id: 'str | None' = None, chunksize: 'int | None' = None, reauth: 'bool' = False, if_exists: 'str' = 'fail', auth_local_webserver: 'bool' = False, table_schema: 'list[dict[str, str]] | None' = None, location: 'str | None' = None, progress_bar: 'bool' = True, credentials=None) -> 'None'
Write a DataFrame to a Google BigQuery table.
This function requires the `pandas-gbq package
<https://pandas-gbq.readthedocs.io>`__.
See the `How to authenticate with Google BigQuery
<https://pandas-gbq.readthedocs.io/en/latest/howto/authentication.html>`__
guide for authentication instructions.
Parameters
----------
destination_table : str
Name of table to be written, in the form ``dataset.tablename``.
project_id : str, optional
Google BigQuery Account project ID. Optional when available from
the environment.
chunksize : int, optional
Number of rows to be inserted in each chunk from the dataframe.
Set to ``None`` to load the whole dataframe at once.
reauth : bool, default False
Force Google BigQuery to re-authenticate the user. This is useful
if multiple accounts are used.
if_exists : str, default 'fail'
Behavior when the destination table exists. Value can be one of:
``'fail'``
If table exists raise pandas_gbq.gbq.TableCreationError.
``'replace'``
If table exists, drop it, recreate it, and insert data.
``'append'``
If table exists, insert data. Create if does not exist.
auth_local_webserver : bool, default False
Use the `local webserver flow`_ instead of the `console flow`_
when getting user credentials.
.. _local webserver flow:
https://google-auth-oauthlib.readthedocs.io/en/latest/reference/google_auth_oauthlib.flow.html#google_auth_oauthlib.flow.InstalledAppFlow.run_local_server
.. _console flow:
https://google-auth-oauthlib.readthedocs.io/en/latest/reference/google_auth_oauthlib.flow.html#google_auth_oauthlib.flow.InstalledAppFlow.run_console
*New in version 0.2.0 of pandas-gbq*.
table_schema : list of dicts, optional
List of BigQuery table fields to which according DataFrame
columns conform to, e.g. ``[{'name': 'col1', 'type':
'STRING'},...]``. If schema is not provided, it will be
generated according to dtypes of DataFrame columns. See
BigQuery API documentation on available names of a field.
*New in version 0.3.1 of pandas-gbq*.
location : str, optional
Location where the load job should run. See the `BigQuery locations
documentation
<https://cloud.google.com/bigquery/docs/dataset-locations>`__ for a
list of available locations. The location must match that of the
target dataset.
*New in version 0.5.0 of pandas-gbq*.
progress_bar : bool, default True
Use the library `tqdm` to show the progress bar for the upload,
chunk by chunk.
*New in version 0.5.0 of pandas-gbq*.
credentials : google.auth.credentials.Credentials, optional
Credentials for accessing Google APIs. Use this parameter to
override default credentials, such as to use Compute Engine
:class:`google.auth.compute_engine.Credentials` or Service
Account :class:`google.oauth2.service_account.Credentials`
directly.
*New in version 0.8.0 of pandas-gbq*.
See Also
--------
pandas_gbq.to_gbq : This function in the pandas-gbq library.
read_gbq : Read a DataFrame from Google BigQuery.
Function07
to_hdf(self, path_or_buf, key: 'str', mode: 'str' = 'a', complevel: 'int | None' = None, complib: 'str | None' = None, append: 'bool_t' = False, format: 'str | None' = None, index: 'bool_t' = True, min_itemsize: 'int | dict[str, int] | None' = None, nan_rep=None, dropna: 'bool_t | None' = None, data_columns: 'bool_t | list[str] | None' = None, errors: 'str' = 'strict', encoding: 'str' = 'UTF-8') -> 'None'
Help on function to_hdf in module pandas.core.generic:
to_hdf(self, path_or_buf, key: 'str', mode: 'str' = 'a', complevel: 'int | None' = None, complib: 'str | None' = None, append: 'bool_t' = False, format: 'str | None' = None, index: 'bool_t' = True, min_itemsize: 'int | dict[str, int] | None' = None, nan_rep=None, dropna: 'bool_t | None' = None, data_columns: 'bool_t | list[str] | None' = None, errors: 'str' = 'strict', encoding: 'str' = 'UTF-8') -> 'None'
Write the contained data to an HDF5 file using HDFStore.
Hierarchical Data Format (HDF) is self-describing, allowing an
application to interpret the structure and contents of a file with
no outside information. One HDF file can hold a mix of related objects
which can be accessed as a group or as individual objects.
In order to add another DataFrame or Series to an existing HDF file
please use append mode and a different a key.
.. warning::
One can store a subclass of ``DataFrame`` or ``Series`` to HDF5,
but the type of the subclass is lost upon storing.
For more information see the :ref:`user guide <io.hdf5>`.
Parameters
----------
path_or_buf : str or pandas.HDFStore
File path or HDFStore object.
key : str
Identifier for the group in the store.
mode : {'a', 'w', 'r+'}, default 'a'
Mode to open file:
- 'w': write, a new file is created (an existing file with
the same name would be deleted).
- 'a': append, an existing file is opened for reading and
writing, and if the file does not exist it is created.
- 'r+': similar to 'a', but the file must already exist.
complevel : {0-9}, optional
Specifies a compression level for data.
A value of 0 disables compression.
complib : {'zlib', 'lzo', 'bzip2', 'blosc'}, default 'zlib'
Specifies the compression library to be used.
As of v0.20.2 these additional compressors for Blosc are supported
(default if no compressor specified: 'blosc:blosclz'):
{'blosc:blosclz', 'blosc:lz4', 'blosc:lz4hc', 'blosc:snappy',
'blosc:zlib', 'blosc:zstd'}.
Specifying a compression library which is not available issues
a ValueError.
append : bool, default False
For Table formats, append the input data to the existing.
format : {'fixed', 'table', None}, default 'fixed'
Possible values:
- 'fixed': Fixed format. Fast writing/reading. Not-appendable,
nor searchable.
- 'table': Table format. Write as a PyTables Table structure
which may perform worse but allow more flexible operations
like searching / selecting subsets of the data.
- If None, pd.get_option('io.hdf.default_format') is checked,
followed by fallback to "fixed"
errors : str, default 'strict'
Specifies how encoding and decoding errors are to be handled.
See the errors argument for :func:`open` for a full list
of options.
encoding : str, default "UTF-8"
min_itemsize : dict or int, optional
Map column names to minimum string sizes for columns.
nan_rep : Any, optional
How to represent null values as str.
Not allowed with append=True.
data_columns : list of columns or True, optional
List of columns to create as indexed data columns for on-disk
queries, or True to use all columns. By default only the axes
of the object are indexed. See :ref:`io.hdf5-query-data-columns`.
Applicable only to format='table'.
See Also
--------
read_hdf : Read from HDF file.
DataFrame.to_parquet : Write a DataFrame to the binary parquet format.
DataFrame.to_sql : Write to a SQL table.
DataFrame.to_feather : Write out feather-format for DataFrames.
DataFrame.to_csv : Write out to a csv file.
Examples
--------
>>> df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]},
... index=['a', 'b', 'c'])
>>> df.to_hdf('data.h5', key='df', mode='w')
We can add another object to the same file:
>>> s = pd.Series([1, 2, 3, 4])
>>> s.to_hdf('data.h5', key='s')
Reading from HDF file:
>>> pd.read_hdf('data.h5', 'df')
A B
a 1 4
b 2 5
c 3 6
>>> pd.read_hdf('data.h5', 's')
0 1
1 2
2 3
3 4
dtype: int64
Deleting file with data:
>>> import os
>>> os.remove('data.h5')
Function08
to_html(self, buf: 'FilePathOrBuffer[str] | None' = None, columns: 'Sequence[str] | None' = None, col_space: 'ColspaceArgType | None' = None, header: 'bool | Sequence[str]' = True, index: 'bool' = True, na_rep: 'str' = 'NaN', formatters: 'FormattersType | None' = None, float_format: 'FloatFormatType | None' = None, sparsify: 'bool | None' = None, index_names: 'bool' = True, justify: 'str | None' = None, max_rows: 'int | None' = None, max_cols: 'int | None' = None, show_dimensions: 'bool | str' = False, decimal: 'str' = '.', bold_rows: 'bool' = True, classes: 'str | list | tuple | None' = None, escape: 'bool' = True, notebook: 'bool' = False, border: 'int | None' = None, table_id: 'str | None' = None, render_links: 'bool' = False, encoding: 'str | None' = None)
Help on function to_html in module pandas.core.frame:
to_html(self, buf: 'FilePathOrBuffer[str] | None' = None, columns: 'Sequence[str] | None' = None, col_space: 'ColspaceArgType | None' = None, header: 'bool | Sequence[str]' = True, index: 'bool' = True, na_rep: 'str' = 'NaN', formatters: 'FormattersType | None' = None, float_format: 'FloatFormatType | None' = None, sparsify: 'bool | None' = None, index_names: 'bool' = True, justify: 'str | None' = None, max_rows: 'int | None' = None, max_cols: 'int | None' = None, show_dimensions: 'bool | str' = False, decimal: 'str' = '.', bold_rows: 'bool' = True, classes: 'str | list | tuple | None' = None, escape: 'bool' = True, notebook: 'bool' = False, border: 'int | None' = None, table_id: 'str | None' = None, render_links: 'bool' = False, encoding: 'str | None' = None)
Render a DataFrame as an HTML table.
Parameters
----------
buf : str, Path or StringIO-like, optional, default None
Buffer to write to. If None, the output is returned as a string.
columns : sequence, optional, default None
The subset of columns to write. Writes all columns by default.
col_space : str or int, list or dict of int or str, optional
The minimum width of each column in CSS length units. An int is assumed to be px units.
.. versionadded:: 0.25.0
Ability to use str.
header : bool, optional
Whether to print column labels, default True.
index : bool, optional, default True
Whether to print index (row) labels.
na_rep : str, optional, default 'NaN'
String representation of ``NaN`` to use.
formatters : list, tuple or dict of one-param. functions, optional
Formatter functions to apply to columns' elements by position or
name.
The result of each function must be a unicode string.
List/tuple must be of length equal to the number of columns.
float_format : one-parameter function, optional, default None
Formatter function to apply to columns' elements if they are
floats. This function must return a unicode string and will be
applied only to the non-``NaN`` elements, with ``NaN`` being
handled by ``na_rep``.
.. versionchanged:: 1.2.0
sparsify : bool, optional, default True
Set to False for a DataFrame with a hierarchical index to print
every multiindex key at each row.
index_names : bool, optional, default True
Prints the names of the indexes.
justify : str, default None
How to justify the column labels. If None uses the option from
the print configuration (controlled by set_option), 'right' out
of the box. Valid values are
* left
* right
* center
* justify
* justify-all
* start
* end
* inherit
* match-parent
* initial
* unset.
max_rows : int, optional
Maximum number of rows to display in the console.
min_rows : int, optional
The number of rows to display in the console in a truncated repr
(when number of rows is above `max_rows`).
max_cols : int, optional
Maximum number of columns to display in the console.
show_dimensions : bool, default False
Display DataFrame dimensions (number of rows by number of columns).
decimal : str, default '.'
Character recognized as decimal separator, e.g. ',' in Europe.
bold_rows : bool, default True
Make the row labels bold in the output.
classes : str or list or tuple, default None
CSS class(es) to apply to the resulting html table.
escape : bool, default True
Convert the characters <, >, and & to HTML-safe sequences.
notebook : {True, False}, default False
Whether the generated HTML is for IPython Notebook.
border : int
A ``border=border`` attribute is included in the opening
`<table>` tag. Default ``pd.options.display.html.border``.
encoding : str, default "utf-8"
Set character encoding.
.. versionadded:: 1.0
table_id : str, optional
A css id is included in the opening `<table>` tag if specified.
render_links : bool, default False
Convert URLs to HTML links.
Returns
-------
str or None
If buf is None, returns the result as a string. Otherwise returns
None.
See Also
--------
to_string : Convert DataFrame to a string.
Function09
to_json(self, path_or_buf: 'FilePathOrBuffer | None' = None, orient: 'str | None' = None, date_format: 'str | None' = None, double_precision: 'int' = 10, force_ascii: 'bool_t' = True, date_unit: 'str' = 'ms', default_handler: 'Callable[[Any], JSONSerializable] | None' = None, lines: 'bool_t' = False, compression: 'CompressionOptions' = 'infer', index: 'bool_t' = True, indent: 'int | None' = None, storage_options: 'StorageOptions' = None) -> 'str | None'
Help on function to_json in module pandas.core.generic:
to_json(self, path_or_buf: 'FilePathOrBuffer | None' = None, orient: 'str | None' = None, date_format: 'str | None' = None, double_precision: 'int' = 10, force_ascii: 'bool_t' = True, date_unit: 'str' = 'ms', default_handler: 'Callable[[Any], JSONSerializable] | None' = None, lines: 'bool_t' = False, compression: 'CompressionOptions' = 'infer', index: 'bool_t' = True, indent: 'int | None' = None, storage_options: 'StorageOptions' = None) -> 'str | None'
Convert the object to a JSON string.
Note NaN's and None will be converted to null and datetime objects
will be converted to UNIX timestamps.
Parameters
----------
path_or_buf : str or file handle, optional
File path or object. If not specified, the result is returned as
a string.
orient : str
Indication of expected JSON string format.
* Series:
- default is 'index'
- allowed values are: {'split', 'records', 'index', 'table'}.
* DataFrame:
- default is 'columns'
- allowed values are: {'split', 'records', 'index', 'columns',
'values', 'table'}.
* The format of the JSON string:
- 'split' : dict like {'index' -> [index], 'columns' -> [columns],
'data' -> [values]}
- 'records' : list like [{column -> value}, ... , {column -> value}]
- 'index' : dict like {index -> {column -> value}}
- 'columns' : dict like {column -> {index -> value}}
- 'values' : just the values array
- 'table' : dict like {'schema': {schema}, 'data': {data}}
Describing the data, where data component is like ``orient='records'``.
date_format : {None, 'epoch', 'iso'}
Type of date conversion. 'epoch' = epoch milliseconds,
'iso' = ISO8601. The default depends on the `orient`. For
``orient='table'``, the default is 'iso'. For all other orients,
the default is 'epoch'.
double_precision : int, default 10
The number of decimal places to use when encoding
floating point values.
force_ascii : bool, default True
Force encoded string to be ASCII.
date_unit : str, default 'ms' (milliseconds)
The time unit to encode to, governs timestamp and ISO8601
precision. One of 's', 'ms', 'us', 'ns' for second, millisecond,
microsecond, and nanosecond respectively.
default_handler : callable, default None
Handler to call if object cannot otherwise be converted to a
suitable format for JSON. Should receive a single argument which is
the object to convert and return a serialisable object.
lines : bool, default False
If 'orient' is 'records' write out line-delimited json format. Will
throw ValueError if incorrect 'orient' since others are not
list-like.
compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None}
A string representing the compression to use in the output file,
only used when the first argument is a filename. By default, the
compression is inferred from the filename.
index : bool, default True
Whether to include the index values in the JSON string. Not
including the index (``index=False``) is only supported when
orient is 'split' or 'table'.
indent : int, optional
Length of whitespace used to indent each record.
.. versionadded:: 1.0.0
storage_options : dict, optional
Extra options that make sense for a particular storage connection, e.g.
host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
are forwarded to ``urllib`` as header options. For other URLs (e.g.
starting with "s3://", and "gcs://") the key-value pairs are forwarded to
``fsspec``. Please see ``fsspec`` and ``urllib`` for more details.
.. versionadded:: 1.2.0
Returns
-------
None or str
If path_or_buf is None, returns the resulting json format as a
string. Otherwise returns None.
See Also
--------
read_json : Convert a JSON string to pandas object.
Notes
-----
The behavior of ``indent=0`` varies from the stdlib, which does not
indent the output but does insert newlines. Currently, ``indent=0``
and the default ``indent=None`` are equivalent in pandas, though this
may change in a future release.
``orient='table'`` contains a 'pandas_version' field under 'schema'.
This stores the version of `pandas` used in the latest revision of the
schema.
Examples
--------
>>> import json
>>> df = pd.DataFrame(
... [["a", "b"], ["c", "d"]],
... index=["row 1", "row 2"],
... columns=["col 1", "col 2"],
... )
>>> result = df.to_json(orient="split")
>>> parsed = json.loads(result)
>>> json.dumps(parsed, indent=4) # doctest: +SKIP
{
"columns": [
"col 1",
"col 2"
],
"index": [
"row 1",
"row 2"
],
"data": [
[
"a",
"b"
],
[
"c",
"d"
]
]
}
Encoding/decoding a Dataframe using ``'records'`` formatted JSON.
Note that index labels are not preserved with this encoding.
>>> result = df.to_json(orient="records")
>>> parsed = json.loads(result)
>>> json.dumps(parsed, indent=4) # doctest: +SKIP
[
{
"col 1": "a",
"col 2": "b"
},
{
"col 1": "c",
"col 2": "d"
}
]
Encoding/decoding a Dataframe using ``'index'`` formatted JSON:
>>> result = df.to_json(orient="index")
>>> parsed = json.loads(result)
>>> json.dumps(parsed, indent=4) # doctest: +SKIP
{
"row 1": {
"col 1": "a",
"col 2": "b"
},
"row 2": {
"col 1": "c",
"col 2": "d"
}
}
Encoding/decoding a Dataframe using ``'columns'`` formatted JSON:
>>> result = df.to_json(orient="columns")
>>> parsed = json.loads(result)
>>> json.dumps(parsed, indent=4) # doctest: +SKIP
{
"col 1": {
"row 1": "a",
"row 2": "c"
},
"col 2": {
"row 1": "b",
"row 2": "d"
}
}
Encoding/decoding a Dataframe using ``'values'`` formatted JSON:
>>> result = df.to_json(orient="values")
>>> parsed = json.loads(result)
>>> json.dumps(parsed, indent=4) # doctest: +SKIP
[
[
"a",
"b"
],
[
"c",
"d"
]
]
Encoding with Table Schema:
>>> result = df.to_json(orient="table")
>>> parsed = json.loads(result)
>>> json.dumps(parsed, indent=4) # doctest: +SKIP
{
"schema": {
"fields": [
{
"name": "index",
"type": "string"
},
{
"name": "col 1",
"type": "string"
},
{
"name": "col 2",
"type": "string"
}
],
"primaryKey": [
"index"
],
"pandas_version": "0.20.0"
},
"data": [
{
"index": "row 1",
"col 1": "a",
"col 2": "b"
},
{
"index": "row 2",
"col 1": "c",
"col 2": "d"
}
]
}
Function10
to_latex(self, buf=None, columns=None, col_space=None, header=True, index=True, na_rep='NaN', formatters=None, float_format=None, sparsify=None, index_names=True, bold_rows=False, column_format=None, longtable=None, escape=None, encoding=None, decimal='.', multicolumn=None, multicolumn_format=None, multirow=None, caption=None, label=None, position=None)
Help on function to_latex in module pandas.core.generic:
to_latex(self, buf=None, columns=None, col_space=None, header=True, index=True, na_rep='NaN', formatters=None, float_format=None, sparsify=None, index_names=True, bold_rows=False, column_format=None, longtable=None, escape=None, encoding=None, decimal='.', multicolumn=None, multicolumn_format=None, multirow=None, caption=None, label=None, position=None)
Render object to a LaTeX tabular, longtable, or nested table/tabular.
Requires ``\usepackage{booktabs}``. The output can be copy/pasted
into a main LaTeX document or read from an external file
with ``\input{table.tex}``.
.. versionchanged:: 1.0.0
Added caption and label arguments.
.. versionchanged:: 1.2.0
Added position argument, changed meaning of caption argument.
Parameters
----------
buf : str, Path or StringIO-like, optional, default None
Buffer to write to. If None, the output is returned as a string.
columns : list of label, optional
The subset of columns to write. Writes all columns by default.
col_space : int, optional
The minimum width of each column.
header : bool or list of str, default True
Write out the column names. If a list of strings is given,
it is assumed to be aliases for the column names.
index : bool, default True
Write row names (index).
na_rep : str, default 'NaN'
Missing data representation.
formatters : list of functions or dict of {str: function}, optional
Formatter functions to apply to columns' elements by position or
name. The result of each function must be a unicode string.
List must be of length equal to the number of columns.
float_format : one-parameter function or str, optional, default None
Formatter for floating point numbers. For example
``float_format="%.2f"`` and ``float_format="{:0.2f}".format`` will
both result in 0.1234 being formatted as 0.12.
sparsify : bool, optional
Set to False for a DataFrame with a hierarchical index to print
every multiindex key at each row. By default, the value will be
read from the config module.
index_names : bool, default True
Prints the names of the indexes.
bold_rows : bool, default False
Make the row labels bold in the output.
column_format : str, optional
The columns format as specified in `LaTeX table format
<https://en.wikibooks.org/wiki/LaTeX/Tables>`__ e.g. 'rcl' for 3
columns. By default, 'l' will be used for all columns except
columns of numbers, which default to 'r'.
longtable : bool, optional
By default, the value will be read from the pandas config
module. Use a longtable environment instead of tabular. Requires
adding a \usepackage{longtable} to your LaTeX preamble.
escape : bool, optional
By default, the value will be read from the pandas config
module. When set to False prevents from escaping latex special
characters in column names.
encoding : str, optional
A string representing the encoding to use in the output file,
defaults to 'utf-8'.
decimal : str, default '.'
Character recognized as decimal separator, e.g. ',' in Europe.
multicolumn : bool, default True
Use \multicolumn to enhance MultiIndex columns.
The default will be read from the config module.
multicolumn_format : str, default 'l'
The alignment for multicolumns, similar to `column_format`
The default will be read from the config module.
multirow : bool, default False
Use \multirow to enhance MultiIndex rows. Requires adding a
\usepackage{multirow} to your LaTeX preamble. Will print
centered labels (instead of top-aligned) across the contained
rows, separating groups via clines. The default will be read
from the pandas config module.
caption : str or tuple, optional
Tuple (full_caption, short_caption),
which results in ``\caption[short_caption]{full_caption}``;
if a single string is passed, no short caption will be set.
.. versionadded:: 1.0.0
.. versionchanged:: 1.2.0
Optionally allow caption to be a tuple ``(full_caption, short_caption)``.
label : str, optional
The LaTeX label to be placed inside ``\label{}`` in the output.
This is used with ``\ref{}`` in the main ``.tex`` file.
.. versionadded:: 1.0.0
position : str, optional
The LaTeX positional argument for tables, to be placed after
``\begin{}`` in the output.
.. versionadded:: 1.2.0
Returns
-------
str or None
If buf is None, returns the result as a string. Otherwise returns
None.
See Also
--------
DataFrame.to_string : Render a DataFrame to a console-friendly
tabular output.
DataFrame.to_html : Render a DataFrame as an HTML table.
Examples
--------
>>> df = pd.DataFrame(dict(name=['Raphael', 'Donatello'],
... mask=['red', 'purple'],
... weapon=['sai', 'bo staff']))
>>> print(df.to_latex(index=False)) # doctest: +NORMALIZE_WHITESPACE
\begin{tabular}{lll}
\toprule
name & mask & weapon \\
\midrule
Raphael & red & sai \\
Donatello & purple & bo staff \\
\bottomrule
\end{tabular}
Function11
to_markdown(self, buf: 'IO[str] | str | None' = None, mode: 'str' = 'wt', index: 'bool' = True, storage_options: 'StorageOptions' = None, **kwargs) -> 'str | None'
Help on function to_markdown in module pandas.core.frame:
to_markdown(self, buf: 'IO[str] | str | None' = None, mode: 'str' = 'wt', index: 'bool' = True, storage_options: 'StorageOptions' = None, **kwargs) -> 'str | None'
Print DataFrame in Markdown-friendly format.
.. versionadded:: 1.0.0
Parameters
----------
buf : str, Path or StringIO-like, optional, default None
Buffer to write to. If None, the output is returned as a string.
mode : str, optional
Mode in which file is opened, "wt" by default.
index : bool, optional, default True
Add index (row) labels.
.. versionadded:: 1.1.0
storage_options : dict, optional
Extra options that make sense for a particular storage connection, e.g.
host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
are forwarded to ``urllib`` as header options. For other URLs (e.g.
starting with "s3://", and "gcs://") the key-value pairs are forwarded to
``fsspec``. Please see ``fsspec`` and ``urllib`` for more details.
.. versionadded:: 1.2.0
**kwargs
These parameters will be passed to `tabulate <https://pypi.org/project/tabulate>`_.
Returns
-------
str
DataFrame in Markdown-friendly format.
Notes
-----
Requires the `tabulate <https://pypi.org/project/tabulate>`_ package.
Examples
--------
>>> s = pd.Series(["elk", "pig", "dog", "quetzal"], name="animal")
>>> print(s.to_markdown())
| | animal |
|---:|:---------|
| 0 | elk |
| 1 | pig |
| 2 | dog |
| 3 | quetzal |
Output markdown with a tabulate option.
>>> print(s.to_markdown(tablefmt="grid"))
+----+----------+
| | animal |
+====+==========+
| 0 | elk |
+----+----------+
| 1 | pig |
+----+----------+
| 2 | dog |
+----+----------+
| 3 | quetzal |
+----+----------+
待续......
下一篇链接:
https://blog.csdn.net/boysoft2002/article/details/128433354