1、使用VS code搭建Python编译环境
2、安装pdf2doc库1
pip install pdf2docx
3、编写代码
3.1 使用parse将pdf转化为docx
编写 pdf2docxParse.py
from pdf2docx import parse
# 文件名
pdf_file = 'demo-image-overlap.pdf'
docx_file = 'demo-image-overlap.docx'
# 将pdf转为docx
parse(pdf_file, docx_file)
运行 pdf2docxParse.py
python pdf2docxParse.py
3.2 使用convert将pdf转化为docx
3.2.1 编写 pdf2docxConvert.py
from pdf2docx import Converter
# 文件名
pdf_file = 'demo-image-overlap.pdf'
docx_file = 'demo-image-overlap.docx'
cv = Converter(pdf_file)
cv.convert(docx_file, start=0, end=None)
cv.close()
3.2.2 运行 pdf2docxConvert.py
python pdf2docxConvert.py
3.3 使用命令行输入pdf 转化pdf
3.3.1 编写 SMQHPdf2Docx.py
'''
@Description 使用命令行到处pdf
@Author: 少莫千华
@Time: 2023-06-11
'''
# import logging
import argparse
from pdf2docx import Converter
def main(pdf_file,docx_file):
cv = Converter(pdf_file)
cv.convert(docx_file, start=0, end=None)
cv.close()
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--pdf",type=str)
args = parser.parse_args()
# logging.debug(args.pdf)
main(args.pdf,args.pdf + '.docx')
3.2.3 运行 SMQHPdf2Docx.py
python SMQHPdf2Docx.py --pdf demo-image-overlap.pdf
3.3 转化效果
DOCX
点击查看pdf2doc详细说明 ↩︎