Unstructured cannot write mode RGBA as JPEG 错误解决
- 0. 错误详细
- 1. 解决方法
0. 错误详细
Image Extraction Error: Skipping the failed image
Traceback (most recent call last):
File "/root/miniconda3/envs/learn-yolo/lib/python3.11/site-packages/PIL/JpegImagePlugin.py", line 639, in _save
rawmode = RAWMODE[im.mode]
~~~~~~~^^^^^^^^^
KeyError: 'RGBA'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/mnt/e/workspace/learn-yolo/unstructured/partition/pdf_image/pdf_image_utils.py", line 222, in save_elements
write_image(cropped_image, output_f_path)
File "/mnt/e/workspace/learn-yolo/unstructured/partition/pdf_image/pdf_image_utils.py", line 52, in write_image
image.save(output_image_path)
File "/root/miniconda3/envs/learn-yolo/lib/python3.11/site-packages/PIL/Image.py", line 2568, in save
save_handler(self, fp, filename)
File "/root/miniconda3/envs/learn-yolo/lib/python3.11/site-packages/PIL/JpegImagePlugin.py", line 642, in _save
raise OSError(msg) from e
OSError: cannot write mode RGBA as JPEG
1. 解决方法
要解决这个问题,需要将 RGBA 模式的图像转换为 RGB 模式。你可以在保存之前使用 convert(‘RGB’) 来转换图像格式。
修改 unstructured/partition/pdf_image/pdf_image_utils.py
, 添加 cropped_image = cropped_image.convert("RGB")
。
try:
image_path = image_paths[page_index]
image = Image.open(image_path)
cropped_image = image.crop(padded_bbox)
if extract_image_block_to_payload:
buffered = BytesIO()
cropped_image = cropped_image.convert("RGB")
cropped_image.save(buffered, format="JPEG")
img_base64 = base64.b64encode(buffered.getvalue())
img_base64_str = img_base64.decode()
el.metadata.image_base64 = img_base64_str
el.metadata.image_mime_type = "image/jpeg"
else:
basename = "table" if el.category == ElementType.TABLE else "figure"
assert output_dir_path
output_f_path = os.path.join(
output_dir_path,
f"{basename}-{metadata_page_number}-{figure_number}.jpg",
)
cropped_image = cropped_image.convert("RGB")
write_image(cropped_image, output_f_path)
# add image path to element metadata
el.metadata.image_path = output_f_path
except (ValueError, IOError):
logger.warning("Image Extraction Error: Skipping the failed image", exc_info=True)