某次爬取一个网站的时候UnicodeEncodeError: 'gbk' codec can't encode character '\xa9' in position 19417: illegal multibyte sequence
尝试了很多个办法,
def get_page(self):
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36"
}
print(self.base_url)
response = requests.get(self.base_url, headers=headers)
# response.encoding = "gbk"
response.encoding = "utf-8"
print(response.text)
都是显示UnicodeEncodeError: 'gbk' codec can't encode character '\xa9' in position 19417: illegal multibyte sequence
更改response.encoding时没有任何效果,结果看了一下是pycharm的配置问题,
更改pycharm的设置,修改为utf-8即可