python 获取图片中的中文的几种办法_百度小程序

1.使用http请求库获取,分别主流有2种以下库

使用百度OCR API：百度提供了OCR API服务，可以通过API调用来识别图片中的文本，包括中文。你需要注册百度开发者账号，获取API密钥，然后使用Python中的HTTP请求库发送图片并接收识别结果

使用微软Azure OCR服务：微软Azure也提供了OCR服务，可以用来提取中文文本。与百度API类似，你需要注册Azure账号，创建一个OCR服务，然后使用Python中的HTTP请求库发送请求并获取结果。

2.使用第三方库，下面推荐4种第三方库及源码

Tesseract OCR库：

pip install pytesseract

from PIL import Image
import pytesseract
# 打开图像
image = Image.open('your_image.png')
# 使用Tesseract进行文本提取
text = pytesseract.image_to_string(image, lang='chi_sim')
# 输出提取的中文文本
print(text)

EasyOCR库：

pip install easyocr

import easyocr
# 创建EasyOCR Reader
reader = easyocr.Reader(['ch_sim'])
# 打开图像
image = 'your_image.png'
# 使用EasyOCR进行文本提取
results = reader.readtext(image)
# 输出提取的中文文本
for (bbox, text, prob) in results:
    print(text)

PyOCR库：

pip install pyocr

import pyocr
import pyocr.builders
from PIL import Image
# 获取Tesseract OCR工具
tools = pyocr.get_available_tools()
tool = tools[0]
# 打开图像
image = Image.open('your_image.png')
# 使用PyOCR进行文本提取
text = tool.image_to_string(
    image,
    lang='chi_sim',
    builder=pyocr.builders.TextBuilder()
)
# 输出提取的中文文本
print(text)

Google Cloud Vision API库：

pip install google-cloud-vision

from google.cloud import vision_v1p3beta1 as vision
from google.oauth2 import service_account
# 设置认证凭据
credentials = service_account.Credentials.from_service_account_file(
    'your-service-account-key.json'
)
# 创建Vision API客户端
client = vision.ImageAnnotatorClient(credentials=credentials)
# 打开图像
with open('your_image.png', 'rb') as image_file:
    content = image_file.read()
# 创建图像对象
image = vision.Image(content=content)
# 使用Vision API进行文本提取
response = client.text_detection(image=image)
# 输出提取的中文文本
for text in response.text_annotations:
    print(text.description)

请注意，对于Google Cloud Vision API，你需要替换 'your-service-account-key.json' 为你自己的服务账户密钥文件路径。确保在使用这些示例代码之前，你已经正确配置了相应的库和服务。

上一篇：关于部署vue项目在Linux上的两种方式tomcat以及nignx（3）使用nignx进行部署

下一篇：（十三）Flask之特殊装饰器详解