Tesseroct is a simple, Pillow-friendly, wrapper around the tesseract-ocr API for Optical Character Recognition (OCR), it integrates directly with Tesseract’s C++ API using Cython which allows for a simple Pythonic and easy-to-read source code. It enables real concurrent execution when used with Python’s threading module by releasing the GIL while processing an image in tesseract.
Installation
Linux and BSD/MacOS
1
$ pip install tesserocr
Windows
The proposed downloads consist of stand-alone packages containing all the Windows libraries needed for execution. The recommended method of installation is via Conda as described below.
1) Conda
1
> conda install -c conda-forge tesserocr
2) pip Download the wheel file corresponding to your Windows platform and Python installation from tesserocr-windows_build and install them via:
1
> pip install <package_name>.whl
Usage
Initialize and re-use the tesseract API instance to score multiple images:
with PyTessBaseAPI() as api: for img in images: api.SetImageFile(img) print(api.GetUTF8Text()) print(api.AllWordConfidences()) # api is automatically finalized when used in a with-statement (context manager). # otherwise api.End() should be explicitly called when it's no longer needed.
Advanced API Examples
1) GetComponentImages example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
from PIL import Image from tesserocr import PyTessBaseAPI, RIL
image = Image.open('/usr/src/tesseract/testing/phototest.tif') with PyTessBaseAPI() as api: api.SetImage(image) boxes = api.GetComponentImages(RIL.TEXTLINE, True) print('Found {} textline image components.'.format(len(boxes))) for i, (im, box, _, _) in enumerate(boxes): # im is a PIL image object # box is a dict with x, y, w and h keys api.SetRectangle(box['x'], box['y'], box['w'], box['h']) ocrResult = api.GetUTF8Text() conf = api.MeanTextConf() print(u"Box[{0}]: x={x}, y={y}, w={w}, h={h}, " "confidence: {1}, text: {2}".format(i, conf, ocrResult, **box))
2) Orientation and script detection (OSD):
1 2 3 4 5 6 7 8 9 10 11 12 13 14
from PIL import Image from tesserocr import PyTessBaseAPI, PSM
with PyTessBaseAPI(psm=PSM.AUTO_OSD) as api: image = Image.open("/usr/src/tesseract/testing/eurotext.tif") api.SetImage(image) api.Recognize()
from tesserocr import PyTessBaseAPI, RIL, iterate_level
with PyTessBaseAPI() as api: api.SetImageFile('/usr/src/tesseract/testing/phototest.tif') api.SetVariable("save_blob_choices", "T") api.SetRectangle(37, 228, 548, 31) api.Recognize()
ri = api.GetIterator() level = RIL.SYMBOL for r in iterate_level(ri, level): symbol = r.GetUTF8Text(level) # r == ri conf = r.Confidence(level) if symbol: print(u'symbol {}, conf: {}'.format(symbol, conf), end='') indent = False ci = r.GetChoiceIterator() for c in ci: if indent: print('\t\t ', end='') print('\t- ', end='') choice = c.GetUTF8Text() # c == ci print(u'{} conf: {}'.format(choice, c.Confidence())) indent = True print('----------')