🌑

Stephen's Blog

Table Detection with Detectron2 & Mask R-CNN

 

Stephen Cheng

Intro

Detectron2 is Facebook AI Research’s new software system that implements state-of-the-art object detection algorithms. It is a ground-up rewrite of the previous version, Detectron, and it originates from Mask R-CNN.

Table detection is a crucial step in many document analysis applications as tables are used for presenting essential information to the reader in a structured manner. It is a hard problem due to varying layouts and encodings of the tables. Researchers have proposed numerous techniques for table detection based on layout analysis of documents. Most of these techniques fail to generalize because they rely on hand engineered features which are not robust to layout variations. In this post, we propose a detectron2 based method for table detection.

Why use detectron2?

  • It is powered by the PyTorch deep learning framework.
  • It Include more features such as panoptic segmentation, Densepose, Cascade R-CNN, rotated bounding boxes, PointRend, DeepLab, etc.
  • It can be used as a library to support different projects on top of it.
  • It trains very faster.
  • The Models can be exported to torchscript format or caffe2 format for deployment.

How to implement?

The implemented CODE contains THREE parts:

  1. Create custom COCO dataset

You can run the voc2coco.py script to generate a COCO data formatted JSON file.

1
python voc2coco.py ./dataset/annotations ./dataset/coco/output.json

Then you can run the following Jupyter notebook to visualize the coco annotations.

COCO_Image_Viewer.ipynb

  1. Training
1
python table_detect_train.py
  1. Evaluation
1
python table_detect_test.py

, — Nov 22, 2020

Search

    Made with ❤️ and ☀️ on Earth.