imglab.org > Database / Datasets

Database / Datasets

Last update: Dec. 18, 2020

The following database / datasets have been developed at our laboratory and made open to the public for research purposes.

JPSC1400 - Japanese Scene Character Dataset

This is a Japanese scene character dataset consisting of Hiragana, Katakana, and Kanji scene character images taken in real scenes in and around Sendai, Japan.

JPSC1400-20201218.zip (Rev.20201218, README)
(63.1MB, MD5SUM: e1b491acb8756302f4e92ea54b68c1a0)

Block-Based Ground Truth Dataset for Scene Text Detection

This dataset contains 4-class Ground Truth data for the natural scene images with text provided at http://www.cs.osakafu-u.ac.jp/document/ . This dataset is intended to be used for evaluations of block-based text detection algorithms.

scene-osakafu-u-GT4.tar.gz (Rev.20090417, README)
(4.9MB, MD5SUM: 0587aa7cedad42cd869e39d56fd04eef)

Block-Based Ground Truth Dataset for ICDAR2003 SceneTrialTrain Dataset

This dataset contains 4-class Ground Truth data of the natural scene images with text from the ICDAR 2003 Robust Reading Competition. The original image data can be found at http://algoval.essex.ac.uk/icdar/Datasets.html . This dataset is intended to be used for evaluations of block-based text detection algorithms.

ICDAR2003-SceneTrialTrain-GT4.tar.gz (Rev.20090417, README)
(61.4MB, MD5SUM: d161cae745923f828662ed3279588137)

imglab.org home