----------------------------------------------------- JPSC1400: Japanese Scene Character Dataset ----------------------------------------------------- 1. Release Rel. 20201218 (18 Dec. 2020) 2. Keywords Scene character, Japanese character, character recognition 3. Description This is a Japanese scene character dataset intended to be used for evaluating the accuracy of scene character recognition algorithms. This dataset was compiled in our research group (https://www.imglab.org/db/) and has been used in some of our work. We release it as an open dataset aiming to contribute to the research community. specifications: - Consisting of Hiragana, Katakana, and Kanji characters (1,400 character images and 523 classes). - A wide variety of sizes ranging from 25x25 to 616x710. - The images were taken in real scenes in and around Sendai, Japan. - Individual character images were manually cropped out from the natural images. JPSC1400 does not cover all characters and the average number of samples per class is limited due to the difficulties in data collection. Some Japanese characters are used only in some limited areas, and it is hard to collect all classes. 4. Files - label.txt: Text file in UTF-8 showing the ground truth labels. The first column indicates the identification numbers of the images. The second column indicates the character code in UTF-8. The third column shows the actual characters. - png/: Image files in Portable Network Graphics (PNG) format. - ppm/: Image files in Portable PixMap (PPM) format. 5. References [1] F. Horie and H. Goto, ``Synthetic Scene Character Generator and Multi-Scale Voting Classifier for Japanese Scene Character Recognition,'' Proc. of IVCNZ 2018, 2018. doi: 10.1109/IVCNZ.2018.8634801 [2] F. Horie and H. Goto, ``Japanese Scene Character Recognition Using Random Image Feature and Ensemble Scheme,'' Proc. of ICPRAM 2019, 2019. 6. Authors Fuma Horie Graduate School of Information Science, Tohoku University, Japan Email: fuma.horie.s3@dc.tohoku.ac.jp Hideaki Goto Cyberscience Center, Tohoku University, Japan Email: hgot@cc.tohoku.ac.jp LICENSE ----------------------------------------------------- Copyright (c) 2020 Fuma Horie All Rights Reserved. You may use, copy, modify, merge, and distribute this dataset without restriction and free of charge, subject to the following conditions. - The above copyright notice and this permission notice shall be included in all copies or substantial portions of the dataset. - Modification(s) made to the dataset and the reason(s) of the modification(s) must be clearly explained in a document, and the document shall be included in all copies or substantial portions of the dataset. THE DATASET IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND. THE AUTHOR(S) OR COPYRIGHT HOLDER(S) WILL NOT BE RESPONSIBLE FOR ANY DAMAGE CAUSED BY THIS DATASET. -----------------------------------------------------