3.2.1. Bitmap Image Font Datasets
In AI’s illustrious journey, reading and generating letters has been its adorable kindergarten project. This involved early datasets such as MNIST (Lecun et al. 1998; Deng 2012), its Chinese handwriting counterpart CASIA (Liu et al. 2011), that consisted of images of the letters or fonts. Due to effortless data collection and preparation and the seemingly uncomplicated topological structure of the shapes, fonts have been employed generously for computer vision advances.
Despite this vital role in image recognition tasks, in performing font and glyph classification tasks with TMNIST (Magre and Brown 2022), or even glyph image generation tasks (Gao et al. 2019) such datasets are not examples of the real fonts. Even recent attempts introducing hierarchical structure and compositional aspects of the Korean alphabet Hangul (Ko et al. 2021) are dealing with glyph generation as a bitmap image generation task.
Since the current font industry standards (Wright April-June/1998; Korpela 2006; Yannis Haralambous 2007) employ glyph contour encoding in splines prevalently Bézier curves, we do not consider bitmap datasets suitable for training font generation models.