While the survey on the application served as the white rabbit at the
beginning of the journey, the survey on datasets introduces the subject
that is entering the rabbit hole.
The investigation delves into the essence of machine learning’s
data-driven nature, where models digest information to comprehend
domains. Type design manifests through its tangible output: fonts. This
study scrutinises the datasets employed in projects documented within
the literature database, revealing an intricate dialogue between type
designers and machine learning practitioners. Their mutual fascination
with letterforms suggests a shared territory waiting to be explored.
The investigation charts two distinct data landscapes: the realm of
bitmap datasets, where fonts exist as digital imprints, and the domain
of real fonts datasets, where typefaces retain their vector
heritage.
Fonts, Naver. n.d.
“Naver Fonts.” Accessed
January 13, 2023.
https://hangeul.naver.com/font.
Gao, Yue, Yuan Guo, Zhouhui Lian, Yingmin Tang, and Jianguo Xiao. 2019.
“AGIS-Net: Artistic Glyph Image
Synthesis via One-Stage Few-Shot Learning.” ACM Transactions
on Graphics 38 (6): 185:1–12.
https://doi.org/10.1145/3355089.3356574.
Google. (2015) 2023.
“Google Fonts Files.”
Google.
https://github.com/google/fonts.
Inc., Fontworks. n.d.
“Fontworks Fonts.”
GitHub. Accessed January 13, 2023.
https://github.com/fontworks-fonts.
Ko, Debbie Honghee, Hyunsoo Lee, Jungjae Suk, Ammar Ul Hassan, and
Jaeyoung Choi. 2021.
“Hangul Font Dataset for Korean Font Research
Based on Deep Learning.” KIPS Transactions on Software and
Data Engineering 10 (2): 73–78.
https://doi.org/10.3745/KTSDE.2021.10.2.73.
Liu, Cheng-Lin, Fei Yin, Da-Han Wang, and Qiufeng Wang. 2011.
“CASIA Online and Offline Chinese Handwriting
Databases.” In, 37–41.
https://doi.org/10.1109/ICDAR.2011.17.
Livezey, Jesse A., Ahyeon Hwang, Jacob Yeung, and Kristofer E. Bouchard.
2021.
“Hangul Fonts Dataset: A
Hierarchical and Compositional Dataset for
Investigating Learned Representations.” June 9,
2021.
https://doi.org/10.48550/arXiv.1905.13308.
Lopes, Raphael Gontijo, David Ha, Douglas Eck, and Jonathon Shlens.
2019.
“SVG-VAE: A Learned Representation
for Scalable Vector Graphics.” In
2019
IEEE/CVF International Conference on
Computer Vision (ICCV), 7929–38.
https://doi.org/10.1109/ICCV.2019.00802.
Magre, Nimish, and Nicholas Brown. 2022.
“Typography-MNIST: An MNIST-Style Image
Dataset to Categorize Glyphs and
Font-Styles.” February 12, 2022.
https://doi.org/10.48550/arXiv.2202.08112.
Virkus, Dusk. (2022) 2022.
“Dafonts Free
Dataset.” https://github.com/duskvirkus/dafonts-free.