3.2.2. Real Font Datasets

With a small number of 46 standard fonts used in MF-Font (Campbell and Kautz 2014), Campbell, for the first time, has opened the problem of sequence regularisation and proposes normalisation techniques to match the number of sequences for each glyph in the dataset space. Instead of exploiting normalisation techniques to existing fonts, we propose creating a font library with a regularised number of the Bézier curves.

Later, with the growing confidence to investigate generative methods, Lopes et al. (Lopes et al. 2019) introduced a large-scale dataset of 14 M font characters called SVG-Fonts Dataset (Lopes et al. 2019). Since the dataset has been collected from the internet, the quality of the fonts is questionable, likewise their licencing and authorship. Instead of collecting a large-scale dataset with implicit problems, we strive for less amount sufficient for training, providing font examples with solid quality.

With more complex architectures, DeepSVG (Lopes et al. 2019) and later DeepVecFont (Wang and Lian 2021) used only a subset of the original SVG-Fonts Dataset (Lopes et al. 2019). Using SVG-Fonts Dataset (Lopes et al. 2019) consisting of random fonts from the internet Wang et al. (Wang and Lian 2021) as one of the first to mention the curve dislocation problem caused by seemingly arbitrary lengths of curve sequences. Instead of providing a font library with a random number of curves per similar topological structure, we aim to provide a dataset with a regularised number of sequences.

Relying on open-source libraries, Google Fonts (Google, n.d.) is integral in providing accessible datasets in font generation and classification projects (Nagata, Iwana, and Uchida 2023; Cao et al. 2023; Cho, Lee, and Choi 2022; Nagata et al. 2022; Yuan et al. 2021; Xie, Fujita, and Miyata 2021; Srivatsan et al. 2021). Likewise, similar libraries, including NAVER Fonts (Fonts, n.d.), focus on Hangul alphabets, and Fontworks (Inc., n.d.) focus on Japanese. Despite providing curated libraries of high-quality fonts and their accessibility to the research community, the nature of font libraries designed by many authors, they cannot aim for regulation of the sequential length.

Campbell, Neill D. F., and Jan Kautz. 2014. “Learning a Manifold of Fonts.” ACM Transactions on Graphics 33 (4): 91:1–11. https://doi.org/10.1145/2601097.2601212.
Cao, Defu, Zhaowen Wang, Jose Echevarria, and Yan Liu. 2023. SVGformer: Representation Learning for Continuous Vector Graphics Using Transformers.” In, 10093–102. https://openaccess.thecvf.com/content/CVPR2023/html/Cao_SVGformer_Representation_Learning_for_Continuous_Vector_Graphics_Using_Transformers_CVPR_2023_paper.html.
Cho, Junho, Kyuewang Lee, and Jin Young Choi. 2022. “Font Representation Learning via Paired-glyph Matching.” November 20, 2022. https://doi.org/10.48550/arXiv.2211.10967.
Fonts, Naver. n.d. “Naver Fonts.” Accessed January 13, 2023. https://hangeul.naver.com/font.
Google. n.d. “Google Fonts.” Google Fonts. Accessed January 26, 2023. https://fonts.google.com/noto.
Inc., Fontworks. n.d. “Fontworks.” Accessed January 13, 2023. https://en.fontworks.co.jp/.
Lopes, Raphael Gontijo, David Ha, Douglas Eck, and Jonathon Shlens. 2019. SVG-VAE: A Learned Representation for Scalable Vector Graphics.” In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 7929–38. https://doi.org/10.1109/ICCV.2019.00802.
Nagata, Yusuke, Brian Kenji Iwana, and Seiichi Uchida. 2023. “Contour Completion by Transformers and Its Application to Vector Font Data.” April 27, 2023. https://doi.org/10.48550/arXiv.2304.13988.
Nagata, Yusuke, Jinki Otao, Daichi Haraguchi, and Seiichi Uchida. 2022. TrueType Transformer: Character and Font Style Recognition in Outline Format.” March 10, 2022. http://arxiv.org/abs/2203.05338.
Srivatsan, Nikita, Si Wu, Jonathan Barron, and Taylor Berg-Kirkpatrick. 2021. “Scalable Font Reconstruction with Dual Latent Manifolds.” In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 3060–72. Online; Punta Cana, Dominican Republic: Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.emnlp-main.244.
Wang, Yizhi, and Zhouhui Lian. 2021. DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.” October 13, 2021. https://doi.org/10.48550/arXiv.2110.06688.
Xie, Haoran, Yuki Fujita, and Kazunori Miyata. 2021. “Learning Perceptual Manifold of Fonts.” June 16, 2021. https://doi.org/10.48550/arXiv.2106.09198.
Yuan, Ye, Wuyang Chen, Zhaowen Wang, Matthew Fisher, Zhifei Zhang, Zhangyang Wang, and Hailin Jin. 2021. “Font Completion and Manipulation by Cycling Between Multi-Modality Representations.” August 29, 2021. https://doi.org/10.48550/arXiv.2108.12965.

Citation

If this work is useful for your research, please cite it as:

@phdthesis{paldia2025generative,
  title={Research and development of generative neural networks for type design},
  author={Paldia, Filip},
  year={2025},
  school={Academy of Fine Arts and Design in Bratislava},
  address={Bratislava, Slovakia},
  type={Doctoral thesis},
  url={https://lttrface.com/doctoral-thesis/},
  note={Department of Visual Communication, Studio Typo}
}