3.3.1. Brief Journey Into Numerical Representation

In the machine learning pipeline, data encoding serves as the crucial bridge between domain-specific data formats and the numerical representations required by learning algorithms. Consider the initial state of domain data, nestled comfortably in its native habitat.

The state of data exists in various domain-specific formats, each carrying its own conventions and structural peculiarities. The following table illustrates common domains and their corresponding native formats:

Table presents the native representations of the data in the machine learning pipeline

This native representation undergoes an encoding process to produce a numerical representation suitable for machine learning algorithms, typically stored in .npy or .npz files. For deep learning algorithms, these numerical representations manifest as tensors with specific encoding modalities:

Table presents the encoding modalities of the data in the machine learning pipeline

Once encoded, this numerical data serves as the foundation for model training. Based on the encoding modality, specific neural network architectures are chosen to recognise patterns and abstractions within the data – not unlike how a type designer learns to spot the subtle differences between Helvetica and Arial.

Figure illustrates the data encoding pipeline in machine learning.

The above-mentioned content presents a deliberately simplified overview of the machine learning pipeline, whilst the actual implementation harbours considerably more technical nuance and architectural sophistication.

@phdthesis{paldia2025generative, title={Research and development of generative neural networks for type design}, author={Paldia, Filip}, year={2025}, school={Academy of Fine Arts and Design in Bratislava}, address={Bratislava, Slovakia}, type={Doctoral thesis}, url={https://lttrface.com/doctoral-thesis/}, note={Department of Visual Communication, Studio Typo} }

3.3.1. Brief Journey Into Numerical Representation

License

Citation