Variants mining

A digital analysis of Authorial corrections.
Variants mining of Giacomo Leopardi.

Discover the project

About the project

In the context of the affirmation of Digital Humanities as an independent field and the subsequent application of computational methods to the analysis of literary texts, the project deviates from the widespread approach based on the representation of Authorial variants rather than on the much less practiced study of them.

The research purpose is to start from a critical apparatus of a manuscript to create an automatic system of detection for pre-established categories of Authorial corrections. The work will thus focus on the case studies of Promessi Sposi by Alessandro Manzoni for prose and Giacomo Leopardi’s Canti for poetry.

As a result, these newer techniques deepen the conventional philological reading, making it possible to gain linguistic and empirical insight into the creative process of an Author directly from the PDF version of the work. Such immediacy of the model will allow to widen the study to other Authors in order to make a complete analysis of the types of Authorial corrections and compare these different conceptions of writing statistically.

How to read the graph


In the graph below, two main categories of correction are represented in order to identify a stratification of the editorial process: “immediate variants” (IC), corrections made at the time of writing, inline and overwritten implicated and “late variants” (LC) corrections made shortly or long after the first draft. Moreover, the taxonomy of the aforementioned classes takes into account corrections characterized by multiple interventions in phases. Finally, insertions (INS) indicate an addition of substantial innovation or a linguistic gap with regards to the former system.
The numbers near the abbreviations define the level of complexity of each category, with 1 implying a single Authorial intervention and 2-8 the total number of sub-variants for the interventions in phases.