Artamène ou
le Grand Cyrus


Projets
CPEM

Le Règne d'Astrée
Molière 21

Navigation
 • Recherche de mots

 • Recherche de pages
 • Téléchargement

Texte
 • Synopsis
 • Partie 1
 • Partie 2
 • Partie 3
 • Partie 4
 • Partie 5
 • Partie 6
 • Partie 7
 • Partie 8
 • Partie 9
 • Partie 10
 • Illustrations

Encyclopédie
 • Sommaire
 • Nouveautés

Documents
 • Textes sources
 • Cartes
 • Bibliographie
 • Liens








 

   

 
accueil  |   projet   |   œuvre   |   édition   |   contacts     


MAKING THE “UNREADABLE” READABLE : THE LONGEST FRENCH NOVEL ON LINE


Madeleine de Scudéry’s novel, Artamène ou le Grand Cyrus, written between 1649 and 1653 is, by its length alone – over 8000 pages -- exceptional and perhaps unequaled in the history of the printed book.

The novel's complex structure and plot make it difficult to obtain an overall grasp of the work: there are more than four hundred characters and over one hundred settings. Artamène was conceived for public, non-linear readings. For all of these reasons, as well as de Scudéry’s encyclopedic ambition, Le Grand Cyrus offers a remarkable challenge for the creation of a new kind of website, a site that proposes the unabridged original text of this novel, whose importance is capital in the history of seventeenth century French literature. The goal of such a website is not only to ensure the survival of a work whose sheer size would make modern paper publication unwieldy, but also to create a reading tool appropriate for a textual mass of such an imposing scale.

If our customary model for literary reading – linear and intimate -- is inadequate for a work like de Scudéry's, the Internet seems to be able to provide an alternative mode of reading, somewhat analogous to that for which the novel was originally intended. In other words, the transposition from the printed medium to an electronic form should provide a reading experience closer to the seventeenth century model -- a public, oral reading -- and thus give us access to a text that, through the evolution of reading practices over the centuries, had become unreadable.

This is the goal of the «Artamène » project directed by Claude Bourqui and Alexandre Gefen (Université de Neuchâtel) and financed by the Fonds National Suisse de la recherche scientifique (National Swiss Fund for Scientific Research). Today we would like to discuss how, in the project’s initial phase, the TEI guidelines – beyond their well-earned reputation for reliability and stability -- allowed the optimal management of such an immense text, and, through the semantics that they permit, allowed the text to be processed in ways that would have been unimaginable with a tradition printed book.

1) The TEI and on-line publication

Our project, as we just mentioned, is based on a digital text using the TEI format, a standard whose use is not restricted to on-line publication.

One of our key initial requirements was to ensure enduring access to our work, regardless of the evolution of future computer technology and standards. For this reason we chose to use only open-source software and free and open standards recommended by the scientific and technical communities.

Enduring access is assured first of all because the rules governing the encoding of the digital text are explicitly laid out in what is called a DTD, a document established with XML, a normalized framework. Thus the text is essentially inseparable from a simple set of standardized operating instructions.

The second set of requirements were reliability and citability. The use of an image mode (which I’ll show you shortly) is the ultimate guarantee of the first of these. By «citability », we mean the possibility for scholars to refer to the digital text just as one would cite a paper version. You can see that our concern was not to simply reproduce de Scudéry’s novel on the Internet, but to help the Internet become a new kind of reference tool that would fit directly  into existing academic practices.

Last but not least, it was crucial to ensure that the text would be read – that is : to produce a universally accessible and useful version of the novel. XML allows texts to be easily converted to many different formats. For example, you can download Le Grand Cyrus from our website in several different formats, including an "ebook" version, a Microsoft Word version and a version ready to be printed.

These requirements lead us to adopt the following technical solutions:

a) We were able to avoid part of the complexity of the TEI guidelines by using TEI Lite, a simplified version of the standard that can nevertheless be extended.

b) The TEI guidelines are a standard for encoding text, but they cannot be directly used without specific conversion and presentation tools. Numerous frameworks exist for presenting XML, and most of them use the XSLT transformation language. However, we chose to use a technology known as Xpathscript, which is an extension of mod_perl for an AxKit platform. We adopted this solution primarily to simplify the deployment of our web server. XSLT would have been a technically « cleaner » solution, but would also have been much more difficult to implement.

Despite the fact that the TEI consortium provides ready-made XSLT style sheets, it remains extremely complex to develop applications in XSLT that go beyond the simple representation of texts. It is worth noting that a more practical alternative is on the horizon, with the introduction of native XML processing in the latest version (5.0) of PHP. This will allow programmers to integrate easier XML texts in dynamic web pages.

2) The TEI as a publishing tool

The TEI guidelines were initially developed as a tool for publishing texts, and it is in this spirit that we proceeded to use them.

a) Establishing the text

The techniques that the TEI proposes for managing different versions of a text are extremely rich, in particular the tools for transcribing manuscripts, and the <alt> tag that allows for alternative versions.

However, these tools are rather awkward for such a long text. Their use can be simplified, if the project is limited to the needs of a scholar, at least for texts that are readable after a simple transliteration from seventeenth century spelling to modern French.

The possibility of viewing the text in image mode suffices to eliminate any ambiguity, provided the transition between modes be fast enough to permit an instant comparison.

You will notice that this solution is powerful enough to eliminate the need to provide a non-transliterated version. It would be possible to maintain parallel versions, but technically difficult.

b) Annotation and commentary

The Artamène website is not intended to be a scholarly edition of de Scudery’s novel. Anyway on the Internet, annotation creates, as we know, numerous new problems which are compounded by the use of XML: the impossibility for contributors to insert their comments directly into the source file; the fact that, as everybody knows, scholar contributions often arrive in a slowy manner, and, above all, the immensity of the task. On the Internet, where we cannot know or control the reader’s entry point, annotating Le Grand Cyrus would require systematic notes, at least for the characters and settings.

Instead, to allow the development of critical discourse around Le Grand Cyrus, we opted for an external tool, based on the idea of the «wiki », which is currently the easiest way to create spontaneous encyclopedias of related notions, capable of representing overlapping ideas within a given conceptual universe. Our «encyclopedia of the world of Cyrus » allows free collaboration for a closed list of registered contributors. The entries of this encyclopedia are automatically furnished with lists of words and notions and can later be inserted into the main text. The XML presentation system then allows these notes to appear as text-bubbles, a technique that makes for a convenient transition between the text and its   commentaries, without interrupting the underlying visual coherence.

3) The TEI as a reading tool

The TEI guidelines enforce a separation between the text and its visual presentation. This separation enables a potentially infinite variety of presentation formats depending on the intended public or the desired use: discovery, reading, printing, comparison between the digital text and the image of the original printed book.

a) Navigation and reading

By giving the text a logical tree structure and enabling an easy segmentation that follows the different modes of enunciation and the choices of genre, XML is particularly useful for manipulating gigantic texts like Le Grand Cyrus. XML allows the reader to visualize and manipulate complex vertical narrative structures that are quite different from the chapter divisions of the modern novel. It is possible to conceive of a complex system of tags that mixes, in a hierarchy of <div> tags, real textual entities and arbitrary divisions. The only limit to this is approach is inherent to the markup rules of XML, that require a rigorous hierarchical structure in which the different branches of the tree cannot cross one another.

A navigation system that «unfolds » the leaves of the XML tree shows the power and the possibilities offered by XML. XML allows us to replace huge textual elements with brief summaries (there are three levels of summaries), thus affording us a comprehensive view and easy orientation in an  otherwise inextricable textual jungle.

As we can see, the combination of the TEI and a navigation interface designed by and for scholars can result in colossal gains in ease of reading while overcoming the obstacles presented by such an immense text.

4) The TEI as a tool for textual analysis

Since each textual element is «encapsulated » within a pair of XML  tags, the text can also be used as a database.

a) From traditional research tools to XML

Faced with such a massive text, a search engine seems perfectly justified. The relatively limited number of words and morphological varieties present in the text excludes any attempt at lemmatization (truncated forms are used instead, which are more than sufficient). For the same reason, a contextual search engine is indispensable, so we chose the Philologic engine, developed at the University of Chicago for the ARTFL project, a tool that is powerful and reliable, though specifically designed for the task at hand.

With a future version of the search engine, we will be able to search for a word in a particular generic or thematic context. This capability will become real when database software supports native XML queries.

Now we would like to briefly consider some of the possibilities offered by XML that we are currently working on and from which other projects using the TEI could perhaps benefit from.

b) Semantization and textual cartography

In the future, we will implement a series of cartographic tools on our website that will, for example, produce a graphic rendering of the location of all the dialogs, of all the occurrences of a given topic or narrative leitmotiv. A graphical representation using color codes will also be used in search engine results in order to provide rapid contextualization.

c) Semantization and reorganizing the text

Ultimately, the semantization done with the original coding of the text will allow us to define topographically distant compositional units. With these units, the user will be able to reorganize and restructure the text. We will then be able to create versions of the novel that bring together all of the dialogs, or all of the texts related to the “duel” or “jealousy” leitmotiv, etc. And based on these reorganizations, one could imagine further manipulations of the original material.

Conclusion

a) From an archival point of view, the TEI guidelines are an effective tool but rather difficult to implement. This difficulty is compounded if we take into account the complexity of XSLT display software and native XML databases. More traditional solutions – i.e. image files to refer to the original text, Perl or PHP parsers, or existing search and indexing tools -- provide a considerable time savings in development and implementation. The development of a "pure" XML scholar publishing system would only be feasible if the different TEI projects were to join together to build it.

b) From a scientific point of view, the primary innovation to look forward to is the possibility of treating a text as a database. This type of procedure, based on the definition of textual units (through the powerful annotation system authorized by the TEI and through the structural properties of text markup) will allow us to create much more refined search queries. The TEI could then be considered not just as a standard for preserving, presenting and searching static digital texts, but as a means for dynamically manipulating and analyzing texts. 

Claude Bourqui and Alexandre Gefen, Université de Neuchâtel (traduction anglaise : Joseph Fahey)




Haut de la page ]