The Scholastic Commentaries and Texts Archive: Creating the Foundation of a Critical Corpus

Jeffrey C. Witt (Loyola University Maryland) | @jeffreycwitt

2017 SIEPM, Porto Alegre, Brazil, July 25, 2017

# I.A. The Basic Problem
Today, traditional mechanisms of publishing scholarly editions of medieval scholastic texts are no longer helping to advance scientific research. Instead, considered in light of what is now possible, traditional publishing workflows are preventing progress. These workflows are preventing collaborative aggregation, computer assisted analysis and discovery, open access, and data re-use.
When it is easier to access the manuscripts or early printed editions of Gregory of Rimini’s Lectura than the modern critical edition, it is clear that publication is failing in its fundamental task of making data **publically** available for research. ![rimini](
What is more, the fragmentation of these editions with different publishers, who prepare data according to proprietary and incompatible data standards, means first that the creation of an edition requires wasteful redundancies. Second it means that the aggregation and analysis of connected content is impossible.

A quick glance around the internet at research groups attempting to create a web presence for scholastic texts reveals that the paradigm of isolation and redundancy is alive and well on the web.

The result of this approach, when speaking either of printed books or webpages, is a world of data silos ![silos](
**In sum**: research groups are currently choosing the most inefficient way possible to make data available on the web, while leaving us with results that go barely beyond the capabilities of the printed page.
# II.B Why This Problem Exists

The problem lies in our understanding of the heart of what an edition is.

# II.C Why This Problem Should No Longer Exist
Today, we have mechanisms of recording the semantic meaning of a text and the relationships between texts that are not tied to any particular presentational form. ![xml]( We should be using these tools.
# II.D Why This Problem Persists In Web Form
The problem with simple webpage publication is clear from the name. It encourages us to continue to think about our texts as “pages” confined to the page. Such pages are only usable to us as already presented pages, and not as re-usable data.
A shift to a "text-as-network" paradigm will let us see our editions first and foremost as data structures that can easily be connected to other networks or manipulated into any form of presentation. ![web](
# II.E Why This Problem Is So Important To Solve
The medieval scholastic corpus is a vast internconnected tapestry. The problem is that the current publication paradigm makes it impossible for us to see this tapestry. No individual reader will ever be able to read the entire corpus or discover all the connections within it. If we are ever to achieve the goal of seeing the entire tapestry, we must find a collaborative solution that allows the small contributions of individual editors to be aggregated over time into an organized and navigable network.
The Good News: We are already on are way to solving it and we need your participation. Join and Support the SCTA Community!
# II.F The Aim of the SCTA
First: to form a community of people who develop and maintain editorial guidelines and best practices for the semantic markup of a text and the ontologies that organize these texts.
Second: to aggregate and organize this text data so that it can be offered back to the entire community as a service that can be used and re-used for traditional and untraditional purposes.
# The SCTA At the Present * Core Technical Team * Michael Stenskjær Christensen (University of Copenhagen) * Nicolas Vaughan (University of los Andes) * Jeffrey C. Witt (Loyola University Maryland) * Ueli Zahnd (University of Basel) * Editorial Board * Michael Stenskjær Christensen (University of Copenhagen) * Andrew Dunning (British Library) * John Slotemaker (Fairfield University) * Nicolas Vaughan (Universidad de los Andes) * Jeffrey C. Witt (Loyola University Maryland) * Ueli Zahnd (University of Basel) * Advisory Board * Monica Brinzei (Institut de recherche et d'histoire des textes, Paris) * Marc Ozilou (ICP/Paris IV-Sorbonne; Director of the Groupe de Recherche Pierre Lombard, GRPL) * Lydia Schumacher (Kings College London) * Jose Merinhos (University of Portugal) * Pascale Bermon (Laboratoire d’Études sur les Monothéismes) * Christophe Grellard (Ecole Pratique des Hautes Etudes) * José Filipe Pereira da Silva (University of Helsinki) * Samuel Huskey (University of Oklahoma) * Dot Porter (University of Pennsylvania) * Matthew Treskon (Loyola Notre Dame Library)
Archive Stats: * 2,755,088 toal assertions indexed. * 7,000,000+ words indexed and searchable * 78 texts indexed. * 15,763 file "items" indexed. * 14,300 quotations indexed. * 10,136 mentioned names indexed. * 2,599 mentioned works indexed. * And we're just getting started!
# 2. The SCTA in Practice

Establishing Guidelines

Semantic Editing according to Guidelines

Creating the data set with RDF extraction

Critical Corpus Database Visualization

## Lbp Web Interface
## Lbp Print Interface
## Mirador Interface
## Quotation Analysis
