The Scholastic Commentaries and Texts Archive:
advancing research through a connected corpus.
Jeffrey C. Witt (Loyola University Maryland)
https://jeffreycwitt.com | jcwitt@loyola.edu
@jeffreycwitt
March 12, 2023, Evergreen Museum & Library, Johns Hopkins University, Baltimore, MA
# The Importance of a Machine Accessible Corpus
# Print Editions
## vs.
# Digital Editions
# ~~Print Editions~~
## ~~vs.~~
# ~~Digital Editions~~
---
# Data
## vs.
# Presentation
# New Discoveries
# Increased Transparency
The field needs a revolution in its expectations for the publication of critical editions. Regardless of how a text is presented, the field should expect a separate and distinct publication of data.
This expectation should be buttressed by new kinds of academic societies that take on the burden of defining the standards of how that data is formatted and organized.
1. Create and maintain data standards for the scholastic corpus and related texts
2. Aggregate data published according to the SCTA standards and generate new data insights from the aggregate
3. Publish this aggregated data for downstream use (e.g. data-analysis and data-visualizations such as print or web renderings)
# We need a GPS for Text "Manifesting" Manuscripts
# Text Hierarchy as the GPS coordinates of our Text "Manifesting" Manuscripts
# Discovery of Citation Networks
## Through Bi-Directional Links
## What if we combine this citation network
## with our text-to-image network?
# Discovery of Successive Re-Use
## through n-gram similarity detection
“Petrus Gracilis...followed not only the footsteps but the very phrases of Hiltalingen in a way so deceptive that it does not cast the best light on Gracilis. He read secundum Hiltalingen without ever mentioning him. Only by a **lucky coincidence** [emphasis mine] was I enabled to "unmask" Gracilis' dubious literary honesty. (See Trapp, Damasus, "Augustinian Theology of the 14th Century," Augustiniana 6 (1956): 147-274, p. 254.)
"The cat is on the mat"
---
4-grams
---
"the cat is on"
"cat is on the"
"is on the mat"
![intersection](https://s3.amazonaws.com/lum-faculty-jcwitt-public/2023-02-01/image5.png)
Similarity = X is similar to Y, if and only if
$$ \\#\\{ a | \forall{ng}\forall{x}\forall{y}(IsFoundIn(ng,x) \land IsFoundIn(ng,y) \land x \neq y \\} >= n $$
where n = 6