The Scholastic Commentaries and Texts Archive: Creating the Foundation of a Critical Corpus
Jeffrey C. Witt (Loyola University Maryland) | @jeffreycwitt
2017 MLA, Philadelphia, Pennyslvania, January 7, 2017
Slide Deck: http://lombardpress.org/slides/2017-01-07-mla
# Problems and Aspirations
## At the current time...
## The medieval corpus is an invisible tapestry
## We want to provide a scientific view of each thread within the context of the whole.
### A scientific view through computer assisted:
- Analysis
- Search
- Access
- Synthesis
# The Current Paradigm
## Text-as-Document
## The problem with the text-as-document approach
### Data Isolation: in print and on the web
**In sum**: research groups are currently
choosing the most inefficient way possible
to make data available on the web,
while leaving us with results that go barely
beyond the capabilities of the printed page.
# A New Paradigm
## Text-as-Network
Despite the inarguable benefits the Web provides, until recently the same principles that
enabled the Web of documents to flourish have not been applied to data. Traditionally, data
published on the Web has been made available as raw dumps in formats such as CSV or
XML, or marked up as HTML tables, sacrificing much of its structure and semantics. In the
conventional hypertext Web, the nature of the relationship between two linked documents is
implicit, as the data format, i.e. HTML, is not sufficiently expressive to enable individual
entities described in a particular document to be connected by typed links to related
entities.
Christian Bizer, Tom Heath, Tim Berners-Lee, "Linked Data - The Story So Far", http://tomheath.com/papers/bizer-heath-berners-lee-ijswis-linked-data.pdf
"Data is relationships." -- Tim Berners Lee, Ted Talk, February 2009
- WorkGroup
- Work
- The idea of Moby Dick
- Expression
- The idea of Melville's expression (as opposed to a screen play expression)
- Manifestation
- The idea of the 1959 edition of Moby Dick
- Item
- One physical copy of the 1959 edition in a particular library
- Transcription
- the idea of a digital transcription of the 1959 edition of Moby Dick
- includes properties like hasXML, hasJson, hasPlaintext, hasHtml
- Manifestation Surface
- the idea of page 1 in 1959 edition of Moby Dick
- Item Surface
- the physical page 1 in a particular copy of the 1959 edition of Moby Dick
- IIIF Canvas
- IIIF image annotation
- images taken of the physical page 1 in a particular copy
Creating the data set with RDF extraction
Critical Corpus Database Visualization
Building common libraries for accession the text-network
https://github.com/lombardpress/lbp.rb
###Suggested Readings
####Introduction
[http://lombardpress.org/2016/08/02/bcht-scta-lbp-overview/](http://lombardpress.org/2016/08/02/bcht-scta-lbp-overview/)
####Modeling and Workflows
[http://lombardpress.org/2016/06/12/DTS-modeling-proposal/](http://lombardpress.org/2016/06/12/DTS-modeling-proposal/)
[http://lombardpress.org/2016/08/09/surfaces-canvases-and-zones/](http://lombardpress.org/2016/08/09/surfaces-canvases-and-zones/)
[http://lombardpress.org/placing-medieval-texts-within-a-critical-corpus/](http://lombardpress.org/placing-medieval-texts-within-a-critical-corpus/)
####On Information sharing and linked open data
[http://lombardpress.org/2016/04/16/iiif-webmentions/](http://lombardpress.org/2016/04/16/iiif-webmentions/)
[http://lombardpress.org/2016/08/25/basel-workshop-report/](http://lombardpress.org/2016/08/25/basel-workshop-report/)
####On Peer Review
[http://lombardpress.org/2016/05/19/the-traveling-imprimatur/](http://lombardpress.org/2016/05/19/the-traveling-imprimatur/)
####On LombardPress Web
[http://lombardpress.org/bringing-ecodices-and-openn-together/](http://lombardpress.org/bringing-ecodices-and-openn-together/)
[http://lombardpress.org/conceiving-the-digital-critical-apparatus/](http://lombardpress.org/conceiving-the-digital-critical-apparatus/)
[http://lombardpress.org/demo-of-full-integration-of-iiif-images-into-lombardpress/](http://lombardpress.org/conceiving-the-digital-critical-apparatus/)