Connecting an Open and Decentralized Text Corpus with the Scholastic Commentaries and Texts Archive


Jeffrey C. Witt (Loyola University Maryland) | @jeffreycwitt


Münster, German, October 18, 2018

Slide Deck: http://jeffreycwitt.com/slides/2018-10-18-munster2

https://creativecommons.org/licenses/by-nc-sa/4.0/

* We want this data to be able to be created in a decentralized fashion. * That is, there is no requirement that data be made in this program or in this platform. It should be able to be created anywhere.
* We want this data to be published (made accessible) in a decentralized fashion. * That is to be useful, data does not have to live in a particular place, one does not have to ask permission to make it available or to access and use it.
* Despite this decentralization, we want this data to be able to be aggregated and indexed so that we can construct connections between texts and promote discovery across the corpus. * In sum, we want this decentralized data to be enhanced by the results of aggregation
* And we want this enhanced data to be made available according to general standards and APIs that existing presentation clients can understand and consume. * That is, we want to make enhanced data available in ways that client applications understand so as to dramatically reduce the cost of digital presentation of this data and the maintenance of these presentations.
## Data Creation
## Data Standards
Text Edition Schema: LombardPress-Schema [https://github.com/lombardpress/lombardpress-schema](https://github.com/lombardpress/lombardpress-schema) Expression Description Files (EDFs) [https://github.com/scta/edf-schema](https://github.com/scta/edf-schema) Transcriptions Description File (TDFs) [https://github.com/scta/tdf-schema](https://github.com/scta/tdf-schema)
## Data Aggregation
## Aggregating from distributed data sources [https://github.com/scta/scta-rdf/tree/master/data](https://github.com/scta/scta-rdf/tree/master/data) * Anyone can run an scta aggregator and data store. * Data integrity is maintained by community standards rather than central control * Other communities can re-use the aggregator for different corpora, simply by preparing their data appropriately, and then pointing the RDF Aggregator to new data sources
### RDFS Schema (Defining a Linked Data Graph) #### Transition from text-as-document to text-as-network paradigm [http://scta.github.io/scta-rdf-schema/](http://scta.github.io/scta-rdf-schema/)
### Text-Network as a Matrix of Hierarchies * OHCO, Order Hierarchy of Content Objects * FRBR, Functional Requirements for Bibliographic Reference * StructureType * ExpressionType
# Material Hierarchies * Parts * FRBR, Functional Requirements for Bibliographic Reference * StructureType
### APIs and Presentation Clients * SCTA Core * SPARQL Endpoint * Lbp.rb Library * IIIF API * CSV API * Simple Presentation API * OAI PMH * Peer Review as an API
### SCTA Linked Data Core API $ curl -iH "Accept: text/plain" https://scta.info/resource/lectio1 $ curl -iH "Accept: text/turtle" https://scta.info/resource/lectio1 $ curl -iH "Accept: application/rdf+xml" https://scta.info/resource/lectio1 $ curl -iH "Accept: application/json" https://scta.info/resource/lectio1
## SCTA SPARQL Endpoint ``` SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10 ``` [http://sparql.scta.info/ds/query?query=SELECT+%3Fs+%3Fp+%3Fo%0D%0AWHERE+%7B%0D%0A%09%3Fs+%3Fp+%3Fo%0D%0A%7D%0D%0ALIMIT+10%0D%0A&output=text&stylesheet=](http://sparql.scta.info/ds/query?query=SELECT+%3Fs+%3Fp+%3Fo%0D%0AWHERE+%7B%0D%0A%09%3Fs+%3Fp+%3Fo%0D%0A%7D%0D%0ALIMIT+10%0D%0A&output=text&stylesheet=)
# Lbp.rb
### Examples of Client Using SPARQL Endpoint #### RCS [https://rcs.philsem.unibas.ch/](https://rcs.philsem.unibas.ch/)
### Examples of Client Using SPARQL Endpoint #### LombardPress-Web Presentation Client (http://scta.lombardpress.org)
### Examples of Client Using SPARQL Endpoint #### Ad fontes (http://lombardpress.org/adfontes)
# IIIF API [http://iiif.io](http://iiif.io) [https://scta.info/iiif/scta/collection](https://scta.info/iiif/scta/collection)
### Example of Client Using IIIF API #### Mirador
### Linked Data Notifications and IIIF Example manifest from the BSB [https://api.digitale-sammlungen.de/iiif/presentation/v2/bsb00103424/manifest](https://api.digitale-sammlungen.de/iiif/presentation/v2/bsb00103424/manifest) Example SCTA transcription via IIIF API [https://scta.info/iiif/hiltalingencommentary/clm26711/layer/transcription](https://scta.info/iiif/hiltalingencommentary/clm26711/layer/transcription)
### More Connections with Linked Data Notifications http://lombardpress.org/2017/01/24/linking-research/
# CSV API [https://scta.info/csv/scta](https://scta.info/csv/scta)
# Simple Presentation API [https://scta.info/api/presentation/1.0/plaoulcommentary](https://scta.info/api/presentation/1.0/plaoulcommentary)
### Example of Client Using Simple Presentation API #### Pellego [http://lombardpress.org/pellego/](http://lombardpress.org/pellego/) [http://jeffreycwitt.com/pellego-demo/](http://jeffreycwitt.com/pellego-demo/)
# OAI-PMH API [https://scta.info/oai](https://scta.info/oai) [http://oaipmh.ekt.gr/#ListRecords](http://oaipmh.ekt.gr/#ListRecords)
### Example of Client Using Simple Presentation API #### World Cat [http://www.worldcat.org/search?q=on:DGCNT+http://scta.info/oai+SCTA+MDSCT&qt=results_page](http://www.worldcat.org/search?q=on:DGCNT+http://scta.info/oai+SCTA+MDSCT&qt=results_page)
### Peer Review as an API [https://dll-review-registry.digitallatin.org/](https://dll-review-registry.digitallatin.org/) [https://dll-review-registry.digitallatin.org/docs/index.html](https://dll-review-registry.digitallatin.org/docs/index.html)