Picturebooks
Computational historiography
On 21, Dec 2015 | In Picturebooks | By Chris Vitale
Mimno, David. “Computational historiography: Data mining in a century of classics journals” Journal on Computing and Cultural Heritage (JOCCH) 5(1), 2012
Referrer: Scott Dexter
Categories: digital humanities, distant reading, data mining, computer science, methodology
Annotation:
David Mimno follows in Franco Moretti’s footsteps in an effort to data mine a massive archive of classics jorunals. This distant reading is the preferred methodology for Mimno who has identified the ability close read such a wide array of documents as unrealistic. The paper discusses the use of computational tools that allow for the statistical analysis of the corpus. The work is explicitly complimentary to traditional scholarship. The collection that Mimno is working with has been OCR-ed from over twenty classical philology and archaeology journals. Outlining the tools used in statistically driven mining of texts, Mimno discusses tokenization, removal of stopwords, word distance and divergence, and topic modeling. The algorithmic representations of these computational methods are given as well as an introductory discussion of the ways they work and are used. Finally, Mimno presents his findings in the forms of graphics, topic models, and observations.