Digital Humanities Day 3 (Notes)

The closing keynote, a rallying cry for a more global perspective (and action road-map), has come to an end. I need to put on a dress and go to the closing banquet, then pack my bag, have a short sleep, and head to the airport around 4:30 AM tomorrow.

I don’t really feel I have the mental energy to consolidate my thoughts about this amazing conference into anything useful, so I am going to once again just paste my barely-looked over  notes below. As often seems to be the case, the morning sessions seemed to get a lot more attention. This is on account of me, and not the presenters.


Anvil Academic – Forging New Traditions

  • @lisapiro @koreybjackson @MoodyFred
  • Governence and funding: Many universities and college contributing financially, in-kind, all sit on governing board
  • Liberal arts colleges – born-digital pedagogical materials
  • Digital Scholars, analog metrics: young scholars do compelling work, but as they work to advance their careers, everything is measured in an analog fashion. DH author’s predicament – authors need to do more traditional scholarly stuff for promotion and tenure. Will always choose to publish a monograph over cool DH project.
  • Some day – The Analog Humanities, with DH as the basic
  • What do they publish? Stuff that can’t be made into a monograph – data driven projects, multi-modal titles, networked authorship projects, interactive and media-rich educational resources.
  • Why publish it? Scholarly work is produced and consumed digitally, new form scholarship cannot be reduced to monograph form.
  • Need for a publisher in the digital world to fill the vetting and credentialing role publishers have traditionally filled – rigorous peer review and editorial processes.
  • What is publishing? Peer review, editorial services, distribution, impact metrics, imprimatur, cataloging and preservation.  (I would never trust a publisher to preserve!)
  • Criteria – Scholarly contribution, rationale for being digital, contribution to the state of digital art, searching for an appropriate balance between ‘ground-breaking’ and content and media use.
  • Platform-independence – works displayed in their native environments, marked as Anvil publications, list in Anvil catalogue, open licenses, open web platform
  • Short term – no hosting, preservation through archive – it
  • Long term – prove concept, secure more funding
  • Sustainability – 5 year plan, have 3 years of support now

eBook as Ecosystem of Dialogical Scholarship – Christopher P. Long

  • @cplong – professor of philosophy and classics, dean of undergrad
  • Research question – What is Socratic politics?
  • “I think that with few Athenians, I attempt the political art truly, and I alone now living do political things” – what is “do political things” when you don’t take office?
  • Not institutional power, but turning individuals to questions of justice, the beautiful, the good – erotic in the Platonic sense…something we’re attracted to because it’s elusive
  • The Digital Dialogue – trying to do research out loud, test new ideas in public in their very early stages, but written word too rigid, too easily citable. Digital Dialogue is a podcast – taught him to listen better, ask better questions,
  • A model that combines the virtues of digital and paper scholarship – but became clear that Platonic writing is a political art, Socrates speakes with interlocutors, Plato writes for reader. Writing analogous but different from oral tradition.
  • Phaedrus – readers have to bring the writing to life — Latour – “It is the reader who writes the text” “Written texts irreversibly dialogical”
  • Read Planned Obsolesce
  • Public annotations on his blog – new opportunities to learn from readers, create a collaborative community of readers
  • So – Reading is a deeply political activity

Joint and multi-authored publication patterns in the DH – Julianne Nyhan

  • Used Zotero to extract bibliographical data from Computer and the Humananities, Literary and Linguistic Computing, AAAGeographers
  • Computing in the Humanities –single authors hold steady over time – joint authorship and three-author papers went up
  • LLC  (post-Computing in Humanities)– decline in single author papers, significant increase in three-author papers
  • Gender in LLC – 73% male authors
  • Decline in single-author papers

Identifying the Real-time Impact of the Digital Humanities using Social Media Measures – Hamed Alhoori

  • How can social media tools support the scholarly communities?
  • Can track tweets easily – but what about blogs, Zotero use,
  • Research questions – how can we get an early indiciation of research?
  • Well-known impact measure – citations – but concerns about it – views, downloads, bookmarks, comments,
  • What about readership? Zotero, CiteULike, Mendeley
  • Can social reference management software be used to predict a ranking of a scholarly venues?
  • Citation higher than readership in older articles – but newer articles have higher readership than citations – possible prediction of future citation?


The Comedie-Francaise Registers Project

  • Hyperstudio – DH at MIt
  • Three integrated components – archive, faceted browser for searching, interactive data visualization tools
  • A case study in data visualization as part of historian’s research tool kit
  • Comedie Francaise – France’s premiere theatre troupe…documents previously stored at an archive in Paris, not digitized
  • Larger DH significance – how does machine reading enable new questions at scales previously unimaginable? What is the relationship between quant and qual analysis? (“toggling” between micro and marco). How can data viz be not just a tool for research (not just proving specific hypothesis), but for exploring data in a less directed way?
  • Archive – 1680-present interested in daily records of repertory and box office receipts
  • Visualization case studies – parallel axis graph – interaction between CF and larger events like death of King Louis XV – allows you to play with scale, elements, customize colour-coding, details of individual threads can be high-lighted. Can see gaps – one for king’s death – the other??
  • Theatre mapping – mapping ticket prices to plays and theatre diagram  – which kinds of tickets were showed at each play? socioeconomic status of audience
  • New browser tool coming this year! (Chris Dessonville – undergrad at MIT) –
  • MIT Hyperlab are looking at ticket prices for 18th C French plays – can see which shows brought in which audiences #DH2013

ChartEx: Discovering Spatial Descriptions and Relationship in Medieval Charters – Sarah Rees Jones & Helen Petrie

  • Funded under Digging into Data Challenge
  • Working with U of T!  – DEEDS Latin charters of English provenance.
  • Charters – early legal documents – give descriptions of people owning and occupying property.
  • Interested in urban topography.
  • Problem – how to find and organize thousands of charters chronologically but also spatially.
  • ChartEx project – three modules –
    • language processing, training computer to read transactions, identify, interpret, annotate
    • Analyzed documents should be sent to data mining module, which will try to map across collections
    • which will deliver up automatic interpretation to a ‘workbench’ for historian to work with.
    • Have already figured out place names and simple relationships – more complicated relationships still need some work – can they identify people across charters – i.e. is ‘Thomas son of Jeremy’ in 1407 charter also ‘Thomas son of Jeremy’ in 1409 charter?
    • ChartEx Virtual Workbench – we want to create a system that does not only allow historians to look at data but actually work with it, do the reasoning, annotation, sharing…impossible with paper
    • Contextual inquiry around user tasks: We need to know more than just the content of the document, need to know what cognitive reasoning tasks people are doing
      • Get people to do their work, video tape it, query them, “why are you looking at that date?” etc. – elicit the process
      • One historian identified a set of shops, another identified trends in witness lists.
      • From this figured out a set of user requirements – they’re searching for documents in collection, interacting with individual documents, then relating information between documents.
      • “all users are fickle”
      • ChartEx project – there is a way of describing our value and skill sets in a standardized way that is clear to non-historians – this project helped Sarah Jones figure this out

Dyadic Pulsations as a Dignature of Sustainability in Correspondence Networks – Frederic Kaplan

  • Paterns in in correspondence networks are of great interest in DH, mathematical methods have been used to identify signatures etc. A crucial variable is reply time between two exchanges – this helps us see prioritization etx.
  • Average response time is a general indication of the “health” of correspondence network
  • Sudden increase in response time often means the end of a discussion – usually an abrupt end.
  • Can we predict these events? – dyadic pulsation as a complementary measure to response time
  • ‘pulsation’ = creation of a new communication dyads – i.e. first direct communication between two users – ‘mutual pulsation’ means they have both initiated communication. Pulsation only happens when it’s new. A group that continuously integrates new members emits many dyadic pulsations
  • evolution in pulsation rhythms are earlier predictors of the evolution of group dynamics
  • For dyadic pulsations to be reliable, it is mandatory that they are resistant to spam, common in group discussion. What characterized spam? You don’t answer it. SO dyadic pulsation measures are spam-safe
  • General notion of ‘spamness’ beyond duality of spam/not – absense of dyadic pulsations charactdersizes a form of spamness in discussion groups
  • Now trying to build a machine approach to detect early signs of group disintegration – taking inspiration from spiking networks – “Pulsed Neural Networks”

Using the Social Web to Explore the Online Discourse Surrounding the South the Civil War – Simon Appleford

  • Clemson University
  • “The social web” – interested in many platforms – twitter, fb, instagram, blogs, microblogs, comments on news articles
  • Long tail of social media – most people in this space talk about strategy – how to get more likes, retweets, etc. – “long tail” should be the focus of academic communities – working out strategies to get content out of this noise, develop metrics, how to make that info and data meaningful
  • Against this backdrop – how to tease out information needed to answer compelling questions? Search tools, theories, discourse, analytics
    • Humanities can inform each of these points and computing can inform how the humanities thinks about them.
    • Problems – Twitter – limits a random search to 1,000 tweets, even then primarly based on simple keywords and hashtags – useful for contained events such as TV shows and conferences, how can you capture as much relevant information as possible? If people forget the hashtag, it’s lost in a search
      • Need to develop ways in which we can move beyond this, need to span computing and humanities disciplines, as well as other areas such as business and communication, to tackle that space
      • TAGS 5.0 – provides search interface, allows you to create a spreadsheet of tweets.
      • How to search “the south”? too many Souths! How can we filter out noise to create meaningful data set not defined by a hashtag.
      • 9 months ago Twitter changed terms of service and API – but model was helpful.
      • Created “Topic profiles” – keywords centred around a topic – “the South” OR “Southern” AND /OR /NOT
      • Tool – take data from social networks, upload the results and visualize. Import CSV, Twitter JSON, Activity stream (preferred – there is a program to change CSV to activity stream – more infor)
      • How to aggregate these vast quantities of data? How do we curate the data? WHO is responsible? LC has it all – but haven’t shared with researchers because they haven’t figured out these questions
      • Need to partner more broadly!
      • Tool will be released in beta in a few months

Expanding and Connecting the Annotation Tool ELAN

  • ELAN – The Language Archive is a unit in the Max Planck Institute for Psycholinguistics
  • Stores language-related resources for sharing and preservation, started tool-building unit in 2010.
  • ELAN – Elan Linguistic Annotator – tool for manual annotation of multimedia, produces xml files (.eaf), annotations are multi-tiered, available for Windows, Mac OSX and Linux, open source, written in Java (mostly). Current version = 4.6.1.
  • Main user communities – language documentation, sign language research, gesture research, multimodality research, behavioural studies etc. These uses reflect the focus of local research groups
  • Different modes – annotation mode, synchronization mode, segmentation mode, transcription mode
  • Recently added functionality that allows you to work on multiple documents
  • FLEx – Fieldworks language explorer – maps to ELAN
  • Started working on ways of doing semi-automatic annotation…incorporate existing software? (speech/non-speech, speaker clustering, gesture recognition)
  • Web services – connect desktop tool with onling resources, apply algorithms to text, audio and video resources

Advantage of incorporating web services – use of application and algorithms w/o hassel of installing and configuring, not available on your platform, can be distributed. Cons – need internet (linguistic


Open Notebook Humanities – Ryan Shaw and Patrick Golden (UNC Chapel Hill)

  • Goal: “to provide a medium by which much valuable information may become a sort of common property among those who can appreciate and use” – 1849
    • Not full articles, just short research notes, or questions they were struggling with
  • Common property – Zotero, cloud sourcing, open acces, annotations, collaborative authoring
  • This project is focused on notes, having worked a number of document-editing projects
  • Documentary editing case study – Emma Goldman Papers documentary editing project (late 19th early 20th century Russian anarchist)
  • Editors prepare collections of documents…find every paper by or about someone, then combine in a microfilm collection. Select the most evocative documents and provide context for them – produce a printed volume with footnotes, chronologies, explains people, places, events, images etc.
  • Workflow : Gather, contextualize select items, publish final product, repeat as funding allow
  • Final product looks great, hides the messy, chaotic work that goes into producing such a thing.
  • Problems – published volumes & necessary work are expensive, funding not always accessible
  • Lack of space for all footnotes – much research gets glossed over or excluded
    • Means less fact checking, more dead ends, tangential biographical details
  • Example of a research question: was asked if any other Lenin siblings were imprisoned besides his brother – he did a bunch of research and found that no, there weren’t  – but there’s no room for that kind of info in a book
  • Need an interface for taking notes like this. SO…
  • Interface for adding a note/question – “Were any other siblings jailed?”  Q status (open/closed/hibernating), related topics, assigned users. Can choose ‘add citation’ use Zotero metadata and a note (“this letter from Lenin indicates that no one else was in trouble”)
  • changes – blobs of free-text have become structured blocks, explicit linkable entities, open access instead of filing cabinets
  • Connections linking topics are free, indexed for anyone to see – especially useful for new research assistants, standardized records that cen be revisited over the course of a project, evidence of intense cholarship,
  • Data model – notes have sections, which may be associated with a document (metadata using Zotero server to store metadata, annotations, scans, transcripts). Notes and documents associated with topics – connect assertions to topics
  • Most difficult – modeling notes
  • –
  • Buit on Django, Google Refine, Haystack for full-text searching, Mozilla Persona for ID management,
  • Next step – better sorting, filtering and aggregating, improved naming control, creating temporal, geospatial and relational visualization
  • Import and export capabilities – JSON through their API, no support for converting to another system, field research, audio and video so bandwidth heavy)
  • Does it scale well to longer video? Yes, but need a powerful computer.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s