On the 24th of April, Carleton University is going to be hosting “Data Day”. This event will be centred around celebrating strategic development in Data Sciences. At this event their will be a poster competition for student researchers in all categories (including big data, Data Analytics, Social Sentiment Analytics, Business Analytics). Seeing as my project falls under the category of big data, Dr Graham alerted me of the competition and what I needed to do to compete in it.
In order to enter I had to first write up an abstract about what my project entails and how I am going about completing it. Below you will find my abstract and can read all about my project to date and what I am hoping to accomplish with it.
Data Mining THATCamp – Hollis Peirce, Graham Undergraduate Digital History Research Fellow,
Big data tools are not just for ‘big data’. In the humanities, they can provide a macroscopic view of patterns in materials that are otherwise difficult to analyze computationally. In this poster, I present the initial results of an analysis of the conversations at ‘THATCamp Accessibility 2012’, a conference held at Carleton in October 2012, using ‘overviewproject’, a system developed by data journalists for finding topics in data using term frequency-inverse document frequency methods.
THATCamp Accessibility 2012 was an ‘unconference’ (a series of free-form discussions) that explored issues of digital and physical access to humanistic research and materials. Sessions explored how digital tools help accessibility, designing accessible courses, digital museums and libraries, augmented reality, game based learning, and other ideas. These seminars were then recorded for future analysis. This project takes one of these conversations, on accessible museums and libraries, and analyzes it to identify underlying hidden themes and patterns of discourse.
Oral history normally transcribes the complete verbatim speech of a session; with these digital mining tools, we instead created an annotated bag-of-words by timestamp, analyzing this list via the overviewproject.org interface. While still in its early days, this project promises to accelerate the analytic step in oral history practice, removing one element of the subjectivity of oral history in favour of a dialetic between so-called ‘distant’ and ‘close’ readings of those transcripts.
So until then I will be continuing to transcribe the conversation to improve results.