Mining and Counting Files

In the most recent tutorial lesson from the Programming Historian (http://programminghistorian.org/) we learned all about how to mine and count through files using the Bash Command Line. In a dramatic turn of events over the past two weeks I have been gaining more confidence when going through these tutorials. The main reason for this is that I have been moving cautiously in order to ensure that I do not skip over crucial steps (and I have learned that when using the command line EVERY step is crucial).

In addition, I have stuck with it so that certain basic steps, such as navigating through the computer on the command line, has come to feel more and more natural. More importantly though I have taken to writing down every step that I take. This has helped immensely as I have found that it forces me to think outside the box. What is meant by this is that I feel as though I am talking to myself when doing so, which allows me to more easily see where I am making mistakes.

Taking down notes helped me greatly when it came to getting through my last tutorial. This tutorial was all about learning how to instruct the computer to go through specified files and count certain things, such as number of words, or mine through it and tell you how many times certain words or numbers came up. The lesson also included instructions on how to create a subdirectory and how to move your results into that subdirectory. To get a better sense of what is meant by this please take a gander at my notes and let me know if there is anything that I can be doing more efficiently.

Digital History – Research Data with Unix

- the unix shell gives you access to a range of different commands that help you mine and count through research data

- options for counting and mining data though does depend on the amount of metadata or file names given to you

- in order to get the most out of the unix shell it is important to remember to take the time to structure your filing system.

- dowloaded the files to proghist-text successfully and now am about to open in the command line

- Note: CSV files are those in which the units of data (or cells) are separated by commas (comma-separated-values) and TSV files are those in which they are separated by tabs. Both can be read in simple text editors or in spreadsheet programs such as Libre Office Calc or Microsoft Excel.

- to count the contents of a file enter the command: wc -w “name of file” (worked correctly)

- if you want to know the number of lines instead of an actual word count, type wc -1 “name of file”

- in addition if you want to know a character count enter command: wc -m “name of file”

- ALL OF THESE COMMANDS ARE NOT CASE SENSITIVE

- the most frequent and useful use of the wc command for digital historians is to compare and contrast sizes of a source in digital format.

- wc can also be utilized with other wildcards like * which is an even easier way to compare multiple sources of research data.

- for instance wc -l 2014-01-31_JA_a*.tsv or wc -l “file name”_”file name”*.tsv

- REMEMBER THAT IT IS A SMALL “L” NOT A “1”

- if you wish to get the data put in a new file rather than just appearing in the terminal screen use the >

- for instance wc -l “file name”_”file name”*.tsv > results/”file name”_”file name”_wc.txt

- this will send the results to a newly created file in a subdirectory called results

- As well as counting files, the unix shell can mine through files using the grep command

- For instance you can enter grep “string, or character clusters” (in this case 1999) *.tsv so: grep 1999 *.tsv

- If you add -c to the command it prints how many times the given character cluster or string appears in a given file. In this case grep -c 1999 *.tsv

- Just like earlier you can export this to a brand new file in the results subdirectory in this case though it would look like grep -c 1999 2014-01-31_JA_*.tsv > results/2014-01-31_JA_1999.txt

- It does not need to mine for numbers alone as it can also mine for words

- To do this you simply need to put the word that you are mining for after the flag -c

- So if you were looking for the word “revolution” it would look like this: grep -c revolution 214-01-31_JA_america.tsv 2014-02-02_JA_britain.tsv

- I tried this and did not succeed, BUT i realized that it didn’t work because I didn’t get the file name correct! THIS IS CLEARLY IMPORTANT

- I kept getting the no such file or directory even with the correct file name so i am trying to go back a file, perhaps i am not in the correct directory

- IT WORKED!!! I was just in the wrong directory

- You can “i” flag after the “c” flag to go through the query again and this time have results prints results that are case insensitive, so for example -ci revolution will also pull out results for both “revolution” and “Revolution”. THIS WORKED!

- You can also move these numbers into another file like the other previous example from earlier.

- grep can also create subsets of tabulated data.

- for instance grep -i revolution 2014-01-31_JA_america.tsv 2014-02-02_JA_britain.tsv > 2016-02-12_JA_america_britain_i_revolution.tsv (this worked just fine once i actually included all the information

- Am going to skip the rm step because i am nervous as to what i will erase…

- continuing on though i am adding on the -v on to the command to exclude certain data elements

- you can also transform different files into different platforms using the > flag

Summary

Within the Unix shell you can now:

- use the wc command with the flags -w and -l to count the words and lines in a file or a series of files.

- use the redirector and structure > subdirectory/filename to save results into a subdirectory.

- use the grep command to search for instances of a string.

- use with grep the -c flag to count instances of a string, the -i flag to return a case insensitive search for a string, and the -v flag to exclude a string from the results.

- combine these commands and flags to build complex queries in a way that suggests the potential for using the Unix shell to count and mine your research data and research projects.

Outline of Final Project

For my final project for Digital History (#hist5702w) I am building a game on Twine that takes themes from my honours thesis and puts them into a real world situation. The real world situation that I am choosing to work with is my own. So far it is a very basic concept in my head but I am hoping that it will develop into a game that will allow the player to gain a different perspective on how an individual with a physical disability gets through a post-secondary career.

To accomplish this goal I will create different options for the player/student to choose that will affect both the outcomes of their post-secondary career and their health. These options are based on my own experiences that I have gone through here in my career at Carleton University. Different options will include what format of textbook the student uses to whether or not they take notes in class themselves or have a note taker in class.

Again, this game will be created on Twine. At the present moment I have little experience with the program, for instance I know how to do basic things like create new links but I am hoping to work through the story and then perhaps work on background and other more advanced features later. For now though I will just continue on until the actual story is complete.

Recognizing & Solving Problems

Over this semester I have been learning quite a few new things. First, I have learned how to use new digital tools that will, although it may not seem like it now, help me in complete future research more efficiently. Second, how to put my incomplete work online (as discussed in my previous post). Finally though, I have learned that I don’t need to suffer on my own. My mind was blown by this idea. “What?!?! Dr Graham you are saying that we can share our issues with other people and ask for help?” No word of a lie that was a hard concept to wrap my head around.

Therefore, following these along with these lessons, I am moving forward and using them to help solve problems I am having. For example, I am writing down every single thing I do for each tutorial in Notational Velocity, one of the many new tools that I have been turned on to as I slowly gain more knowledge about its usefulness. Also, I am posting all my notes on this blog, below you can see the notes that I took yesterday (February 9, 2016) when I attempted the Command Line Tutorial for a second time.

When doing this tutorial for a second time I was able to complete it successfully despite the couple times that I thought I had done something wrong, I was later assured that I had not done anything wrong. Check them out and let me know if you have any suggestions.

Digital History – Command Line Tutorial Notes 2

- typed in pwd command to orient myself then hit the ls command to get a listing of the files and directories within my current location which is /users/hollispeirce1. These are:

ApplicationsDownloadsMoviesPublicprojects   Desktop   Dropbox      Music       Sites python    Documents         Library Pictures mallet-2.0.7

- Flags: these are additions to a command that provide the computer with a bit more guidance with what sort of output or manipulation that you want

- Playing around with changing files and directories but for some reason I still don’t understand how to move into files that have two words or more in the title. Tried _ , – , and . .

- Figured out how to do it! to get into files with two or more word titles i just have to use “quotation marks”

- Succeeded in using the -l flag to get more information on the main files and directories

- Adding an h to the -l flag (-lh) commands the computer to display the sizes of the files in a smaller format to make up room

- successfully moved straight to mallet by typing: cd /users/hollispeirce1/mallet-2.0.7

- i also was able to open mallet by typing: open . once i was in mallet and the window opened up

- created a new directory on the desktop called ProgHist-Text by entering the command: mkdir ProgHist-Text

- can now move in and out of it as desired and successfully moved into it using the auto-complete with the tab button remember though auto-complete is case sensitive

- figured out how to read a file on the command line by typing the command: cat name-of-text.txt

- When I hit the up arrow it cycles through the most recent commands and the down arrow goes through the commands in the other direction

- successfully duplicated the file by using the command: cp name-of-text.txt name-of-text2.txt

- and moved it into a smaller one with: cat name-of-text

- to open vim and edit a txt file in terminal enter the command: vim name-of-text.txt

- to man an edit enter the a flag which allows you to edit the text and press escape to go back to reading

- to save anything in vim type : and hit enter then type w and press enter

- to leave vim type : and hit enter then press the q button

- you can also combine these two like all other command BUT WATCH OUT AS YOU CAN QUIT WITHOUT SAVING SO IF YOU DO THIS ENTER wq

- create a back up before moving a file by entering cp file-name.txt file-name-backup.txt

Posting Notes Online

As mentioned in the post “Learning To Share Your Work”, I have learned from Dr Graham that there is a great benefit to posting your finished work online. However, it is also being pointed out in his class now that there is a great benefit to posting your notes online as well. This way you can put your frustrations out there and allow your fellow academics to have the opportunity to help you. Perhaps someone in the community has had similar issues as you and has figured out a solution to that very same problem. I must admit I was a little nervous about posting my finished work online let alone my gibberish notes that I take when working. This is even more nerve-racking as it would seem to me that my notes would only make sense to me.

I must admit though that the concept does seem to make sense. So here goes nothing! These notes are what I wrote down thus far when completing the Programming Historian’s tutorial on how to use the Bash Command Line.

Digital History – Command Line Tutorial

First command worked fine, pwd brought up: /Users/hollispeirce
– Had trouble with the next 1s command
– for some reason it said “command not found”
– No matter what I do to get it to tell me the cd desktop command it says permission denied
– So apparently I was hitting the wrong thing it was “ls” not “1s”
– Learning the hard way that command line only handles one step at a time
– Entered “ls” again and it for some reason gave me a huge list of where i was
– now for some reason my print working directory (pwd) has changed
– Figured out to do absolutely everything separately and it is fairly easy to navigate around
– Following instruction implicitly and being sure to check where I am by typing “ls” to keep track of where I am is important
– Creating new files worked just fine now to try copying files
– A little confused by copying files as it has told me no such file or directory
– seem to have figured it out but not sure how i did it… ASK ABOUT THIS
– if you delete an item in the command line it is gone for good unlike simply moving an item to the trash bin
– deleted a file successfully but now need to look at deleting directories
– for some reason it keeps telling me “-rm: command not found” after typing “-rm -rf anotherdir/”
– am so confused by how to download a developer package.
– installed brew but forgot to press Return to complete the download
– so i tried re entering the download but it keeps saying “400 bad request” so i am reverting back a few steps and figuring out python
– saved python snippet to digital history folder for now as i cannot locate python
– managed to download the most up to date python from https://www.python.org/
– have successfully created a new directory to save python files into
– and i have successfully saved the snippet inside of it
– Learning thanks to the Bash tutorial what a “flag” is when dealing with the command line
– * flag makes the command line display the directory as a list of text files
– Tried to multiple of files and exclude others with the command: ls *-Scan1.jpeg , Scan2.jpeg , Scan3.jpeg but it failed.
– Am going to try to solve problem by moving commas around
– Didn’t work. THIS IS SOMETHING TO ASK ABOUT WHEN THIS TUTORIAL COMES UP IN CLASS
– Not understanding the difference between the basic ls command and adding -1 or also h along with it as for me it just displays the directories in a list without the additional information it is supposed to
– IMPORTANT NOTE!!!!! COMMAND cd — WILL BRING YOU RIGHT BACK TO THE STARTING POINT ALMOST LIKE A RESET BUTTON
– Having trouble moving up and down through my file systems with cd whatever so i will have to investigate this further
– I tried jumping directly to my desired directory or file instead by typing in /users/hollispeirce1$/whatever but Terminal told me the same thing as it did with the other strategy: no such file or directory
– tried using the command: open whatever file . but it stated that that was not a line
– so i also tried another file with a one word title and it worked just fine so it may be that i am not writing the other titles correctly
– tried another file with a one word title and it worked just fine so that is definitely the problem
– using the tab button after writing half of the file name will prompt it to attempt auto complete and by using its subdirectories or files in the current directory
– THIS IS CASE SENSITIVE
– Managed to download and save War and Peace from the gutenberg project website but for some reason it would not open up when i followed the instructions as directed by the tutorial INVESTIGATE THIS FURTHER
– did not get anywhere because of this with editing a file in the command line because of the above problems

Let me know if any of that makes sense, or if you know how to solve any of my problems!

Under the Wire!

After great deal of stress I can safely say that my graduate school application has been filed. Over the majority of late November, early December and the holidays I slowly gathered my materials and sorted through this arduous process. While it was stressful it was very helpful to have the full support of the History Department.

It is not surprising the amount of forms and papers that must be submitted during the process. What is surprising though is how complicated submitting the actual application is. At first glance it appears as though you just get your papers ready and then upload them all in one place. Then again, it could have just been my naive brain tricking me (yet again). Nevertheless, with a little over a week before the deadline I found myself in an absolute panic over learning where to actually upload the files and find out where I actually send the application fee. Again though, thanks to the great help of the department I was able to find answers.

In addition I met with the department’s current Graduate Studies Chair, Dr Lipsett-Rivera and Dr Graham to discuss my application. During this meeting I heard some very promising news that my lower grades from my earlier years (which were caused by lack of accommodation) will not hurt me as much so long as there is a general upward trend in my later years. This of course is no promise that I will get in to the program, but it was a relief to know that it is understood why those grades are where they are. Another interesting thing that came out of that meeting was a discussion over what possible TA roles I could see myself fulfilling.

To be honest being a TA has been on my mind over the past few years. At first I was a little hesitant as it seemed to be a daunting task to undertake on top of taking classes. However, over the later years of my undergrad, and now this year, I have gotten better and better at managing my time. Therefore I feel confident that I could take on a challenge like that. So during the meeting we agreed that I would be able to do such things as run tutorial groups, mark papers on the computer, or under Dr Graham’s suggestion run online forums.

So after a great deal of stress leading up to the later part of this week, I was able to successfully submit my application on the day of the deadline to qualify for funding. Now comes the hard part of waiting…

Revving Up To Apply To Grad School

This year has been filled with new adventures. Whether it be living on my own, taking a graduate level seminar, or even most recently taking a spontaneous trip to Toronto. All of these experiences have helped me learn to grab the bull by the horn and take advantage of every opportunity presented to me. What I cannot forget  is my major purpose for the year, which is to solidly transition into an independent life and apply for graduate school.

The former was my major focus of the first part of the year as I learned to adapt to living on campus and looking after my own needs. After twenty seven years of being able to depend on a parent or attendant to help me with those needs, it was (as embarrassing as it sounds) quite the challenge to get used to.

The latter though has been my most recent focus as November is upon us and graduate application deadlines are on the near horizon. At the same time however, I am also learning (there’s that word again) to manage my classwork, in this case finish a final paper.

The difficulty for me going into this project was to find a subject I felt passionate enough about to write a strong paper on. Being a graduate level class, my classmates and I were encouraged to relate our papers to our major thesis or MRE topics. However, since I am not yet in a graduate program I am not entirely certain about what my thesis is (or could be) about.

This thought was quite intimidating at first although after a recent discussion with my current professor, he helped me work out the idea that I could use this paper as a jump-start for my research proposal in my grad school application. Therefore, I have chosen to write this paper on how a historian does not need to depend upon the book as a physical entity for sources in order to be a successful historian.

In order to do so I am hoping to draw upon sources from my thesis last year as well as some new ones on historians who have utilized oral histories as well as historical film for their major sources. Wish me luck…

Projecting the Warning Signs

Seeing as it now is October 1st, I thought that I should take some time to reflect upon my experiences thus far with my new living situation. As an individual who faces daily challenges with even the simplest of physical tasks I thought that adapting to this new lifestyle would take much longer to become accustomed to. However, thanks to the excellent help of the Attendant Services program here at Carleton University as well as friends and family it has essentially seemed seamless.

In spite of this fact though, I still must be wary of early warning signs of exhaustion so that I can be sure to avoid major setbacks such as serious illness as I know I am at a higher risk of coming down with. Thanks to the great tunnels here at Carleton I have managed to avoid any unfortunate weather on my way to and from class. Seeing as it still early in the fall though I have made a point of getting as much fresh air when it has been reasonable for me to do so.

After two weeks of classes, and at the time I was only taking two, I could tell that they both would be too demanding for me to juggle at the same time. This is not to say that I was not enjoying them both but due to the amount of reading and work that they were both requiring of me, I knew that I would be unable to complete all of the assigned tasks to my fullest potential. Following the first completed task of my History of Food and Drink In Early Modern Europe, it was clear that I had not been able to give myself enough time to prepare. I am by no means pleading innocent in this as one of the reasons for this is most likely that I am still learning how best to divide time equally between social time and my studies. Even so, I could tell when I did the workload would be too much.

I therefore had to make a choice between the two classes. While the History of Food and Drink In Early Modern Europe was an interesting class, but I decided to stick with Historiography of Canada Part 1. Despite them both being interesting topics this was an easy choice for me as I am wanting to focus this year on learning what graduate level classes are like.

Thus far, this has been a great decision for me as I have already completed my first major task of the semester by leading the discussion on last week’s readings through analysis and coming up with questions to discuss. I must say it was nerve racking doing this for the first time at this next level of education but wet fantastically well. I found it even easier at this level as students who are studying along with you are wanting to be engaged, so it almost eliminates awkward silences from the audience completely. I would just ask a question and the conversation would naturally flow from there.

Now that that assignment is completed I can go on to do regular readings but also start to slowly on my end of year term paper on defining what post colonialism is. I am planning on approaching it from a compare and contrast perspective so I can hopefully get a better idea of how post colonialist writers differentiate themselves. The other great thing about this decision to drop down to one class is while I am managing my workload I do still have time for some social time.

This is a very important thing as one of the main reasons for me moving out and into residence is so that I can do just that. Up until this point it has worked out great. Firstly, I have been doing more regular social events like getting together for beers at the campus pub with my roommate. But I have been also pushed outside my comfort zone by the people around me like my girl friend and family to do things like navigate OC Transpo on my own and walk (ok roll) home from Lansdowne on my own.

The things that I have done in just the first month on my own have been so exciting! I cannot wait to see what October has in store…

Taking the Next “Step”

For most university students, moving out and gaining more independence at residence occurs at a natural rate after high school. For someone like myself though, moving out and away from that high level of support can be a major step that takes that much more time to prepare for. In fact, my situation led me to not only wait a couple years but until after my first degree!

This was not only surprising to some of my friends, who despite being disabled like me moved in to residence in their first year, was surprising to me. However, it was surprising to me not for the fact that I waited until after my undergraduate degree (because to me it never felt right at the time) but because I never felt as though I would move at all.

This is because when I was growing up I always had the feeling that I would be living with my parents for the rest of my life. Ideas like this came to me in a multiple of ways, for the most part though it was because I felt comfortable where I was with my family and couldn’t imagine the frustrations that arise as an adult living with your parents. On top of it all when my parents separated it became quite apparent to me that I would need to become comfortable with some other living situation in the near future.

What I did not know then was that the “near future” was a lot nearer than I could ever imagine. Just last February for instance I was in my kitchen having a casual conversation with my mom that led me to take the first step towards stepping (ok rolling) out on my own. The Carleton University Attendent Services Program suddenly seemed to be the perfect fit for me as it offered me the independence I was desiring and continue my academic career at the same time.

It has now been one week of living here and my experiences have been nothing but positive. I have been surrounded by positive energy and familiar faces that have made me feel excited about trying out a new living situation in a familiar environment. Just one week in it has led me to do more things independently than ever before including shocking my sister by bumping into her at a packed crowd getting geared up for an AC/DC concert. If this is just what I have been able to do in week one, I am eager to find out the rest of the year has in store.

Learning To Share Your Work

One of the most irritating things in the life of a student, at least to my mind, is not knowing what to do with your work once it is complete. After spending an entire semester, or in the case of a thesis an entire year working on a major paper, it often will just sit there taking up space on your computer. I find it so frustrating because I don’t want to get rid of it, because as a historian I have a weird need to have a small archive of my work, but at the same time I don’t want it to just sit there.

The solution that I have found thanks to the advice of professors and fellow students is sharing your work by publishing it online. In order to do so there are many options out there. One of my favourites to date has been putting my work on academia.edu. However, as I have learned this year through the process of finishing my thesis there are other options out there as well. For instance simply publishing in a private blog such as this one.

While it does not seem at first that this will accomplish much at all, it is a great networking opportunity also. By blogging about it and then perhaps Tweeting a link to your work with appropriate hashtags, academics of similar interest will be able to find your work. Helping others find your work allows your work to spread and be read by more individuals who can give you feedback on your work. This feedback will lead to more successful writing in your academic future. As an example I will publish my work on this blog to demonstrate how easy it is.

Deciding on how to publish my thesis was a difficult task. Luckily I had the help of of my supervisor for suggestions. I could think of only one way of publishing, put up a PDF of the file on a publishing website such as academia.edu. However, he was able to suggest to me some alternative forms that could prove useful as they could be a more accessible read to some. Seeing as my entire thesis is about accessibility and the written word, I thought this was a great idea.

So to begin I will publish a PDF file of my thesis right here (Thesis FINAL Copy) followed by a full copy in Notational Velocity.  Notational Velocity will allow the work to look like a virtual book with chapters stacked on top of one another rather than the endless virtual scroll of a PDF file.

To do this download notational velocity (http://brettterpstra.com/projects/nvalt/). Then, download the attached zip folder (hollis-thesis)and unzip it. In Notational Velocity, go to Preferences -> Notes and select that folder under ‘read notes from folder’. Then, under preferences -> notes -> storage, select under ’store and read notes on disk as ‘plain text files’. My thesis will now become available. In the search bar, try ‘accessibility’ and see all the bits and pieces come together! An original, non-linear reading experience that breaks the tyranny of the infinite scroll shall appear.

Finally, I will also publish a copy using a program called Twine.The amazing thing about Twine is that it allows the reader to see a work from a more external perspective. Twine does this by allowing you to make a game puzzle using themes from your work. So for my puzzle I will attempt to bring the reader through the experiences of a disabled academic. This will make it so the reader can have a greater understanding of the struggles that were brought up in my thesis and therefore have a greater appreciation for the paper while reading it.

The Twine version is a little more complicated to make obviously so that will be published soon in my next blog. For now, enjoy the PDF and Notational Velocity versions…

Letting Your Research Guide You

When starting out on a research paper, or in my case a thesis, one always has a plan of attack already in mind. For example, in an earlier blog entry, “White Board of Ideas”, I introduced the topic of my thesis to be studying the evolution of digitization through six different technologies. These technologies were: the incunabula, telegraph, the telephone, radio, television, and the internet. After a short while of researching though I came to the conclusion that this would simply be too large of a task to undertake for an honours thesis that I only have one year to write. Instead, I decided to focus my efforts to what interested me most in my research. This is how I ended up comparing the evolution of digitization of the incunabula (early book) to the development of the ebook.

My research did not stop guiding me at that point. Once I began, I discovered that it was important to follow some tangents that needed clearing up. For instance, when I began discussing my experience with ebooks, I fell upon the topic of taking physically printed books off the shelf and making them into a PDF format that I can read on my computer. While this idea was the original inspiration for my thesis as a whole, it led me to write about another necessary step in the evolution of digitization, copying.

This sudden realization opened up a number of other tangents. Most importantly, the study of copying in general and how that process became digitized through a number of paradigm shifts in its evolution through history. While pursuing this tangent of research I discovered some amazing stories, my favourite of which was about a Jesuit named Father Busa who was born in Italy in 1913. While doing his doctorate in philosophy he went about creating the first digitized index of the work of Thomas Aquinas.

In order to do so he estimated that he would need around thirteen million punch cards to do so. Because of this Busa knew that some kind of machine was necessary so he teamed up with IBM to complete such a daunting task. It was a very interesting story that I discovered thanks to letting my research guide my work.

The point that I am trying to make is that at this level of academics it is not good to approach a work with a very narrow mindset from the beginning straight to the end. Do not be afraid of looking further into stories that you discover along the way that seem interesting. Never would I have discovered that the development of the ebook and the incunabula were so complex had I not been open to new ideas. I have been amazed at how my writing has improved since I have taken on this mindset.