Resident log, NDSR date 20161004.1
Resident: Lorena Ramirez-Lopez from WHUT
“who just md5deep-ed and redirected all them checksums to a .csv file? This gal “
AND this is how I did it!
Screenshot of my Twitter page (7 Sept. 2016, 11:07 am. Tweet) taken on 3 October 2016 on Mac OS X El Capitan Version 10.11.6 using Chrome 53.0.2785.116 (64-bit)
Wow almost a month ago I was able to checksum a batch of 272 video files
also woooow this is my first blog post.
Thanks to Eira Tansey (@eiratansey). It was because of a simple tweet/question (one that I should’ve responded to earlier. Sorry!), but I got over enthusiastic and decided to blog about it!
Let me clarify WHY I DID IT before I go into the HOW I DID IT (which you can totally skip this part and scroll down to HOW I DID IT!)
WHY I DID IT?
WHY I needed to checksum?
1) checksums are important! Or they can be for people who know what checksums are. Checksums are a good and simple step towards digital preservation
2) WHUT is part of the American Archive of Public Broadcasting project and in 2011, we were able to receive at least 200 hours of digitization! (Woot!)
As a result of the collaboration, we have:
– 136 mov files (ranging from 170 MB to 500 MB)
– 136 that are MXF files (ranging from 13 GB to 28GB)
– on a Lacie drive…
– from 2011…
so WHY not bagit?
Bagit is way more user-friendly for me after reading and researching a lot of documentation.
For example, Ethan Gates (@The_BFOOL) wrote “Using Bagit” on his wordpress site. Check it out here at: https://patchbaynyu.wordpress.com/2016/09/20/using-bagit/
And then there’s all the other documentation one would find when googling “bagit for mac”, that I couldn’t break down, but I know would/should make sense to someone:
“BagIt usage instructions” by Matt Schultz, Stephen Eisenhair, and Nick Krabbenhoef. April 29, 2014. Link: http://metaarchive.org/public/resources/neh/research/BagIt_Usage_Instructions.pdf
“Bagger” by Library of Congress on their GitHub account: https://github.com/LibraryOfCongress/bagger
There was a cool series of how to use bagit on the State Archives of North Carolina channel. Part 1 of 10 link: https://www.youtube.com/watch?v=14ZPtYLtUYA&index=1&list=PL2OuHt89v00rWfthbk0qQqAPRhdDeNazZ
But it still doesn’t make complete and total sense in my head (YET!) And I had another problem: these hard drives were “locked” which meant I couldn’t write in it and definitely couldn’t create a bag in it and bagit froke* out when I tried redirecting the bag.
I would get this error:
If there was another way – SHARE IT (please)- cause I could not find an option.
HOW I DID IT!
and you can too. you can even make it better and expand it just hope you’d be able to share alike.
First, I used TERMINAL
If this is your first time using terminal, sorry! Don’t worry though it’s not too bad.
Once I opened terminal I made sure I was in the Desktop directory
This part might or might not make sense to people. For those who might not get it, type: pwd into terminal
pwd means print working directory and tells you which directory you are in. You should see /Desktop at the end of that line. If not, try and type: cd Desktop
OK now the part where I md5deeped.
command line used:
md5deep -be [dragged my files] > test_1.csv
what does each mean?
|Need or optional?||Command||Reason|
|need||md5deep||This is the command that will generate the md5 checksum!|
|optional||– b||this flag means ‘bare’ and will not include the directory pathway. So instead of /User/Me/Folder/Desktop/file name – it’ll just be file name|
|optional||-e||This flag means progress of the command, but don’t get too excited. It’s not like a progress bar. It’s the time % of the file uploading. If you have one file. Great! It’ll tell you how many minutes that one file will be done. If you have 272 files….it’ll take awhile BUT at least you see what’s happening in Terminal!|
|need||>||This means redirect the results to a different location (FOR THE FIRST TIME!)**|
|need||test_1.csv||This is the file document I want. I’m naming the .csv file test_1 because why not?|
Depending on the size of your files this can take a short amount of time (coffee break status) or a long time (you’re probably going to have to do this overnight)
But after that you have a .csv file with your checksums!
Although I should warn you that the checksum and title (and other flags like timestamp if you included) will be put into one cell like this:
BUT I know how to separate!
HOWEVER, that’s another blog post which I’ll get to during the weekend. Sorry!
Taking a page from @ablwr’s book, I will be uploading files (video, audio, text, files, etc.) so you can download and use them as tests!!! Like the GIF I just made for this blog post.
I am still figuring out where to upload them, but the videos and photos will be my own works that all have CC-BY-SA licenses.
So do what you want with them! Remix. Reuse. Recycle. wReck.
Have fun with them and don’t feel bad if it gets messed up 😉
I don’t have all the answers, but I’m always happy to lend a hand and find related articles to help. Just ask!
End of resident log, NDSR date 20161004.1
*Spelling of froke was intentional.
** 20161006: Again thanks to @eiratansey for asking and pointing this out! > is redirect output which will always write and/or rewrite from the beginning. Which is great for first time redirect outputs. BUT if you want to reuse and ADD to the same document >> can append redirect output rather than overwrite the information you already have. Hope that wasn’t too confusing!
Resident log update 20161006.2: I was freaking out about possible confusion. So here’s a video of me doing what I wrote: “blog_md5deep”