Digitization Image

So I finally made it to a scanner and here is my image. It is page 15 of the January 2006 issue of the Brotherhood Beacon. For my final project for this class I’m hoping to digitize five years of this monthly publication. I’m really not sure if this is beyond my capabilities or not, but I’ve been told that there is some software that will help me with this process. I will use OCR software to create text files from each page image and then upload them all to a website using streetprint software which creates a searchable database.

Brotherhood Beacon - January 2006 p. 15
For this particular image, I just scanned it in and cropped it to get just the page image. I scanned it in at 400 dpi to make sure that I get good enough quality scans to run OCR software and get readable text. I may try 300 dpi as well to see if it makes any difference. 400 dpi also gives me a good enough file to save for archives, even if I don’t end up uploading that large of an image. When I save them as .jpg images currently (on medium quality) I’m getting file sizes around 2.5 megabytes, the same is true for medium quality .pdf files.

I’m really not sure what kind of file type or size I will use yet when actually uploading. I think it would be great if the files that I actually upload were smaller so that people on dial-up connections can actually still use the archive. Finding the balance between that and readability will be my main job, as I can save the larger archive quality image on cds/DVDs and/or hard drives for safe keeping (although I guess we all realize now how safe they actually are).

I’m also considering whether or not I want to scan in color. It seems that the Brotherhood Beacon uses black and white and one other color per issue, so grayscale wouldn’t be losing too much I don’t think. I would have to check to see how much file space that would save me.

Okay, I don’t know why I can’t provide a link to my image and put it on another page on my blog. The image is too small to see what’s even on the page. Enough for now, must wait for class.

Leave a Reply