Friday, December 29, 2006

To Scan, or not to Scan

Years ago now, before Linux, I started scanning and storing all my paperwork electronically. Those who live in this century will attest that we are simply plagued with paperwork. Be it bank statements, receipts, tax records... we live in an age where paperwork can very soon become overwhelming. Before Linux, I would happily use the software included with my scanner to scan as a searchable PDF, meaning that a simple OCR was performed on the document (information on the OCR process). All was right in the world, that is of course until my "migration".

Although SANE supports scanning to PDF, the process is very messy, and the PDF has to be reduced further (an letter sized document can run into MBs). After many hours of searching, I have not been able to find a reasonable solution. Reducing the pdf further is very simple using tools provided by ImageMagik, but attaining the searchability is the main problem. If anyone should have thoughts on this, do let me know. In the meantime I'll look at DocMorph, an online service for translating to PDF... such a service might be the only feasible service if Linux's support is lacking. gscan2pdf is also a program I'll have to investigate.

The Final Scan

Hours later, I've turned up nothing on the searchable PDF front. I can only find reference to high priced commercial solutions. The scanning to pdf process seems to be well documented, however the scanned document then being searchable seems to be a requirement with little interest. During my search

I found the Adobe PDF Online service, and took up a trial (5 conversions). I had very limited success with the service, although to the service's credit, I was able to partially translate the test page, and there were a plethora of configuration options which might hold the key to a fuller conversion.

Adobe PDF online can be subscribed to for $9.99 p/m or $99.99 p/a.

For the first time since I started using Linux, I've found myself having to settle for failure. There may be a solution out there, but for the time being I feel that I've exhausted every avenue of determination. Sometimes walking with Linus can be a little challenging. For now I'll have to *gulp* boot in to Windows so I can file my ever-building paperwork.

0 comments: