Case study Government
Location: Ft. Beloit, Virginia, USA
Industry: Government
Focus Area: Intelligent document conversion
URL:
Customer Background:
DTIC is a division of the Department of Defense and is responsible for storing, cataloging, and distributing documents from all areas of the DOD and the military service branches. DTIC distributes these documents via the public internet, hardcopy, CD and DVD. The department reports to the Director, Defense Research and Engineering.
Business Need:
DTIC has a vast stockpile of documents that are not easily accessible electronically. These documents are stored on microfiche and can only be distributed by manually photocopying the fiche. While not the complete inventory, DTIC estimates that it has at least 71M pages of technical reports, university studies and other unclassified research documents stored on microfiche. There are also an unknown number of classified documents on fiche as well.
Currently, DTIC’s most visible document search and retrieve tool is Stinet, a web-based application available to any US citizen to search through DTIC’s archive. The archive content isprimarily citations or abstracts of the actual document. Should an online user want the actual document, DTIC must manually locate the fiche, make photocopies and ship the report to the recipient.
More and more, DTIC has been attempting to include TIFF and PDF versions of the documents to allow researchers to immediately download and view the full-text of the document. But the vast majority of the Stinet documents are represented by citations.
DTIC also wanted to improve its search engine. They select Olive’s business partner with DRS scanners to supply third-generation search and auto citation generation system.
Solution:
To meet DTIC’s needs, Olive developed a new version of an existing Olive product to automate the process of converting microfiche into TIFFs and PDFs. This system, called the Microfiche PipeX system will allow DTIC to process the 71M unclassified document pages on fiche in approximately about 6 years and provide digital versions of the full reports for all of these documents to be distributed by Stinet.
In August of 2004, DTIC select Olive to build their microfiche processing system. This was based on Olive’s track record of converting hardcopy newspapers and magazines into an electronic form that could be viewed by subscribers on the Internet.
Microfiche PipeX was built on one of our foundation products, PipeX; an expandable system for automatically converting images on microfilm to our Olive PrXML repository. Once in the repository, our customers could view the online version of the publication through our Electronic Editions web application servers.
The PipeX system installed at DTIC consists of 9 IBM eServer processors and some networking equipment.
DTIC’s project offered Olive a series of challenges. First, to make the system automated we would have to develop a technique for making the document ID number on the fiche machine readable. This required a sophisticated optical character recognition configuration to be created.
We would also need to work with microfiche and fiche reading devices. Further, this device, or scanner, had to work in a production environment. It had to be able process fiche unattended and had to have an automated feeder system to set and remove fiche.
We selected a microfiche scanner built by DRS out of Germany.
Benefits of the Solution:
Prior to the installation of the Olive Microfiche PipeX system, DTIC did not have a plan for converting these fiche and making these documents easily accessible to its customers. The benefit to DTIC’s Stinet service is that more and more their users will be able to directly access the actual image of the document enabling them to see more than simply the citation.
DTIC is also considering using a full-text search engine, to give researchers more flexibility and greater accuracy when attempting to locate information. The Olive PipeX system will help here as well. Since we are already generating the full text for each document, while not currently be used by DTIC, this could give the tools for not only converting the fiche documents in to images but also providing the text for search as well.
Prepared By: Tony Pompili and Kim Dail
