Content Cleansing and Categorization Platform

Imagine a company that has hundreds of thousands of documents that have been scattered in multitudes of shared folders within the corporation during the past several years. Imagine further that you have been assigned the daunting task of migrating this content to a modern Enterprise Information Management environment,(such as SharePoint).

Within this huge volume of content can be found:

  • Business records – that must be retained and preserved for legal, audit and business continuity reasons
  • Knowledge records – information assets and reference material that are the basis for the day to day operations of the company; and in some cases these information assets may be the primary source of revenue for the company.
  • Electronic communications – some of which is linked to business records and must be retained and managed as such.
  • ‘Junk Content’ – that should be deleted. If this is a typical company, the volume of this junk content is much larger than many would like to believe.

Record Types

Having identified the strategic records management or content management platform, the challenging initial question is “Where do we start?”.

This is the question we have been hearing from many clients.

In response to this requirement, ACIS Consulting has developed an innovative search based application that can be used for cleansing and categorizing of legacy content for preparation to migrate to modern content management platforms.

This application leverages a number of the advanced capabilities of the FAST search technology to enable customers to quickly address this initial task of content inventory, rationalization, classification and exporting to new EIM environment.

Content cleansing process

Key capabilities of this service tool include:

    • Ability to access and index all kinds of content repositories where information might be stored
    • Ability to classify and categorize processed content based on multiple business rules
    • Ability to extract metadata and keyword entities from within the entire textual content of the processed document and use it to further classify and categorize each document
    • Ability to identify duplicate and near duplicate documents
    • Capability to identify the owner department, the source repository and author name (if available)
    • Automated extraction of document creation date to enable date based tagging for retention policy.
    • Extraction of document size and content type that can be used for evaluation of how much storage is being used by what type of content
    • Capability for information analysts to use advanced content categorization queries to identify groups of document that belong to a given department or function. Analysts can then assign a category tag to all of the returned search results thereby refining and accelerating the content classification process.
    • Capability to export the list of documents, along with the additional classification metadata, for migration to the new content management environment
    • Capability to identify duplicate documents as candidates for deletion by the document owner.

In addition to offering this service tool, ACIS consulting also provides consulting services and analyst resources to assist clients to successfully overcome this crucial first step towards more efficient Enterprise Information Management.