Sunday, September 2, 2007

Proposed Carrot2 Framework

The developed Carrot2 System is a modularised java-based program. It extensively employs Carrot2 java libraries and API references. It uses Carrot2 embedded Apache Lucene 2.2 to automatically handle a lot of the data pre-processing, specifically employing the Porter stemmer for stemming. The developed uses Carrot2 System Weiss developed parser which accounts for “names, e-mail addresses, web pages and such.” (Weiss and Osinski, 2007b)

Using Carrot2 API and source code to resource a lot of the peripheral text mining services allows for a stable environment through which to test the effect of changes to the assumed key phrase on the outcomes of clustering.

