org.jbox.textCutter
Class CutterBox

java.lang.Object
  extended by org.jbox.textCutter.CutterBox

public class CutterBox
extends java.lang.Object

Container of Cutter.

Version:
1.0
Author:
YiBin.H
See Also:
Cutter, NoiseFilter, LanguageFilter

Constructor Summary
CutterBox()
          Constructs a new CutterBox.
 
Method Summary
 void addCutter(Cutter c)
          Add a Cutter into CutterBox.
static java.lang.String[] cutArticleToSentence(java.lang.String article)
          Static method for splitting text into Sentences by "." or "ĄŁ"
 void cutPage(Page p)
          Cut text of Page object into words, calculate the TF of Word, and stored the words in the Page object.
 java.util.Collection<java.lang.String> cutText(java.lang.String text)
          Cut text into words.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CutterBox

public CutterBox()
Constructs a new CutterBox.

Method Detail

addCutter

public void addCutter(Cutter c)
Add a Cutter into CutterBox.

Parameters:
c - a Cutter object.

cutPage

public void cutPage(Page p)
Cut text of Page object into words, calculate the TF of Word, and stored the words in the Page object. All words in the text defined in noise file will not be stored.

Parameters:
p - Page object contain the text to be cut.

cutText

public java.util.Collection<java.lang.String> cutText(java.lang.String text)
Cut text into words.

Parameters:
text - text to be cut.
Returns:
string collection containing words.

cutArticleToSentence

public static java.lang.String[] cutArticleToSentence(java.lang.String article)
Static method for splitting text into Sentences by "." or "ĄŁ" or "ŁĄ".

Parameters:
article - text to be cut.
Returns:
string array containing sentences.