org.jbox.textCutter
Interface Cutter

All Known Implementing Classes:
AbstractCutter, SimpleCJKCutter, SimpleENCutter

public interface Cutter

The root interface of text cutter.

Cutter is put in a CutterBox to work.

Version:
1.0
Author:
YiBin.H
See Also:
CutterBox, LanguageFilter

Method Summary
 java.util.Collection<java.lang.String> cutSentenceToWord(java.lang.StringBuffer unCheckedString)
          Cut text in a StringBuffer into words.
 void setUnicode(java.lang.Character.UnicodeBlock[] unicodeBlocks, int[][] unicodeScopes)
          Set the unicode scope of Cutter.
 

Method Detail

cutSentenceToWord

java.util.Collection<java.lang.String> cutSentenceToWord(java.lang.StringBuffer unCheckedString)
Cut text in a StringBuffer into words.

Parameters:
unCheckedString - text contain chars with different code of languages.
Returns:
string collection containing words of text that belong to unicode scope of the Cutter.

setUnicode

void setUnicode(java.lang.Character.UnicodeBlock[] unicodeBlocks,
                int[][] unicodeScopes)
Set the unicode scope of Cutter.

Parameters:
unicodeBlocks -
unicodeScopes -