org.jbox.textCutter
Class AbstractCutter
java.lang.Object
org.jbox.textCutter.AbstractCutter
- All Implemented Interfaces:
- Cutter
- Direct Known Subclasses:
- SimpleCJKCutter, SimpleENCutter
public abstract class AbstractCutter
- extends java.lang.Object
- implements Cutter
A abstract class define default behavior of Cutter
.
- Version:
- 1.0
- Author:
- YiBin.H
- See Also:
CutterBox
,
LanguageFilter
Method Summary |
protected abstract java.util.Collection<java.lang.String> |
cutSentenceToWord(java.lang.String checkedString)
Cut text into words. |
java.util.Collection<java.lang.String> |
cutSentenceToWord(java.lang.StringBuffer unCheckedString)
Cut text in a StringBuffer into words. |
void |
setUnicode(java.lang.Character.UnicodeBlock[] unicodeBlocks,
int[][] unicodeScopes)
Be used to specify the unicode scope of Cutter for filtering text. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
langFilter
protected LanguageFilter langFilter
AbstractCutter
public AbstractCutter()
cutSentenceToWord
protected abstract java.util.Collection<java.lang.String> cutSentenceToWord(java.lang.String checkedString)
- Cut text into words.
- Parameters:
checkedString
- text contain chars belongs the unicode scope
of the Cutter.
- Returns:
- words of text.
setUnicode
public void setUnicode(java.lang.Character.UnicodeBlock[] unicodeBlocks,
int[][] unicodeScopes)
- Be used to specify the unicode scope of Cutter for filtering text.
- Specified by:
setUnicode
in interface Cutter
cutSentenceToWord
public java.util.Collection<java.lang.String> cutSentenceToWord(java.lang.StringBuffer unCheckedString)
- Description copied from interface:
Cutter
- Cut text in a StringBuffer into words.
- Specified by:
cutSentenceToWord
in interface Cutter
- Parameters:
unCheckedString
- text contain chars with different code of languages.
- Returns:
- string collection containing words of text that belong to unicode
scope of the Cutter.