org.jbox.textCutter.CJK
Class SimpleCJKCutter
java.lang.Object
org.jbox.textCutter.AbstractCutter
org.jbox.textCutter.CJK.SimpleCJKCutter
- All Implemented Interfaces:
- Cutter
public class SimpleCJKCutter
- extends AbstractCutter
A concrete class of Cutter for CJK.
SimpleCJKCutter use dictionary files in directory "DICT/WORD/" to cut text.
If there is not file in the directory, the text will be cut by char. Noted
that if this class is used, the directory "DICT/WORD/" is needed, even if
there is not dictionary file exist,or else it will throw a
DictInitException
.
Note: Just Chinese Dictionary is offered in jbox at current version.
- Version:
- 1.0
- Author:
- YiBin.H
- See Also:
CutterBox
,
Dict
,
LanguageFilter
Method Summary |
java.util.Collection<java.lang.String> |
cutSentenceToWord(java.lang.String s)
Cut text into words. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
dict
protected Dict dict
SimpleCJKCutter
public SimpleCJKCutter()
cutSentenceToWord
public java.util.Collection<java.lang.String> cutSentenceToWord(java.lang.String s)
- Description copied from class:
AbstractCutter
- Cut text into words.
- Specified by:
cutSentenceToWord
in class AbstractCutter
- Parameters:
s
- text contain chars belongs the unicode scope
of the Cutter.
- Returns:
- words of text.