org.jbox.textCutter.CJK
Class SimpleCJKCutter

java.lang.Object
  extended by org.jbox.textCutter.AbstractCutter
      extended by org.jbox.textCutter.CJK.SimpleCJKCutter
All Implemented Interfaces:
Cutter

public class SimpleCJKCutter
extends AbstractCutter

A concrete class of Cutter for CJK.

SimpleCJKCutter use dictionary files in directory "DICT/WORD/" to cut text. If there is not file in the directory, the text will be cut by char. Noted that if this class is used, the directory "DICT/WORD/" is needed, even if there is not dictionary file exist,or else it will throw a DictInitException.

Note: Just Chinese Dictionary is offered in jbox at current version.

Version:
1.0
Author:
YiBin.H
See Also:
CutterBox, Dict, LanguageFilter

Field Summary
protected  Dict dict
           
 
Fields inherited from class org.jbox.textCutter.AbstractCutter
langFilter
 
Constructor Summary
SimpleCJKCutter()
           
 
Method Summary
 java.util.Collection<java.lang.String> cutSentenceToWord(java.lang.String s)
          Cut text into words.
 
Methods inherited from class org.jbox.textCutter.AbstractCutter
cutSentenceToWord, setUnicode
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

dict

protected Dict dict
Constructor Detail

SimpleCJKCutter

public SimpleCJKCutter()
Method Detail

cutSentenceToWord

public java.util.Collection<java.lang.String> cutSentenceToWord(java.lang.String s)
Description copied from class: AbstractCutter
Cut text into words.

Specified by:
cutSentenceToWord in class AbstractCutter
Parameters:
s - text contain chars belongs the unicode scope of the Cutter.
Returns:
words of text.