|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.jbox.textCutter.util.NoiseFilter
public class NoiseFilter
A filter is used to filter noise word.
It is used to filter noise word for CutterBox
.
All noise words must be defined in a file in the directory "DICT/NOISE/". For
example, word "fool" needed to be filtered, it should be added to a file in
"DICT/NOISE/", or added to a new file such as "myNoise.txt" in "DICT/NOISE/".
Then the word "fool" will be ignored when cutting text.
CutterBox
will invoking this method when calling
CutterBox.cutPage(Page)
.
Dict
,
CutterBox
Field Summary | |
---|---|
protected Dict |
noise
|
Constructor Summary | |
---|---|
NoiseFilter()
Constructs a new NoiseFilter with default path "DICT/NOISE". |
|
NoiseFilter(java.lang.String path)
Constructs a new NoiseFilter with path. |
Method Summary | |
---|---|
void |
filterNoise(java.util.Collection<Word> words)
Filter noise words, remove words defined in noise dictionary from the specified collection . |
void |
filterRedundancy(java.util.Collection<Word> unFilteredWords)
Filter redundant words. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected Dict noise
Constructor Detail |
---|
public NoiseFilter()
public NoiseFilter(java.lang.String path)
path
- path of noise dictionary file or directory.Method Detail |
---|
public void filterNoise(java.util.Collection<Word> words)
words
- Collection containing Word
objects to be filtered.public void filterRedundancy(java.util.Collection<Word> unFilteredWords)
Word
at the same
string, it will remove one word from the collection, and then
add the locations of removed word to the other word. For example, suppose
that there are two words at the same string "fun", and first word has
the locations {1,2}, second has {5}, after invoking this method, the second
word was removed, and locations of the first word is {1,2,5}.
unFilteredWords
-
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |