|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
public interface WebSpider
The root interface of WebSpider. It is used to crawl the Internet and fetch pages.
Method Summary | |
---|---|
int |
getMaxPageNum()
Return max page number defined in configuration file. |
boolean |
hashNext()
Check if there is a next page to visit. |
Page |
next()
Visit and return the next @{link Page Page} Object. |
void |
setMaxPageNum(int maxPageNum)
Set max number of pages that the spider will crawl. |
void |
setRules(java.lang.String[] rules)
Set crawl rules of WebSpider. |
void |
setStartUrls(java.lang.String[] startUrls)
Set start URLs of WebSpider. |
Method Detail |
---|
void setStartUrls(java.lang.String[] startUrls)
startUrls
- String array containing start URLs of WebSpider.void setRules(java.lang.String[] rules)
rules
- String array containing rules written in REGEXP.void setMaxPageNum(int maxPageNum)
maxPageNum
- max number of pages the WebSpider will crawl.boolean hashNext()
Page next()
int getMaxPageNum()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |