|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
public interface WebSpider
The root interface of WebSpider. It is used to crawl the Internet and fetch pages.
| Method Summary | |
|---|---|
int |
getMaxPageNum()
Return max page number defined in configuration file. |
boolean |
hashNext()
Check if there is a next page to visit. |
Page |
next()
Visit and return the next @{link Page Page} Object. |
void |
setMaxPageNum(int maxPageNum)
Set max number of pages that the spider will crawl. |
void |
setRules(java.lang.String[] rules)
Set crawl rules of WebSpider. |
void |
setStartUrls(java.lang.String[] startUrls)
Set start URLs of WebSpider. |
| Method Detail |
|---|
void setStartUrls(java.lang.String[] startUrls)
startUrls - String array containing start URLs of WebSpider.void setRules(java.lang.String[] rules)
rules - String array containing rules written in REGEXP.void setMaxPageNum(int maxPageNum)
maxPageNum - max number of pages the WebSpider will crawl.boolean hashNext()
Page next()
int getMaxPageNum()
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||