Before starting our search engineer journey, you should do things
below:
1. Make sure your jdk is version 5.0 above;
2. Copy hibernate and log4j configuration file to your class path;
3. Copy jbox configuration file and "DICT" file to your
program path;
Your project tree might be like below(default path):
4.Set the configuration of jbox like below:
<spider class =
"org.jbox.spider.htmlSpider.SimpleSpider">
<maxPageNume>10</maxPageNum>
<startUrls>
<property name =
"URL">http://localhost</property>
</startUrls>
<crawlRules>
<property name = "Rule">http://.*</property>
</crawlRules>
</spider>
More details about jbox configuration look at this.
5.The hibernate configuration based on your database.
Make sure you have create tables "Page" and "Word". You may find the
CREATE satement in "MYSQL.txt" in SQL package.
If you have finish steps above, let's begin our journey: Click Here