Table of Contents

  • Introduction
  • Administrator
    · Introduction
    · Directories
    · Installing
    · Configuring
    · New profile wizard
    · Unleash the crawler
    · Logs
    · Advanced profile configuration
  • User
  • Appendix
  • Unleash the crawler
    Select the Crawlers tab at the top of the page to enter the Crawler Control page. From this page, you can start and stop crawlers, and view their current status. First on the page is a list of crawlers ready to start, below this is the list of active crawlers.

    • At this point your test profile should be in the list of ready crawlers. Select Launch Now to start a crawler.

    • Now, your profile should appear in the list of "Active Crawlers" with the status "starting up". Click Reload on your browser to update this information.

    • After a while (Click Reload a few times, every few seconds) the status of your crawler will change to "Running!" and another few options will appear.
      If this does not happen within a few seconds, something has probably gone wrong. The most likely cause is that Pike failed to start. Check:

      • That you have specified the correct Engine home location path in the Intraseek Module Configuration.

      • Challenger's log (logs/debug/default.1), for any hints about what went wrong.

      If you find the error, click Zap! to return the crawler to the "Ready to launch" list.

    • If all went well, the crawler is now running. You can select either View status to see a summary of the progress, or View log to see a more verbose log.

      • If you select Zap!, the crawler will be killed by force, and the entire data gathering aborted. The crawler will then be "Ready to launch" again, and none of the information gathered will be used. (The temporary files will also be deleted)

      • If you select Halt! the crawler will instead be stopped. It will then save all its information to the data bases, and also save a freeze-file (ID.scheduler_freeze.is) so that the data-gathering can be continued. It will say "Halting... (saving)" in the status box, and then "Halted" when everything has been saved.

    Afterwards
    When the crawler has finished running, and indexed all the available web pages, two data bases with the names "n.ID.pages.yabu" and "n.ID.index.yabu", and a flag-file "ID.new.flag" will have been created. (where ID is the name of the profile)

    These will be renamed to "a.ID.index.yabu" and "a.ID.pages.yabu" when the search engine finds the flag-file whereupon the flag-file will be deleted and the new data bases used.

    More information on these is to be found in the Storage of Data Bases chapter of the technical documentation.

    You will notice a slight pause the first time you use the new data base. The reason for this is that the IntraSeek module replaces the old data base files with the new ones. This delay can last anywhere from seconds to minutes depending on file system speed and data base size.

    Logs have been generated as well. Select the Logs tab in the configuration interface to see these. The main crawler log tells you how different crawlers have been started and stopped. This log can be cleared by selecting Delete the main log.