Linux Headquarters

I've installed Linux… now what???

Introduction
htdig is a webpage search engine licensed under the GNU Public License. It uses a very simple configuration file to allow it to search only the webpages you specify. For example, you can exclude the cgi-bin or a testing directory from the search engine. In addition to installing it on a webserver, some programs use it as a search engine plugin such as Glade, the GTK+ User Interface Builder. In addition, it will create a searchable database of any website. You just supply to URL.
Installing htdig

  1. Download the latest version from the htdig ftp server.
  2. tar -xvfz htdig-3.1.5.tar.gz
  3. cd htdig-3.1.5
  4. ./configure
  5. make
  6. make install

 

Configuring htdig

 

Once you have htdig installed, you must make a few changes to the configuration file and the HTML templates into which the search results are embedded.

Configuration File

The configuration file for htdig is located at /opt/www/htdig/conf/htdig.conf. It is pretty self-explanitory. The main attributes you need to configure are as follows. It will work if you leave the defaults for the other options or change them if you wish.

Attribute Value Example
start_url URL of your site http://www.mywebsite.com
exclude_urls Directories you do not want searched separated by white spaces /cgi-bin/ /testing/
maintainer Email address of maintainer maintainer@mywebsite.com
search_results_header HTML file to be used as header of search results. Only use this if you don’t want to use the default location for the header file: /opt/www/htdig/common/header.html /home/httpd/search/header.html
search_results_footer HTML file to be used as footer of search results. Only use this if you don’t want to use the default location for the header file: /opt/www/htdig/common/footer.html /home/httpd/search/footer.html
nothing_found_file HTML file to be displayed if there is no match to search string entered. Only use this if you don’t want to use the default location for the header file: /opt/www/htdig/common/nomatch.html /home/httpd/search/nomatch.html
syntax_error_file HTML file to be displayed if there is a syntax error in the search string entered. Only use this if you don’t want to use the default location for the header file: /opt/www/htdig/common/syntax.html /home/httpd/search/syntax.html

HTML Templates

 

If you don’t want to use the default look-and-feel of htdig, you can edit the following files to use the look-and-feel of your website. The paths may be different if you choose to change the paths of them in your configuration file.

  • /opt/www/htdig/common/header.html
  • /opt/www/htdig/common/footer.html
  • /opt/www/htdig/common/nomatch.html
  • /opt/www/htdig/common/syntax.html

Post-installation and configuration

  1. Next, you must setup the search database by running the script /opt/www/htdig/bin/rundig.
  2. Copy the default search.html and images from /opt/www/htdocs/htdig to a directory named htdig off of your webRoot. If the images are not in this directory, they will not appear unless you configure it otherwise it htdig.conf.
  3. Copy /opt/www/cgi-bin/htsearch to the cgi-bin for your webserver.
  4. Test the search engine by opening search.html in your browser and entering a search string.
  5. Because the search engine uses a database to return results, the database must be rebuilt with the rundig command used in step 1 every time any pages are added to the website.
  6. If you want to configure anything else, refer the the htdig website. Pretty much everything is configurable with htdig.

Where to download

What is Related

Comments are closed.

Subscribe to email feed

  • RSS
  • Delicious
  • Digg
  • Facebook
  • Twitter
  • Linkedin
  • Youtube

Dedicated Linux Serv

Linux is the popular system nowadays, offering all the benefits ...

StarOffice 5.1

Introduction StarOffice 5.1 is a complete office suite with a word ...

AxY FTP

Introduction AxY FTP, formerly known as wxFTP, is a graphical FTP ...

Adobe Acrobat PDF Re

Introduction Many of you are probably familar with Adobe Acrobat Reader ...

Macromedia Shockwave

Introduction Macromedia has developed a version of its popular Shockwave plugin ...

Twitter updates

No public Twitter messages.