Desktop Search

Frequently Asked Questions

  1. What is Pathena's basic concept of operation?
  2. How hard is it to install?
  3. What file types can Pathena index?
  4. Where does the name Pathena come from?

1. What is Pathena's basic concept of operation?

Pathena makes use of full-text indexing and search techniques. Its software architecture is based on the following components:

  • Server -- implemented using the PostgreSQL database management system, including the Tsearch2 module for full-text indexing. The server runs quietly as a background process on your desktop machine.
  • Client -- implemented as Python software using the Tk GUI toolkit. The client provides the primary interface, allowing users to invoke queries and display results. Additionally, users may view file contents, launch external viewers, invoke shell commands, and perform other tasks based on query results.
  • Scripts -- includes assorted items for conducting file indexing and performing various maintenance chores.

Currently Pathena requires that file indexing be performed as a separate, explicitly managed activity. Users specify indexing profiles to indicate which portions of their file systems are subject to indexing and which file types to include. Indexing can be performed on demand or in a regularly schedule manner. Future versions will aim for tighter integration with the underlying file system to reduce the indexing lag.

Pathena is designed to index individual files as maintained by the operating system of a desktop or workstation. It is not intended to access application-specific data stored in proprietary database formats. Nor does it attempt to index your browser histories or track other types of recent access patterns. It works best when indexing data with a certain amount of permanence rather than ephemeral information. Also, Pathena's operation is strictly local; it makes no attempt to access any Internet sites or initiate external communication of any kind.

2. How hard is it to install?

Installation is trivial for many platforms (e.g., recent versions of Linux). Fedora Core 2 and later include suitable implementations of all required packages. If all RPMs from a Fedora 2/3/4 distribution have been installed, Pathena should run out of the box. Many of the other Linux and BSD distributions should be similarly endowed. In any case, the installation script tells you if any required packages are missing. If you don't want to proceed, you can stop there and you've lost only one minute of your day.

Pathena is written in Python along with a little SQL and a few shell scripts. These components don't need lengthy compilation. If all dependent packages exist on your system, the build process takes only a minute or two.

3. What file types can Pathena index?

Currently, the following file types are supported:

  • Plain text files.
  • Source files for popular programming languages. Syntax for identifiers is accommodated (complete identifiers and their subwords are both indexed).
  • Markup files (html, xml, tex).
  • Archive types (tar, zip, rpm). Only path names of the archived files are indexed.
  • Encoded document types (PDF, PostScript, OO Writer, OO Impress, OO Calc, MS Word).
  • E-mail messages stored in 'maildir' format (attachments excluded).
  • File system nodes (directories and symbolic links).
  • Executable scripts.
More file types and better file converters will be added over time.

4. Where does the name Pathena come from?

Pathena was named for one of the lesser known figures from Geek mythology.

  Administrator:   Ben Di Vito [Powered by PostgreSQL] [Python Powered] Powered By GForge Collaborative Development Environment Last Updated:   31 October 2005