These are components that are involved in a crawl:
- Content Host - This is the server that hosts/stores the content that your indexer is crawling. For example, if you have a content source that crawls a SharePoint site, the content host would be the web front end server that hosts the site. If you are crawling a file share, the content host would be the server where the file share is physically located.
- MSDMN.EXE - When a crawl is running, you can see this process at work in the task manager at the indexer. This process is called the "Search Daemon" . When a crawl is started, this process is responsible for connecting to the content host (using the protocol handler and iFilter), requesting content from the content host and crawling the content. The Search Daemon has the biggest impact on the indexer in terms of resource utilization. It's doing the most amount of work.
- MSSearch.exe - Once the MSDMN.EXE process is done crawling the content, it passes the crawled content on to MSSearch.exe (this process also runs on the indexer, you should see it in the task manager during a crawl). MSSearch.exe does two things. It writes the crawled content on to the disk of the indexer and it also passes the metadata properties of documents that are discovered during the crawl to the backend database. Crawling metadata properties (document title, author etc.) allows the use of Advanced Search in MOSS 2007. Unlike the crawled content index, which gets stored on the physical disk of the indexer, crawled metadata properties are stored in the database.
- SQL Server (Search Database) - The search database stores information such as information about the current status of the crawls, metadata properties of documents/list items that are discovered during the crawl.