But others have written simple depth-first searches which, at the worst, can bring servers to their knees in minutes by recursively downloading information from CGI script-based pages that contain an infinite number of possible links. (Often robots can't realize this!) Imagine what happens when a robot decides to "index" the CONTENTS of several hundred mpeg movies. Shudder.
The moral: a robot that does what you want may already exist; if it doesn't, please study the document World Wide Web Robots, Wanderers and Spiders (URL is: http://web.nexor.co.uk/mak/doc/robots/robots.html ) and learn about the emerging standards for exclusion of robots from areas in which they are not wanted. You can also read about existing robots there.