robots.txt - what is it?
robots.txt - what is it?
it popped up around the time the custom 404 and 403 appeared but I'm wondering what does it do?
- VileTerror
- Anti-Villain
- Posts: 3437
- Joined: Wed Sep 17, 2003 11:16 am
- Location: n. 1 a place where something is located. 2 the action of location. - DERIVATIVES locational adj.
- Contact:
Hmm . . .
As far as I can tell: it disallows 'bots.
What that means . . . I have no freakin' idea.
What that means . . . I have no freakin' idea.
Haughty spirit and pride make for a wild roller coaster ride!
I mean, as long as you like fairly final endings.
I mean, as long as you like fairly final endings.
- VileTerror
- Anti-Villain
- Posts: 3437
- Joined: Wed Sep 17, 2003 11:16 am
- Location: n. 1 a place where something is located. 2 the action of location. - DERIVATIVES locational adj.
- Contact:
COOL!
In that case: I'm going to allow them!
Haughty spirit and pride make for a wild roller coaster ride!
I mean, as long as you like fairly final endings.
I mean, as long as you like fairly final endings.
You think correctly; it indeed is referering to spiders that craw over your site and all the pages contained within (via links from various pages, like the front page), though not nessesarly for harvesting email addresses. Usually it's for search engines.johndisko wrote:the bots to which it refers are (according to my belief) spider bots that scan your webpage for email addresses to send spam and stuff. usually used by metasearchers and stuff
i think
In fact, 'malicious' spiders for that purpose probably ignore any robots.txt file and just do whatever they want as far as looking at your site~
- VileTerror
- Anti-Villain
- Posts: 3437
- Joined: Wed Sep 17, 2003 11:16 am
- Location: n. 1 a place where something is located. 2 the action of location. - DERIVATIVES locational adj.
- Contact:
Niiiiiiiice.
That means I should put a list of every hotmail account which belongs to someone I dislike on my page. Should be nice and effective.
Haughty spirit and pride make for a wild roller coaster ride!
I mean, as long as you like fairly final endings.
I mean, as long as you like fairly final endings.
Heh.
Yeah, as summarized, robots.txt is a special addition you can put in your main FTP directory that tells legitimate robots what to do if they happen upon your site. Everyday, the Internet is crawled over and traversed with robots, little automated programs that recursively blast through those billions of pages in search of new content. robots.txt can be used, for example, to tell a search engine's robot not to index your site.
The operative phrase here is "legitimate." As said, malicious robots will completely ignore robots.txt.
Yeah, as summarized, robots.txt is a special addition you can put in your main FTP directory that tells legitimate robots what to do if they happen upon your site. Everyday, the Internet is crawled over and traversed with robots, little automated programs that recursively blast through those billions of pages in search of new content. robots.txt can be used, for example, to tell a search engine's robot not to index your site.
The operative phrase here is "legitimate." As said, malicious robots will completely ignore robots.txt.

Dragon Angel Tai - http://dragonangel.keenspace.com
holy shit! all that book learnin actually paid off 
im not bein sarcastic. i was making an educated guess, but a guess nonetheless
so thank you all
id like to dedicate this award to... *fades*
-jd

im not bein sarcastic. i was making an educated guess, but a guess nonetheless

id like to dedicate this award to... *fades*
-jd
im the "j" "o" to the "h" "n" and i cant even spell the rest. it takes too long and i need a friggin cigarette.