• Welcome to NIWA Community Forums.
 

Robots.txt

Started by FlyingRagnar, May 13, 2011, 05:28:53 PM

Previous topic - Next topic

FlyingRagnar

Here's something I don't understand.  I have done some searching but have to yet to find a good answer.

If I google "Pikachu site:bulbapedia.bulbagarden.net" I get lots of articles.  If I switch to image search, I get lots of images.
If I google "Hero site:dragon-quest.org" I get lots of articles.  If I switch to image search, I get nothing.  What gives?

Does this have anything to do with the robots.txt for a site?  I'm not sure because the 2 sites really are similar.

http://bulbapedia.bulbagarden.net/robots.txt
http://dragon-quest.org/robots.txt

Tappy

Probably because all images on your wiki are under dragon-quest.org/w/ which you have disallowed crawlers from going though.

For example the pages like below are just place holders, crawlers will see it, yes... but when they try to go to the image directly it's under /w/ which according to your robots.txt you don't allow crawlers to see.

http://www.dragon-quest.org/wiki/File:Main_wiki_logo.png

http://www.dragon-quest.org/w/images/c/cc/Main_wiki_logo.png
Webmaster of Hyrule.net (a mastermind of ZeldaWiki.org)

FlyingRagnar

Yes that's exactly right.  Your post led me to realize that Bulbapedia is storing media in a location outside of /w/, which is why their rules are similar, but images still get crawled.  I guess the solution is to move the media somewhere else.

It looks like almost all the other NIWA wikis are doing the same.  All except Lylat and Nookipedia.  Nookipedia lets crawlers go everywhere, I would assume Lylat does as well.


Jake

Quote from: FlyingRagnar on May 14, 2011, 04:33:02 AM
Yes that's exactly right.  Your post led me to realize that Bulbapedia is storing media in a location outside of /w/, which is why their rules are similar, but images still get crawled.  I guess the solution is to move the media somewhere else.

It looks like almost all the other NIWA wikis are doing the same.  All except Lylat and Nookipedia.  Nookipedia lets crawlers go everywhere, I would assume Lylat does as well.


I didn't realize our robots.txt file was set up like that. There were supposed to be some rules in there about special pages and skins. Thanks for letting me know. :)

FlyingRagnar

No problem.  The tough part of changing robots.txt is that you then have to wait around for google or whatever bots to visit again.  I think I have mine sorted out now, but I'm still waiting for google to pick up all of our images.