Author Topic: Robots.txt  (Read 7245 times)

Offline FlyingRagnar

  • Dragon Quest Wiki Staff
  • ***
  • Posts: 111
    • View Profile
    • Dragon Quest Wiki
Robots.txt
« on: May 13, 2011, 05:28:53 PM »
Here's something I don't understand.  I have done some searching but have to yet to find a good answer.

If I google "Pikachu site:bulbapedia.bulbagarden.net" I get lots of articles.  If I switch to image search, I get lots of images.
If I google "Hero site:dragon-quest.org" I get lots of articles.  If I switch to image search, I get nothing.  What gives?

Does this have anything to do with the robots.txt for a site?  I'm not sure because the 2 sites really are similar.

http://bulbapedia.bulbagarden.net/robots.txt
http://dragon-quest.org/robots.txt

Offline Tappy

  • Forum Administrator
  • *****
  • Posts: 270
  • Gender: Male
    • View Profile
    • Hyrule.net
Re: Robots.txt
« Reply #1 on: May 14, 2011, 01:31:31 AM »
Probably because all images on your wiki are under dragon-quest.org/w/ which you have disallowed crawlers from going though.

For example the pages like below are just place holders, crawlers will see it, yes... but when they try to go to the image directly it's under /w/ which according to your robots.txt you don't allow crawlers to see.
 
http://www.dragon-quest.org/wiki/File:Main_wiki_logo.png

http://www.dragon-quest.org/w/images/c/cc/Main_wiki_logo.png
Webmaster of Hyrule.net (a mastermind of ZeldaWiki.org)

Offline FlyingRagnar

  • Dragon Quest Wiki Staff
  • ***
  • Posts: 111
    • View Profile
    • Dragon Quest Wiki
Re: Robots.txt
« Reply #2 on: May 14, 2011, 04:33:02 AM »
Yes that's exactly right.  Your post led me to realize that Bulbapedia is storing media in a location outside of /w/, which is why their rules are similar, but images still get crawled.  I guess the solution is to move the media somewhere else.

It looks like almost all the other NIWA wikis are doing the same.  All except Lylat and Nookipedia.  Nookipedia lets crawlers go everywhere, I would assume Lylat does as well.


Offline Jake

  • Nookipedia Staff
  • ***
  • Posts: 244
  • Gender: Male
  • Nookipedia Director / Server Admin
    • View Profile
    • Nookipedia
Re: Robots.txt
« Reply #3 on: May 20, 2011, 07:54:04 PM »
Yes that's exactly right.  Your post led me to realize that Bulbapedia is storing media in a location outside of /w/, which is why their rules are similar, but images still get crawled.  I guess the solution is to move the media somewhere else.

It looks like almost all the other NIWA wikis are doing the same.  All except Lylat and Nookipedia.  Nookipedia lets crawlers go everywhere, I would assume Lylat does as well.


I didn't realize our robots.txt file was set up like that. There were supposed to be some rules in there about special pages and skins. Thanks for letting me know. :)

Offline FlyingRagnar

  • Dragon Quest Wiki Staff
  • ***
  • Posts: 111
    • View Profile
    • Dragon Quest Wiki
Re: Robots.txt
« Reply #4 on: May 20, 2011, 10:16:08 PM »
No problem.  The tough part of changing robots.txt is that you then have to wait around for google or whatever bots to visit again.  I think I have mine sorted out now, but I'm still waiting for google to pick up all of our images.