123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051 |
- <!DOCTYPE html>
- <html>
- <head>
- <link rel="icon" type="image/png" href="https://owlman.neocities.org/favicon.ico" />
- <title>robots.txt</title>
- <body bgcolor="#CCCCCC" text="#000000">
- <!-- -->
- <!-- Because you are reading this, it must mean only one thing; you -->
- <!-- are looking at our page source, well, hope you like the -->
- <!-- look of it! -->
- <!-- -->
- <!-- The Penny's Pages Wiki was made by members of the Neocities -->
- <!-- webhost, for your enjoyment, and our pain. We hope that you are -->
- <!-- enjoying reading our articles. -->
- <!-- -->
- <!-- Penny's Pages is composed of original material, and may be used -->
- <!-- as long as you follow CC BY-NC-SA 3.0 -->
- <!-- -->
- <!-- Our URL: https://thewikion.neocities.org/ -->
- <!-- -->
- <!-- Enjoy the rest of your night, young Internet search astronaut! -->
- <!-- -->
- <TABLE WIDTH=750><TD VALIGN=TOP>
- <h1>robots.txt</h1>
- <p>
- Simply put, the robots exclusion standard (also called the robots exclusion protocol or robots.txt protocol) is a easy way of telling Web crawlers and other Web robots what parts of a Web site they can and can not view.
- <p>
- To give robots instructions about what part of your site they can access, you can put a text (.txt) file called robots.txt in the main directory of their Web site, e.g. <tt><a href="https://owlman.neocities.org/robots.txt">https://owlman.neocities.org/robots.txt</a></tt>. This file tells robots what part of your site they can view, however, some robots can ignore such files, especially malicious (or bad) robots.
- <p>
- If the robots.txt file does not exist, Web robots assume that they can see all parts of your site.
- <p>
- An example of a good robot (and a good boy).
- <p>
- <pre> \ oo
- \____|\mm
- //_//\ \_\
- /K-9/ \/_/
- /___/_____\
- -----------</pre>
- <p>
- <b><h1><a href="#Outside links">Outside links</a><a name="Outside links"></a></h1></b>
- <p>
- Here are some useful links on robots.txt that may help you.
- <p>
- <a href="https://en.wikipedia.org/wiki/Robots_exclusion_standard">English Wikipedia article on robots.txt</a>
- <p>
- <a href="https://simple.wikipedia.org/wiki/Robots_exclusion_standard">Simple English Wikipedia article on robots.txt</a>
- <p>
- <a href="http://www.robotstxt.org/">The Web Robots Pages</a>
- </body>
- </html>
|