# If the Joomla site is installed within a folder such as at # e.g. www.example.com/joomla/ the robots.txt file MUST be # moved to the site root at e.g. www.example.com/robots.txt # AND the joomla folder name MUST be prefixed to the disallowed # path, e.g. the Disallow rule for the /administrator/ folder # MUST be changed to read Disallow: /joomla/administrator/ # # For more information about the robots.txt standard, see: # http://www.robotstxt.org/orig.html # # For syntax checking, see: # http://www.sxw.org.uk/computing/robots/check.html # 屏蔽指定爬虫 User-agent: AhrefsBot Disallow: / User-agent: Dotbot Disallow: / User-agent: BlexBot Disallow: / User-agent: Dataprovider Disallow: / User-agent: Semantify.it Disallow: / User-agent: Sistrix Disallow: / User-agent: Cliqzbot Disallow: / User-agent: SiteSucker Disallow: / User-agent: wget Disallow: / User-agent: curl Disallow: / User-agent: Scrapy Disallow: / User-agent: Python-urllib Disallow: / User-agent: Java Disallow: / User-agent: BadBotNameHere Disallow: / User-agent: MJ12bot Disallow: / User-agent: rogerbot Disallow: / User-agent: Screaming Frog SEO Spider Disallow: / # 防止爬虫(Spider/Bot)对特定 URL 过度抓取 User-agent: * Crawl-delay: 10 Disallow: /@ningbomh8 Disallow: /@mhchin # 允许其他所有爬虫访问网站,并禁止访问特定目录 User-agent: * Allow: /*.js* Allow: /*.css* Allow: /*.png* Allow: /*.jpg* Allow: /*.gif* Disallow: /administrator/ Disallow: /bin/ # 4SEO-opt Disallow: /cache/ Disallow: /cli/ # 4SEO-opt Disallow: /includes/ Disallow: /installation/ Disallow: /language/ Disallow: /libraries/ Disallow: /logs/ # 4SEO-opt # 4SEO-opt Disallow: /question/ Disallow: /tmp/ # JSitemap entries Sitemap: https://www.mh-chine.com/index.php?option=com_jmap&view=sitemap&format=xml Sitemap: https://www.mh-chine.com/index.php?option=com_jmap&view=sitemap&format=mobile