Ahli Kompie

My Robots.txt Tricks

Tidak mau bor tertentu menjelajahi website Anda dan membuat index website Anda rusak? Berikut adalah beberapa tips robots.txt untuk keperluan SEO yang bisa Anda pakai.

Allow Only Some Bots

Source:

https://support.google.com/webmasters/answer/1061943?hl=en

https://www.bing.com/webmaster/help/which-crawlers-does-bing-use-8c184ec0

https://help.yahoo.com/kb/SLN22600.html

http://www.baiduguide.com/baidu-spider/

http://www.useragentstring.com/pages/useragentstring.php?typ=Crawler

http://www.user-agents.org/index.shtml

Supaya hanya bot tertentu yang bisa mengakses web kita (Google, Bing, Yahoo), berikut caranya:

User-agent: *
Disallow: /

User-agent: Googlebot
User-agent: Googlebot-News
User-agent: Googlebot-Image
User-agent: Googlebot-Video
User-agent: Googlebot-Mobile
User-agent: Mediapartners-Google
User-agent: Mediapartners
User-agent: AdsBot-Google
User-agent: AdsBot-Google-Mobile-Apps
User-agent: Bingbot
User-agent: MSNBot
User-agent: MSNBot-Media
User-agent: AdIdxBot
User-agent: BingPreview
User-agent: Yahoo! Slurp
User-agent: Slurp
User-agent: Ask
User-agent: Ask Jeeves
User-agent: Teoma
User-agent: Yandex
User-agent: YandexBot
User-agent: Baidu
User-agent: Baiduspider
Allow: /

Block Site Scrapers

Source: https://www.blackhatworld.com/seo/anybody-with-the-code-to-block-majestic-ahrefs-and-all-others-from-crawling-a-site.584470/

User-agent: Rogerbot 
User-agent: Exabot 
User-agent: MJ12bot 
User-agent: Dotbot 
User-agent: Gigabot 
User-agent: AhrefsBot 
User-agent: BlackWidow 
User-agent: Bot [EMAIL="craftbot@yahoo.com"]mailto:craftbot@yahoo.com[/EMAIL] 
User-agent: ChinaClaw 
User-agent: Custo 
User-agent: DISCo 
User-agent: Download Demon 
User-agent: eCatch 
User-agent: EirGrabber 
User-agent: EmailSiphon 
User-agent: EmailWolf 
User-agent: Express WebPictures 
User-agent: ExtractorPro 
User-agent: EyeNetIE 
User-agent: FlashGet 
User-agent: GetRight 
User-agent: GetWeb! 
User-agent: Go!Zilla 
User-agent: Go-Ahead-Got-It 
User-agent: GrabNet 
User-agent: Grafula 
User-agent: HMView 
User-agent: HTTrack 
User-agent: ia_archiver
User-agent: Image Stripper 
User-agent: Image Sucker 
User-agent: Indy Library
User-agent: InterGET 
User-agent: Internet Ninja 
User-agent: JetCar 
User-agent: JOC Web Spider 
User-agent: larbin 
User-agent: LeechFTP 
User-agent: Mass Downloader 
User-agent: MIDown tool 
User-agent: Mister PiX 
User-agent: Navroad 
User-agent: NearSite 
User-agent: NetAnts 
User-agent: NetSpider 
User-agent: Net Vampire 
User-agent: NetZIP 
User-agent: Octopus 
User-agent: Offline Explorer 
User-agent: Offline Navigator 
User-agent: PageGrabber 
User-agent: Papa Foto 
User-agent: pavuk 
User-agent: pcBrowser 
User-agent: RealDownload 
User-agent: ReGet 
User-agent: SEOkicks-Robot
User-agent: SiteSnagger 
User-agent: SmartDownload 
User-agent: SuperBot 
User-agent: SuperHTTP 
User-agent: Surfbot 
User-agent: tAkeOut 
User-agent: Teleport Pro 
User-agent: VoidEYE 
User-agent: Web Image Collector 
User-agent: Web Sucker 
User-agent: WebAuto 
User-agent: WebCopier 
User-agent: WebFetch 
User-agent: WebGo IS 
User-agent: WebLeacher 
User-agent: WebReaper 
User-agent: WebSauger 
User-agent: Website eXtractor 
User-agent: Website Quester 
User-agent: WebStripper 
User-agent: WebWhacker 
User-agent: WebZIP 
User-agent: Wget 
User-agent: Widow 
User-agent: WWWOFFLE 
User-agent: Xaldon WebSpider 
User-agent: Zeus
Disallow: /

Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *