# Robots.txt for RejexAI # Public pages (/, /pricing, /about, /privacy, /terms, /docs/*) are allowed by default # ============================================ # AI CRAWLERS - ALLOWED (for LLM discovery) # ============================================ # OpenAI (ChatGPT, GPT-4) User-agent: GPTBot Allow: / User-agent: ChatGPT-User Allow: / # Anthropic (Claude) User-agent: ClaudeBot Allow: / User-agent: anthropic-ai Allow: / # Google (Gemini, Bard) User-agent: Google-Extended Allow: / User-agent: GoogleOther Allow: / # Apple Intelligence User-agent: Applebot-Extended Allow: / # Perplexity AI User-agent: PerplexityBot Allow: / # DeepSeek User-agent: DeepSeek-Crawler Allow: / # ByteDance (Doubao, Coze) User-agent: Bytedance Allow: / User-agent: ByteSpider Allow: / # Meta AI (Llama) User-agent: FacebookBot Allow: / User-agent: Meta-ExternalAgent Allow: / # Cohere User-agent: cohere-ai Allow: / # Anthropic Claude (additional) User-agent: Claude-Web Allow: / # You.com User-agent: YouBot Allow: / # Diffbot User-agent: Diffbot Allow: / # Common Crawl (used by many AI models) User-agent: CCBot Allow: / # Omgili (news aggregator used by AI) User-agent: Omgilibot Allow: / # AI2 (Allen Institute - open source models) User-agent: AI2Bot Allow: / # Anthropic research crawler User-agent: anthropic-research Allow: / # Mistral AI User-agent: MistralBot Allow: / # Hugging Face User-agent: HuggingFaceBot Allow: / # Stability AI User-agent: StabilityBot Allow: / # Inflection AI (Pi) User-agent: InflectionBot Allow: / # Character.AI User-agent: CharacterAI Allow: / # Poe (Quora's AI platform) User-agent: PoeBot Allow: / # Bing AI / Copilot User-agent: BingPreview Allow: / # Amazon (Alexa, AWS AI) User-agent: Amazonbot Allow: / # Baidu AI User-agent: Baiduspider-render Allow: / # Yandex AI User-agent: YandexBot Allow: / User-agent: YandexRenderBot Allow: / # ============================================ # SEARCH ENGINE CRAWLERS - ALLOWED # ============================================ User-agent: Googlebot Allow: / User-agent: Bingbot Allow: / User-agent: Slurp Allow: / User-agent: DuckDuckBot Allow: / # ============================================ # AGGRESSIVE SCRAPERS - BLOCKED # ============================================ User-agent: AhrefsBot Disallow: / User-agent: SemrushBot Disallow: / User-agent: MJ12bot Disallow: / User-agent: DotBot Disallow: / # ============================================ # DEFAULT RULES FOR ALL OTHER CRAWLERS # ============================================ User-agent: * # ALLOW - Documentation pages (explicitly allow for clarity) Allow: /docs # BLOCK - Authentication pages (should NOT be indexed) Disallow: /login Disallow: /signup Disallow: /forgot-password Disallow: /auth/ # BLOCK - Admin pages Disallow: /admin # BLOCK - Protected user pages (should NOT be indexed) Disallow: /dashboard Disallow: /analysis/ Disallow: /screenshots Disallow: /appeals/ Disallow: /teams Disallow: /settings Disallow: /guidelines Disallow: /checklist Disallow: /compliance Disallow: /analytics Disallow: /support # BLOCK - Payment processing pages Disallow: /payment/ # BLOCK - API endpoints and functions Disallow: /api/ Disallow: /.netlify/ # ============================================ # SITEMAPS # ============================================ Sitemap: https://rejexai.netlify.app/sitemap.xml