A checklist for optimizing your site's technical foundation for AI search visibility. Content quality gets you cited — but if AI crawlers can't reach, render, or trust your pages, that content never gets a chance.
Why technical foundations matter for AI search
AI search engines like ChatGPT, Perplexity, and Google AI Overviews rely on web crawling and search indexes to discover content. If your pages are slow, blocked, poorly structured, or invisible to crawlers, no amount of content optimization will help — your pages simply won't be in the pool that AI models draw from.
Technical GEO is the foundation layer. Get this right first, then focus on content quality and authority signals. Most of these checks are one-time fixes with ongoing monitoring — high leverage for relatively low effort.
If AI crawlers can't reach your pages, nothing else matters. These are the baseline requirements for making your content discoverable.
Allow AI crawlers in robots.txt
Don't block GPTBot, Anthropic-AI, PerplexityBot, or other AI user agents. Many sites inadvertently block them with broad disallow rules, making their content invisible to AI search engines entirely. Use our free robots.txt builder →
Submit and maintain an XML sitemap
Ensure your sitemap is current and includes all important content pages. Submit it to Google Search Console and keep it updated as you publish new content. AI systems rely heavily on search indexes to discover your pages.
Fix crawl errors aggressively
Resolve 404s, redirect chains, and server errors that prevent content from being indexed. A page that returns a 500 or loops through three redirects before loading will never make it into an AI response.
Verify your pages are actually indexed
Use Google Search Console to check index status. Being crawlable isn't the same as being indexed. AI models pull from search indexes, so a page that's been crawled but not indexed is effectively invisible.
Speed is a ranking signal, and ranking signals determine which pages AI models see first. Slow pages also risk being partially crawled or timed out entirely.
Hit Core Web Vitals thresholds
Aim for good LCP (<2.5s), FID (<100ms), and CLS (<0.1). Google uses these as ranking signals, and ranking signals feed directly into which pages AI models see first. Use PageSpeed Insights to diagnose issues.
Optimize mobile rendering
AI models often see mobile-rendered content since mobile-first indexing is the default. If your page loads slowly or renders poorly on mobile, the content AI models receive may be incomplete or degraded.
Minimize server response time
TTFB under 200ms is ideal. Slow servers mean crawlers time out or deprioritize your pages. Consider a CDN if you serve a global audience, and optimize database queries that back your content pages.
Structured data gives AI models machine-readable context about your content — what type of page it is, what entities are on it, and how they relate to each other.
Implement Schema.org markup with JSON-LD
Add structured data for Organization, Product, Article, FAQ, HowTo, and Review schemas as relevant to your content. JSON-LD is the preferred format. This gives AI models machine-readable context about what your page is — not just what it says.
Validate structured data before deploying
Use Google's Rich Results Test to catch errors. Invalid structured data is worse than no structured data — it signals carelessness and can cause your rich results to be dropped entirely.
Match schema to actual page content
Don't add FAQ schema unless there are actual FAQs on the page. Don't use Product schema on a blog post. Mismatched schema erodes trust with search engines and can trigger manual actions or filtering.
How your site is organized and linked determines how effectively AI crawlers can discover and contextualize your content.
Use clean, descriptive URLs
Use keyword-rich, human-readable URLs. Avoid parameter strings, session IDs, and dynamic hash fragments. /blog/technical-seo-checklist is far more extractable than /p?id=4827&ref=nav.
Maintain a logical site hierarchy
Organize content in clear categories with a shallow structure. Important pages should be reachable within 3 clicks of the homepage. Deep nesting signals low priority to both search engines and AI crawlers.
Link related content contextually
Internal links help AI models understand relationships between your topics. Link from your pillar pages to supporting content and vice versa. Contextual links carry more weight than footer or sidebar navigation links.
Set canonical tags on all pages
Canonical URLs prevent duplicate content from splitting your authority across multiple URLs. Every page should have a self-referencing canonical tag, and duplicate or near-duplicate pages should point to the primary version.
Even if your pages are crawlable, AI models need to be able to parse and understand the content. These practices ensure your content is readable by machines, not just humans.
Render content server-side
AI crawlers may not execute JavaScript. If your critical content only appears after client-side rendering, it's invisible to many AI systems. Use SSR or static generation for any page you want AI models to see.
Keep important text out of images
Text baked into images, infographics, or screenshots isn't parsed by AI models. If there's a key stat, recommendation, or data point in an image, make sure it also appears in the HTML as actual text.
Use semantic HTML throughout
Proper heading hierarchy (H1-H6), ordered and unordered lists, paragraph tags, and table elements help AI parse your content structure. A div soup with styled spans is harder for AI to extract clean answers from.
Write descriptive alt text for images
Alt text helps AI models understand what an image shows and how it relates to surrounding content. Be specific: "Screenshot of Core Web Vitals report showing LCP at 1.8s" is more useful than "report screenshot."
Security isn't just about protecting users — it's a trust signal that affects how search engines and AI models treat your content.
Serve all pages over HTTPS
HTTPS is a baseline trust signal. All pages should be served over HTTPS with HTTP-to-HTTPS redirects in place. Search engines and AI models treat HTTP pages as less trustworthy, and some won't index them at all.
Keep SSL certificates valid and current
An expired or misconfigured SSL certificate kills trust immediately. Check expiration dates, certificate chain validity, and that your certificate covers all subdomains you serve content from.
Eliminate mixed content
Don't load HTTP resources (images, scripts, stylesheets) on HTTPS pages. Mixed content warnings signal a site that isn't fully secured, and browsers may block the insecure resources entirely.
The JavaScript rendering gap: Many modern sites rely on client-side JavaScript to render content. While Googlebot can execute JS, most AI crawlers cannot. If you use a JavaScript framework (React, Vue, Angular), make sure critical content pages use server-side rendering or static generation. Test by disabling JavaScript in your browser — if the content disappears, AI crawlers can't see it either.
Monthly audit checklist
Run these checks monthly to keep your technical foundation solid.
Test robots.txt allows AI crawlers (GPTBot, Anthropic-AI, PerplexityBot)
Verify sitemap is current, valid, and submitted to Search Console
Check Core Web Vitals scores in Search Console
Validate structured data with Rich Results Test
Review crawl stats and errors in Search Console
Test key pages render without JavaScript enabled
Confirm SSL certificate is valid and not expiring soon
ChatRank tracks your brand's visibility across ChatGPT, Perplexity, and Google AI — showing you which pages AI models are citing and which are invisible.