Learn/Technical SEO

XML Sitemaps

TL;DRA sitemap is a map of your website for Google. Without one, Google has to discover pages by following links - which means some pages might never get found.

What is an XML sitemap?

An XML sitemap is a file that lists every important page on your website along with metadata about each one - when it was last updated, how often it changes, and how important it is relative to other pages.

It lives at yoursite.com/sitemap.xml and is written in a format search engines understand. Think of it as handing Google a table of contents instead of making them read the whole book to find chapters.

Why it matters for your rankings

Google discovers pages by crawling - following links from one page to another. But if a page isn't linked from anywhere (an "orphan page"), Google may never find it.

New sites benefit the most. When your site is brand new, Google doesn't know it exists. A sitemap submitted to Google Search Console says "here's everything I have - please come look." Without it, you're waiting for Google to discover pages naturally, which can take weeks.

Large sites need them. If you have hundreds of pages (blog posts, product pages, landing pages), some will inevitably be buried deep in your site structure. The sitemap ensures nothing gets missed.

Fresh content gets indexed faster. Each sitemap entry includes a lastmod date. When you publish a new blog post, the sitemap updates, and Google knows to come back and check what changed.

What happens without a sitemap:

  • New pages take longer to appear in Google
  • Deep pages may never get indexed
  • Google doesn't know which pages you consider important
  • No signal about when content was last updated

How it actually works

A basic sitemap looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/</loc>
    <lastmod>2026-03-18</lastmod>
    <changefreq>weekly</changefreq>
    <priority>1.0</priority>
  </url>
  <url>
    <loc>https://example.com/blog/</loc>
    <lastmod>2026-03-18</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.9</priority>
  </url>
</urlset>

Key fields:

  • loc - the URL
  • lastmod - when the page was last meaningfully changed (not cosmetic edits)
  • changefreq - how often the page typically changes (hint, not a command)
  • priority - relative importance within your site (0.0 to 1.0)

Google treats changefreq and priority as hints, not instructions. The lastmod date is the most useful signal - but only if it reflects real content changes. Sites that update lastmod on every build (even without content changes) train Google to ignore it.

Common mistakes:

  • Including noindex pages in the sitemap (contradictory signals)
  • Never updating lastmod (Google stops trusting it)
  • Updating lastmod on every deploy regardless of changes (same problem)
  • Forgetting to submit the sitemap to Google Search Console
  • Not including all language versions for multilingual sites

How Webentity handles this

Webentity generates your sitemap automatically at build time. Every page - static pages, blog posts, landing pages, translations - is included with accurate lastmod dates pulled from git history (the actual last time content changed, not deploy time).

When you publish a new blog post or add a page, the sitemap updates on the next deploy. For instant indexing on Bing and Yandex, our IndexNow integration pings search engines immediately when new content goes live.

Your sitemap is also referenced in robots.txt so crawlers find it automatically - no manual Search Console submission needed (though we recommend submitting there too for monitoring).