The Definitive Guide To Sitemaps With Python

Posted by Dmitry

Updated on

sitemaps python sitemapa

Sitemaps are important. Especially for big websites. It is always a good idea to develop your website with SEO in mind. Unfortunately, most developers ignore this part. This article describes general idea and how to implement your sitemaps with python. I made this article for myself in the first place, because I tend to forget things.

What Is Sitemap

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="">

Sitemaps help search engines discover your website pages. You combine your most important URLs in a bunch of XML files. Different sitemaps can contain different types of media. It can be plain URLs, Images, Videos, and News entries. Images, videos, and news entries are just URLs with additional metadata.

Sitemaps are especially important if you have a website with a lot of pages. Now, I will not go into details, because obviously you're a smart person and will find everything at Google Search Central or

Just a few simple rules for you:

  • You can combine sitemaps in index sitemaps.
  • Sitemap size must not exceed 50mbs and/or 50k URLs.
  • Sitemap can be compressed via GZIP.
  • Don't forget to link your sitemaps in robots.txt
  • All sitemaps must be in the same domain.
  • "priority" and "changefreq" are deprecated by Google, so don't bother wasting space.

Don't forget to signup at Google Search Console and upload your sitemaps.

Can I Link To Multiple Sitemaps In robots.txt?

Yes, you can. Sitemap directive can be used multiple times. Here is real-world example:


Create Your First Sitemap With Python

Here is the idea. You'll need 3 modules: xml, os and, optionally gzip. This snippet shows how sitemap can be created.

import os
import gzip

from xml.etree import cElementTree

def add_url(root_node, url, lastmod):
    doc = cElementTree.SubElement(root_node, "url")
    cElementTree.SubElement(doc, "loc").text = url
    cElementTree.SubElement(doc, "lastmod").text = lastmod

    return doc

def save_sitemap(root_node, save_as, **kwargs):
    compress = kwargs.get("compress", False)

    sitemap_name = save_as.split("/")[-1]
    dest_path = "/".join(save_as.split("/")[:-1])

    sitemap_name = f"{sitemap_name}.xml"
    if compress:
        sitemap_name = f"{sitemap_name}.gz"

    save_as = f"{dest_path}/{sitemap_name}"

    # create sitemap path if not existed
    if not os.path.exists(f"{dest_path}/"):

    if not compress:
        tree = cElementTree.ElementTree(root_node)
        tree.write(save_as, encoding='utf-8', xml_declaration=True)

        # gzip sitemap
        gzipped_sitemap_file =, 'wb')

    return sitemap_name

# create root XML node
sitemap_root = cElementTree.Element('urlset')
sitemap_root.attrib['xmlns'] = ""

# add urls
add_url(sitemap_root, "", "2022-04-07")
add_url(sitemap_root, "", "2022-04-07")
add_url(sitemap_root, "", "2022-04-07")

# save sitemap. xml extension will be added automatically
save_sitemap(sitemap_root, "sitemaps/sitemap")

# if you want to gzip sitemap
save_sitemap(sitemap_root, "sitemaps/sitemap", compress=True)


If you want to add images, videos or news sections you'll need to add xml attributes for your root node.

# create root XML node
sitemap_root = cElementTree.Element('urlset')
sitemap_root.attrib['xmlns'] = ""

# for images add
sitemap_root.attrib["xmlns:image"] = ""

# for videos add
sitemap_root.attrib["xmlns:video"] = ""

# for news add
sitemap_root.attrib["xmlns:news"] = ""

# add this snippet to attach image to url
def add_url_image(url_node, image_url):
    image_node = cElementTree.SubElement(url_node, "image:image")
    cElementTree.SubElement(image_node, "image:loc").text = image_url

    return image_node

# now when you want to add image to url
url_1 = add_url(sitemap_root, "", "2022-04-07"),
add_url_image(url_1, "")

I will not describe here how to add videos or news to your url, because with this code you can easily do it yourself.

How To Create Index Sitemap

If you have a lot of pages on your website or you simply want to place your sitemaps in different sections you'll need index sitemaps. Index sitemap is just an XML-file with root tag sitemapindex with sitemap tags containing URLs to your sitemaps.

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="">

Let's improve our code to create index sitemap. Add function add_sitemap_url at the beginning of your file.

def add_sitemap_url(root_node, sitemap_url):
    sitemap_url_node = cElementTree.SubElement(root_node, "sitemap")
    cElementTree.SubElement(sitemap_url_node, "loc").text = sitemap_url

    return sitemap_url_node

Then use it whenever you need it.

# create sitemapindex tag
sitemap_index_node = cElementTree.Element('sitemapindex')
sitemap_index_node.attrib['xmlns'] = ""

# append links to other sitemaps
add_sitemap_url(sitemap_index_node, "")
add_sitemap_url(sitemap_index_node, "")

save_sitemap(sitemap_index_root, "sitemaps/sitemap")

You can find code here. Feel free to comment or ask questions.

A Short Guide To The Chinese Coordinate System

Have your ever searched google maps china offset? Most people who was in China yes. Here is my story behind this question.

Sitemapa Library

Create sitemaps with Python

Here is the code: GitHub
And package: PyPi

Now, for small sitemaps, it's all pretty easy. If you need to generate lots of sitemaps with images, videos, or news metadata, your code will become messy at some point. I created sitemapa as a little abstraction for XML burden.

Sitemapa is a small package to reduce your work while generating sitemaps. You describe your sitemaps with JSON structure. Sitemapa is framework-agnostic and not indexing your website — it's just generating sitemaps from your description. Noting more. I use it to generate sitemaps for millions of URLs on my websites.

Keep in mind that it's your job to validate your urls and lastmod dates.


  • Use JSON to describe your sitemaps. Don't waste your time with XML.
  • No extra dependencies.
  • Create regular sitemaps. URLs, Images, Videos and News are supported.
  • Create index sitemaps to combine your regular sitemaps.
  • Create extra attributes for your tags like <video:price currency="EUR">1.99</video:price>.
  • Compress sitemaps with gzip.
  • Auto Image, Video or news xmlns attributes.


pip install sitemapa

# import in your script
from sitemapa import Sitemap, IndexSitemap

Create Standard Sitemap. Sitemap Class API.

You need to import Sitemap class to create a standard sitemap: from sitemapa import Sitemap. Sitemap class has two methods: append_url and save.

append_url(url, url_data=None)
Parameters: url(str) — Website URL
            url_data(Optional[dict]) — URL Description
            url_data can contain next keys:
              - lastmod
              - changefreq. Deprecated at Google
              - priority. Deprecated at Google
              - images. To describe URL images
              - videos. To describe URL videos
              - news. To describe URL news

Return type: dict. Dictionary with all urls and url_data

# ------

save(save_as, **kwargs)
Parameters: save_as(str) — Sitemap name and where to save. For example: sitemap1.xml or sitemap1.xml.gz

Return type: str. For example sitemap1.xml or sitemap1.xml.gz

Let's create a sitemap like this and save it as sitemap1.xml.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="">

And this is the implementation with sitemapa:

from sitemapa import Sitemap

standard_sitemap = Sitemap()

standard_sitemap.append_url("", {
    "lastmod": "2022-06-04"

# method 'save' will reset inner dictionary with URLs
sitemap1_name ="sitemap1.xml")

# now, if you want to create new sitemap, just do this:
sitemap2_name ="sitemap2.xml")

Add Images To Your Standard Sitemap

Let's take this example from Google.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns=""

To do so, we'll use url_data description.

from sitemapa import Sitemap

sitemap_with_images = Sitemap()

sitemap_with_images.append_url("", {
    "images": [

# you can also describe like this
sitemap_with_images.append_url("", {
    "images": [
            "loc": "",
            "lastmod": "2022-05-05"

As you can see you can use a list of images or a list of dictionaries. I prefer the first option, since Google deprecated all keys except loc.

Add Videos To Your Standard Sitemap

This is where it gets a little tricky. Videos have a more complex structure. Let's dive into details, using an example from Google.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns=""
       <video:title>Grilling steaks for summer</video:title>
       <video:description>Alkis shows you how to get perfectly done steaks every
       <video:restriction relationship="allow">IE GB US CA</video:restriction>
       <video:price currency="EUR">1.99</video:price>
from sitemapa import Sitemap

sitemap = Sitemap()

sitemap.append_url("", {
    "videos": [
            "thumbnail_loc": "",
            "title": "Grilling steaks for summer",
            "description": "Alkis shows you how to get perfectly done steaks every time",
            "content_loc": "",
            "player_loc": "",
            "duration": "600",
            "expiration_date": "2021-11-05T19:20:30+08:00",
            "rating": "4.2",
            "view_count": "12345",
            "publication_date": "2007-11-05T19:20:30+08:00",
            "family_friendly": "yes",
            "restriction": {
                "$value": "IE GB US CA",
                "relationship": "allow"
            "price": {
                "$value": "1.99",
                "currency": "EUR"
            "requires_subscription": "yes",
            "uploader": {
                "info": "",
                "$value": "GrillyMcGrillerson"
            "live": "no"

You can see that each item in the videos list is a description for <video:video>. Take a look at the "restriction" attribute. Each property(except $value) will add extra attributes to <video:restriction>. $value is a special property and it is the content of a tag. So basically it works like this: <video:restriction relationship="allow">restriction[$value]</video:restriction>.

Add Google News To Your Standard Sitemap

Keep in mind that Google require you to publish in sitemap only new articles. Read more about this here.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="" xmlns:news="">
                <news:name>The Example Times</news:name>
            <news:title>Companies A, B in Merger Talks</news:title>

And this is implementation with sitemapa

from sitemapa import Sitemap

sitemap = Sitemap()

sitemap.append_url("", {
    "news": [
            "publication": {
                "$tags": {
                    "name": "The Example Times",
                    "language": "en"
            "publication_date": "2008-12-23",
            "title": "Companies A, B in Merger Talks"

As you can see we just added new tags(<news:name> and <news:language>) inside of <news:publication> using $tags key.

Can I Describe Images, Videos and Google News all at once?

sitemap.append_url("", {
    "lastmod": "",
    "images": [],
    "videos": [],
    "news": []

Create Index Sitemap with Sitemapa

We'll use an example sitemap from the beginning of this article. Import IndexSitemap from sitemapa. IndexSitemap class has two methods: append_sitemap and save.

from sitemapa import IndexSitemap

index_sitemap = IndexSitemap()



This article is my summary for sitemaps. I hope it helps you on your journey. Don't forget to verify everything with official resources. If you have any questions or you see mistakes in this text, don't be shy and drop me a line.

The Latest

recv() failed (104: Connection reset by peer) while reading response header from upstream

python uwsgi error snippet

Geolocate the Location of an IP Address With Cloudflare Workers and JavaScript

javascript cloudflare geolocation workers

Detect a visitor's country and city by IP address with Cloudflare Workers and JavaScript.

JavaScript Document Onload + TypeScript version

javascript DOM snippet basics typescript
🗃 JavaScript Basics

Universal code-snippet to know when HTML is ready for modern browsers.

How to Create Jinja2 Filters in Flask

python flask jinja2 snippet

In this post, I'll talk about filters. Jinja2 has a list of built-in filters, and Flask leverages them.

How to Upload Files To Storage

python bunny_net storage snippet is a well-known CDN and storage among developers. Here is my python snippet on how to upload a file to the storage.

How to Copy Text to Clipboard With Javascript

javascript DOM browser

Here is a short snippet on how to copy text to the clipboard with Javascript. Your visitors will thank you. Most likely not, but here we are.

Flask Boilerplate and Your Guide to Flask in 2023. With SQLAlchemy.

boilerplate open source flask

Flask-Backbone is my take to create an initial structure and a set of rules, so most of my flask projects are easy to support.

How to Import CSV Files to PostgreSQL with Pgfutter

csv postgresql

Sometimes I need to upload large CSV files to PostgreSQL. CSV file might have hundreds of columns, that's why i want a tool that can do some magic for me.

How to Upload Files to DigitalOcean Spaces with Python

digitalocean python

Snippet on how to upload files to DigitalOcean Spaces with boto3 and python.

A Short Guide To The Chinese Coordinate System

Have your ever searched google maps china offset? Most people who was in China yes. Here is my story behind this question.