Why to add indexes?
- We add indexes on database table columns to speed up the select query as indexes allow faster lookup.
- Adding indexes on multiple columns will slow down the write query, so we need to add indexes only on required columns.
Why do we face downtime when adding indexes?
- In most of the databases, adding an index requires an exclusive lock on the table.
- Because, we don’t want to execute any
UPDATE
,INSERT
, andDELETE
operations while the index is created. - Locking a table might create a problem if the system needs to be available when creating the indexes. If the table has more data, then it will take more time to create the indexes.
- As a result, the system will be unavailable till the index creation process completes.
What can be done to prevent the database unavailability when adding indexes?
- If the database vendors didn’t provide the functionality to create the indexes without locking the table then we can’t do anything.
- But, some database vendors like PostgreSQL provide the functionality to create the indexes without locking the table.
- We can create indexes on PostgreSQL concurrently using keyword
CONCURRENTLY
. - Let’s see a simple SQL query to create an index concurrently.
CREATE INDEX CONCURRENTLY "my_index" ON my_table("my_column");
Django migration to apply indexes concurrently on a PostgreSQL database
Let’s consider a simple scenario of product table.
from django.db import models
class Product(models.Model):
name = models.CharField()
description = models.TextField()
class Meta:
db_table="product"
Let’s say we didn’t add the indexes on the table and the data is on the table is huge and it’s resulting in slow queries. so, we need to create the indexes for the column name
.
By defult django runs the migrations by acquiring the lock on the table. To avoid it, we need to write a custom migration.
Let’s create a migration to create indexes for the table concurrently.
002_product_index.py
import django.contrib.postgres.indexes
from django.db import migrations, models
from django.contrib.postgres.operations import AddIndexConcurrently
class Migration(migrations.Migration):
atomic = False
dependencies = [
("app_name", "001_initial"),
]
operations = [
AddIndexConcurrently(
model_name="product",
index=models.Index(
fields=["name"], name="name_idx"
),
),
]
- When we run the concurrent migrations on django we need to set
atomic = False
on the migration file. - To create the index concurrently on PostgreSQL django ORM provides class
AddIndexConcurrently
- By using
AddIndexConcurrently
andatomic=False
we can create the indexes concurrently.
Let’s create a SQL for the above migration to confirm the SQL
python manage.py sqlmigrate app_name 002_product_index
It will generate the SQL like below
CREATE INDEX CONCURRENTLY "name_idx" ON product("name");
Now, we can confirm that the migration will create the indexes concurrently. Let’s apply the migration with the below command
python manage.py migrate app_name 002_product_index
Now, check the database. We should be able to see the index created for the table.
That’s it folks. stay tuned for more articles.
References:
- https://docs.djangoproject.com/en/dev/ref/contrib/postgres/operations/#django.contrib.postgres.operations.AddIndexConcurrently
- https://docs.djangoproject.com/en/3.2/howto/writing-migrations/#non-atomic-migrations
- https://www.postgresql.org/docs/9.1/sql-createindex.html
Thank you for reading the Agiliq blog. This article was written by Anjaneyulu Batta on Jul 31, 2021 in django , postgresql .
You can subscribe ⚛ to our blog.
We love building amazing apps for web and mobile for our clients. If you are looking for development help, contact us today ✉.
Would you like to download 10+ free Django and Python books? Get them here