Set up PostgreSQL and MongoDB in Django using Docker

Set up PostgreSQL and MongoDB in Django using Docker
Set up PostgreSQL and MongoDB in Django using Docker

In this post, you'll learn how to integrate multiple databases with the Django framework and navigate incoming data using a DB router that automatically writes them to the required database.

Real-world example scenario

Usually, the majority of projects are using relational databases such as Postgres or MySQL but sometimes we also need NoSQL databases to hold extra heavy data which decrease the overload of relational databases.

Assume that your project generates tons of logs while processing some heavy tasks in a queue. These log objects must be stored in non-relational databases instead of hitting relational databases each time and extremely overloading them with huge messy log objects. I guess you spot the problem here, so let's take a look at what can we do about it...

Setting up environment

Create an empty directory named app then let's start by creating a Dockerfile that will copy our working directory and also install required dependencies for Python and Postgres.

app/Dockerfile

FROM python:3.8-slim

RUN apt-get update \
    && apt-get upgrade -y \
    && apt-get install -y \
    build-essential \
    libssl-dev \
    libffi-dev \
    python3-dev \
    build-essential \
    libjpeg-dev \
    zlib1g-dev \
    gcc \
    libc-dev \
    bash \
    git \
    && pip3 install --upgrade pip


ENV LIBRARY_PATH=/lib:/usr/lib

ENV PYTHONUNBUFFERED 1
ENV PYTHONDONTWRITEBYTECODE 1

WORKDIR /app

COPY . /app

RUN pip3 --no-cache-dir install -r requirements.txt

These wheel packages will be used while installing and setting up Postgres and other services in the system. Now, we need to add requirements.txt to install the required packages for our project.

app/requirements.txt

celery==5.0.2
Django==3.1.3
git+git://github.com/thepylot/djongo.git#egg=djongo
mongoengine==0.20.0
pylint-mongoengine==0.4.0
pymongo==3.11.0
psycopg2-binary==2.8.5
redis==3.5.3
Faker

We are going to use djongo which will help to convert SQL to MongoDB query. By using djongo we can use MongoDB as a backend database for our Django project. At the time of writing this post, djongo has issues supporting Django versions above v.3+, but it can be resolved easily by just by changing the version from the package itself by forking it to your repository. There is a file named setup.py that holds configuration settings and you'll see the block named install_requires where the supported version is mentioned (Don't forget to fork it first).

setup.py

install_requires = [
    'sqlparse==0.2.4',
    'pymongo>=3.2.0',
    'django>=2.1,<=3.0.5',
]

We just need to refactor it to fit with our current version of Django. I am using 3.1.3 so I will replace the 3.0.5 to 3.1.3 and it will look like below:

install_requires = [
    'sqlparse==0.2.4',
    'pymongo>=3.2.0',
    'django>=2.1,<=3.1.3',
]

Once you finished, search for requirements.txt and change the Django version there as well:

requirements.txt (in djongo)

...
django>=2.0,<=3.1.3
...

That's it! Commit your changes to the forked repository and then you need to include it to your requirements.txt file but this time it will get from your forked repository like below:

git+git://github.com/YOUR_GITHUB_USERNAME/djongo.git#egg=djongo

You don't have to go through all these processes because I already included it as you can see from above (requirements.txt) so feel free to use mine if your Django version is 3.1.3 as well.

There are some other dependencies like celery where we'll use it as the queue to pass time-consuming tasks to run in the background and redis is just a message broker for celery. This topic is out of scope for this post but you can visit Dockerizing Django with Postgres, Redis and Celery to understand it more.

Now it's time to set up our services by configuring compose file. Create this file in a root level of your current directory which means it will be outside of app directory There are going to be 5 services in total:

  1. mongodb - for setting up MongoDB
  2. postgres - for setting up PostgreSQL
  3. app - Django project
  4. celery - Queue for tasks
  5. redis - Message broker that required for celery

/docker-compose.yml

version: '3'

services:

  mongo:
    image: mongo
    container_name: mongo
    restart: always
    env_file: .env
    environment: 
      - MONGO_INITDB_ROOT_USERNAME=root
      - MONGO_INITDB_ROOT_PASSWORD=root
      - MONGO_INITDB_DATABASE=${MONGO_DB_NAME}
      - MONGO_INITDB_USERNAME=${MONGO_DB_USERNAME}
      - MONGO_INITDB_PASSWORD=${MONGO_DB_PASSWORD}
    volumes:
      - ${PWD}/_data/mongo:/data/db
      - ${PWD}/docker/_mongo/fixtures:/import
      - ${PWD}/docker/_mongo/scripts/init.sh:/docker-entrypoint-initdb.d/setup.sh
    ports:
      - 27017:27017

  postgres:
    container_name: postgres
    image: postgres:12
    restart: always
    env_file: .env
    environment:
      - POSTGRES_DB=app_db
      - POSTGRES_USER=app_db_user
      - POSTGRES_PASSWORD=supersecretpassword
      - POSTGRES_PORT=5432
    ports:
      - 5432:5432
    volumes:
      - ${PWD}/_data/postgres:/var/lib/postgresql/data
      - ${PWD}/docker/_postgres/scripts/create_test_db.sql:/docker-entrypoint-initdb.d/docker_postgres_init.sql

  redis:
    image: redis:6
    container_name: redis
    restart: always
    env_file: .env
    command: redis-server --requirepass $REDIS_PASSWORD
    ports:
      - 6379:6379
    volumes:
      - ${PWD}/_data/redis:/var/lib/redis

  app:
    build: ./app
    image: app:latest
    container_name: app
    restart: always
    command: "python manage.py runserver 0.0.0.0:8000"
    env_file: .env
    volumes:
      - ${PWD}/app:/app
    ports:
      - 8000:8000
    depends_on:
      - postgres
      - redis

  celery:
    build: ./app
    image: app:latest
    container_name: celery
    restart: always
    command: [
      "celery",
      "-A",
      "app",
      "worker",
      "-c",
      "1",
      "-l",
      "INFO",
      "--without-heartbeat",
      "--without-gossip",
      "--without-mingle",
    ]
    env_file: .env
    environment:
      - DJANGO_SETTINGS_MODULE=app.settings
      - DJANGO_WSGI=app.wsgi
      - DEBUG=False
    volumes:
      - ${PWD}/app:/app
    depends_on:
      - postgres
      - redis

networks:
  default:

I will not go through this configuration in detail by assuming you already have knowledge about docker and compose files. Simply, we are pulling required images for the services and setting up main environment variables and ports to complete the configuration.

Now we also need to add .env file to fetch values of environment variables while building services:

/.env

# Mongo DB
MONGO_DB_HOST=mongo
MONGO_DB_PORT=27017
MONGO_DB_NAME=mongo_db
MONGO_DB_USERNAME=root
MONGO_DB_PASSWORD=root
MONGO_DB_URI=mongodb://root:root@mongo:27017

# PostgreSQL
POSTGRES_HOST=postgres
POSTGRES_DB=app_db
POSTGRES_USER=app_db_user
POSTGRES_PASSWORD=supersecretpassword
POSTGRES_PORT=5432


# Redis
REDIS_HOST=redis
REDIS_PORT=6379
REDIS_PASSWORD=supersecretpassword
BROKER_URL=redis://:supersecretpassword@redis:6379/0
REDIS_CHANNEL_URL=redis://:supersecretpassword@redis:6379/1
CELERY_URL=redis://:supersecretpassword@redis:6379/0

Next, we'll create a new Django project inside our app folder:

docker-compose run app sh -c "django-admin startproject app ."

The project structure should be like below:

.
├── app
│   ├── app
│   │   ├── asgi.py
│   │   ├── __init__.py
│   │   ├── settings.py
│   │   ├── urls.py
│   │   └── wsgi.py
│   ├── Dockerfile
│   ├── manage.py
│   └── requirements.txt
├── docker-compose.yml
└── .env

If you see dbsqlite in project files then you should delete it since we'll use postgres it as a relational database. You also will notice _data directory which represents the volume of MongoDB and Postgres.

Integration with PostgreSQL

We are ready to add our primary relational database which is going to be postgres. Navigate to settings.py and update DATABASES configuration like below:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql',
        'HOST': os.environ.get('POSTGRES_HOST'),
        'NAME': os.environ.get('POSTGRES_NAME'),
        'USER': os.environ.get('POSTGRES_USER'),
        'PASSWORD': os.environ.get('POSTGRES_PASSWORD'),
        'PORT': os.environ.get('POSTGRES_PORT', 5432),

    }
}

The default database set to postgres and environment variables will be fetched from .env file. Sometimes, postgres is having connection issues caused by racing issues between Django and postgres. To prevent such situations we'll implement a custom command and add it to commands block in compose file. In this way, Django will wait  postgres before launch.

The recommended path of holding commands is /core/management/commands/ from the official documentation of Django. So let's create an app named core then create a management/commands directory inside it.

docker-compose run app sh -c "django-admin startapp core"

Then add following command to hold Django until postgres is available:

/core/management/commands/wait_for_db.py

import time
from django.db import connections
from django.db.utils import OperationalError
from django.core.management import BaseCommand

class Command(BaseCommand):
    """Django command to pause execution until db is available"""

    def handle(self, *args, **options):
        self.stdout.write('Waiting for database...')
        db_conn = None
        while not db_conn:
            try:
                db_conn = connections['default']
            except OperationalError:
                self.stdout.write('Database unavailable, waititng 1 second...')
                time.sleep(1)

        self.stdout.write(self.style.SUCCESS('Database available!'))

Make sure you included __init__.py into sub-directories you created. Now update app service in compose file by adding this command:

docker-compose.yml

  app:
    build: ./app
    image: app:latest
    container_name: app
    restart: always
    command: >
        sh -c "python manage.py wait_for_db &&
               python manage.py migrate &&
               python manage.py runserver 0.0.0.0:8000"
    env_file: .env
    volumes:
      - ${PWD}/app:/app
    ports:
      - 8000:8000
    depends_on:
      - postgres
      - redis

Consider the command block only and you'll see we now have two commands there.

Integration with MongoDB

Actually, the integration of MongoDB is so simple thanks to djongo which handles everything behind the scenes. Switch to settings.py again and we'll add our second database as nonrel which stands for the non-relational database.

settings.py

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql',
        'HOST': os.environ.get('POSTGRES_HOST'),
        'NAME': os.environ.get('POSTGRES_NAME'),
        'USER': os.environ.get('POSTGRES_USER'),
        'PASSWORD': os.environ.get('POSTGRES_PASSWORD'),
        'PORT': os.environ.get('POSTGRES_PORT'),

    },
    "nonrel": {
        "ENGINE": "djongo",
        "NAME": os.environ.get('MONGO_DB_NAME'),
        "CLIENT": {
            "host": os.environ.get('MONGO_DB_HOST'),
            "port": int(os.environ.get('MONGO_DB_PORT')),
            "username": os.environ.get('MONGO_DB_USERNAME'),
            "password": os.environ.get('MONGO_DB_PASSWORD'),
        },
        'TEST': {
            'MIRROR': 'default',
        },
    }
}

The same logic applies here as we did for default database which is postgres.

Setting up DB router

DB router which will automatically write objects to a proper database such as whenever the log object created it should navigate to mongodb instead of postgres. Setting up a DB router is simple, we just need to use router methods that Django provides and define our non-rel models to return the proper database.

Create a new directory named utils inside the core app and also add __init__.py to mark it as a python package. Then add the new file which is DB router below:

/core/utils/db_routers.py

class NonRelRouter:
    """
    A router to control if database should use
    primary database or non-relational one.
    """

    nonrel_models = {'log'}

    def db_for_read(self, model, **_hints):
        if model._meta.model_name in self.nonrel_models:
            return 'nonrel'
        return 'default'

    def db_for_write(self, model, **_hints):
        if model._meta.model_name in self.nonrel_models:
            return 'nonrel'
        return 'default'

    def allow_migrate(self, _db, _app_label, model_name=None, **_hints):
        if _db == 'nonrel' or model_name in self.nonrel_models:
            return False
        return True

nonrel_models - We are defining the name of our models in lowercase which belongs to a non-rel database or mongodb.

db_for_read - the function name is self-explanatory so basically, it is used for reading operations which means each time we try to get records from the database it will check where it belongs and return the proper database.

db_for_write - the same logic applies here. It's used to pick a proper database for writing objects.

allow_migrate - Decided if the model needs migration. In mongodb there is no need to run migrations since it's a non-rel database.

Next, we should add extra configuration in settings.py to activate our custom router:

settings.py

...

DATABASE_ROUTERS = ['core.utils.db_routers.NonRelRouter', ]

...

Great! Our database configurations are finished and now it's time to make a few changes in Django as well before launching everything.

Setting up Celery and Redis

This part is a bit out of scope but since I want to illustrate some real-world app then those tools are always present in projects to handle heavy and time-consuming tasks. Let's add celery to our project, but it should place in our project folder alongside with settings file:

celery.py

import os
from celery import Celery

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'app.settings')

app = Celery('app')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()

Basically, it will discover all tasks alongside the project and will pass them to the queue. Next, we also need to update __init__.py file inside the current directory, which is our Django project:

__init__.py

from .celery import app as celery_app

__all__ = ['celery_app']

Celery requires a broker URL for tasks so in this case, we will use Redis as a message broker. Open your settings file and add the following configurations:

settings.py

...

CELERY_TASK_TRACK_STARTED = True
CELERY_TASK_TIME_LIMIT = 30 * 60
CELERY_IGNORE_RESULT = True
CELERY_BROKER_URL = os.environ.get('CELERY_URL')
CELERYD_HIJACK_ROOT_LOGGER = False
REDIS_CHANNEL_URL = os.environ.get('REDIS_CHANNEL_URL')

...

Now try to run docker-compose up -d and all services should start successfully.

Adding models and celery tasks

In this section, we'll pass a new task to the celery queue and test if our DB router works properly and writes objects to the required database. Create a new directory inside the core app named models and add the following file inside it to hold MongoDB models :

core/models/mongo_models.py

from djongo import models as mongo_models



class Log(mongo_models.Model):
    _id = mongo_models.ObjectIdField()
    message = mongo_models.TextField(max_length=1000)

    created_at = mongo_models.DateTimeField(auto_now_add=True)
    updated_at = mongo_models.DateTimeField(auto_now=True)

    class Meta:
        _use_db = 'nonrel'
        ordering = ("-created_at", )

    def __str__(self):
        return self.message

As you see it's a very simple Log model where it will let us know what's going on behind the scenes of internal operations of our application. Then add a model for Postgres as well:

core/models/postgres_models.py

from django.db import models

class Post(models.Model):
    title = models.CharField(max_length=255)
    description = models.TextField()

    def __str__(self):
        return self.title

We are going to generate random posts and make some of them fail in order to write error logs for failed data.

Lastly, don't forget to include __init__.py inside models directory:

models/__init__.py

from .postgres_models import *
from .mongo_models import *

Now we need to create a celery task that will generate tons of posts with random values by using Faker generators:

core/tasks.py

import logging
import random
from faker import Faker
from .models import Post
from celery import shared_task

from core.utils.log_handlers import LoggingHandler

logger = logging.getLogger(__name__)
logger.addHandler(LoggingHandler())

@shared_task
def create_random_posts():
    fake = Faker()
    number_of_posts = random.randint(5, 100)
    for i in range(number_of_posts):
        try:
            if i % 5 == 0:
                title = None
            else: 
                title = fake.sentence()
            description = fake.text()
            Post.objects.create(
                title = title,
                description = description,
            )
        except Exception as exc:
            logger.error("The post number %s failed due to the %s", i, exc)

By adding if statement there we are forcing some of the posts to fail in order to catch exceptions and write them to mongodb. As you noticed, we are using a custom log handler that will write the log data right away after its produced. So, create a new file named log_handlers.py inside utils folder:

core/utils/log_handlers.py

import logging

from core.models import Log


class LoggingHandler(logging.Handler):
    """Save log messages to MongoDB
    """

    def emit(self, record):
        Log.objects.create(
            message=self.format(record),
        )

Great! We are almost ready to launch.

Lastly, let's finish creating a very simple view and URL path to trigger the task from the browser and return JSON a response if the operation succeeded.

core/views.py

from django.http.response import JsonResponse
from .tasks import create_random_posts


def post_generator(request):
    create_random_posts.delay()
    return JsonResponse({"success": True})

urls.py

from django.contrib import admin
from django.urls import path

from core.views import post_generator

urlpatterns = [
    path('admin/', admin.site.urls),
    path('', post_generator)
]

Include the core app inside INSTALLED_APPS configuration in settings.py then run the migrations and we're done!

docker-compose run app sh -c "python manage.py makemigrations core"

Now, if you navigate to 127.0.0.1:8000 the task will be passed to the queue and post objects will start to generate. Try to visit admin and you'll see log objects created successfully and by this way we're avoiding overload postgres while it handles only relational data.

Source Code

GitHub - thepylot/django-mongodb-postgres: Integration of multiple databases with Django framework and navigate incoming data using DB router which automatically writes them to required database.
Integration of multiple databases with Django framework and navigate incoming data using DB router which automatically writes them to required database. - GitHub - thepylot/django-mongodb-postgres:...

Support 🌏

If you feel like you unlocked new skills, please share them with your friends and subscribe to the youtube channel to not miss any valuable information.