Django Excessive
Data Exposure:
Examples and Prevention

stackhawk

StackHawk|May 9, 2022

Learn and understand what excessive data exposure is and how you can prevent it in your Django application using REST and GraphQL.

We can perform all kinds of activities online, such as shopping, internet surfing, reading books, banking, and more. But have you considered how we're able to receive so much information? It’s possible thanks to Application Programming Interfaces (APIs) that enable data sharing between an application's different entities. Even though tasks are easy to accomplish with the help of APIs, we still have to deal with security issues. Is the data that's shared over the internet secure? What if it's hacked?  This brings us to excessive data exposure. In this article, we'll define what it is, explain how it can be harmful, and cover it in one of the Python frameworks, Django

What Is Excessive Data Exposure?

When you build an application, there are certain data requirements. Some applications require real-time synchronization. In every application, you generally want to have access control over the data. But there are scenarios where there is no control over specific information, which causes excessive data exposure. Excessive data literally means extra data. Sometimes the client gets access to more data than what was requested. Exposing this extra data causes security and privacy issues. Sensitive information is prone to be leaked during communication, and hackers can easily take advantage. Let's look at the cause of this vulnerability. 

What Causes Excessive Data Exposure in Django REST?

An API is responsible for communication between software or websites. REST has been the standard way to design various web APIs, but it comes with some weaknesses, including the over-fetching of data. Over-fetching simply means that the client downloads more information than required. Hence, it becomes vulnerable as the data can be leaked or hacked midway. How does this happen? REST functionality doesn't give power to the front end to generate specific endpoints for the requested data. When the client requests data, the API sends the request to the server, and all the related information is passed to the client as a response. Therefore, the client gets a response with extra data that can possibly lead to data leakage. Let's look at this problem using an example. 

Example Use Case

Consider a fashion store that has online services. Customers purchase outfits from the website by first logging in and submitting some relevant information. While purchasing, they use internet banking services where the data is stored on the server. Now, say the store is running a marketing campaign that requires information about its customers for further analysis to improve its sales. Let's say it needs the names and addresses of customers from the database on the server. It would send the request below: https://apparel.com/api/customers/show?customer_id=1 The API would then send the request to the server and generate a response. The information given includes all the information about the customer. 

{

  "id": 1, 

  "name": "Jacob", 

  "phone": "123-123-5345",

  "address": "XYZ street, House - 2, California, USA",

 "creditCard": "1234 5678 1234 5678",

 "CVV": "123",

 “validUntil": "2025”

 }

The above data contains certain confidential information. If exposed, it can create a big problem. Therefore, you'll need to remove the excess data before it reaches the client. If you want to dive deeper into this vulnerability, you can find a detailed guide on it here. Now let's discuss some of the techniques that can prevent excessive data. 

How to Prevent Excessive Data Exposure in Django

There are various ways to deal with data exposure.  

  1. The client should not be responsible for filtering data. Hackers can easily get a hold of data at this stage. But if the server side filters the information, you can prevent this. If, on the other hand, you need to gather all this information, you should mask the data. 

  2. Encrypt the data during the transfer of information from one end to the other. If the data is leaked, the attacker won't be able to decrypt it. 

  3. Automate APIs using various security tools so they'll be alerted whenever something like this occurs. 

Next let's look at how excessive data exposure can occur in a REST API and how to fix it. 

REST Application in Django

First, let's create a simple REST API in Django. 

Create a New Django Project

In order to create a new Django project, run the following command inside a directory of your choice: 

django-admin startproject user

Next, jump into the project directory. This is where you'll create a new Django application that consists of user information.

django-admin startapp info

Create Models in Django

Models are the objects in Django that perform basic REST functionalities like create, update, read, and delete (CRUD). To do that, open the models.py file in the info directory and add the following code: 

from django.db import models
class user_info(models.Model):
    firstname = models.CharField(max_length=30)
    lastname = models.CharField(max_length=20)    
    phone_no = models.IntegerField()

    def __str__(self):
        return self.lastname

Then open admin.py file and register the data models as shown below: 

from django.contrib import admin
from . models import user_info
# Register your models here.
admin.site.register(user_info)

Finally, migrate these models into the database using the following command: 

python manage.py makemigrations
python manage.py migrate

Run the Server

To run your Django server, use the following command: 


python manage.py runserver

You should see the following update on your terminal.

Django Excessive Data Exposure: Examples and Prevention image

Now let's see what the application looks like.

Django Excessive Data Exposure: Examples and Prevention image

Creating a Superuser

It's time to create a superuser that will allow access to the admin page.

python manage.py createsuperuser

Then enter a username and password. 

Django Administration

Now go to the admin page to add some data to the database by adding add /admin to the following URL: http://127.0.0.1:8000/admin The page should look like this:

Django Excessive Data Exposure: Examples and Prevention image


Next, input your credentials to enter your admin page with access to the database and start adding data to your table.

Django Excessive Data Exposure: Examples and Prevention image

Serialization

After adding all the required data, create a serialize.py file in the app directory. This will help translate the objects into data types in order to view them in formats like XML, JSON, etc., as a response.

from rest_framework import serializers
from info.models import user_info
class user_infoSerializer(serializers.ModelSerializer):    
    class Meta:
        model = user_info
        fields = '__all__'

Views in Django

The view is an important component in Django. It allows clients to make requests and generate a response. Hence, this component makes the required endpoints. So update the views.py file.

from django.shortcuts import render
from django.http import HttpResponse
from django.shortcuts import get_object_or_404
from rest_framework.views import APIView
from rest_framework.response import Response
from rest_framework import status
from . models import user_info
from . serialize import user_infoSerializer

class userList(APIView):
    def get(self, request):
        user1 = user_info.objects.all()
        serializer = user_infoSerializer(user1, many=True)
        return Response(serializer.data)

Final Output

The last step is to create the URL to view the JSON data, i.e., the user list. The urls.py of the main project should contain the following code:

from django.contrib import admin
from django.urls import path, include
from rest_framework.urlpatterns import format_suffix_patterns
from info import views

urlpatterns = [
    path('admin/', admin.site.urls),
    path('user/', views.userList.as_view()),    
]


You can check the final result by clicking the URL http://127.0.0.1:8000/user

Django Excessive Data Exposure: Examples and Prevention
 image


The API returned all the data from the database. But what if you want only some data? Django can return only the data a client wants after removing all the extra. Let's check it out. 

Django RESTQL

RESTQL is a package in Django that provides only specific data as a response from the server to the client. To import this package, first install it on your system. 

pip install django-restql

In order to work with this package, the serialize.py file will require some changes. 

from rest_framework import serializers
from info.models import user_info
from django_restql.mixins import DynamicFieldsMixin

class user_infoSerializer(DynamicFieldsMixin,serializers.ModelSerializer):    
    class Meta:
        model = user_info
        fields = '__all__'

The package allows querying the data according to the requirements of the client. The response generated in the REST API consisted of all the data fields. Let's suppose the client only wants to see the names of the users, meaning that data in other fields like ID and phone number are extra. In order to handle this, you can add the query to the URL. 

http://127.0.0.1:8000/user/?query={firstname, lastname}

Django Excessive Data Exposure: Examples and Prevention - Picture 6 image

You can see in the figure above that the output contains only the users' first and last names. This is how you can eliminate access to extra data and block an attacker's entry. 

Automated API security testing in CICD

Conclusion: Preventing Excessive Data Exposure in Django

With RESTful APIs, you have to carefully structure your responses and database queries so that you don't accidentally send sensitive information back in the API. Filtering your back-end APIs according to what data the client requests will go a long way toward helping you avoid excessive data exposure. 


This post was written by Siddhant Varma. Siddhant is a full-stack JavaScript developer with expertise in frontend engineering. He’s worked with scaling multiple startups in India and has experience building products in the Ed-Tech and healthcare industries. Siddhant has a passion for teaching and a knack for writing. He's also taught programming to many graduates, helping them become better future developers.


StackHawk  |  May 9, 2022