Protecting JSON endpoints in bottle

Back to Blog

Bottle is an amazing framework for serving quick APIs. For example to serve a Hello world web page all it takes is the following code taken from their home page.

In [1]:
from bottle import route, run, template

@route('/hello/<name>')
def index(name):
    return template('<b>Hello {{name}}</b>!', name=name)

# Run this line if you want to run the server
#run(host='localhost', port=8080)

Now sometimes you expose an API and you want to document what it expects and raise errors if you don't get what the API is expecting.

There are people who will abuse your API by providing things you did not expect. image.png

There are some questions on SO regarding what status code to send in the event of a someone sending you things you don't expect. See here

One way is to raise an error every time it does not match a certain prescribed schema. I found a neat way to do this with decorators and Json Schema. I'll be describing that in this post.

For a minimal example we will implement an API which says hello but only if you give it a certain token.

In [3]:
import bottle
app = bottle.Bottle()


@app.post("/hello")
def hello():
    token = bottle.request.json['token']
    if token.starts_with('please'):
        return 'hello'
    else:
        return 'good bye'

With this defined we can run our app with app.run(port=8080) and it will work. However what if you want to impose certain limitations on the token? In that case your actual API code gets wonky and cluttered if the specification is large.

For now let's say we want our token to be 100 chars long string. Here's how you do it with json schema and decorators. We define a new function called json_validate and decorate our API with it.

In [8]:
from functools import wraps
from jsonschema import validate


def json_validate(function):
    schema = function.__doc__.split('#-#-#-#')[1].strip()
    # Eval poses no threat here since we are running on known string
    schema = eval(schema)

    @wraps(function)
    def newfunction(*args, **kwargs):
        try:
            if bottle.request.json is None:
                raise bottle.HTTPError(415, body='only JSON content is allowed')
            validate(bottle.request.json, schema)
        except Exception as e:
            error_message = 'JSON does not satisfy scheme'
            error_message += '\n' + str(e)
            raise bottle.HTTPError(422, body=error_message)
        else:
            return function(*args, **kwargs)
    return newfunction

What we have done is define a docstring for the API and mark the docstring with the special strings #-#-#-#. The docstring is split with these strings and the JSON schema is picked up from the docstring. This is then evaluated into a schema object. There's no harm in running eval here since it's running on strings that we have written.

Now we check if the incoming JSON matches the schema using the jsonschema library. Appropriate errors are raised if it does not match.

If it satisfies the schema however, the request continues to the function we have specified. The hello API now becomes the following code. We define the docstring and decorate the function. That's all that needs to change in existing code.

In [9]:
@app.post("/hello")
@json_validate
def hello():
    """
    This API returns hello only if the token starts with 'please'
    Schema is described below
    
    #-#-#-#
    {
            "type"      : "object",
            "properties":   {                                                                                  
                                "token" : { "type": "string",
                                            "minLength": 100,
                                            "maxLength": 100
                                          }
                            },
            "required": ["token"]
    }
    #-#-#-#
    
    The API returns a string.
    """
    token = bottle.request.json['token']
    if token.starts_with('please'):
        return 'hello'
    else:
        return 'good bye'

Nice things about this

  1. Our schema gets documented when a documentation builder like Sphinx starts collecting docstrings.
  2. We know that this exact schema is being used to validate the incoming requests. There's no confusion regarding if the documentation matches the code being executed.
  3. It's a single function we can reuse anywhere as opposed to a hefty library.

Bad things about this

  1. There are probably better ways to do validation
  2. JSON only
  3. jsonschema dependency
  4. Docstring might become VERRRY long for huge schemas.

Hence, use this with a pinch of salt.