Pages

Tuesday, January 3, 2012

Make validation suck less

Validation sucks

One thing I hate in web application code, it's that big fat block of validation code that sits at the top of your controller (the C in MVC). I tend to have a BIG COMMENT to help my eye-balling capabilities detect where the business logic actually starts. And it's not rare for the validation code to be longer than the meaty part.

Validation sucks. It's boring. It adds absolutely zero value to your business. It is only meant to help protect yourself from human errors...

Embracing validation

That being written, you will need validation, so you are better off embracing it, otherwise...

creepy dog

By coding back-end apps, I only deal with JSON inputs. In a previous post, I wrote about validictory, a JSON validator. This greatly condensed the validation code in my controller to a few lines. But I still had the boilerplate of having a try/except block before deserializing JSON, set an error message in case of failure and return an error response to the caller.

So I wrote a decorator meant to validate requests against expected JSON schemas. Here, the decorator decorates a Pyramid view:

from myproj.lib.decorators import validate_json_input
from myproj.lib import jsonschemas

@validate_json_input(jsonschemas.save_user_prefs_schema)
def save_user_prefs(request):
    data = request.json

    # Do interesting stuff with the data ...

My view has only business logic, with validation kept out of the way. In case of a buggy schema, I can tell at a glance which schema is faulty. The decorator makes the deserialized and validated data available under request.json.

And here is the decorator code, which returns a "400 Bad Request" response in case of error:

import simplejson as json
import validictory
from functools import wraps
from pyramid.httpexceptions import HTTPBadRequest

def validate_json_input(schema):
    """
    Validate the request's JSON input and make the deserialized data available
    under ``request.json`` before calling the decorated view.

    Return ``HTTPBadRequest`` on error.

    It will ensure:
    - request Content-Type is ``application/json``
    - JSON input (request.body) validates against the given JSON ``schema``

    """
    def decorator(func):
        @wraps(func)
        def wrapper(request):
            error = None
            try:
                if request.content_type != 'application/json':
                    msg = 'Request Content-Type must be application/json'
                    raise ValueError(msg)
                data = json.loads(unicode(request.body))
                validictory.validate(data, schema)
            except ValueError as err:
                error = {'error': str(err)}
            except json.JSONDecodeError as err:
                error = {'error': "Corrupted/malformed JSON: %s" % err}
            except validictory.ValidationError as err:
                error = {'error': "Invalid JSON: %s" % err}

            if error is not None:
                return HTTPBadRequest(body=json.dumps(error),
                                      content_type='application/json')

            request.json = data
            return func(request)
        return wrapper
    return decorator

Whenever the underlying view gets called, I am confident that the data is clean. Trust me, it removes a fair amount of "if" statements in your controller/view code.

So far, I have been quite happy with this approach, making my validation process suck less.

Peppercorn

For front-end tasks, you may wish browsers could natively submit web forms in JSON to validate them using the above technique. But I don't think that will happen anytime soon. According to the current HTML5 draft, the only possible form encoding types are:
  • application/x-www-form-urlencoded (default)
  • multipart/form-data
  • text/plain
Of course, AJAX requests can submit JSON. But that is not my point.

There many form validation libraries out there. Each has its own way of validating submitted data. But why not validate data against a JSON schema, an IETF draft on its way of becoming a standard?

Peppercorn is a clever little Python library that relies on the fact that, according to the HTML specification, form values are submitted by the browser in the order they appear in the HTML document. Thus, Peppercorn came up with a simple tokenization protocol that turns your flat and boring form submission data into richer data structures, with nested lists and dictionaries, à la JSON. Checkout the Peppercorn documentation to get a feel of it!

To use the above decorator with peppercorn-submitted forms, just substitute json.loads() with peppercorn.parse(), adapt the enforced content type, and keep validating your input data with validictory!

2 comments:

  1. Nice approach, though what about integrating client side validation with error labels for erroneous fields as well as server side validation? Something like jquery.validation? Sending back a 400 is not the most user friendly approach.

    ReplyDelete
    Replies
    1. Returning an HTTPBadRequest is only an example which works for my task. You could return anything you want that makes more sense to you. Having client side validation working with validictory (or JSON schemas in general) would be nice to see happen although it would be part of a larger scope. It would require to have a structured response so error messages can be mapped back to form fields. Moreover, if it's directed to end-users, you might not want to expose validictory's error messages as they are very technical. Maybe you'd want to provide an alternate error message (per error type that validictory could potentially raise), and support internationalization. It's a challenging problem that I am not interested in solving. ;)

      Delete