Python dictionaries are great! They create a simple yet powerful data structure, they're easy to reference in code, and they provide an easy way to pass data to a function. Python dictionaries might be my best friends when my so called real friends are off getting married and starting families. I especially love using them to pass several parameters to a function so instead of:
you can write
and just expect a dictionary with the right values. However, this assumes that you're always going to be passed a dictionary of what you expect. While you can simply use user_stats.get("name") which should either get the value in the dictionary, if found, or give None, I decided to do some googling around and found a faster, more formal way to declare what you want. I found a package called Voluptuous which allows to create a schema that you can use to validate your dictionary against.
A quick and easy use of Voluptuous is to ensure that all values have the correct types that we want. Suppose that we use the function def above def hows_my_driving(user_stats):, we can verify this dictionary with a schema from Voluptuous:
So we're saying that the keys name and driving_status MUST be strings while arrests and outstanding_tickets MUST be integers. If the schema is violated, Voluptuous raises an exception of the key and the expected type. Using a try-catch statement, this is how we can validate the dictionary:
If we pass in a valid dictionary such as
then we'll pass the try statement without error and proceed to the next line
But if we pass in a dictionary where the key arrests is formatted as a string such as below
then an exception will be raised and stop the program's execution with
The user_stats dictionary was not formatted correctly. Error: expected int for dictionary value @ data['arrests']
Of course, you can choose to handle this logic differently and not raise an exception to continue the program's execution.
What if the method or user calling our method forgets a key? The keys are optional in the schema above by default so we can pass
which will pass the schema but lead to an unhandled exception later on we attempt to print the results with
KeyError: 'name'
name and driving_status are going to be required for the print statement to not error out. Luckily, voluptuous can handle this with
We'll also want to rework our exception handling slightly with
Now, we're back to our handled exception with
Exception: The user_stats dictionary was not formatted correctly. Error: required key not provided @ data['name']
Taking this further, we may have a longer dictionary where we want to require all of the keys. Instead of explicitly making each key required, we can pass required=true to the Schema object.
And we can go the other way and explicitly write some keys as optional:
We can take this a little further and say that there should be no more than 5 arrests:
Notice that we want to use All to combine multiple requirements to the same field
So if we pass the following dictionary
We will receive
The user_stats dictionary was not formatted correctly. Error: value must be at most 5 for dictionary value @ data['arrests']
Now, let's say that we want to have a url in our dictionary we can give to the program. Voluptuous has a solution for that
Notice that we've been using types for the dictionary value. Here, we're using a callable to validate the url key. The url key needs to be a string which is a valid url. In the schema validation, we can use this
but not this
Let's say that you want all users to enter a Date of birth and you'd like these dates to all be entered in the correct format. You can use a custom validation method to ensure that the date is in the format that you want.
Date of birth is going to be validated against our custom Date() method
Recently, I've been working on a Udacity Nanodegree and I decided to use Voluptuous to validate my API requests. I wanted to have an easy reusable function to validate that the expected JSON data is sent with a POST, PUT, or PATCH request. What better way than to basically validate against a dictionary?
When I first started learning how to validate forms, the idea that I learned was to start with an array of errors and then append to that array whenever data wasn't in the correct form. I still use this method of validation today! So, this will be the general idea for validating the data that our users pass to create a user object.
The above code is fine if you have you only a few parameters. Let's say that in addition to username, we'll also want the email. Unlike username, the email of the user isn't required and will just store 'Not Provided' if not passed. Also, we need to verify that the username is a string and the email, if passed, is a string.
Quickly, our code is becoming harder to manage and you want to be able to keep it consistent. You want a quick and easy way to make the code consistent so that another developer, or even yourself after not touching the code for awhile, can easily build on the code. Naturally, we want to create a function so that our document can tell other developers, and future you, how to work with the code. Here's where using a data validation package like Voluptuous comes in handy.
To start, what we can do is to define a scheme at the top of the module file:
Now instead of checking the errors directly inside of the API definition, we will call another function to generate our error. This makes it easy to tell future developers to just call a function.
We're using a new function, validate_against_api that will validate the request and return any errors. This takes the place of us checking all parameters and types manually.
So,
api_errors list, which we'll return at the end of the function to pass any errors back to the API definitionDictInvalid error, which I find usually occurs when my data object is empty. I used to send data as form data to my API before learning that sending the data as a JSON body is easier to parse and debug. Still, old habits die hard and I've done this once or twice where I send data as form data by mistake.Exception class as the next error to catch because I'll always want to iterate over the errors.errors object. It is errors.errors because a general errors object is sent and then an errors attribute is sent with the object because more than one error may be thrown."required key not provided" is the message, then a required attribute on the schema was violated. We can use the path attribute of the error to get the name of the required parameter that was violated. We can also use the path and method attributes on the request object to pull more information that may be useful to an API consumer. expected is found in the message, we can tell that the violation was an incorrect type.Now we have a reusable function that we can use to validate our API requests. This will help to keep the code consistent so that developers aren't using their own code to validate.
However, there is one difference with PATCH requests. Because PATCH requests conventionally are used so that the user can update only a single attribute of an object, we update the above validate function slightly,
We're passing an additional optional boolean parameter so that we can later not generate an error if a required parameter is not passed by setting this to True. Again, PATCH requests are recognized as a partial update so that we're not updating the entire object. This optional parameter helps us to make the required constraint a little bit more "loose".
In our main API definition, we'll also update our call slightly so that we use data.get() to get the parameter if it's defined and return None if not defined. We'll also set optional as True when we call our validate function.
Now, our validate_against_api() function is just a little bit more universal and can deal with more requests.
The above are just a few features of Voluptuous that I've found to be useful but here is the Github repo which has even more examples of functionality: https://github.com/alecthomas/voluptuous
Type checking is an important thing that statically typed languages have always had over "hobbiest" languages. Even though a lot of the burden has always fallen to the programmer to type check their code, packages like this as well as the growing popularity of TypeScript have really helped in this area to ease the burden.