{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Data validation with Voluptuous (schema definitions)\n", "\n", "In this notebook we use [Voluptuous](https://github.com/alecthomas/voluptuous) to define schemas for our data. We can then use schema checking at various points in our cleanup to ensure that we meet the criteria. Finally, we can use schema checking exceptions to flag, set aside or remove impure or invalid data.\n", "\n", "
| \n", " | Unnamed: 0 | \n", "timestamp | \n", "city | \n", "store_id | \n", "sale_number | \n", "sale_amount | \n", "associate | \n", "
|---|---|---|---|---|---|---|---|
| 0 | \n", "0 | \n", "2018-09-10 05:00:45 | \n", "Williamburgh | \n", "6 | \n", "1530 | \n", "1167.0 | \n", "Gary Lee | \n", "
| 1 | \n", "1 | \n", "2018-09-12 10:01:27 | \n", "Ibarraberg | \n", "1 | \n", "2744 | \n", "258.0 | \n", "Daniel Davis | \n", "
| 2 | \n", "2 | \n", "2018-09-13 12:01:48 | \n", "Sarachester | \n", "2 | \n", "1908 | \n", "266.0 | \n", "Michael Roth | \n", "
| 3 | \n", "3 | \n", "2018-09-14 20:02:19 | \n", "Caldwellbury | \n", "14 | \n", "771 | \n", "-108.0 | \n", "Michaela Stewart | \n", "
| 4 | \n", "4 | \n", "2018-09-16 01:03:21 | \n", "Erikaland | \n", "11 | \n", "1571 | \n", "-372.0 | \n", "Mark Taylor | \n", "