{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Hypothesis: Property-based testing\n", "\n", "In this notebook we use property-based testing to find problems in our code. [Hypothesis](https://hypothesis.readthedocs.io/en/latest/) is a library similar to Haskell’s [Quickcheck](https://hackage.haskell.org/package/QuickCheck). We’ll get to know it in more detail later, along with other test libraries: Hypothesis. [Hypothesis](https://jupyter-tutorial.readthedocs.io/en/latest/notebook/testing/hypothesis.html) can also provide mock objects and tests for numpy data types." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Imports" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import re\n", "\n", "from hypothesis import assume, given\n", "from hypothesis.strategies import emails, integers, tuples" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Find range" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def calculate_range(tuple_obj):\n", " return max(tuple_obj) - min(tuple_obj)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Test with `strategies` and `given`\n", "\n", "With [hypothesis.strategies](https://hypothesis.readthedocs.io/en/latest/data.html) you can create different test data. For this, Hypothesis provides strategies for most types and arguments restrict the possibilities to suit your needs. In the example below, we use the [integers](https://hypothesis.readthedocs.io/en/latest/data.html#hypothesis.strategies.integers) strategy, which is applied to the function with the [Python-Decorator](https://docs.python.org/3/glossary.html#term-decorator) `@given`. More specifically, it takes our test function and converts it into a parameterised one to run over wide ranges of matching data:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "@given(tuples(integers(), integers(), integers()))\n", "def test_calculate_range(tup):\n", " result = calculate_range(tup)\n", " assert isinstance(result, int)\n", " assert result > 0" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "ename": "AssertionError", "evalue": "", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mAssertionError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[4], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mtest_calculate_range\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n", "Cell \u001b[0;32mIn[3], line 2\u001b[0m, in \u001b[0;36mtest_calculate_range\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;129m@given\u001b[39m(tuples(integers(), integers(), integers()))\n\u001b[0;32m----> 2\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mtest_calculate_range\u001b[39m(tup):\n\u001b[1;32m 3\u001b[0m result \u001b[38;5;241m=\u001b[39m calculate_range(tup)\n\u001b[1;32m 4\u001b[0m \u001b[38;5;28;01massert\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(result, \u001b[38;5;28mint\u001b[39m)\n", " \u001b[0;31m[... skipping hidden 1 frame]\u001b[0m\n", "Cell \u001b[0;32mIn[3], line 5\u001b[0m, in \u001b[0;36mtest_calculate_range\u001b[0;34m(tup)\u001b[0m\n\u001b[1;32m 3\u001b[0m result \u001b[38;5;241m=\u001b[39m calculate_range(tup)\n\u001b[1;32m 4\u001b[0m \u001b[38;5;28;01massert\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(result, \u001b[38;5;28mint\u001b[39m)\n\u001b[0;32m----> 5\u001b[0m \u001b[38;5;28;01massert\u001b[39;00m result \u001b[38;5;241m>\u001b[39m \u001b[38;5;241m0\u001b[39m\n", "\u001b[0;31mAssertionError\u001b[0m: ", "\u001b[0mFalsifying example: test_calculate_range(\n tup=(0, 0, 0),\n)" ] } ], "source": [ "test_calculate_range()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we correct the test with `>=` and check it again:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "@given(tuples(integers(), integers()))\n", "def test_calculate_range(tup):\n", " result = calculate_range(tup)\n", " assert isinstance(result, int)\n", " assert result >= 0" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "test_calculate_range()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Check against regular expressions\n", "\n", "[Regular expressions](https://en.wikipedia.org/wiki/Regular_expression) can be used to check strings for certain syntactical rules. In Python, you can use [re.match](https://docs.python.org/3/library/re.html#re.match) to check regular expressions.\n", "\n", "
\n", "\n", "**Note:**\n", "\n", "On the website [regex101](https://regex101.com/) you can first try out your regular expressions.\n", "
\n", "\n", "As an example, let’s try to find out the `username` and the `domain` from email addresses:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "def parse_email(email):\n", " result = re.match(\n", " r\"(?P\\w+).(?P[\\w\\.]+)\",\n", " email,\n", " ).groups()\n", " return result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we write a test `test_parse_email` to check our method. As input values we use the [emails](https://hypothesis.readthedocs.io/en/latest/data.html#hypothesis.strategies.emails) strategy of Hypothesis. As result we expect for example:\n", "```\n", "('0', 'A.com')\n", "('F', 'j.EeHNqsx')\n", "…\n", "```\n", "In the test, we assume on the one hand that two entries are always returned and that a dot (`.`) occurs in the second entry. " ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "@given(emails())\n", "def test_parse_email(email):\n", " result = parse_email(email)\n", " # print(result)\n", " assert len(result) == 2\n", " assert \".\" in result[1] " ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ " + Exception Group Traceback (most recent call last):\n", " | File \"/Users/veit/sandbox/py313/.venv/lib/python3.13/site-packages/IPython/core/interactiveshell.py\", line 3577, in run_code\n", " | exec(code_obj, self.user_global_ns, self.user_ns)\n", " | ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " | File \"/var/folders/hk/s8m0bblj0g10hw885gld52mc0000gn/T/ipykernel_6234/3120039226.py\", line 1, in \n", " | test_parse_email()\n", " | ~~~~~~~~~~~~~~~~^^\n", " | File \"/var/folders/hk/s8m0bblj0g10hw885gld52mc0000gn/T/ipykernel_6234/102862131.py\", line 2, in test_parse_email\n", " | def test_parse_email(email):\n", " | ^^^\n", " | File \"/Users/veit/sandbox/py313/.venv/lib/python3.13/site-packages/hypothesis/core.py\", line 1705, in wrapped_test\n", " | raise the_error_hypothesis_found\n", " | ExceptionGroup: Hypothesis found 2 distinct failures. (2 sub-exceptions)\n", " +-+---------------- 1 ----------------\n", " | Traceback (most recent call last):\n", " | File \"/var/folders/hk/s8m0bblj0g10hw885gld52mc0000gn/T/ipykernel_6234/102862131.py\", line 6, in test_parse_email\n", " | assert \".\" in result[1]\n", " | ^^^^^^^^^^^^^^^^\n", " | AssertionError\n", " | Falsifying example: test_parse_email(\n", " | email='0/0@A.ac',\n", " | )\n", " +---------------- 2 ----------------\n", " | Traceback (most recent call last):\n", " | File \"/var/folders/hk/s8m0bblj0g10hw885gld52mc0000gn/T/ipykernel_6234/102862131.py\", line 3, in test_parse_email\n", " | result = parse_email(email)\n", " | File \"/var/folders/hk/s8m0bblj0g10hw885gld52mc0000gn/T/ipykernel_6234/3679374379.py\", line 5, in parse_email\n", " | ).groups()\n", " | ^^^^^^\n", " | AttributeError: 'NoneType' object has no attribute 'groups'\n", " | Falsifying example: test_parse_email(\n", " | email='/@A.com',\n", " | )\n", " +------------------------------------\n" ] } ], "source": [ "test_parse_email()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With Hypothesis, two examples were found that make it clear that our regular expression in the `parse_email` method is not yet sufficient: `0/0@A.ac` and `/@A.ac`. After we have adapted our regular expression accordingly, we can call the test again:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "def parse_email(email):\n", " result = re.match(\n", " r\"(?P[\\.\\w\\-\\!~#$%&\\|{}\\+\\/\\^\\`\\=\\*']+).(?P[\\w\\.\\-]+)\",\n", " email,\n", " ).groups()\n", " return result" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "test_parse_email()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.13 Kernel", "language": "python", "name": "python313" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.0" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }