{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Hypothesis: Property-based testing\n",
    "\n",
    "In this notebook we use property-based testing to find problems in our code. [Hypothesis](https://hypothesis.readthedocs.io/en/latest/) is a library similar to Haskell’s [Quickcheck](https://hackage.haskell.org/package/QuickCheck). We’ll get to know it in more detail later, along with other test libraries: Hypothesis. [Hypothesis](https://jupyter-tutorial.readthedocs.io/en/latest/notebook/testing/hypothesis.html) can also provide mock objects and tests for numpy data types."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import re\n",
    "\n",
    "from hypothesis import assume, given\n",
    "from hypothesis.strategies import emails, integers, tuples"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Find range"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def calculate_range(tuple_obj):\n",
    "    return max(tuple_obj) - min(tuple_obj)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Test with `strategies` and `given`\n",
    "\n",
    "With [hypothesis.strategies](https://hypothesis.readthedocs.io/en/latest/data.html) you can create different test data. For this, Hypothesis provides strategies for most types and arguments restrict the possibilities to suit your needs. In the example below, we use the [integers](https://hypothesis.readthedocs.io/en/latest/data.html#hypothesis.strategies.integers) strategy, which is applied to the function with the [Python-Decorator](https://docs.python.org/3/glossary.html#term-decorator) `@given`. More specifically, it takes our test function and converts it into a parameterised one to run over wide ranges of matching data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "@given(tuples(integers(), integers(), integers()))\n",
    "def test_calculate_range(tup):\n",
    "    result = calculate_range(tup)\n",
    "    assert isinstance(result, int)\n",
    "    assert result > 0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "ename": "AssertionError",
     "evalue": "",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mAssertionError\u001b[0m                            Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[4], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mtest_calculate_range\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n",
      "Cell \u001b[0;32mIn[3], line 2\u001b[0m, in \u001b[0;36mtest_calculate_range\u001b[0;34m()\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[38;5;129m@given\u001b[39m(tuples(integers(), integers(), integers()))\n\u001b[0;32m----> 2\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mtest_calculate_range\u001b[39m(tup):\n\u001b[1;32m      3\u001b[0m     result \u001b[38;5;241m=\u001b[39m calculate_range(tup)\n\u001b[1;32m      4\u001b[0m     \u001b[38;5;28;01massert\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(result, \u001b[38;5;28mint\u001b[39m)\n",
      "    \u001b[0;31m[... skipping hidden 1 frame]\u001b[0m\n",
      "Cell \u001b[0;32mIn[3], line 5\u001b[0m, in \u001b[0;36mtest_calculate_range\u001b[0;34m(tup)\u001b[0m\n\u001b[1;32m      3\u001b[0m result \u001b[38;5;241m=\u001b[39m calculate_range(tup)\n\u001b[1;32m      4\u001b[0m \u001b[38;5;28;01massert\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(result, \u001b[38;5;28mint\u001b[39m)\n\u001b[0;32m----> 5\u001b[0m \u001b[38;5;28;01massert\u001b[39;00m result \u001b[38;5;241m>\u001b[39m \u001b[38;5;241m0\u001b[39m\n",
      "\u001b[0;31mAssertionError\u001b[0m: ",
      "\u001b[0mFalsifying example: test_calculate_range(\n    tup=(0, 0, 0),\n)"
     ]
    }
   ],
   "source": [
    "test_calculate_range()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we correct the test with `>=` and check it again:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "@given(tuples(integers(), integers()))\n",
    "def test_calculate_range(tup):\n",
    "    result = calculate_range(tup)\n",
    "    assert isinstance(result, int)\n",
    "    assert result >= 0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "test_calculate_range()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Check against regular expressions\n",
    "\n",
    "[Regular expressions](https://en.wikipedia.org/wiki/Regular_expression) can be used to check strings for certain syntactical rules. In Python, you can use [re.match](https://docs.python.org/3/library/re.html#re.match) to check regular expressions.\n",
    "\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "\n",
    "**Note:**\n",
    "\n",
    "On the website [regex101](https://regex101.com/) you can first try out your regular expressions.\n",
    "</div>\n",
    "\n",
    "As an example, let’s try to find out the `username` and the `domain` from email addresses:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "def parse_email(email):\n",
    "    result = re.match(\n",
    "        r\"(?P<username>\\w+).(?P<domain>[\\w\\.]+)\",\n",
    "        email,\n",
    "    ).groups()\n",
    "    return result"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we write a test `test_parse_email` to check our method. As input values we use the [emails](https://hypothesis.readthedocs.io/en/latest/data.html#hypothesis.strategies.emails) strategy of Hypothesis. As result we expect for example:\n",
    "```\n",
    "('0', 'A.com')\n",
    "('F', 'j.EeHNqsx')\n",
    "…\n",
    "```\n",
    "In the test, we assume on the one hand that two entries are always returned and that a dot (`.`) occurs in the second entry. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "@given(emails())\n",
    "def test_parse_email(email):\n",
    "    result = parse_email(email)\n",
    "    # print(result)\n",
    "    assert len(result) == 2\n",
    "    assert \".\" in result[1]                                                                                                                                      "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "  + Exception Group Traceback (most recent call last):\n",
      "  |   File \"/Users/veit/sandbox/py313/.venv/lib/python3.13/site-packages/IPython/core/interactiveshell.py\", line 3577, in run_code\n",
      "  |     exec(code_obj, self.user_global_ns, self.user_ns)\n",
      "  |     ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n",
      "  |   File \"/var/folders/hk/s8m0bblj0g10hw885gld52mc0000gn/T/ipykernel_6234/3120039226.py\", line 1, in <module>\n",
      "  |     test_parse_email()\n",
      "  |     ~~~~~~~~~~~~~~~~^^\n",
      "  |   File \"/var/folders/hk/s8m0bblj0g10hw885gld52mc0000gn/T/ipykernel_6234/102862131.py\", line 2, in test_parse_email\n",
      "  |     def test_parse_email(email):\n",
      "  |                    ^^^\n",
      "  |   File \"/Users/veit/sandbox/py313/.venv/lib/python3.13/site-packages/hypothesis/core.py\", line 1705, in wrapped_test\n",
      "  |     raise the_error_hypothesis_found\n",
      "  | ExceptionGroup: Hypothesis found 2 distinct failures. (2 sub-exceptions)\n",
      "  +-+---------------- 1 ----------------\n",
      "    | Traceback (most recent call last):\n",
      "    |   File \"/var/folders/hk/s8m0bblj0g10hw885gld52mc0000gn/T/ipykernel_6234/102862131.py\", line 6, in test_parse_email\n",
      "    |     assert \".\" in result[1]\n",
      "    |            ^^^^^^^^^^^^^^^^\n",
      "    | AssertionError\n",
      "    | Falsifying example: test_parse_email(\n",
      "    |     email='0/0@A.ac',\n",
      "    | )\n",
      "    +---------------- 2 ----------------\n",
      "    | Traceback (most recent call last):\n",
      "    |   File \"/var/folders/hk/s8m0bblj0g10hw885gld52mc0000gn/T/ipykernel_6234/102862131.py\", line 3, in test_parse_email\n",
      "    |     result = parse_email(email)\n",
      "    |   File \"/var/folders/hk/s8m0bblj0g10hw885gld52mc0000gn/T/ipykernel_6234/3679374379.py\", line 5, in parse_email\n",
      "    |     ).groups()\n",
      "    |       ^^^^^^\n",
      "    | AttributeError: 'NoneType' object has no attribute 'groups'\n",
      "    | Falsifying example: test_parse_email(\n",
      "    |     email='/@A.com',\n",
      "    | )\n",
      "    +------------------------------------\n"
     ]
    }
   ],
   "source": [
    "test_parse_email()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With Hypothesis, two examples were found that make it clear that our regular expression in the `parse_email` method is not yet sufficient: `0/0@A.ac` and `/@A.ac`. After we have adapted our regular expression accordingly, we can call the test again:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "def parse_email(email):\n",
    "    result = re.match(\n",
    "        r\"(?P<username>[\\.\\w\\-\\!~#$%&\\|{}\\+\\/\\^\\`\\=\\*']+).(?P<domain>[\\w\\.\\-]+)\",\n",
    "        email,\n",
    "    ).groups()\n",
    "    return result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "test_parse_email()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.13 Kernel",
   "language": "python",
   "name": "python313"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.0"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
   "autoclose": false,
   "autocomplete": true,
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 1,
   "hotkeys": {
    "equation": "Ctrl-E",
    "itemize": "Ctrl-I"
   },
   "labels_anchors": false,
   "latex_user_defs": false,
   "report_style_numbering": false,
   "user_envs_cfg": false
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "state": {},
    "version_major": 2,
    "version_minor": 0
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}