{
"cells": [
{
"cell_type": "markdown",
"id": "89151575",
"metadata": {},
"source": [
"# `grep` and `find`"
]
},
{
"cell_type": "markdown",
"id": "875e0caf",
"metadata": {},
"source": [
"## `grep`\n",
"\n",
"`grep` finds and prints lines in files that match a [regular expression](https://python-basics-tutorial.readthedocs.io/en/latest/appendix/regex.html). In the following example, we search for the string `Python`:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "a851f685",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"IPython\n",
"`IPython `_, or *Interactive Python*, was initially an\n",
"advanced Python interpreter that has now grown into an extensive project\n",
"Today, IPython is not only an interactive interface to Python, but also offers a\n",
"number of useful syntactic additions for the language. In addition, IPython is\n",
" * `Miki Tebeka - IPython: The Productivity Booster\n"
]
}
],
"source": [
"!grep Python ../index.rst"
]
},
{
"cell_type": "markdown",
"id": "abcef71d",
"metadata": {},
"source": [
"The option `-w` limits the matches to the word boundaries so that `IPython` is ignored:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "971e4687",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"`IPython `_, or *Interactive Python*, was initially an\n",
"advanced Python interpreter that has now grown into an extensive project\n",
"Today, IPython is not only an interactive interface to Python, but also offers a\n"
]
}
],
"source": [
"!grep -w Python ../index.rst"
]
},
{
"cell_type": "markdown",
"id": "c26ef503",
"metadata": {},
"source": [
"`-n` shows the line numbers that match:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "a6c50f5e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"8:`IPython `_, or *Interactive Python*, was initially an\n",
"9:advanced Python interpreter that has now grown into an extensive project\n",
"11:Today, IPython is not only an interactive interface to Python, but also offers a\n"
]
}
],
"source": [
"!grep -n -w Python ../index.rst"
]
},
{
"cell_type": "markdown",
"id": "7765ec9b",
"metadata": {},
"source": [
"`-v` inverts our search"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "ee50ed37",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1:.. SPDX-FileCopyrightText: 2020 Veit Schiele\n",
"2:..\n",
"3:.. SPDX-License-Identifier: BSD-3-Clause\n",
"4:\n",
"5:IPython\n",
"6:=======\n",
"7:\n",
"8:`IPython `_, or *Interactive Python*, was initially an\n",
"9:advanced Python interpreter that has now grown into an extensive project\n",
"10:designed to provide tools for the entire life cycle of research computing.\n",
"11:Today, IPython is not only an interactive interface to Python, but also offers a\n",
"12:number of useful syntactic additions for the language. In addition, IPython is\n",
"13:closely related to the `Jupyter project `_.\n",
"14:\n",
"15:.. seealso::\n",
"18:\n",
"19:.. toctree::\n",
"23:\n"
]
}
],
"source": [
"!grep -n -v \"^ \" ../index.rst"
]
},
{
"cell_type": "markdown",
"id": "8650231d",
"metadata": {},
"source": [
"`grep` has lots of other options. To find out what they are, you can type:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "e8716f88",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"usage: grep [-abcdDEFGHhIiJLlMmnOopqRSsUVvwXxZz] [-A num] [-B num] [-C[num]]\n",
"\t[-e pattern] [-f file] [--binary-files=value] [--color=when]\n",
"\t[--context[=num]] [--directories=action] [--label] [--line-buffered]\n",
"\t[--null] [pattern] [file ...]\n"
]
}
],
"source": [
"!grep --help"
]
},
{
"cell_type": "markdown",
"id": "8f7f4935",
"metadata": {},
"source": [
"In the following example we use the `-E` option and put the pattern in quotes to prevent the shell from trying to interpret it. The `^` in the pattern anchors the match to the start of the line and the `.` matches a single character."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "b3557f33",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"5:IPython\n"
]
}
],
"source": [
"!grep -n -E \"^.Python\" ../index.rst"
]
},
{
"cell_type": "markdown",
"id": "ac12ed53",
"metadata": {},
"source": [
"## `find`\n",
"\n",
"`find .` searches in this directory whereby the search is restricted to directories with `-type d`."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "a9490066",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"..\n",
"../mypackage\n",
"../unix-shell\n",
"../unix-shell/.ipynb_checkpoints\n"
]
}
],
"source": [
"!find .. -type d"
]
},
{
"cell_type": "markdown",
"id": "e17d432c",
"metadata": {},
"source": [
"With `-type f` the search is restricted to files."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "486e9b52",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"./index.rst\n",
"./shell-variables.ipynb.license\n",
"./sorted-length.txt\n",
"./create-delete.ipynb\n",
"./grep-find.ipynb.license\n",
"./create-delete.ipynb.license\n",
"./length.txt\n",
"./file-system.ipynb\n",
"./pipes-filters.ipynb\n",
"./shell-variables.ipynb\n",
"./pipes-filters.ipynb.license\n",
"./file-system.ipynb.license\n",
"./grep-find.ipynb\n"
]
}
],
"source": [
"!find . -type f"
]
},
{
"cell_type": "markdown",
"id": "d4a18766",
"metadata": {},
"source": [
"With `-mtime` the search is limited to the last `X` days, in our example to the last day:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "fcd814cd",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
".\n",
"./sorted-length.txt\n",
"./create-delete.ipynb\n",
"./length.txt\n",
"./file-system.ipynb\n",
"./pipes-filters.ipynb\n",
"./.ipynb_checkpoints\n",
"./grep-find.ipynb\n"
]
}
],
"source": [
"!find . -mtime -1"
]
},
{
"cell_type": "markdown",
"id": "5367a566",
"metadata": {},
"source": [
"With `-name` you can filter the search by name."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "5730e75a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"../index.rst\n",
"../unix-shell/index.rst\n",
"../extensions.rst\n",
"../start.rst\n"
]
}
],
"source": [
"!find .. -name \"*.rst\""
]
},
{
"cell_type": "markdown",
"id": "01e677f7",
"metadata": {},
"source": [
"Now we count the characters in the files with the suffix `.rst`:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "e4e869f3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 923 ../index.rst\n",
" 540 ../unix-shell/index.rst\n",
" 2229 ../extensions.rst\n",
" 820 ../start.rst\n",
" 4512 total\n"
]
}
],
"source": [
"!wc -c $(find .. -name \"*.rst\")"
]
},
{
"cell_type": "markdown",
"id": "86ddb15f",
"metadata": {},
"source": [
"It is also possible to search for a regular expression in these files:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "2cb7faf2",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"../index.rst:`IPython `_, or *Interactive Python*, was initially an\n"
]
}
],
"source": [
"!grep \"ipython.org\" $(find .. -name \"*.rst\")"
]
},
{
"cell_type": "markdown",
"id": "a141f23a",
"metadata": {},
"source": [
"Finally, we filter out all results whose path contains `ipynb_checkpoints`:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "af943d5f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"./create-delete.ipynb\n",
"./file-system.ipynb\n",
"./pipes-filters.ipynb\n",
"./shell-variables.ipynb\n",
"./grep-find.ipynb\n"
]
}
],
"source": [
"!find . -name \"*.ipynb\" | grep -v ipynb_checkpoints"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.13 Kernel",
"language": "python",
"name": "python313"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.0"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
"autoclose": false,
"autocomplete": true,
"bibliofile": "biblio.bib",
"cite_by": "apalike",
"current_citInitial": 1,
"eqLabelWithNumbers": true,
"eqNumInitial": 1,
"hotkeys": {
"equation": "Ctrl-E",
"itemize": "Ctrl-I"
},
"labels_anchors": false,
"latex_user_defs": false,
"report_style_numbering": false,
"user_envs_cfg": false
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"state": {},
"version_major": 2,
"version_minor": 0
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}