.. SPDX-FileCopyrightText: 2020 Veit Schiele .. .. SPDX-License-Identifier: BSD-3-Clause Introduction ============ Target groups ------------- The target groups are diverse, from data scientists to data engineers and analysts to systems engineers. Their skills and workflows are very different. However, one of the great strengths of Python for Data Science is that it allows these different experts to work closely together in cross-functional teams. Data scientists explore data with different parameters and summarise the results. Data engineers check the quality of the code and make it more robust, efficient and scalable. Data analysts use the code provided by data engineers to systematically analyse the data. System engineers provide the research platform based on the :doc:`jupyter-tutorial:hub/index` on which the other roles can perform their work. In this tutorial we address system engineers who want to build and run a platform based on Jupyter notebooks. We then explain how this platform can be used effectively by data scientists, data engineers and analysts. Structure of the Python for Data Science tutorial ------------------------------------------------- From Chapter 2, the tutorial follows the prototype of a research project: 2. :doc:`workspace/index` with the installation and configuration of :doc:`workspace/ipython/index`, :doc:`Jupyter notebooks ` with :doc:`jupyter-tutorial:nbextensions/index` and :doc:`jupyter-tutorial:ipywidgets/index`. #. :doc:`data-processing/index` either through a :doc:`REST API ` or directly from an :doc:`HTML page `. #. :doc:`clean-prep/index` is a recurring task that involves removing or changing redundant, inconsistent or incorrectly formatted data. #. :doc:`viz/index` has been moved to a separate tutorial with the many different possibilities. #. :doc:`performance/index` introduces ways to make your code run faster. #. :doc:`productive/index` shows what is necessary to achieve reproducible results: not only :doc:`reproducible environments ` are needed, but also versioning of the :doc:`source code ` and :doc:`data `. The source code should be :doc:`packed into programme libraries ` with :doc:`documentation `, :doc:`licence(s) `, :doc:`tests ` and :doc:`logging `. Finally, the chapter includes advice on :doc:`improving code quality ` and :doc:`secure operation `. #. :doc:`web/index` can either generate dashboards from Jupyter notebooks or require more comprehensive application logic, such as demonstrated in :doc:`pyviz:bokeh/embedding-export/flask`, or provide data via a `RESTful API `_. .. include:: ../README.rst :start-after: badges :end-before: first-steps .. include:: ../README.rst :start-after: follow-us