24th Aug 2022 8 minutes read How to Use Notebooks in Python Kateryna Koidan python notebooks When reading or watching Python tutorials online, you might notice that many data scientists use Python notebooks for their projects. Let’s explore what a notebook is and how to use notebooks in Python. Python is a general-purpose programming language that can be used for just about anything. It is also one of the key tools used in data science these days. There are several popular Python IDEs for data science, but if you follow this field for a while, you’ll notice that many data scientists choose to carry on their projects in Jupyter Notebooks. In this article, I’ll discuss what a notebook is, how to install a Jupyter Notebook on your computer, how to work with notebooks, and when to use them. If you don’t have much experience with Python yet, you may want to start with our interactive online course Introduction to Python for Data Science. With 141 coding challenges, it covers key topics that you’ll need to work with data in Python. What Is a Python Notebook? Jupyter Notebook (formerly IPython Notebooks) is a web-based interactive computational environment that you can use to create documents that contain both computer code (e.g. Python) and rich text elements such as paragraphs of text, equations, visualizations, links, etc. Instead of writing and rewriting entire Python programs, you can use notebooks for a more interactive experience. With Jupyter Notebook, you can: Write one or more lines of code and, if necessary, run each line separately to see the output immediately and spot any issues. Display the code output, including data tables and charts, between the corresponding code snippets. Add explanatory text to your code. As a result, you get a single document corresponding to your Python project. It has all the code, data, and visualizations nicely organized and structured. It doesn’t look like a complex Python program that only programmers can understand, but rather like a ready-to-be-shared report that is easy to follow. Using Python notebooks is especially popular in data science, where they make the workflow so much easier and more efficient. When working with data in Python, it’s very convenient to see the output of separate code lines or code snippets. Some common ways to do this are: df.head() – Returns the first five lines of the data table, helping you get a better understanding of your data. df.describe() – Returns descriptive statistics for the entire data table. df.plot(x="column_1", y="column_2", kind="scatter") – Produces a chart depicting the correlation between column_1 and column_2. All the functions above are part of the pandas library, one of the key libraries for data analysis. You can read more about it in our list of the top 15 Python libraries for data science. We’ll have more examples below, but you should see how using Jupyter notebooks will speed up your workflow and make it easier to communicate and share your results. Now, let’s see how you can start using this tool. How to Install a Jupyter Notebook on Your Computer? The Jupyter Notebook is not included with Python; if you want to use it, you will need to install Jupyter on your computer. There are a few options here. Aspiring data scientists are often encouraged to use the Anaconda distribution of Python. It comes with several useful data science packages pre-installed, including Jupyter Notebook and JupyterLab. If this sounds interesting to you, go to Anaconda’s official website, choose Products → Anaconda Distribution, and download the installer corresponding to your operating system. After you complete the installation and launch the Anaconda Navigator, you’ll be able to run Jupyter Notebook from this navigator. If you want to start using Jupyter without downloading and installing Anaconda, you can do so using the pip package manager. It allows you to install any Python package that is part of the Python Package Index – including the Jupyter Notebook. For this installation, you’ll need to run a few commands from the Windows PowerShell or Command Prompt (if you’re working on Windows) or from the Terminal (if you’re working on a Mac). Check the version of Python installed on your system. You’ll want to have Python 3 (Python 3.X) installed. This is the most commonly used Python version; plus, almost all installations of Python3 ship with the pip package manager pre-installed. To check your Python version, type: python --version or python3 --version Use pip to install Jupyter with the following command: pip3 install jupyter The installation might take a few minutes, but then you’ll have the Jupyter Notebook installed on your computer. Now let’s see how to work with notebooks once you have everything installed. How to Work With Jupyter Notebooks Each time you want to work with a notebook, you’ll need to launch the notebook server. This is done by running the following command in the Terminal (macOS) or Windows PowerShell (Windows): jupyter notebook The command launches the notebook server, which means that you’ll see the Notebook Dashboard in a new browser window (or a new tab). This dashboard allows you to create a new notebook or select and open one of the existing notebooks: To open an existing notebook, simply browse the folders and click on the file with the .ipynb extension (which corresponds to Jupyter notebooks). The notebook will be opened in a new browser tab. To create a new notebook, click New → Python 3, and a blank notebook will be opened in a new tab. Let’s do something with this new notebook! Specifically, let’s have a brief look at the Starbucks Menu dataset. It contains nutritional information for Starbucks food and drinks, including the calories, fat, carbs, fiber, and protein in each serving. Here, we’ll do a very basic exploratory data analysis on the drinks served at Starbucks. For simplicity, I’ve already cleaned the dataset and removed rows with missing data. Now let’s have some fun with the Jupyter notebook: Import all the necessary libraries. We’ll import pandas for processing and analyzing tabular data and Seaborn to create data visualizations. If you don’t have these libraries installed yet, you’ll need to do this before importing – follow the pip package installation sequence described earlier. Create a data frame. Next, we are uploading our dataset from a .csv file to pandas so that we can process and analyze it with Python. In the code below, df is the name we assign to our dataset. View the dataset. To understand our data better, we want to have a look at the top rows of the dataset. To this end, we apply the head() method to our dataset df. By default, the first 5 rows are returned. Get descriptive statistics. Next, we get descriptive statistics for all numerical columns of our dataset using the df.describe() command. Draw a chart. Finally, we want to see some visualizations. For example, below we are using the Seaborn library to plot calories against carbs and understand how these two attributes relate to each other. As you can see, it’s very convenient to work with data in Python notebooks. You can select any cell and run it separately by either clicking Run in the top menu or pressing Shift + Enter. Of course, you can always run all cells at once. In this example, I’ve only scratched the surface. There is much more that you can do with Jupyter notebooks, like adding headers or lists, highlighting code and syntax, exporting notebooks to PDF and other formats, etc. Let’s discuss a few typical use cases for Python notebooks. Typical Use Cases for Python Notebooks As you could see above, Jupyter notebooks are very effective for any data-related projects. If you want to work (or are working) with data, I would highly recommend this tool for your next project. There are certain cases where Python notebooks are especially useful: Building a data science project for your portfolio. If you are looking for a job in data science, building a project portfolio is essential. With Jupyter notebooks, you can demonstrate a variety of your skills with a single document, including how you approach a project, your coding and data visualization skills, your ability to interpret the results, and your presentation skills. Get some ideas for your next data science project here. Documenting and sharing your coding project. Jupyter notebooks are often used by professional programmers to share their experience. Notebooks can be downloaded and studied by Python newbies. They are usually easy to follow because: The output is displayed right after the code. There can be extensive commenting. Everything is nicely formatted. Sharing visualizations. Usually, we share the results of a data visualization as a static image. With Jupyter notebooks, you can share not only charts but also the code behind these charts so that others can dive in and play around. It’s Time to Practice Python! Now you can see how handy Jupyter notebooks can be for your next data science project. However, if you are only starting to learn Python, it would be a good idea to learn and practice the basic concepts first. I would recommend starting with LearnPython.com’s Introduction to Python for Data Science. This interactive online course has many coding challenges covering data cleaning, basic data analysis, and simple data visualizations. If you want to learn more, check out our Python for Data Science learning track. It includes 5 interactive courses with hundreds of exercises. These also cover topics beyond simple data analysis like working with strings in Python and reading or writing CSV, Excel, and JSON files. Read more about our data science learning path in this article. Thanks for reading, and happy learning! Tags: python notebooks