Categories
MacOS Python Windows

Python, Virtual Environments and Jupyter Notebooks

In a recent issue of CEP, Dr. Jacob Albrecht published an interesting article with the title “Step Into the Digital Age with Python” (CEP, September 2021, pp 47 – 53). Dr. Albrecht gives an excellent overview of the many uses of Python in Chemical Engineering. He also provides several examples of typical engineering calculations and the packages used to solve them. The examples are captured in Jupyter Notebooks.

Notebooks, containing a mix of text and code, are great for documenting as well as running code while showing the results in tables and graphs. If you never used a Jupyter Notebook you should definitely try it. Dr Albrecht’s code is a very good start and freely available for download on GitHub (https://github.com/chepyle/Python4ChEs ). In order to follow along in this blog post, I would suggest you download the code from GitHub and put the folder Python4ChEs somewhere on your hard disk. However, in order to modify and run these notebooks yourself, you need to have Python installed along with the required packages. In this post I show how to do this both on a Mac (macOS) and a PC (Windows).

Background

Python is an interpreted language with a straightforward syntax and powerful handling of lists and dictionaries. You can write interesting programs with native Python, but its real strength comes from the seamless interface to thousands of packages. These packages are not part of a standard installation but must be installed and imported as needed. While most projects make use of a core set of packages, other projects need many more. Since some packages depend on other packages, there needs to be a mechanism for keeping track of dependencies and avoiding version conflicts.

Virtual Environments to the Rescue

A Virtual Environment (venv) is an elegant solution for avoiding possible problems with package conflicts. It is a mechanism to collect all packages needed for a project in one folder associated with one selected version of Python. After the environment is activated all packages subsequently installed onto the system end up in the active environment, respecting both the Python version and other packages already in the environment. This magic is performed by Python’s package installer and manager (pip). But before I show you how to set up and activate a virtual environment, let’s go through installation of Python itself.

Installing Python and the venv

There are several ways of getting Python on your system but I’m going to show the one I prefer and has worked well for me. It starts with a WEB browser directed to https://www.python.org/downloads/. You are presented with a page from which you can download a version of Python. The latest version is shown in a big yellow button where it says “Download Python x.xx.x”. At the time of writing this post, x.xx.x = 3.10.0. Now I deal with Mac OS and Windows separately since installation differs slightly.

MacOS

The download comes down as a package (python-3.10.0…pkg). I double click on this file and go through the installation. You will find that the installation goes directly into the “Applications” folder ready to be used. However, when you look there you might find that you also have other, older versions of Python installed. So which one is being used? I’ll show you how we select and decide.

At this point we open a Terminal window. Navigate into the Python4ChEs folder you created earlier. We are now creating an environment for this project. But first, check what version of python is being used. On my laptop it looked like this:

MacBook:python4ches myusername$ python –version

Where I typed what’s shown in Bold face. There is a good chance the answer comes back:

Python 2.7.16

or something like that. The reason being that MacOS comes loaded with Python but for older machines like mine it has version 2 which is not what you want. We will fix this with the following commands.

MacBook:python4ches myusername$ python3.10 -m venv env

This creates a Virtual Environment (venv) in the folder env. We will now activate this environment.

MacBook:python4ches myusername$ source env/bin/activate

Now if you type a $ python –version you’ll find that we are using the latest version. You will also see that the command prompt ($) is preceded by (env). On my MacBook it looks as follows:

(env) MacBook:python4ches myusername$

The first thing to do is to update the package manager (called pip) to the latest version:

(env) MacBook:python4ches myusername$ python -m pip install –upgrade pip

From this point on we can install packages that we need. We’ll start with two that I always use.

(env) MacBook:python4ches myusername$ pip install numpy

(env) MacBook:python4ches myusername$ pip install matplotlib

There is a file inside the python4ches folder called requirements.txt. It lists all the packages used for the project. You batch install packages directly from this file but I have found it straightforward to do it manually just as I showed for numpy and matplotlib. You then get the latest versions and pip magically keeps track of potential version conflicts between the packages.

After all the required packages are installed we have two more to do before we can use Jupyter Notebooks.

(env) MacBook:python4ches myusername$ pip install jupyter

This installs the Notebook app and all the files required for it. The very last install I do is not with pip but with python:

(env) MacBook:python4ches myusername$ python -m ipykernel install –name=env4che

This last command just assures that I get a “kernel” for a Jupyter notebook that is associated with my virtual environment.

Now I can start a notebook page by typing:

(env) MacBook:python4ches myusername$ jupyter notebook

It starts a local server and opens a page in my web browser. I can then navigate into the “Ten_problems” folder where I find a bunch of notebooks. Open any one of them and examine and execute. Make sure the “env4che” kernel is selected since this will assure that you use the packages in the correct environment (You can have many venv’s all with different packages and versions of python too).

Windows

Installation on Windows is very similar to that on MacOS with a few subtle differences that I now point out.

The downloading process downloads an .exe file like “python-3.10.0-amd64.exe”. Double click on this file and an installation window appears. Select the “Install Now” option. It installs Python in

C:\Users\myusername\Appdata\Local\Programs\Python\Python310

and says that Setup was successful.

Close the download window and open a terminal (Command prompt). Navigate to the python4ches folder and install the virtual environment the following way:

C:\users\myusername\python4ches>\users\myusername\appdata\local\programs\python\python310\python.exe -m venv env

Now activate the environment:

C:\users\myusername\python4ches> env\scripts\activate.bat

Go ahead and update pip.

(env) C:\users\myusername\python4ches> env\scripts\python.exe -m pip install –upgrade pip

From this point forward installation of both packages and jupyter are the same as for MacOS, in other words using pip install package_name.

Installation of an associated kernel for the jupyter notebook is done as follows:

(env) C:\users\myusername\python4ches> env\scripts\python.exe -m ipykernel install –name=env4che

Shutting down

After we are done running the the notebooks it is a good idea to shut down the local server. The command:

Control-C followed by y will stop the server and shut down all kernels.

To exit from the virtual environment we simply type:

deactivate

We are now back to a regular command prompt with no servers running.

Generalizing

You only have to install Python once (unless you specifically want another version than what you now use) but you can create as many virtual environments as you need. It is definitely a good idea to install a new venv for each major project you create. And it is a very good idea to have a separate environment for projects you download from GitHub. The reason is that each project will have its own set of requirements (e.g. as in requirements.txt) and might rely on specific versions of the various packages used. If you had all packages installed in one place on your system the risk for conflict is real. With separate environments you can come back to old projects and know that they work even if you might have installed new versions of several packages (in other venv’s) since then. If you create your environments as “env” along side of your project files you know which environment to activate for each project.

Summary

I have shown here how you can download Python and arrange packages needed for a project in separate folders called virtual environments. At first glance it might appear cumbersome to go through these steps for each new project you create, but believe me, you will be happy you did once you start collecting or creating new Python projects.

Categories
Dynamic Modeling iOS MacOS Windows

Swift vs Python for Dynamic Simulation

Swift and Python are two programming languages I’ve recently used to write process models for interactive, dynamic simulations. They are both modern and powerful with Swift being the youngest; still evolving and improving. Python is very popular overall and has found use in a number of areas, more recently Machine Learning.

Both Python and Swift support Object-Oriented programming, which is essential for building modular dynamic simulation models. While the syntax for the two languages is different, it is still relatively straightforward to construct a function or a class in one language given the design in the other. But I must say that working from models implemented in Swift towards designs in Python seems easier than the other way around. This could have to do with the fact that Swift is strongly typed and Python is not.

The dynamic nature of Python makes it ideal for rapid construction and experimentation. You can create a class in Python, add some attributes and member functions (also called methods) and test it right away. This is because Python is an interpreted language. What that means is that the keywords and instructions you enter as your program are interpreted by the Python executable and then sent to various pre-compiled functions within the application for execution. The big advantage of this approach is that you don’t have to specify variable types (e.g. int, double, etc) you use; the interpreter figures that out for you from the context of how they are used. However, this flexibility comes at a cost. First, since variables are not explicitly typed there is little to no help or warnings given to you before you run the program. Although debugging is relatively easy, I often find that many errors made by me could readily have been caught by a compiler before I even attempted to hit the “run” button.

The second, and possibly most serious drawback of an interpreted language is execution speed. This is particularly true for simulations of complex models with plenty of math. While there are some areas of computation where execution speed is of secondary importance compared to flexibility, the area of dynamic simulations is not one of them.

At this point you might wonder why I even considered Python for dynamic simulations instead of using fast languages like C++, C#, and Java. And certainly, 20-30 years ago those were my work horses. But back then computers were a lot slower than they are today and every bit of help from a compiled language was needed in order to make an interactive simulation fast enough to be useful. Today the situation is different. Most laptops or even tablets are fast enough to run an interactive, dynamic simulation written in Swift or Python at an acceptable speed. Notice the emphasis on interactive. It means that a process simulation runs much faster than real time but not so fast that you the “operator” don’t have time to interact to make changes. Here is an example of what I’m talking about.

Example of an interactive, dynamic simulation running on an iPad and written in Swift.

So with today’s computers Python cannot be dismissed solely on the basis of execution speed. And once we decide to keep it in the running we can focus on the many advantages offered by the language itself and its rich set of libraries. One of those is the plotting library Matplotlib (https://matplotlib.org). The library is described on its website in one sentence: “Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python”. I would add that it is powerful and easy to use. And very importantly, like Python itself, it’s open source and freely available. You will find many examples in my blogs of how I use Matplotlib in my own research.

In addition to Matplotlib the other library that is practically mandatory for dynamic simulations is NumPy (https://numpy.org). The single sentence characterization of this library is: “The fundamental package for scientific computing with Python”. NumPy is a set of routines written in C that significantly boosts the speed of mathematical expressions used to write dynamic simulation models. In my code examples you will see how I practically always use NumPy.

Another area of great importance for interactive, dynamic simulations is graphical user interfaces (GUI). If you are writing code in Swift for iOS devices you get help in creating slick interfaces from the tools in Apple’s development environment, Xcode. Personally I have never found it particularly easy to get it exactly right but it’s probably my old desktop/laptop background getting in the way of modern iOS thinking. Presumably for the same reasons I have found it easier to work with some of the external libraries that integrate with Python, for example PyQt5 (https://pypi.org/project/PyQt5/ ). This library contains a set of graphical interface routines written in C++ with a Python API. I will show in a separate blog how I have used PyQt5 to build a flexible and reusable interface for dynamic simulations.

It may feel that I’m writing mostly about Python, almost to justify its use. This is partly true because Swift does not need much in terms of justification, Apple got it right from the beginning. Swift not only supports Object-oriented programming but also advocates what they call protocol oriented designs. A Swift protocol is like a C++ or Java interface (if that helps anybody?) and does not have a direct counterpart in Python. Python’s abstract base class has some features along the lines of an interface, but not really. And Swift’s protocols are quite powerful, even more so than a classic Java interface. In fact, you can create a whole design based just on protocols that can later be implemented in various ways.

Swift is a strongly typed and compiled language. What that means is that all class variables you introduce must be of some type (e.g. Int, Double, some class or even a protocol) and be initialized before they are cleared for compilation. This is a huge advantage since you are guaranteed not to send messages to objects that can’t receive them. Of course you can still make logical mistakes just like you can in all programming languages. But that’s up to you and a good debugger to figure out. The compilation into machine code, and linking with runtime libraries, is not overly fast but once your program is compiled it is really fast! We are talking about orders of magnitude faster than Python especially for simulation tasks. And that is with use of NumPy. Surely you can take steps to speed up your Python code but now you have lost some time in development, made the code a little more fragile and perhaps lost some generality. Swift is consistently fast right from the get go.

Now you might think that Swift is the clear winner? Well, if runtime speed were everything it would be. But there are other considerations. While Swift has been made open source (https://www.swift.org) it is still very young and does not yet have nearly the same eco system of support that Python has. What I’m especially missing today is a good plotting package, like Matplotlib, that is freely available and integrates readily with Swift on all platforms. It will come, I’m sure.

Swift was designed by Apple to replace Obj-C, their other language for which many excellent libraries were written. When you run Swift on iOS and OS X devices you take advantage of all that code and enjoy great performance. The open source team is porting Swift (both development and runtime) to other platforms like Windows. I have not tried Swift on Windows yet but I know that Python runs equally well on both platforms.

Conclusions

Swift and Python are both excellent programming languages for writing dynamic simulation models. Both being modern emphasize “people time” over run time. Python is senior to Swift and enjoys a rich set of high quality, open source libraries. Swift is strongly typed, logical and fast. I use both in my work and research. For production and customer requested models I prefer Swift. For my own research and experimentation I prefer Python. Most of my process models are implemented in both languages, developed and tested in one language first and then ported to the other.