4) Data types in python

Related text:

We will run some python today using variables and libraries.

Comments, Integers and floats

As you saw last time, basic math can be performed without importing any libraries, and python will decide if you have a integer or a float.

As engineers know, tracking units are of utmost importance. Comments can be a great way to both explain your code and track units.

In [1]:
"""
Blocks of comments can be placed between lines containing three
double-quotes, like this.
They often appear at the beginning of a function, explaining what the
   function does and noting the expected input and output.
   We'll come back to this later.
This is an example of commenting and using variables in a formula.
We'll use more advance math (exponents, etc.) so we'll import numpy,
   an excellent math library.
"""
import numpy as np  #it is very common to import numpy as the shorthand alias, 'np'
from scipy import special  # we can also import just part of a library; this is needed to get erfc

# One of the most important syntax characters to know: '#' is the comment character
# They can start a line of comments only, or comments can follow code, as shown above and below.

# I'm old-school in certain ways; I explicity make values floats that
#   I want treated as such, even if python will treat an integer
#   variable as a float when doing the math
gas_constant = 0.00198588  # the ideal gas constant in units of kcal / (g-mol K)
temp = 300.  # temp in K
energy = 5.  # kcal/mol

# Save a little computation by creating a term that is repeatedly needed
ea_over_rt = energy / (gas_constant * temp)

# The (integrated form of the) Maxwell-Boltzmann distribution for the
#    fraction of molecules with energy of at least E
# If consistent units were used, they should cancel out in each term
#    (fractions are unitless!)
frac_mol = np.sqrt(4. / np.pi * ea_over_rt) * np.exp(-ea_over_rt) + special.erfc(np.sqrt(ea_over_rt))
print("At {} K, {}% of molecules have energy >= {} kcal/mol.".format(temp, round(frac_mol*100.,2), energy))
At 300.0 K, 0.08% of molecules have energy >= 5.0 kcal/mol.

Identifying and changing types

As has already appeared in ipynb’s for this course, we can find out the type of any variable by using the built-in global function type().

In [2]:
print(type(temp), type(ea_over_rt))
<class 'float'> <class 'float'>

We can also change the variable type with functions such as:

In [3]:
temp = int(temp)
print(temp, type(temp))
temp = str(temp)
print(temp, type(temp))
temp = float(temp)
print(temp, type(temp))
temp = str(temp)
print(temp, type(temp))
300 <class 'int'>
300 <class 'str'>
300.0 <class 'float'>
300.0 <class 'str'>

The ability to change numbers to strings and strings to numbers can be useful when reading and writing variables.

A few more syntax tips

As with many other languages, there are shortcut ways of performing simple math on a variable:

In [4]:
x = 4
print("x =", x, type(x))
x += 3  # same as x = x + 3.
print("x =", x, type(x))
x -= 1  # same as x = x - 1.
print("x =", x, type(x))
x *= 3  # same as x = x * 3.
print("x =", x, type(x))
x /= 3  # same as x = x * 3.
print("x =", x, type(x))
x != 3  # same as x = x * 3.
print("x =", x, type(x))
x = 4 <class 'int'>
x = 7 <class 'int'>
x = 6 <class 'int'>
x = 18 <class 'int'>
x = 6.0 <class 'float'>
x = 6.0 <class 'float'>

The combinations of operators and the equals sign changes the pointers. What do you expect the following output to be?

In [5]:
my_favorite_number = 8
x = my_favorite_number
y = x
x *= 2
print("x =", x, "and y =", y)
x = 16 and y = 8

Note that the end of all statements is the end of the line, unlike C or C++ which ends statements with ;. You can force a line to wrap by using the \ key. Even more convenient: enclose the expression in parenthesis.

In [6]:
y = 1 + 2 + 3 + 4 + 5 + 6.
print("y = ", y, type(y))
y = 1 + 2 + 3 +\
    4 + 5 + 6
print("y = ", y, type(y))
y = (1 + 2 + 3 +
     4 + 5 + 6)
print("y = ", y, type(y))
y =  21.0 <class 'float'>
y =  21 <class 'int'>
y =  21 <class 'int'>

In the code above, note the indentation. As you will see even more in the future, white space is meaningful in python.

The convention is to use four spaces for indentation (and yes, there can be blocks inside blocks; the meaningful indentation can make it much easier to read!). Note that many programs automatically convert a tab to four spaces. Check whether that is happening when you tab in your notebook!

In [7]:
total = 0
for i in range(10):
    # indentation indicates code block, and the first line of the block ends with ':'
    print("i =", i)
    total += i

print("total =", total)
i = 0
i = 1
i = 2
i = 3
i = 4
i = 5
i = 6
i = 7
i = 8
i = 9
total = 45
In [8]:
# Note the difference between this and the above code block.
total = 0
for i in range(10):
    # indentation indicates code block, and the first line of the block ends with ':'
    print("i =", i)
    total += i

    print("total =", total)
i = 0
total = 0
i = 1
total = 1
i = 2
total = 3
i = 3
total = 6
i = 4
total = 10
i = 5
total = 15
i = 6
total = 21
i = 7
total = 28
i = 8
total = 36
i = 9
total = 45

White space within lines will not affect the output. However, best practice is to put one space between each term and each operator.

Best practices in the python community are codified in the PEP 8 style guide for python code.

In [9]:
y=1+2+3+4+5+6  # saves space but is non-standard
print("y = ", y, type(y))
y       =   1    +2     + 3   +        4  + 5    + 6  # this does not help readability
print("y = ", y, type(y))
y = 1 + 2 + 3 + 4 + 5 + 6  # Much better readability. Follow this lead!
print("y = ", y, type(y))
y =  21 <class 'int'>
y =  21 <class 'int'>
y =  21 <class 'int'>

Other built-in operators

Given the amount of material to cover in this class, we can’t spend time going over all of the built-in operators. Note that I am skipping built-in math operators including modulo, floor division, or bitwise operations. If you want to learn about them, you can start with this tutorial or always just ask the internet.

Comparison operators (== for equal to, >, <, !=, >=, and <=) will come up when we get to more into writing our own functions

Boolean operators are very readable in python. The options are True and False (not the capitalization), and the standard operations are simply and, or, and not. For example:

In [10]:
x = 4
(x < 6) and (x >= 2)
Out[10]:
True

Parentheses Are for Grouping or Calling (or tuples! covered below)

You’ve seen two ways of using parenthesis above: grouping statements or calling functions.

The terms inside the parenthesis used to call a function (like the print function below) are called arguments. The number of arguments that the function takes depends on the function. In the case of the print function, it will print as many arguments as you give it. However, the function type only takes certain arguments (here, we just give it one argument; it is possible to give it 2 specific additional arguments but that is unusual and out of scope for this course).

In [11]:
y = (2 + 1 + 3)
print("y = ", y, type(y))
y = 2 * (1 + 3)
print("y = ", y, type(y))
y = (2 * 1) + 3
print("y = ", y, type(y))
y =  6 <class 'int'>
y =  8 <class 'int'>
y =  5 <class 'int'>

Other collections in python: lists, tuples, and dicts

A collection of values (strings, ints, floats, …) can be stored as types of collections.

The main kinds are tuples, lists, and dicts. You’ve already seen lists, remember?

In [12]:
print(type(In))
<class 'list'>

New lists can be defined without any items in them yet (an empty list) or with items. They do not have to be of the same type. Python makes a list when it sees square brackets. Items in a list do not have to be of the same type.

The built-in len() function will return the length of the function.

In [13]:
list_1 = []
print(list_1, type(list_1), len(list_1))
list_2 = [4, 5.2, 8, 'ten', list_1]
print("list_2 = {}, has type {}, and length {}".format(list_2, type(list_2), len(list_2)))
[] <class 'list'> 0
list_2 = [4, 5.2, 8, 'ten', []], has type <class 'list'>, and length 5

As we also previously saw, we can access entries in a list. Remember that counting starts with zero. Negative numbers access items from the end of the list.

Items in a list can be changed.

In [14]:
print("At the moment, the full list_2 = ", list_2)
print(list_2[0], type(list_2[0]))
print(list_2[1], type(list_2[1]))
print(list_2[-1], type(list_2[-1]))
print(list_2[-2], type(list_2[-2]))
list_2[-2] = 10
print(list_2[-2], type(list_2[-2]))
list_2[-1] = 2
print(list_2[-1], type(list_2[-1]))
print("Now, the full list_2 = ", list_2)
print(list_2[:2])
print(list_2[1:3])
print(list_2[2:])
At the moment, the full list_2 =  [4, 5.2, 8, 'ten', []]
4 <class 'int'>
5.2 <class 'float'>
[] <class 'list'>
ten <class 'str'>
10 <class 'int'>
2 <class 'int'>
Now, the full list_2 =  [4, 5.2, 8, 10, 2]
[4, 5.2]
[5.2, 8]
[8, 10, 2]

Built-in methods

In addition to be able to use functions like print and type on lists, list have built-in methods. These are particular actions that python will know how to perform on any list. Here are a few examples; more can be found many places, including here.

These methods are followed by parenthesis (has parenthesis like a function) and may or may not need arguments.

In [15]:
print("list_2 = ", list_2)
list_2.sort()
print("list_2 = ", list_2)
list_2.append(7.1)
print("list_2 = ", list_2)
list_2.extend([5.3, 9.2, 4.8])
print("list_2 = ", list_2)
list_2 =  [4, 5.2, 8, 10, 2]
list_2 =  [2, 4, 5.2, 8, 10]
list_2 =  [2, 4, 5.2, 8, 10, 7.1]
list_2 =  [2, 4, 5.2, 8, 10, 7.1, 5.3, 9.2, 4.8]

If you accidentally try to make a list by using parenthesis, you’ll get a tuple instead!

In [16]:
not_a_list = ([4.3, 5.2, 8,
               2, 10], [3])
print(type(not_a_list))
<class 'tuple'>

Some things are the same about tuples and lists, like how to print parts of them:

In [17]:
print(not_a_list[0], type(not_a_list[0]))
print(not_a_list[-1], type(not_a_list[-1]))
print(not_a_list[:2])
print(not_a_list[1:3])
print(not_a_list[2:])
[4.3, 5.2, 8, 2, 10] <class 'list'>
[3] <class 'list'>
([4.3, 5.2, 8, 2, 10], [3])
([3],)
()

The big difference is that tuples cannot be changed (they are immutable); you cannot change the value of an existing item or add new items.

In [18]:
not_a_list[0] = 11.11
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-18-d1adc261984b> in <module>()
----> 1 not_a_list[0] = 11.11

TypeError: 'tuple' object does not support item assignment
In [19]:
not_a_list.append(5.5)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-19-1038aa509a89> in <module>()
----> 1 not_a_list.append(5.5)

AttributeError: 'tuple' object has no attribute 'append'

However, a tuple can contain mutable items.

In [20]:
tuple_2 = ([], [1, 2, 3])
print(tuple_2)
tuple_2[0].append(5)
print(tuple_2)
tuple_2[-1][1] = 4
print(tuple_2)
([], [1, 2, 3])
([5], [1, 2, 3])
([5], [1, 4, 3])

Combinations of collections (lists of lists, tuples of lists, etc.) took my a while to wrap my head around. We’ll have more practice in the future. Also, once we get past some of these basics, we’ll get to even more fun material.

Next up: more on variables and collections