Lecture 4: Conditionals and Loops [FINISHED]
Before we can start working with data, we need to work out some of the basics of Python. The goal is to learn enough so that we can do some interesting data work --- we do not need to be Python Jedi.
We now know about the basic data structures in python, how types work, and how to do some basic computation and string manipulation. What need next is flow control.
A python program is a list of statements. The python interpretor reads thos statement from top to bottom and executes them. Depending on what is happening in our program, we may want to skip some statements, or repeat some other statements. Flow control statements manage this process.
In this notebook we will cover (the terms ordered and mutable will make sense by the time we are done here)
Remember: Ask questions as we go.
1. Booleans (top)¶
Flow control often requires asking whether a statement is true or false and then taking an action conditonal on the answer. For example: Is this data a string? If yes, convert to a float. If not, do nothing.
The python type bool
can take on two values: True
or False
. Let's see it in action.
my_age = 40 # I am still old...
is_a_minor = my_age < 18 # here compare age to see if it is less than 18
print(is_a_minor)
print(type(is_a_minor))
False <class 'bool'>
The comparison operators we will often use are
<
(less than)>
(greater than)<=
(less than or equal to)>=
(greater than or equal to)==
(equal)!=
(not equal)
Important: We use a double equal sign ==
to check for equality and a single equal sign for assignment.
# a bit of code to see if the variable year is equal to the current year
year = 2019
is_current_year = (2019 == year) # the parenthesis are not needed, but I like them for clarity
print(is_current_year)
True
Go back and change the third line to is_current_year = (2020 = year)
. What happened?
More complicated comparisons¶
We can build more complicated expressions using and
and or
. For and
all the sub-comparisons need to evaluate to True
for the whole comparison to be True
. For or
only one of the sub-comparisons needs to be true for the whole comparison to be true.
x = (2 < 3) and (1 > 0) # Both sub-comparions are true
print('Is 2<3 and 1>0?', x)
y = (2 < 3) and (1 < 0) # Only one sub-comparison is true
print('Is 2<3 and 1<0?', y)
z = (2 < 3) or (1 < 0) #only one sub-comparison is true
print('Is 2<3 or 1<0?', z)
Is 2<3 and 1>0? True Is 2<3 and 1<0? False Is 2<3 or 1<0? True
Alternatively we can use a "pipe" (|
) for the "or" operator and the ampersand for the "and" operator:
x = (2 < 3) | (1 > 0) # Both sub-comparions are true
print('Is 2<3 and 1>0?', x)
y = (2 < 3) & (1 < 0) # Only one sub-comparison is true
print('Is 2<3 and 1<0?', y)
z = (2 < 3) | (1 < 0) #only one sub-comparison is true
print('Is 2<3 or 1<0?', z)
Is 2<3 and 1>0? True Is 2<3 and 1<0? False Is 2<3 or 1<0? True
Comparing strings¶
Given the nature of data, we might need to compare strings. Remember, progaming languages are picky...
gender = 'Male'
is_male = ('male' == gender)
print(is_male)
False
Case matters. Luckily, python has lots of features to manipulate strings. We will learn some of these as we go along. In this case we use the lower()
method of the string class to make the string all lower case.
We are using the 'dot' notation again without really explaining it yet, but that explanation is comming.
gender_lowcase = gender.lower() # we are applying the lower() method to the variable gender
print('gender, after being lowered:', gender_lowcase)
is_male = 'male' == gender_lowcase
print(gender_lowcase, is_male)
is_male = 'male' == gender.lower() # NB, we're pointing out that you don't have to store the lowered string separately
print(gender.lower(), is_male)
gender, after being lowered: male male True male True
2. Conditional statements (top)¶
Conditonal statements check a condition statement. If the statement is true, it evaluates one set of code. If the statement is false it evaluates another set of code.
Important: Earlier, I mentioned that white space doesn't matter around operators like +
or *
and that we can insert blank lines wherever we want. Here comes a caveat: When we form a conditional, we need exactly four spaces in the lines following the condition statement. The indents define the lines of code that are executed in each branch of the if
statement.
quantity = 10
if quantity > 0:
print('This print statement occured because the statement is true.') # this indented code is the 'if branch'
print('The quantity is positive.')
temp = quantity + 5
print('When I add 5 to the quantity it is:', temp, '\n')
else:
print('This print statement occured because the statement is false.') # this indented code is the 'else branch'
print('The quantity is not positive.\n')
print('This un-indented code runs no matter what.')
This print statement occured because the statement is true. The quantity is positive. When I add 5 to the quantity it is: 15 This un-indented code runs no matter what.
Now go back to the code and change quantity to 0, or -10 and run the cell. What happens?
Now go back to the code and change the indentation of the first print statement after
if quantity > 0:
to be two spaces. Run the cell. What happened?
# the else is optional.
size = 'md'
if (size == 'sm') or (size == 'md') or (size == 'lg'):
print('A standard size was requested.\n')
print('This un-indented code runs no matter what.')
A standard size was requested. This un-indented code runs no matter what.
Change size to 'xxl'
. Run the cell.
Practice: Conditionals ¶
Take a few minutes and try the following. Feel free to chat with those around you if you get stuck.
- Edit this markdown cell and write True, False, or error next to each statement
1 > 2
'bill' = 'Bill'
(1 > 2) or (2*10 < 100)
'Dennis' == 'Dennis'
x = 2 0 < x < 5
x = 0.10 y = 1/10 x == y
- Before you run the code cell below: do you think it will be true or false?
- Run the code cell.
x = 1/3
y = 0.333333333 # This is an approximation of 1/3
print(x == y)
False
In the previous cell, add a few more 3s to the end of the definition of y
so you get a better approximation of x
. Can you get x==y
to be true?
Representing a floating point number that does not have a base-2 fractional representation is a problem in all programing languages. It is a limitation of the computer hardware itself. The python documentation has a nice discussion. https://docs.python.org/3.7/tutorial/floatingpoint.html
This will not likely be an issue for us (although it could crop up) but it is a big deal in numeric computing.
- Let's introduce a new function that is built into python: the
len()
function. This computes the length of an object. In the code cell below tryprint(len('hello world'))
print(len('hello world'))
11
- In the cell below, write some code (use an if statement) that prints out the longer string in all lower case letters and prints out the shorter string in all upper case letters. [Hint: the companion to lower() is upper()].
string1 = 'Terry'
string2 = 'CoLLege'
if len(string1)<len(string2): # true if string2 is longer than string1
print(string1.lower())
print(string2.upper())
else: # if above is false, string1 is longer than string2
print(string1.upper())
print(string2.lower())
terry COLLEGE
3. Loops (top)¶
The conditional statement allows us to selectively run parts of our program. Loops allow us to re-run parts of our code several times, perhaps making some changes each time the code is run. There are several types of loops. The for
loop runs a block of code 'for' a fixed number of times.
Here is a basic example.
# loop three times and print out the value of 'i'
for i in range(3): # The counter variable 'i' can be named anything.
print('i =', i )
i = 0 i = 1 i = 2
Important: Notice the 4-space indent again. In general, the colon tells us that the indented lines below 'belong' to the line of code with the colon.
Ranges¶
The function range()
creates a sequence of whole numbers. With a single arguement, it starts at zero, but it can do more. Examples:
range(3)
returns 0, 1, 2range(2,7)
returns 2, 3, 4, 5, 6range(0, 10, 2)
returns 0, 2, 4, 6, 8 [the third argument is the 'step' size]
Change the code above to try out these ranges.
# a range is python type, like a float or a str
my_range = range(5)
print(type(my_range))
# what happens if I print the range?
print(my_range)
<class 'range'> range(0, 5)
That last print out might not be what you expected. If you want to see the sequence, convert it to a list first.
print(list(my_range)) #Remember what list() does?
[0, 1, 2, 3, 4]
Looping over lists and strings¶
Looping over a range is the only kind of for loop you can use in languages like C or MATLAB. Python gives us a very easy way to loop over many kinds of things. Here, we loop over a list.
var_names = ['GDP', 'POP', 'INVEST', 'EXPORTS']
# Here is a clunky, C-style way to do this
print('The old-school way:')
for i in range(4): # i = 0, 1, 2, 3
print(var_names[i])
# The python way
print('\nThe python way:')
for var in var_names: # again, 'var' can be named anything
print(var)
The old-school way: GDP POP INVEST EXPORTS The python way: GDP POP INVEST EXPORTS
We can do the same kind of thing for a string.
lake = 'observatory drive'
for letter in lake:
print(letter)
o b s e r v a t o r y d r i v e
Wow.
Ranges, lists, and strings are all 'iterable objects'. An iterable object (a type) is an object that knows how to return the 'next' element within it. When we iterate over a list variable, each time the for loop 'asks' for the next element in the list, the variable knows how to answer with the next element.
- Ranges iterate over whole numbers
- Lists iterate over the elements of the list
- Strings iterate over the characters
- Dicts iterate over the keys
- and more...
Iterators are used in places besides loops, too. We will see other uses as we go on. Powerful stuff.
Practice: Loops ¶
Take a few minutes and try the following. Feel free to chat with those around you if you get stuck.
Remember this example from earlier?
- We have 5 integer observations in our dataset: 1, 3, 8, 3, 9. Unfortunately, the data file ran all the observations together and we are left with the variable
raw_data
in the cell below. - What type is raw_data?
- Turn raw_data into a list.
raw_data = '13839'
print(raw_data)
print(type(raw_data))
13839 <class 'str'>
raw_data = list(raw_data)
print(raw_data)
print(type(raw_data))
['1', '3', '8', '3', '9'] <class 'list'>
Is your data ready to be analyzed? Why not?
- In the cell below, covert your list to a list of integers.
You might try repeating statements likePut a loop to work!list_data[0]=int(list_data[0])
for v in range(len(raw_data)):
raw_data[v] = int(raw_data[v])
print(raw_data)
[1, 3, 8, 3, 9]
# another way using enumerate which loops through the index and value simultenaously
for v, s in enumerate(raw_data):
raw_data[v] = int(raw_data[v])
print('index: ',v)
print('value: ',s)
print('')
print(raw_data)
index: 0 value: 1 index: 1 value: 3 index: 2 value: 8 index: 3 value: 3 index: 4 value: 9 [1, 3, 8, 3, 9]
- Loop through the following list:
commands = ['go', 'go', 'go', 'stop', 'go', 'go']
If the command is 'go' print out the word 'Green'. If the command is 'stop' print out the word 'Red'.
commands = ['go', 'go', 'go', 'stop', 'go', 'go']
for c in commands:
if c == 'go':
print('Green')
else:
print('Red')
print('program complete!')
Green Green Green Red Green Green program complete!