Open In Colab

Python Exercises

For our first exercise, we will apply some of the skills we learned in Python. Please use all of the resources available to you, including Google, StackOverflow, and the Canvas Discussion Board.

For these exercises, we will be using data from O’Connell, et al. (2021). Reduced social distancing during the COVID-19 pandemic is associated with antisocial behaviors in an online United States sample. PLoS ONE.

This study assessed whether social distancing behaviors (early in the COVID-19 pandemic) was associated with self-reported antisocial behavior. To measure one index of social distancing behavior, participants were presented with an image of an adult silhouette surrounded by a rectangular border. They were asked to click a point in the image that represents how far away they typically stood from other individuals.

Here is a heatmap showing how far participants reported standing from other individuals in the past week, with dark maroon indicating a higher density of responses obtained from a kernel density estimation. The mean response coordinate, +, represents a distance of approximately 98 inches (8.2 feet; 2.5 m).

Figure 1

Table of Contents

  1. Basics (importing modules, basic syntax, types of variables)

  2. If statements, For loops

  3. Functions

Key

  • # [INSERT CODE BELOW]: indicates where you should insert your own code, feel free to replace with a comment of your own

  • ...: indicates a location where you should insert your own code

  • raise NotImplementedError("Student exercise: *"): delete this line once you have added your code

Basics

# We usually, start a notebook with a brief overview 
# in the first cell using Markdown (see above)

# Then, it is common practice to load all the 
# packages/modules that we will use in our first 
# code cell. Please import pandas and numpy 
# below so we can load our data:

# [INSERT CODE BELOW]
raise NotImplementedError("Student exercise: import pandas and numpy, then delete this line")

import ... as ...
import ... as ...
  File "/tmp/ipykernel_4526/2401966255.py", line 12
    import ... as ...
             ^
SyntaxError: invalid syntax
# Now, we will load in our dataframe into
# a variable called `df` and view the first few rows:

# here, we are just going to read data from the web 
# as a Pandas DataFrame
url = 'https://raw.githubusercontent.com/shawnrhoads/gu-psyc-347/master/docs/static/data/OConnell_COVID_MTurk_noPII_post_peerreview.csv'
df = pd.read_csv(url)

# [INSERT CODE BELOW]
raise NotImplementedError("Student exercise: display the contents of the DataFrame")

display(...)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
/tmp/ipykernel_4526/2794955659.py in <module>
      5 # as a Pandas DataFrame
      6 url = 'https://raw.githubusercontent.com/shawnrhoads/gu-psyc-347/master/docs/static/data/OConnell_COVID_MTurk_noPII_post_peerreview.csv'
----> 7 df = pd.read_csv(url)
      8 
      9 # [INSERT CODE BELOW]

NameError: name 'pd' is not defined
# Great, now that we have our data. Let's store 
# data into two variables of interest into lists:
    # - silhouette_dist_X_min81 : 
    #      distance from others in pixels (x-axis)
    # - STAB_total_min32 : 
    #      antisocial behavior measured using 
    #      the Subtypes of Antisocial 
    #      Behavior Questionnaire (STAB)

# No need to add any code here. Just execute this cell!

distance = list(df['silhouette_dist_X_min81'].values)
antisociality = list(df['STAB_total_min32'].values)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
/tmp/ipykernel_4526/402758015.py in <module>
     10 # No need to add any code here. Just execute this cell!
     11 
---> 12 distance = list(df['silhouette_dist_X_min81'].values)
     13 antisociality = list(df['STAB_total_min32'].values)

NameError: name 'df' is not defined
# Let's verify that both of these variables are 
# indeed stored in memory as lists using the 
# `print()` and `type()` functions

# [INSERT CODE BELOW]
raise NotImplementedError("Student exercise: print the type of each variable, then delete this line")

print(type(...))
print(type(...))
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
/tmp/ipykernel_4526/3929904286.py in <module>
      4 
      5 # [INSERT CODE BELOW]
----> 6 raise NotImplementedError("Student exercise: print the type of each variable, then delete this line")
      7 
      8 print(type(...))

NotImplementedError: Student exercise: print the type of each variable, then delete this line
# Let's also explore the data a bit more. 
# Remember, both of these lists should 
# contain the same number of observations. 
# Let's store number of elements of each 
# list and print them out. 

# [INSERT CODE BELOW]
raise NotImplementedError("Student exercise: store number of elements of each list, then delete this line")

length_of_dist_data = ...
length_of_stab_data = ...

print(f'list containing distance data contains {length_of_dist_data} observations')
print(f'list containing STAB data contains {length_of_stab_data} observations')
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
/tmp/ipykernel_4526/1878024175.py in <module>
      6 
      7 # [INSERT CODE BELOW]
----> 8 raise NotImplementedError("Student exercise: store number of elements of each list, then delete this line")
      9 
     10 length_of_dist_data = ...

NotImplementedError: Student exercise: store number of elements of each list, then delete this line

If statements, For loops

# Rather than printing out the lengths of each 
# list above and qualitatively assessing whether 
# they contain the same number of observations, 
# we could have just used an if-statement. 
# Let's do that now. If they are the same length, 
# then print one line with the number of observations; 
# if they are not, then print two lines with the 
# number of observations for each list.

# [INSERT CODE BELOW]
raise NotImplementedError("Student exercise: use if-statement to check if lists contain the same number of elements, then delete this line")

length_of_dist_data = ...
length_of_stab_data = ...

if ...
    print(...)
else:
    print(...)
    print(...)
  File "/tmp/ipykernel_4526/3450765365.py", line 16
    if ...
          ^
SyntaxError: invalid syntax
# We might be missing data for some of the 
# observations in these lists (i.e., a 
# participant did not complete this question, 
# so the element in the list is a `nan` 
# or not a number). Let's write a for-loop 
# to loop through the observations in `distance` 
# and then check whether each observation is a nan. 
# If the observation is a nan, then print 
# out the location of that observation in the list

# Hint: this will require you to put an 
# if-statement within the for-loop

# [INSERT CODE BELOW]
raise NotImplementedError("Student exercise: loop through elements in list and check if any are nans, then delete this line")

for index, ... in enumerate(...):
    if ...
        print(f'observation #{index} is nan')
  File "/tmp/ipykernel_4526/2815082703.py", line 18
    if ...
          ^
SyntaxError: invalid syntax
# Okay (spoiler alert), `distance` contains nans.
# Let's take the same for-loop code from above 
# and add a "counter" to count how 
# many nans we actually have

# [INSERT CODE BELOW]
raise NotImplementedError("Student exercise: loop through elements in list, check if any are nans, and update counter for each nan, then delete this line")

counter = 0 #initialize counter with 0
for index, ... in enumerate(...):
    if ...
        counter = ... #update counter if nan
        print(f'observation #{index} is nan')

# Let's print out the number of nans. Note that this final line is outside of the for-loop
print(f'the list contains {counter} nans') 
  File "/tmp/ipykernel_4526/1042918599.py", line 11
    if ...
          ^
SyntaxError: invalid syntax

Functions

# We can also make our code above "general-purpose",
#  so we can apply it to any list. In this cell,
#  write a function called `check_for_nans()`
#  that takes two inputs [a list and a string 
# ("the list name")] and two outputs [a boolean
#  whether the list contains any nans (i.e., if
#  the counter is greater than 0) and the
#  number of nans in list (zero if no nans)]. 

# Note that there are many ways to accomplish
#  this task, feel free to experiment 
# around with different approaches

# Fill out this function, then try to 
# excecute the next cell to see if it works

def check_for_nans(list_input, list_name='list'):
    """Check whether a list contains any nans

    Args:
        list_input (list): a list that contains the observations
        list_name (string): a string containing the name of the variable
    
    Returns:
        boolean: True if the list contains nans, False if not
        int: number of nans found in list, zero if no nans
    """

    ############################
    # [INSERT CODE BELOW]
    raise NotImplementedError("Student exercise: check if any inputted list contains nans, then delete this line")
    ############################

    # loop through elements/observations in list
    counter = 0 #initialize counter with 0
    for index, ... in enumerate(...):
        if ...
            counter = ... #update counter if nan
    
    # check if list contains any nans
    contains_nans = ...
    
    # print if contains_nans==True
    if contains_nans:
        print(f'{list_name} contains {counter} nans')
    else:
        print(f'{list_name} contains no nans')

    return contains_nans, counter
  File "/tmp/ipykernel_4526/2808151122.py", line 37
    if ...
          ^
SyntaxError: invalid syntax
# Run this cell to check your work. 
# This cell should output the line:
# "CONGRATS! LOOKS LIKE YOU DID IT!"
# No need to edit, just execute cell!

antisociality_contains_nans, antisociality_nan_count = check_for_nans(antisociality, 
                                                                      list_name='antisociality')
distance_contains_nans, distance_nan_count = check_for_nans(distance, 
                                                            list_name='distance')

# This is a check to see if it works; 
# bonus point if you can summarize what we do here!
##############
new_list = [[1, np.nan, 2, 3, np.nan, 4, 5, 6, np.nan,7, 8, 9, np.nan],           # 4 
            [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],                                      # 0 
            [np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, 10]] # 8

list_of_booleans = []
list_of_counts = []
for index, item in enumerate(new_list):
    nans, counts = check_for_nans(item, list_name=f'list{index}')
    list_of_booleans.append(nans)
    list_of_counts.append(counts)

if (list_of_booleans==[True,False,True]) and (list_of_counts==[4,0,8]):
    print("CONGRATS! LOOKS LIKE YOU DID IT!")
##############
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
/tmp/ipykernel_4526/153651912.py in <module>
      4 # No need to edit, just execute cell!
      5 
----> 6 antisociality_contains_nans, antisociality_nan_count = check_for_nans(antisociality, 
      7                                                                       list_name='antisociality')
      8 distance_contains_nans, distance_nan_count = check_for_nans(distance, 

NameError: name 'check_for_nans' is not defined
# Please convert this cell to a Markdown cell. 
# Create a Heading named "Notebook Feedback," then provide 1-2 sentences about your experience with this Jupyter Notebook (e.g., Did you enjoy the exercises? Were they too easy/difficult? Would you have like to see anything different? Were you able to apply some skills we learned during class? Anything still confusing?). Finally, please rate your experience from (0) "did not enjoy at all" to (10) "enjoyed a great deal." Only your instructor will see these responses. 

Woohoo! You finished your first Jupyter Notebook exercise!

In you are working on Google Colab, go to File > Download > Download .ipynb to download your work. Then, save the file as “Lastname_Exercise01.ipynb” and submit on Canvas.

Solutions

See the solutions at the following links: