 # Prosocial RL Exercises¶

We will now fit your behavioral data to the models developed during class (as well as a few others) and then compare the models to assess which model best explains the data. This exercise is a bit different from previous exercises. I will only provide you with minimal code/text to guide you through, but you should be able to complete it all on your own using the course resources, previous tutorials/exercises, previous papers, and previous lectures!

Once you complete this exercise, you will be well on your way to becoming a pro computational modeler!

You can download the data from GitHub or by using the following code.

# import relevant modules
from scipy.optimize import minimize # finding optimal params in models
from scipy import stats             # statistical tools
import os                           # operating system tools
import numpy as np                  # matrix/array functions
import matplotlib.pyplot as plt     # plotting
%matplotlib inline

np.random.seed(2021)                # set seed for reproducibility

# this function will load the data into memory
'''
input: n/a
output: dictionary of DataFrames containing the data
'''
urls = [f'https://raw.githubusercontent.com/shawnrhoads/gu-psyc-347/master/docs/static/data/{x:02}_psyc-347-prosocial-learning.csv' for x in range(1,12)]

subject_data = {}
for index, file in enumerate(urls):
subject_data[index] = df_filtered = df[df['outcomeDescr'] != 'practice'][['block',
'trial_num',
'true_accuracy',
'outcome',
'outcomeDescr',
'cumulativePts_self',
'cumulativePts_social']]

return subject_data

# let's load in our data using the function above


## Relevant Data¶

### 4 conditions/blocks¶

• self, win

• self, avoid loss

• social, win

• social, avoid loss

### Each block contains 24 trials¶

• 2 stimuli per trial (counterbalanced wrt side)

• outcomes associated with each stimulus is probabilistic (75%, 25%)

### Data output key¶

• block: (play for self to win); (play for self to avoid loss); (play for next student to win); (play for next student to avoid loss)

• trial_num: order in which trials are displayed within a block (0-23)

• true_accuracy: 1 if selected stimulus with highest probability of winning or avoid losing

• outcome: outcome for trial (+1, 0, -1); blank if missed

• outcomeDescr: text description of outcome (practice, social win, social avoid win, social loss, social avoid loss, self win, self avoid win, self loss, self avoid loss)

• cumulativePts_self: running total of points for self

• cumulativePts_social: running total of points for social

# display data from first subject
display(subject_data)

block trial_num true_accuracy outcome outcomeDescr cumulativePts_self cumulativePts_social
subject
1.0 social_win 0 0.0 0.0 social avoid win 1000 1000
1.0 social_win 1 1.0 1.0 social win 1000 1100
1.0 social_win 2 1.0 1.0 social win 1000 1200
1.0 social_win 3 1.0 1.0 social win 1000 1300
1.0 social_win 4 1.0 0.0 social avoid win 1000 1300
... ... ... ... ... ... ... ...
1.0 self_avoidloss 19 1.0 0.0 self avoid loss 2100 1300
1.0 self_avoidloss 20 1.0 0.0 self avoid loss 2100 1300
1.0 self_avoidloss 21 1.0 0.0 self avoid loss 2100 1300
1.0 self_avoidloss 22 1.0 -1.0 self lose 2000 1300
1.0 self_avoidloss 23 0.0 -1.0 self lose 1900 1300

96 rows × 7 columns

# we will need this function to grab indices of specific elements
def get_index_positions(list_of_elems, element):
''' Returns the indexes of all occurrences of given element in
the list- listOfElements '''
index_pos_list = []
index_pos = 0
while True:
try:
# Search for item in list from indexPos to the end of list
index_pos = list_of_elems.index(element, index_pos)
# Add the index position in list
index_pos_list.append(index_pos)
index_pos += 1
except ValueError as e:
break

return index_pos_list


## Specifying plausible models¶

Below we will begin to specify some plausible models to characterize behavior during the task. Typically, each of these models can be written as a class objects (see Python Classes) with methods (e.g., model.fit()), but we will use a more-procedural approach for clarity during these exercises.

This first model is a simple Rescorla-Wagner Model that ignores information that is socially-relevant (e.g., deciding self or other) or outcome-relevant (e.g., positive or negative outcomes).

# Completed Model 1
def Simple_1a1t(params, choices, outcomes, block, plot=False):
'''
Inputs:
- params: list of 2 guesses (float) for each parameter (alpha, theta)
- choices: list of 96 choices (int) on each trial (0, 1)
- outcomes: list of 96 outcomes (int) on each trial (-1, 0, 1)
- block: list of 96 conditions (string) on each trial (self_win, self_avoidloss, social_win, social_avoidloss)
Outputs:
- negLL: negative loglikelihood computed
from the choice probabilities (float)
'''
alpha, theta = params

if np.isnan(alpha) or np.isnan(theta): # check inputs
return np.inf
else:
blocks = list(block)

# extracts list of four strings corresponding to conditions
unique_conditions = list(set(block))

# init choice probs
choiceProb = np.zeros((len(blocks)), dtype = float)
Q_out = {}

count = 0
for condition in unique_conditions:

T_temp = blocks.count(condition)
Q = [0.5, 0.5] # Q at trial 0
Q_stored = np.zeros((2, T_temp), dtype = float)

cur_indices = get_index_positions(blocks, condition)
c = np.array(choices)[cur_indices]
r = np.array(outcomes)[cur_indices]

# check if self vs social
#    Note: this bulky if-statement
#    is intentional to make it clear
#    what we are doing
if 'self' in condition:
# check if win vs avoid loss
if 'win' in condition:
r = r
elif 'avoidloss' in condition:
r = np.array([x+1 for x in r])
elif 'social' in condition:
# check if win vs avoid loss
if 'win' in condition:
r = r
elif 'avoidloss' in condition:
r = np.array([x+1 for x in r])

# loop through trials within condition
for t in range(T_temp):

if np.isnan(c[t]):
#don't update
choiceProb[count] = np.nan
Q_stored[:,t] = Q

else:
# compute choice probabilities for k=2
# use the softmax rule
ev = np.exp(theta*np.array(Q))
sum_ev = np.sum(ev)
p = ev / sum_ev

# compute choice probability for actual choice
choiceProb[count] = p[int(c[t])]

# update values
delta = r[t] - Q[int(c[t])]
Q[int(c[t])] = Q[int(c[t])] + alpha * delta

# store Q_t+1
Q_stored[:,t] = Q

count += 1

Q_out[condition] = Q_stored

negLL = -np.nansum(np.log(choiceProb))

if plot: #plot mean across 4 blocks
Q0 = np.mean(np.stack((Q_out['self_win'], Q_out['social_win'], Q_out['self_avoidloss'], Q_out['social_avoidloss']),axis=0),axis=0)
Q1 = np.mean(np.stack((Q_out['self_win'], Q_out['social_win'], Q_out['self_avoidloss'], Q_out['social_avoidloss']),axis=0),axis=0)

plt.plot(range(T_temp),Q0)
plt.plot(range(T_temp),Q1)
plt.title('Mean Q across conditions')
plt.xlabel('trial')
plt.ylabel('Q')
plt.show()

return negLL


Let’s test this function using a guess for alpha and theta for one subject.

# use one subject (feel free to change)
behavior = subject_data

# guess for alpha and theta
params = [.12, 2.11]

# specify subject data
choices = behavior.true_accuracy
outcomes = behavior.outcome
block = behavior.block

# compute negative log likelihood given
# data and guessed parameters
subj_negll = Simple_1a1t(params, choices, outcomes, block, plot=True) Now, let’s put this into a for-loop to minimize the negative log likelihood across all subjects.

#initialize dataframe to store results
df1 = pd.DataFrame(index=range(11), columns=['alpha','theta','NLL'])

# initialize list of algorithms
# (use one during class, but you can try a few others on your own)
algorithms = ['L-BFGS-B'] #['Powell','TNC','SLSQP','trust-constr']

# loop through subjects
for index, behavior in enumerate(subject_data.values()):
c, o, block = behavior.true_accuracy, behavior.outcome, behavior.block
bounds = ((0,1),(0,12))

# gradient descent to minimize neg LL
res_nll = np.inf # set initial neg LL to be inf

# guess several different starting points for alpha
for alpha_guess in np.linspace(0,1,3):
for theta_guess in np.linspace(1,12,4):

# guesses for alpha, theta will change on each loop
init_guess = (alpha_guess, theta_guess)

for algorithm in algorithms:

# minimize neg LL
result = minimize(Simple_1a1t,
init_guess,
(c, o, block),
bounds=bounds,
method=algorithm)

# if current negLL is smaller than the last negLL,
# then store current data
if result.fun < res_nll and result.success:
res_nll = result.fun
param_fits = result.x

# also, compute BIC
BIC = -2 * res_nll + len(init_guess) * np.log(len(c))

#store in dataframe
df1.at[index, 'alpha'] = param_fits
df1.at[index, 'theta'] = param_fits
df1.at[index, 'NLL'] = res_nll
df1.at[index, 'BIC'] = BIC

# print/plot Q values for subject
print(fr'subject {index+1:02}: alpha={param_fits:.2f}, theta={param_fits:.2f}; negLL={res_nll:.2f}; BIC={BIC:.2f}')
nll = Simple_1a1t(param_fits, c, o, block, plot=True)

subject 01: alpha=0.91, theta=3.13; negLL=41.24; BIC=-73.34 subject 02: alpha=0.03, theta=12.00; negLL=51.42; BIC=-93.71 subject 03: alpha=0.06, theta=12.00; negLL=41.13; BIC=-73.13 subject 04: alpha=0.00, theta=1.00; negLL=66.54; BIC=-123.96 subject 05: alpha=0.23, theta=7.15; negLL=35.25; BIC=-61.37 subject 06: alpha=0.44, theta=5.81; negLL=28.37; BIC=-47.61 subject 07: alpha=0.13, theta=12.00; negLL=19.16; BIC=-29.18 subject 08: alpha=0.16, theta=5.38; negLL=48.77; BIC=-88.42 subject 09: alpha=0.31, theta=7.58; negLL=28.84; BIC=-48.54 subject 10: alpha=0.31, theta=11.00; negLL=18.06; BIC=-27.00 subject 11: alpha=0.25, theta=12.00; negLL=12.57; BIC=-16.00 At a quick glance at the negative log likelihood and BIC values, we can see that this model performs better for some subjects relative to others. Some subjects even perform so poorly that we can’t estimate parameters for them. We can see this all at once in the DataFrame we created to store the outputs.

display(df1)

alpha theta NLL BIC
0 0.910384 3.125691 41.235016 -73.341337
1 0.034217 12.0 51.419038 -93.709379
2 0.063137 12.0 41.131208 -73.133719
3 0.0 1.0 66.542129 -123.955562
4 0.234915 7.149005 35.249369 -61.370042
5 0.438106 5.812434 28.370059 -47.611421
6 0.129681 12.0 19.155706 -29.182715
7 0.163722 5.379729 48.774263 -88.419830
8 0.30879 7.577298 28.83555 -48.542404
9 0.305454 11.001908 18.063611 -26.998525
10 0.248679 12.0 12.56593 -16.003163

Let’s try a different model…

Our task is very similar to the task used by Lockwood et al. (2016). Think about how the authors modeled human social behavior in their task!

One variation included learning rate parameters (e.g., alpha) that indexed sensitivity to prediction errors for both (1) self (e.g., alpha_self) and (2) other (e.g., alpha_other). How about we try that for our data?

def Social_2a1t(params, choices, outcomes, block, plot=False):
# 1 alpha_self + 1 alpha_other + 1 theta
'''
Inputs:
- params: list of 3 guesses (float) for each parameter (alpha_self, alpha_other, theta)
- choices: list of 96 choices (int) on each trial (0, 1)
- outcomes: list of 96 outcomes (int) on each trial (-1, 0, 1)
- block: list of 96 conditions (string) on each trial (self_win, self_avoidloss, social_win, social_avoidloss)
Outputs:
- negLL: negative loglikelihood computed
from the choice probabilities (float)
'''
alpha_self, alpha_other, theta = params

# check inputs
if np.isnan(alpha_self) or np.isnan(alpha_other) or np.isnan(theta):
return np.inf
else:
blocks = list(block)

# extracts list of four strings corresponding to conditions
unique_conditions = list(set(block))

# init choice probs
choiceProb = np.zeros((len(blocks)), dtype = float)
Q_out = {}

count = 0
for condition in unique_conditions:

T_temp = blocks.count(condition)
Q = [0.5, 0.5] # Q at trial 0
Q_stored = np.zeros((2, T_temp), dtype = float)

cur_indices = get_index_positions(blocks, condition)
c = np.array(choices)[cur_indices]
r = np.array(outcomes)[cur_indices]

# check if self vs social
#    Note: this bulky if-statement
#    is intentional to make it clear
#    what we are doing
if 'self' in condition:
# check if win vs avoid loss
if 'win' in condition:
r = r
elif 'avoidloss' in condition:
r = np.array([x+1 for x in r])

elif 'social' in condition:
# check if win vs avoid loss
if 'win' in condition:
r = r
elif 'avoidloss' in condition:
r = np.array([x+1 for x in r])

# loop through trials within condition
for t in range(T_temp):

if np.isnan(c[t]):
# don't update if nan
choiceProb[count] = np.nan
Q_stored[:,t] = Q

else:
# compute choice probabilities for k=2
# use the softmax rule
ev = np.exp(theta*np.array(Q))
sum_ev = np.sum(ev)
p = ev / sum_ev

# compute choice probability for actual choice
choiceProb[count] = p[int(c[t])]

# update values
delta = r[t] - Q[int(c[t])]

if 'self' in condition:
Q[int(c[t])] = Q[int(c[t])] + alpha_self * delta
elif 'social' in condition:
Q[int(c[t])] = Q[int(c[t])] + alpha_other * delta

# store Q_t+1
Q_stored[:,t] = Q

count += 1

Q_out[condition] = Q_stored

negLL = -np.nansum(np.log(choiceProb))

if plot: #plot mean across 4 blocks
Q0 = np.mean(np.stack((Q_out['self_win'], Q_out['social_win'], Q_out['self_avoidloss'], Q_out['social_avoidloss']),axis=0),axis=0)
Q1 = np.mean(np.stack((Q_out['self_win'], Q_out['social_win'], Q_out['self_avoidloss'], Q_out['social_avoidloss']),axis=0),axis=0)

plt.plot(range(T_temp),Q0)
plt.plot(range(T_temp),Q1)
plt.title('Mean Q across conditions')
plt.xlabel('trial')
plt.ylabel('Q')
plt.show()

return negLL


Now, let’s put this into a for-loop to minimize the negative log likelihood across all subjects.

#initialize dataframe to store results
df2 = pd.DataFrame(index=range(11), columns=['alpha_self','alpha_other','theta','NLL'])

# initialize list of algorithms
# (use one during class, but you can try a few others on your own)
algorithms = ['L-BFGS-B'] #['Powell','TNC','SLSQP','trust-constr']

# loop through subjects
for index, behavior in enumerate(subject_data.values()):
c, o, block = behavior.true_accuracy, behavior.outcome, behavior.block
bounds = ((0,1),(0,1),(0,12))

# gradient descent to minimize neg LL
res_nll = np.inf # set initial neg LL to be inf

# guess several different starting points for alpha
for alpha_self_guess in np.linspace(0,1,3):
for alpha_other_guess in np.linspace(0,1,3):
for theta_guess in np.linspace(1,12,4):

# guesses for alpha, theta will change on each loop
init_guess = (alpha_self_guess, alpha_other_guess, theta_guess)

for algorithm in algorithms:

# minimize neg LL
result = minimize(Social_2a1t,
init_guess,
(c, o, block),
bounds=bounds,
method=algorithm)

# if current negLL is smaller than the last negLL,
# then store current data
if result.fun < res_nll and result.success:
res_nll = result.fun
param_fits = result.x

# also, compute BIC
BIC = -2 * res_nll + len(init_guess) * np.log(len(c))

#store in dataframe
df2.at[index, 'alpha_self'] = param_fits
df2.at[index, 'alpha_other'] = param_fits
df2.at[index, 'theta'] = param_fits
df2.at[index, 'NLL'] = res_nll
df2.at[index, 'BIC'] = BIC

# print/plot Q values for subject
print(fr'subject {index+1:02}: alpha_self={param_fits:.2f}, alpha_other={param_fits:.2f}, theta={param_fits:.2f}; negLL={res_nll:.2f}; BIC={BIC:.2f}')
nll = Social_2a1t(param_fits, c, o, block, plot=True)

subject 01: alpha_self=0.76, alpha_other=1.00, theta=3.30; negLL=40.68; BIC=-67.67 subject 02: alpha_self=0.08, alpha_other=0.02, theta=12.00; negLL=46.42; BIC=-79.14 subject 03: alpha_self=0.11, alpha_other=0.04, theta=12.00; negLL=38.84; BIC=-64.00 subject 04: alpha_self=0.00, alpha_other=0.00, theta=12.00; negLL=66.41; BIC=-119.12 subject 05: alpha_self=0.26, alpha_other=0.18, theta=7.69; negLL=35.02; BIC=-56.34 subject 06: alpha_self=0.30, alpha_other=0.65, theta=6.24; negLL=27.35; BIC=-41.00 subject 07: alpha_self=0.14, alpha_other=0.12, theta=12.00; negLL=19.08; BIC=-24.46 subject 08: alpha_self=0.16, alpha_other=0.69, theta=3.99; negLL=47.11; BIC=-80.52 subject 09: alpha_self=0.42, alpha_other=0.24, theta=7.91; negLL=28.21; BIC=-42.73 subject 10: alpha_self=0.29, alpha_other=0.80, theta=12.00; negLL=17.71; BIC=-21.73 subject 11: alpha_self=0.33, alpha_other=0.20, theta=12.00; negLL=12.02; BIC=-10.35 display(df2)

alpha_self alpha_other theta NLL BIC
0 0.760689 1.0 3.295711 40.681883 -67.670721
1 0.081382 0.015783 12.0 46.417372 -79.141698
2 0.107811 0.037979 12.0 38.844495 -63.995945
3 0.0 0.003832 12.0 66.406572 -119.120098
4 0.260908 0.175293 7.693504 35.018243 -56.343442
5 0.296518 0.651074 6.241567 27.348106 -41.003168
6 0.139729 0.11758 12.0 19.077929 -24.462814
7 0.160885 0.688449 3.992985 47.106963 -80.520881
8 0.424338 0.242731 7.908341 28.212399 -42.731752
9 0.285959 0.801353 12.0 17.710507 -21.727969
10 0.326473 0.201148 12.0 12.022076 -10.351106

What do we think? At a glance, do we think this model was better at characterizing the behavior?

Now, we will split into our groups (3-4 students per group) to brainstorm other plausible models and then collaboratively write code to fit the data to these new models!

Here are some hints/ideas:

• Think Lockwood et al. (2016)! The authors had more than one learning rate. What else did they include in their best-fitting model? Why?

• What else is different about this task? Do participants always “play to win” like they do in Lockwood et al. (2016)? What other conditions are there?

# Model 3(12 out of 30 points)
def Valence_2a1t(params, choices, outcomes, block):
# 1 alpha_positive + 1 alpha_negative + 1 theta

return negLL

# fit model here and store output in df3

# Model 4 (12 out of 30 points)
def Social_2a2t(params, choices, outcomes, block, plot=False):
# 1 alpha + 1 beta_self + 1 beta_other + 1 theta

return negLL

# fit model here and store output in df4

# Bonus Model 5 (not required for exercise)
def SocialValence_4a1t(params, choices, outcomes, block, plot=False):
# 1 alpha_self_pos + 1 alpha_self_neg +
# 1 alpha_other_pos + 1 alpha_other_neg + 1 theta

return negLL

# fit model here and store output in df5


Now, we should have run 3-4 models that we can compare to assess which model might explain the data best. Remember we store the parameter fits in Pandas DataFrames: df1, df2, df3, df4, and maybe df5 (if you completed the bonus).

Let’s use this information to compute Bayesian Information Criterion ($$BIC$$) across subjects for each model. One way in which we can do so is by summing the negative log likelihoods across subjects for each model and using that value to compute a $$BIC$$ score. Then, we can plot each $$BIC$$ to assess model fits. Please do this below.

Hints

• to compute the sum of the negative likelihoods for the first model, you would use the command: np.nansum(df1['NLL'].values).

• You can refer to this page to review Bayesian Information Criterion ($$BIC$$), but note that tutorial used $$MSE$$ to compute the $$BIC$$. Here, we will use the negative log likelihood (like in this tutorial). Therefore, we won’t need to multiply by a $$-1$$! Also, remember to factor in the number of free parameters in each of your models!

• Remember we will have 4-5 $$BIC$$ values by the end of this tutorial.

# Compute BIC for each Model and Plot (6 out of 30 points)
def calculate_bic(n, nll, num_params):

return bic

def plot_bic(list_of_dfs, list_of_num_params, df_names):
# 96 trials across 11 subjects
all_num_trials = 96 * 11

#init list to input bic values
bic_list = []

all_mods = ...
all_num_params = ...

# get integrated BIC for each model
for mod, num_params in zip(all_mods, all_num_params):
bic_temp = calculate_bic(n=..., nll=..., num_params=...)
bic_list.append(bic_temp)

#plot these values:
plt.bar(range(0,len(all_mods)), bic_list)
plt.xlabel("Model")
plt.ylabel("BIC Score")
plt.title("BIC Scores for Different Models")

plt.xticks(range(0,len(all_mods)), df_names)
plt.show()

list_of_dfs = [df1, df2, df3, df4, df5]
df_names = ['df1', 'df2', 'df3', 'df4', 'df5']
list_of_num_params = [2, 3, 3, 4, 5]
plot_bic(list_of_dfs, list_of_num_params, df_names)


Which of these models fit the data best? Think about whether there be others that could do better!

# Finally, please convert this cell to a Markdown cell.

# Create a Heading named "Notebook Feedback," then delete this text