Nonlinear Modeling¶

This tutorial was inspired by and adapted from the Neuromatch Academy tutorials [CC BY 4.0], using a nonlinear hyperbolic model to assess social discounting.

Goals of this tutorial¶

Specifying a nonlinear model
Fitting data to a nonlinear model
Comparing models
Working with actual data

from scipy.optimize import minimize
from scipy import stats
import numpy as np, pandas as pd
import requests
import matplotlib.pyplot as plt
%matplotlib inline

What is a nonlinear model?¶

Recall the general linear model, in which the multivariate relationship between a dependent variable (\(y\)) can be expressed as a linear combination of independent variables (\(x_d\)) that are multiplied by a weighted parameter or slope (\(\beta_d\)), plus some noise (\(\epsilon\)):

\[ y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... +\beta_d x_d + \epsilon \]

where \(\beta_0\) is the intercept and \(d\) is the number of features (it is also the dimensionality of our input).

Nonlinear modeling simply implies that the relationship between \(y\) and \(x_d\) is more than a linear combination. Some common examples of nonlinear models include

Sigmoid function:

\(y=\frac{1}{1 + exp(\beta_0 + \beta_1x_1)}\)

Exponential function:

\(y = \beta_0*exp(-\beta_1x_1)\)

Hyperbolic function:

\(y=\frac{\beta_0}{1 + \beta_1x_1}\)

# Let's plot some of these examples:
np.random.seed(2021)

b0 = 1
b1 = .04
b2 = 1.5
b3 = .0125
b4 = 2.67

x1 = np.random.normal(10, 20,
                      size=(100,1))

noise = np.random.randn(100).reshape((100,1))

y = {'Linear': (b0 - b1*x1).reshape((100,1)),
     'Sigmoid': ( ( 1 / (1 + np.exp(b2 + b4*x1)) )).reshape((100,1)),
     'Exponential': (80*np.exp(-b1*x1)).reshape((100,1)),
     'Hyperbolic': ((80/(1 + b3*x1))).reshape((100,1))}

fig, axes = plt.subplots(ncols=4, figsize=(18, 3))

for (key, values), ax in zip(y.items(), axes):
    
    # True data
    ax.scatter(x1, values)  # our data scatter plot

    ax.set(title= fr'{key}')
    
    ax.set_xticklabels('')
    ax.set_yticklabels('')
    ax.set_xlabel('x')
    ax.set_ylabel('y', rotation=0)

plt.show()

_images/module-02-01_Nonlinear-Modeling_5_0.png

Model comparison¶

Above, we can see that the hyperbolic model fits the data best, but typically the best fitting model isn’t so obvious. Thus, we can use methods such as \(R^2\) or the Bayesian Information Criterion (\(BIC\)) to compare model fits. The model with the lowest \(BIC\) value is the best fitting model in a finite set of models.

The \(BIC\) penalizes free parameters (see constant term: \(k*ln(n)\)):

\[ BIC = -2*ln(MSE) + p*ln(n)\]

Here, \(n\) is the total number of observations in your sample (e.g., sample size), \(p\) the number of parameters estimated by the model (we are estimating two parameters in our model: \(v0\) and \(k\)), and \(MSE\) is the mean sqauared error of the model. Remember that a lower \(BIC\) value is better; Adding the term \(p*ln(n)\) penalizes the model fit by the number of free parameters.

def calculate_bic(sample_size, mse, num_params):
    bic = -2 * np.log(mse) + num_params * np.log(sample_size)
    return bic

# calculate the Bayesian Information Criterion (bic)
for key, output in results.items():
    bic = calculate_bic(len(v_mean), output.fun, 2)
    print(f'{key} (BIC): {bic:.3f}')

lin (BIC): -4.263
exp (BIC): -1.763
hyp (BIC): 0.730

We can now show that the hyperbolic model is the model with the lowest \(BIC\) value (out of the models tested above).

Fitting actual data to models¶

Now that we’ve seen two examples of simulating data and model parameter recovery. Let’s try to fit actual data to these models.

We will use a subset of the data from Vekaria et al. (2017).

# First let's load in the data

# here, we are just going to download data from the web (no need to edit these lines, but try to figure out what they are doing)
url = 'https://raw.githubusercontent.com/shawnrhoads/gu-psyc-347/master/docs/static/data/Vekaria-et-al-2017_data.csv'
df = pd.read_csv(url, index_col='subject')
print(df.head())

          1   2   5  10  20  50  100
subject                             
    85  85  85  85  85  65   85
    85  85  -5   5   5   5    5
    85  85  85  85  85   5    5
    85  55  65  55  25  15    5
    65  55  45  55  45  15   15

We can see that our data are formatted with participants as rows and amounts willing to forgo at each social distance as columns. To use our functions above, we will need to make sure our \(v\) data have shape (n_subjects, 7).

Let’s convert the pd.DataFrame to a np.array:

vekaria_data = df.values

print(vekaria_data.shape)

(25, 7)

First, let’s fit all of the data together, with fixed intercepts and slopes, for both the hyperbolic model and the exponential model.

fit1 = minimize(mse_hyperbolic, # objective function
               (85, .05), # estimated starting points
               args=(N, df.iloc[:,0:7].values), # arguments
               bounds=((0,80),(0,1)),
               tol=1e-3)

# minimize MSE for exponential function using scipy.optimize.minimize
fit2 = minimize(mse_exponential, # objective function
               (85, .05), # estimated starting points
               args=(N, df.iloc[:,0:7].values), # arguments
               bounds=((0,80),(0,1)),
               tol=1e-3)

# fig, axes = plt.subplots()
fig = plt.figure()    

plt.scatter(N, np.mean(df.iloc[:,0:7].values, axis=0), label='Observed (mean)')
plt.plot(N, fit1.x[0] / (1 + fit1.x[1]*N), label='Hyperbolic')
plt.plot(N, fit2.x[0] * np.exp(-fit1.x[1]*N), label='Exponential')
plt.legend()

plt.show()

_images/module-02-01_Nonlinear-Modeling_35_0.png

Based on this plot, we can clearly see how much better the hyperbolic model is at explaining the variance in the data. We can confirm this again using the \(BIC\).

# calculate the Bayesian Information Criterion (bic)
for label, output in zip(['hyperbolic', 'exponential'], [fit1, fit2]):
    bic = calculate_bic(len(vekaria_data), output.fun, 2)
    print(f'{label} (BIC): {bic:.3f}')

hyperbolic (BIC): -6.567
exponential (BIC): -6.668

# initialize a DataFrame, with columns corresponding to params ['v0', k] and rows corresponding to subjects
res_vekaria = pd.DataFrame(columns=['v0', 'k'])

for subj_id, subj_v in zip(df.index, vekaria_data):
    
    # minimize MSE for hyperbolic function using scipy.optimize.minimize
    fit = minimize(mse_hyperbolic, # objective function
                   (85, .05), # estimated starting points
                   args=(N, subj_v), # arguments
                   bounds=((0,80),(0,1)),
                   tol=1e-3)
        
    res_vekaria.loc[subj_id, 'v0'] = fit.x[0]
    res_vekaria.loc[subj_id, 'k'] = fit.x[1]
    res_vekaria.loc[subj_id, 'MSE'] = fit.fun
    
    print(f'subject {subj_id}: v0 = {fit.x[0]:.2f}, k = {fit.x[1]:.3f}, MSE = {fit.fun:.2f}')

subject 102: v0 = 80.00, k = 0.000, MSE = 53.53
subject 106: v0 = 80.00, k = 0.301, MSE = 488.09
subject 107: v0 = 80.00, k = 0.024, MSE = 430.10
subject 113: v0 = 79.88, k = 0.074, MSE = 75.61
subject 114: v0 = 62.64, k = 0.033, MSE = 42.86
subject 116: v0 = 80.00, k = 0.000, MSE = 25.00
subject 119: v0 = 80.00, k = 0.021, MSE = 370.64
subject 120: v0 = 79.94, k = 0.087, MSE = 224.61
subject 121: v0 = 80.00, k = 0.000, MSE = 25.00
subject 122: v0 = 80.00, k = 0.003, MSE = 62.99
subject 123: v0 = 79.62, k = 0.110, MSE = 358.13
subject 124: v0 = 79.97, k = 0.062, MSE = 55.33
subject 125: v0 = 80.00, k = 0.019, MSE = 294.43
subject 126: v0 = 80.00, k = 0.006, MSE = 820.72
subject 127: v0 = 80.00, k = 0.030, MSE = 138.29
subject 128: v0 = 80.00, k = 0.000, MSE = 25.00
subject 132: v0 = 80.00, k = 0.024, MSE = 196.97
subject 135: v0 = 80.00, k = 0.056, MSE = 503.05
subject 136: v0 = 79.98, k = 0.089, MSE = 271.09
subject 137: v0 = 80.00, k = 0.021, MSE = 351.20
subject 138: v0 = 80.00, k = 0.024, MSE = 149.32
subject 139: v0 = 80.00, k = 0.220, MSE = 273.89
subject 141: v0 = 79.93, k = 0.083, MSE = 135.05
subject 143: v0 = 56.54, k = 0.036, MSE = 16.84
subject 147: v0 = 80.00, k = 0.361, MSE = 178.55

We can see that some participants did not do very well with model fitting. For most, this is because their “amounts willing to forgo” do not vary across social distances.

To account for this, let’s check which subjects these are.

for subj_id, subj_v in zip(df.index, vekaria_data):
    if all(x==subj_v[0] for x in subj_v):
        print(f'no variation for subject #{subj_id}, {subj_v}')

no variation for subject #116, [85 85 85 85 85 85 85]
no variation for subject #121, [85 85 85 85 85 85 85]
no variation for subject #128, [85 85 85 85 85 85 85]

Three participants sacrificed all of their resources for all social others. Let’s assign k=0 and v0=85 to these participants since there is no variation in their preferences. This is eqivalent to a straight horizontal line (no discounting) at \(v=85\).

# initialize a DataFrame, with columns corresponding to params ['v0', k] and rows corresponding to subjects
hyp_vekaria = pd.DataFrame(columns=['v0', 'k'])

for subj_id, subj_v in zip(df.index, vekaria_data):
    
    if all(x==subj_v[0] for x in subj_v):
        if np.sum(subj_v)>=595:
            
            hyp_vekaria.loc[subj_id, 'v0'] = 80 # 
            hyp_vekaria.loc[subj_id, 'k'] = 0
            hyp_vekaria.loc[subj_id, 'MSE'] = np.nan
            
            print(f'assigning k=0 for subject #{subj_id}, {subj_v}')
    else:
    
        # minimize MSE for hyperbolic function using scipy.optimize.minimize
        fit = minimize(mse_hyperbolic, # objective function
                       (85, .05), # estimated starting points
                       args=(N, subj_v), # arguments
                       bounds=((0,80),(0,1)),
                       tol=1e-3)

        hyp_vekaria.loc[subj_id, 'v0'] = fit.x[0]
        hyp_vekaria.loc[subj_id, 'k'] = fit.x[1]
        hyp_vekaria.loc[subj_id, 'MSE'] = fit.fun

res_vekaria = pd.concat([df, hyp_vekaria], axis=1)

assigning k=0 for subject #116, [85 85 85 85 85 85 85]
assigning k=0 for subject #121, [85 85 85 85 85 85 85]

assigning k=0 for subject #128, [85 85 85 85 85 85 85]

fig, axes = plt.subplots(ncols=2, figsize=(15,5), sharey=True)

axes[0].hist(res_vekaria['v0'], bins=20)
axes[0].set(ylabel="Number of Subjects", xlabel="v0")

axes[1].hist(res_vekaria['k'], bins=20)
axes[1].set(ylabel="Number of Subjects", xlabel="k")

plt.show()

_images/module-02-01_Nonlinear-Modeling_43_0.png

Yay! Now, we can use these data for subsequent analyses! Note the little variation in \(v_0\). Also note that \(k\) is not parametric (e.g., not normally distributed), so we would need to conduct subsequent analyses using non-parametric approaches.

Nonlinear Modeling

Contents

Nonlinear Modeling¶

Goals of this tutorial¶

What is a nonlinear model?¶

A case for nonlinear modeling: Social Discounting¶

Model Parameters¶

Model comparison¶

Fitting actual data to models¶