R Code

This is a Jupyter Notebook with an R kernel.

titanic <- read.csv(url("https://raw.githubusercontent.com/shawnrhoads/executable-book-template/main/docs/data/titanic.csv"))
print(head(titanic[2:5]))
  survived pclass    sex age
1        0      3   male  22
2        1      1 female  38
3        1      3 female  26
4        1      1 female  35
5        0      3   male  35
6        0      3   male  NA
x <- c(1,2,3,4,6,7,8,9)
print(x)
[1] 1 2 3 4 6 7 8 9

Logistic Regression

This is a logistic regression predicting survival of passengers.

Independent variables:

  • age (\(x1\))

  • sex (\(x2\))

  • class (\(x3\))

Dependent variable:

  • survived (\(y\))

logit_m <- glm(survived ~ age + sex + class, titanic, family=binomial(link="logit"))
summary(logit_m)
Call:
glm(formula = survived ~ age + sex + class, family = binomial(link = "logit"), 
    data = titanic)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-2.7303  -0.6780  -0.3953   0.6485   2.4657  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)  3.777013   0.401123   9.416  < 2e-16 ***
age         -0.036985   0.007656  -4.831 1.36e-06 ***
sexmale     -2.522781   0.207391 -12.164  < 2e-16 ***
classSecond -1.309799   0.278066  -4.710 2.47e-06 ***
classThird  -2.580625   0.281442  -9.169  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 964.52  on 713  degrees of freedom
Residual deviance: 647.28  on 709  degrees of freedom
  (177 observations deleted due to missingness)
AIC: 657.28

Number of Fisher Scoring iterations: 5

Age, sex, and class were all associated with probability of surviving