R Code¶
This is a Jupyter Notebook with an R kernel.
titanic <- read.csv(url("https://raw.githubusercontent.com/shawnrhoads/executable-book-template/main/docs/data/titanic.csv"))
print(head(titanic[2:5]))
survived pclass sex age
1 0 3 male 22
2 1 1 female 38
3 1 3 female 26
4 1 1 female 35
5 0 3 male 35
6 0 3 male NA
x <- c(1,2,3,4,6,7,8,9)
print(x)
[1] 1 2 3 4 6 7 8 9
Logistic Regression¶
This is a logistic regression predicting survival of passengers.
Independent variables:
age (\(x1\))
sex (\(x2\))
class (\(x3\))
Dependent variable:
survived (\(y\))
logit_m <- glm(survived ~ age + sex + class, titanic, family=binomial(link="logit"))
summary(logit_m)
Call:
glm(formula = survived ~ age + sex + class, family = binomial(link = "logit"),
data = titanic)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.7303 -0.6780 -0.3953 0.6485 2.4657
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 3.777013 0.401123 9.416 < 2e-16 ***
age -0.036985 0.007656 -4.831 1.36e-06 ***
sexmale -2.522781 0.207391 -12.164 < 2e-16 ***
classSecond -1.309799 0.278066 -4.710 2.47e-06 ***
classThird -2.580625 0.281442 -9.169 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 964.52 on 713 degrees of freedom
Residual deviance: 647.28 on 709 degrees of freedom
(177 observations deleted due to missingness)
AIC: 657.28
Number of Fisher Scoring iterations: 5
Age, sex, and class were all associated with probability of surviving