Dear all WISE R Club members. :smile: Here comes the solution to assignment2. Because assignment 2 is more difficult than assignment 1, I advise you to read this solution manual more carefully to make sure you understand every single code.
I define the factorial function in this way.
Fac <- function(n) if (n==1|n==0) 1 else n*Fac(n-1)
Now test it!
c(Fac(0),Fac(3),Fac(5))
## [1] 1 6 120
To start with, I read the grades.txt
dataset to create data frame.
grades <- read.table("grades.txt", header=TRUE,stringsAsFactors=FALSE)
To plot the boxplot, we should reshape the data frame grades
at first.
library(reshape2)
gradex <- melt(grades,id="name")
We can see what is in the new data frame gradex
.
head(gradex)
## name variable value
## 1 丁聪华 Math 63
## 2 孙蝶妃 Math 73
## 3 许娇翔 Math 78
## 4 崔子希 Math 74
## 5 樊瑶芳 Math 81
## 6 阮恭琴 Math 88
str(gradex)
## 'data.frame': 900 obs. of 3 variables:
## $ name : chr "丁聪华" "孙蝶妃" "许娇翔" "崔子希" ...
## $ variable: Factor w/ 3 levels "Math","Chinese",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ value : int 63 73 78 74 81 88 84 96 78 96 ...
Then we plot the boxplot just like Figure 6.12.
boxplot(value ~ variable, data=gradex, col="lightblue",
main="The boxplot of the test scores",
xlab="Subject", ylab="Test Scores")
To calculate Sum
and get the 80th, 60th, 40th and 20th percentile of it,
Sum <- rowSums(grades[,2:4])
x <- quantile(Sum, c(0.8,0.6,0.4,0.2))
To grade the students,
grades <- cbind(grades, Sum)
grades$grade[Sum >= x[1]] <-"A+"
grades$grade[Sum < x[1] & Sum >= x[2]] <-"A"
grades$grade[Sum < x[2] & Sum >= x[3]] <-"B"
grades$grade[Sum < x[3] & Sum >= x[4]] <-"C"
grades$grade[Sum < x[4]] <-"D"
str(grades)
## 'data.frame': 300 obs. of 6 variables:
## $ name : chr "丁聪华" "孙蝶妃" "许娇翔" "崔子希" ...
## $ Math : int 63 73 78 74 81 88 84 96 78 96 ...
## $ Chinese: int 80 81 88 76 84 83 85 89 82 76 ...
## $ English: int 77 74 82 86 86 92 92 98 81 91 ...
## $ Sum : num 220 228 248 236 251 263 261 283 241 263 ...
## $ grade : chr "D" "D" "B" "D" ...
Grade <- function(nm){
names <- grades$name
grades$grade[nm==names]
}
## see a few examples
c(Grade("王书棋"), Grade("姚仕能"), Grade("夏 良"))
## [1] "C" "B" "D"
Now I draw the pie chart of the grade.
y <- table(grades$grade)
lb <- paste(names(y), " ", y, sep="")
pie(y, labels=lb, main= "Pie chart of the grade \n (with sample size)")
box()
Now calculate the mean and standard error for each subject in each group.
options(digits=2)
dstats <- function(x)(c(mean=mean(x), sd=sd(x)))
aggregate(grades[,2:4], by=list(grade=grades$grade), dstats)
## grade Math.mean Math.sd Chinese.mean Chinese.sd English.mean English.sd
## 1 A 83.9 6.7 82.2 5.6 87.3 5.6
## 2 A+ 90.1 6.4 83.3 4.3 92.4 5.3
## 3 B 80.6 5.7 80.9 5.7 84.8 4.7
## 4 C 76.5 5.9 82.1 4.7 81.5 5.1
## 5 D 70.7 6.0 81.6 5.4 74.7 6.5
Regress Math
on Chinese
and English
.
fit<-lm(Math~Chinese+English,data=grades)
summary(fit)
##
## Call:
## lm(formula = Math ~ Chinese + English, data = grades)
##
## Residuals:
## Min 1Q Median 3Q Max
## -20.59 -5.53 -0.32 5.50 20.95
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 80.0629 9.6124 8.33 3.0e-15 ***
## Chinese -0.3751 0.0919 -4.08 5.7e-05 ***
## English 0.3719 0.0596 6.24 1.5e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.2 on 297 degrees of freedom
## Multiple R-squared: 0.176, Adjusted R-squared: 0.171
## F-statistic: 31.8 on 2 and 297 DF, p-value: 3.14e-13
par(mfrow=c(2,2))
plot(fit)
Now I plot a histogram of the studentized residuals below and superimposes a normal curve, kernel density curve and rug plot.
z <- rstudent(fit)
hist(z, breaks=13, freq=FALSE,ylim=c(0,0.43), xlab="Studentized Residual",main="Distribution of Errors")
rug(jitter(z), col="brown")
curve(dnorm(x, mean=mean(z), sd=sd(z)), add=TRUE, col="blue", lwd=2)
lines(density(z)$x, density(z)$y, col="red", lwd=2, lty=2)
legend("topright",legend = c( "Normal Curve", "Kernel Density Curve"),lty=1:2, col=c("blue","red"), cex=1)
box()
WISE R Club project is proudly maintained by XiaojunSun.