How to Grade R Scripts using R
Using computer to aid marking of student assignments
TL;DR
The post is to explain how I grade an R assignment in the SMU Master of Professional Accounting program' elective course Programming with Data. The assignment is to write a function to return the number of digits for an input number.
Best Practices for R Programming Assignment
The following is my preference:
- For a pure coding assignment, use .R script rather than RMarkdown file
- RMarkdown format is great for assignment with both code and explanations/interpretations/visualizations. It is hard to auto-grade though.
- For coding exercise, remind students not to change any names of objects, such as names of data frame and function. This is to facilitate auto-grading using computer
- If you use RMarkdown for a pure coding assignment, don’t use
eval = FALSE
setting and always remind students not to use this setting in the code chunks. - Tell students in advance that submissions without correct format may result in auto-grading failure and hence zero marks.
Convert Rmarkdown to R Script
If your submission is in Rmarkdown format, you will need to first convert them into .R script format. This can be easily done with the purl()
function in the package:knitr
. Unfortunately it can only convert one file at one time (let me know if you know how to convert multiple files at one go).
Let’s launch the usual libraries first.
library(tidyverse)
library(knitr)
Depending on how you design your folder strcture, you may need to set your working directory for easy file manipulation. For example, I stored all RMarkdown files from students under one folder and plan to convert all to .R scripts in the subfolder under the RMarkdown folder. You can use the function setwd()
such as setwd("D:/RMarkdown_Files/R_Files/")
.
Now let’s try to convert all RMarkdown files into R scripts. The first step is to extract all the files names using the list.files()
function. And then convert into .R scripts using the function purl()
. Note that full.names = TRUE
is to keep the full path of the file and then it can be read late.
# list all Rmd files, specify the path relative to the working dir
filenames <- list.files("../", pattern = "*.Rmd", full.names = TRUE)
# convert Rmd to R
for (filename in filenames) {
knitr::purl(filename, documentation = 0)
}
Generate Random Numbers with Different Digits
The assignment is to write a function which can return the number of digits of a given integer. The algo we try to evaluate the function is:
- generate random numbers with different number of digits
- apply the function with each random number generated above and check its accuracy
- the final marks will be the percentage of correct output from the above steps
We will use the sample()
to generate random numbers. For more options of random number generation, read here.
The following code defines a function gen_random_test_num(max_n_digit)
and the input max_n_digit
is the maximum number of digits you want to test. For my case, I generated 10 numbers for digits ranging from 1 to 10. Thus my output will be a data frame which consists of 100 random numbers with 10 numbers in each number of digits from 1 to 10. You may increase or decrease the max_n_digit
argument.
# Define a function to generate 100 numbers,including 10 numbers
# for each digit (from 1 to 10 digits)
# Use the sample() function
gen_random_test_num <- function(max_n_digit) {
for (i in c(1:max_n_digit)) {
if (i == 1) {
test_num <- data.frame(Digit = sample((10 ^ (i-1) - 1):(10 ^ i - 1), 10,
replace = F))
}
else {
num_temp <- data.frame(Digit = sample((10 ^ (i-1)):(10 ^ i - 1), 10,
replace = F))
test_num <- bind_cols(test_num, num_temp)
}
}
return(test_num)
}
# generate the 10x10 data frame
df <- gen_random_test_num(10)
Test the Function from Students' Assignments
I tried to use lapply()
but I did not get it. So I use the very brutal for
and while
loops. Let me know if you have better idea.
Read all the .R files names first. As I set the working directory to the folder which contains all the .R files, I will just use the .
current directory. You may need to change to the folder of your .R files.
filenames <- list.files(".", pattern = "*.R", full.names = FALSE)
The complication of the test procedure is that some students functions may not work, hence Error
. There is a Base R function try()
. I am using tryLog()
which claims to be better than the try()
according to the author. To understand more on how to handle error, read here.
The drawback is that students will get zero marks if their submitted functions cannot be run by the computer. This may be harsh but it is the reality. Wrong code will never work in practice.
If the submitted function can be run but with wrong output, we will assign marks according to the percentage of correct output using the 100 random numbers. I think this is very fair.
As students file name is in the format of Student Name.R
, so I use the str_sub(filename, 1, -3)
function from the package:stringr
to extract students' names.
The output data frame marks
will contain two columns: names
and marks
.
Do remember to remove the function from the environment using the rm()
when the loop goes to a new student.
library(tryCatchLog)
q <- 1
for (filename in filenames) {
source(filename)
i <- 1
test <- data.frame()
while (i <= 10) {
j <- 1
while (j <= 10) {
tryLog(test[i, j] <- ifelse(digits(df[i, j]) == j, 1, 0))
j <- j + 1
}
i <- i + 1
}
if (q == 1) {
marks <- data.frame(name = str_sub(filename, 1, -3),
marks = sum(test))
}
else {
marks_temp <- data.frame(name = str_sub(filename, 1, -3),
marks = sum(test))
marks <- bind_rows(marks, marks_temp)
}
tryLog(rm(digits))
q <- q + 1
}
Conclusion and Credit
I shared how I auto grade an R coding assignment in the SMU Master of Professional Accounting program' elective course Programming with Data. I hope you will find this useful.
For my students, your marks are based on a similar Python code which is developed by your tutor Lei Yu (Thanks for his excellent work) from the SMU Master of Science in Accounting (Data & Analytics) program. The above R code also borrows some idea from his Python code.
One caveat of the above code is that I did not consider the situation that the submitted RMarkdown file has eval = FALSE
setting. The purl()
function will automatically convert all things in an R code chunk with eval = FALSE
setting to notes in .R script. It means that computer will not read the function for this special case and hence students will get zero. This is not an issue using the Python code as we read R scripts literally from the RMarkdown files. I should have also done so but it takes time.
Last updated on 24 October, 2021