Using paste to read and write multiple files in R
This post is a quick tip on how to use the paste
1 function to read and write multiple files. First, let’s create some data.
dataset = data.frame(expand.grid(Trt=c(rep("Low",10),rep("High",10)),
Sex=c(rep("Male",10),rep("Female",10))),
Response=rnorm(400))
The next step is not necessary, but makes the subsequent code more readable.
trt = levels(dataset$Trt)
sex = levels(dataset$Sex)
The following example is silly because you would rarely want to split your data as shown in this example, but (hopefully) it clearly illustrates the general idea of using paste
to create dynamic file names when writing files.
for (i in 1:length(trt)){
for (j in 1:length(sex)){
write.csv(subset(dataset, Trt==trt[i] & Sex==sex[j]),
paste(trt[i],sex[j],".csv",sep=""),
row.names=FALSE)
}
}
The result of this loop is four CSV files: HighFemale.csv
, HighMale.csv
, LowFemale.csv
, and LowMale.csv
. We can use the same basic idea to read those same four files into a single data frame. The key is to initialize an empty data frame and then append, via rbind
, the data from each of the four files.2
dataset2 = data.frame()
for (i in 1:length(trt)){
for (j in 1:length(sex)){
dataset2 = rbind(dataset2,
read.csv(paste(trt[i],sex[j],".csv",sep="")))
}
}
I found this approach useful when I used a supercomputer to conduct many, many runs of an agent-based model. My jobs were queued more quickly on the supercomputer if they were small, so I broke my simulation experiments into many small jobs. This produced many files that I needed to combine into one data frame for analysis in R.