r - Repeating sequential functions when creating multiple lists/matrices/dataframes within `lapply` -


this question elaboration on previous question i'd asked repeating functions on sequentially-labeled dataframes.

in past, needed make minor alterations data.tables read in folder r (e.g. changing dates, recoding).

now, however, goals bit more complex: i'd read in several text files folder, take random sample character vectos, read random sample corpus (using package tm) , generate new data.frame has list of words/phrases , frequencies.

the code i've developed far follows:

bigramtokenizer <- function(x) ngramtokenizer(x, weka_control(min = 1, max = 5)) # finds words or phrases files <- list.files("~/path/", full.names = true, pattern="\\.txt$") # reads in files  out <- lapply(1:length(files), function(x) {   df <- scan(files[x], what="", sep="\n") # read in files   df<-sample(c(df),size=1500,replace=f) # take random sample   corpus <- corpus(vectorsource(df)) # create corpus   corpus <- tm_map(corpus, stripwhitespace)   corpus <- tm_map(corpus, tolower)   corpus <- tm_map(corpus, removewords, stopwords("english"))   tdm <- termdocumentmatrix(corpus, control = list(tokenize = bigramtokenizer)) #create term document matrix   m <- as.matrix(tdm)   v <- sort(rowsums(m),decreasing=true)   d <- data.frame(word = names(v),freq=v) # create new dataframe words & frequencies }) 

however, although function works, i'm not sure how access data.frames d while discarding rest? out contain of objects created in lapply?

the lapply function returns list containing values returned specified function. in example, function returns data frame assigned d, out list containing d data frames. of other objects created function (such tdm, m, , v) discarded, seems want.

you can access data frames in out indexing them, in out[[1]], lapply, in lapply(out, function(d) d$word), or combining them do.call('rbind', out).


Comments

Popular posts from this blog

linux - xterm copying to CLIPBOARD using copy-selection causes automatic updating of CLIPBOARD upon mouse selection -

c++ - qgraphicsview horizontal scrolling always has a vertical delta -