r - Idiom for dropping a single column in a data.table -


i need drop 1 column data.frame containing few hundred columns.

with data.frame, i'd use subset conveniently:

> dat <- data.table( data.frame(x=runif(10),y=rep(letters[1:5],2),z=runif(10)),key='y' ) > subset(dat,select=c(-z))             x y  1: 0.1969049  2: 0.7916696  3: 0.9095970 b  4: 0.3529506 b  5: 0.4923602 c  6: 0.5993034 c  7: 0.1559861 d  8: 0.9929333 d  9: 0.3980169 e 10: 0.1921226 e 

obviously still works, seems not data.table-like idiom. manually construct list of column names wanted keep, seems little more data.table-like:

> dat[,list(x,y)]             x y  1: 0.1969049  2: 0.7916696  3: 0.9095970 b  4: 0.3529506 b  5: 0.4923602 c  6: 0.5993034 c  7: 0.1559861 d  8: 0.9929333 d  9: 0.3980169 e 10: 0.1921226 e 

but have construct such list, clunky.

is subset proper way conveniently drop column or two, or cause performance hit? if not, what's better way?

edit

benchmarks:

> dat <- data.table( data.frame(x=runif(10^7),y=rep(letters[1:10],10^6),z=runif(10^7)),key='y' ) > microbenchmark( subset(dat,select=c(-z)), dat[,list(x,y)] ) unit: milliseconds                          expr       min        lq    median        uq      max 1           dat[, list(x, y)] 102.62826 167.86793 170.72847 199.89789 792.0207 2 subset(dat, select = c(-z))  33.26356  52.55311  53.53934  55.00347 180.8740 

but may matter more memory if subset copies whole data.table.

if wanting remove column permanently use := null

dat[, z := null] 

if have columns drop character string use () force evaluation character string, not character name.

todrop <- c('z')  dat[, (todrop) := null] 

if want limit availability of columns in .sd, can pass .sdcols argument

dat[,lapply(.sd, somefunction) , .sdcols = setdiff(names(dat),'z')] 

however, data.table inspects j arguments , gets columns use way. see faq 1.12

when write x[y,sum(foo*bar)], data.table automatically inspects j expression see columns uses.

and doesn't try , load data .sd (unless have .sd within call j)


subset.data.table processing call , evaluating dat[, c('x','y'), with=false]

using := null should instantaneous, howveer t permanently delete column.


Comments

Popular posts from this blog

linux - xterm copying to CLIPBOARD using copy-selection causes automatic updating of CLIPBOARD upon mouse selection -

qt - Errors in generated MOC files for QT5 from cmake -