Python pandas dataframe: filter columns using a list? -


i have dataframe large: 100000 rows * 10000 cols

now i'm given list of labels (call list1) not match labels of columns in dataframe, match part of these labels. example, label in dataframe might "string1,d111" , labels in list1 might "d111".

so want find out these corresponding columns using list1, , sum these columns, efficient way this?

dataframe:        string1,d111       string2,d222          string3,d333   ......    stringn,dnnn 1         ..                   ..                     ..                     .. 2 3 4 5 6 ...   list1:  d111, d333,...dxxx 

in [28]: df = dataframe(randn(10,10),columns=[ 'c_%s' % in range(3)] + ['d_%s' % in range(3) ] + ['e_%s' % in range(4)])  in [3]: df.filter(regex='d_|e_') out[3]:          d_0       d_1       d_2       e_0       e_1       e_2       e_3 0 -0.022661 -0.504317  0.279227  0.286951 -0.126999 -1.658422  1.577863 1  0.501654  0.145550 -0.864171 -0.374261 -0.399360  1.217679  1.357648 2 -0.608580  1.138143  1.228663  0.427360  0.256808  0.105568 -0.037422 3 -0.993896 -0.581638 -0.937488  0.038593 -2.012554 -0.182407  0.689899 4  0.424005 -0.913518  0.405155 -1.111424 -0.180506  1.211730  0.118168 5  0.701127  0.644692 -0.188302 -0.561400  0.748692 -0.585822  1.578240 6  0.475958 -0.901369 -0.734969  1.090093  1.297208  1.140128  0.173941 7 -0.679514 -0.790529 -2.057733  0.420175  1.766671 -0.797129 -0.825583 8 -0.918645  0.916237  0.992001 -0.440573 -1.875960 -1.223502  0.084821 9  1.096687 -1.414057 -0.268211  0.253461 -0.175931  1.481261 -0.200600 

Comments

Popular posts from this blog

c# - Operator '==' incompatible with operand types 'Guid' and 'Guid' using DynamicExpression.ParseLambda<T, bool> -