python - Matrix multiplication in pandas -


i have numeric data stored in 2 dataframes x , y. inner product numpy works dot product pandas not.

in [63]: x.shape out[63]: (1062, 36)  in [64]: y.shape out[64]: (36, 36)  in [65]: np.inner(x, y).shape out[65]: (1062l, 36l)  in [66]: x.dot(y) --------------------------------------------------------------------------- valueerror                                traceback (most recent call last) <ipython-input-66-76c015be254b> in <module>() ----> 1 x.dot(y)  c:\programs\winpython-64bit-2.7.3.3\python-2.7.3.amd64\lib\site-packages\pandas\core\frame.pyc in dot(self, other)     888             if (len(common) > len(self.columns) or     889                     len(common) > len(other.index)): --> 890                 raise valueerror('matrices not aligned')     891      892             left = self.reindex(columns=common, copy=false)  valueerror: matrices not aligned 

is bug or using pandas wrong?

not must shapes of x , y correct, column names of x must match index names of y. otherwise code in pandas/core/frame.py raise valueerror:

if isinstance(other, (series, dataframe)):     common = self.columns.union(other.index)     if (len(common) > len(self.columns) or         len(common) > len(other.index)):         raise valueerror('matrices not aligned') 

if want compute matrix product without making column names of x match index names of y, use numpy dot function:

np.dot(x, y) 

the reason why column names of x must match index names of y because pandas dot method reindex x , y if column order of x , index order of y not naturally match, made match before matrix product performed:

left = self.reindex(columns=common, copy=false) right = other.reindex(index=common, copy=false) 

the numpy dot function no such thing. compute matrix product based on values in underlying arrays.


here example reproduces error:

import pandas pd import numpy np  columns = ['col{}'.format(i) in range(36)] x = pd.dataframe(np.random.random((1062, 36)), columns=columns) y = pd.dataframe(np.random.random((36, 36)))  print(np.dot(x, y).shape) # (1062, 36)  print(x.dot(y).shape) # valueerror: matrices not aligned 

Comments

Popular posts from this blog

linux - xterm copying to CLIPBOARD using copy-selection causes automatic updating of CLIPBOARD upon mouse selection -

qt - Errors in generated MOC files for QT5 from cmake -