python - Matrix multiplication in pandas -
i have numeric data stored in 2 dataframes x , y. inner product numpy works dot product pandas not.
in [63]: x.shape out[63]: (1062, 36) in [64]: y.shape out[64]: (36, 36) in [65]: np.inner(x, y).shape out[65]: (1062l, 36l) in [66]: x.dot(y) --------------------------------------------------------------------------- valueerror traceback (most recent call last) <ipython-input-66-76c015be254b> in <module>() ----> 1 x.dot(y) c:\programs\winpython-64bit-2.7.3.3\python-2.7.3.amd64\lib\site-packages\pandas\core\frame.pyc in dot(self, other) 888 if (len(common) > len(self.columns) or 889 len(common) > len(other.index)): --> 890 raise valueerror('matrices not aligned') 891 892 left = self.reindex(columns=common, copy=false) valueerror: matrices not aligned is bug or using pandas wrong?
not must shapes of x , y correct, column names of x must match index names of y. otherwise code in pandas/core/frame.py raise valueerror:
if isinstance(other, (series, dataframe)): common = self.columns.union(other.index) if (len(common) > len(self.columns) or len(common) > len(other.index)): raise valueerror('matrices not aligned') if want compute matrix product without making column names of x match index names of y, use numpy dot function:
np.dot(x, y) the reason why column names of x must match index names of y because pandas dot method reindex x , y if column order of x , index order of y not naturally match, made match before matrix product performed:
left = self.reindex(columns=common, copy=false) right = other.reindex(index=common, copy=false) the numpy dot function no such thing. compute matrix product based on values in underlying arrays.
here example reproduces error:
import pandas pd import numpy np columns = ['col{}'.format(i) in range(36)] x = pd.dataframe(np.random.random((1062, 36)), columns=columns) y = pd.dataframe(np.random.random((36, 36))) print(np.dot(x, y).shape) # (1062, 36) print(x.dot(y).shape) # valueerror: matrices not aligned
Comments
Post a Comment