python - Fill NaN in candlestick OHLCV data -


i have dataframe this

                       open    high     low   close         vol 2012-01-01 19:00:00  449000  449000  449000  449000  1336303000 2012-01-01 20:00:00     nan     nan     nan     nan         nan 2012-01-01 21:00:00     nan     nan     nan     nan         nan 2012-01-01 22:00:00     nan     nan     nan     nan         nan 2012-01-01 23:00:00     nan     nan     nan     nan         nan ...                          open      high       low     close          vol 2013-04-24 14:00:00  11700000  12000000  11600000  12000000  20647095439 2013-04-24 15:00:00  12000000  12399000  11979000  12399000  23997107870 2013-04-24 16:00:00  12399000  12400000  11865000  12100000   9379191474 2013-04-24 17:00:00  12300000  12397995  11850000  11850000   4281521826 2013-04-24 18:00:00  11850000  11850000  10903000  11800000  15546034128 

i need fill nan according rule

when open, high, low, close nan,

  • set vol 0
  • set open, high, low, close previous close candle value

else keep nan

here's how via masking

simulate frame holes (a 'close' field)

in [20]: df = dataframe(randn(10,3),index=date_range('20130101',periods=10,freq='min'),             columns=list('abc'))  in [21]: df.iloc[1:3,:] = np.nan  in [22]: df.iloc[5:8,1:3] = np.nan  in [23]: df out[23]:                                      b         c 2013-01-01 00:00:00 -0.486149  0.156894 -0.272362 2013-01-01 00:01:00       nan       nan       nan 2013-01-01 00:02:00       nan       nan       nan 2013-01-01 00:03:00  1.788240 -0.593195  0.059606 2013-01-01 00:04:00  1.097781  0.835491 -0.855468 2013-01-01 00:05:00  0.753991       nan       nan 2013-01-01 00:06:00 -0.456790       nan       nan 2013-01-01 00:07:00 -0.479704       nan       nan 2013-01-01 00:08:00  1.332830  1.276571 -0.480007 2013-01-01 00:09:00 -0.759806 -0.815984  2.699401 

the ones nan

in [24]: mask_0 = pd.isnull(df).all(axis=1)  in [25]: mask_0 out[25]:  2013-01-01 00:00:00    false 2013-01-01 00:01:00     true 2013-01-01 00:02:00     true 2013-01-01 00:03:00    false 2013-01-01 00:04:00    false 2013-01-01 00:05:00    false 2013-01-01 00:06:00    false 2013-01-01 00:07:00    false 2013-01-01 00:08:00    false 2013-01-01 00:09:00    false freq: t, dtype: bool 

ones want propogate a

in [26]: mask_fill = pd.isnull(df['b']) & pd.isnull(df['c'])  in [27]: mask_fill out[27]:  2013-01-01 00:00:00    false 2013-01-01 00:01:00     true 2013-01-01 00:02:00     true 2013-01-01 00:03:00    false 2013-01-01 00:04:00    false 2013-01-01 00:05:00     true 2013-01-01 00:06:00     true 2013-01-01 00:07:00     true 2013-01-01 00:08:00    false 2013-01-01 00:09:00    false freq: t, dtype: bool 

propogate first

in [28]: df.loc[mask_fill,'c'] = df['a']  in [29]: df.loc[mask_fill,'b'] = df['a'] 

fill 0's

in [30]: df.loc[mask_0] = 0 

done

in [31]: df out[31]:                                      b         c 2013-01-01 00:00:00 -0.486149  0.156894 -0.272362 2013-01-01 00:01:00  0.000000  0.000000  0.000000 2013-01-01 00:02:00  0.000000  0.000000  0.000000 2013-01-01 00:03:00  1.788240 -0.593195  0.059606 2013-01-01 00:04:00  1.097781  0.835491 -0.855468 2013-01-01 00:05:00  0.753991  0.753991  0.753991 2013-01-01 00:06:00 -0.456790 -0.456790 -0.456790 2013-01-01 00:07:00 -0.479704 -0.479704 -0.479704 2013-01-01 00:08:00  1.332830  1.276571 -0.480007 2013-01-01 00:09:00 -0.759806 -0.815984  2.699401 

Comments

Popular posts from this blog

linux - xterm copying to CLIPBOARD using copy-selection causes automatic updating of CLIPBOARD upon mouse selection -

c++ - qgraphicsview horizontal scrolling always has a vertical delta -