Version 0.22.0
Enable Arrow 0.15.1+
Apache Arrow 0.15.0 did not work well with PySpark 2.4 so it was disabled in the previous version.
With Arrow 0.15.1, now it works in Koalas (#902).
Expanding and Rolling
We also added expanding()
and rolling()
APIs in all groupby()
, Series and Frame (#985, #991, #990, #1015, #996, #1034, #1037)
min
max
sum
mean
std
var
Multi-index columns support
We continue improving multi-index columns support. We made the following APIs support multi-index columns:
Documentation
We added "Best Practices" section in the documentation (#1041) so that Koalas users can read and follow. Please see https://koalas.readthedocs.io/en/latest/user_guide/best_practices.html
Other new features and improvements
We added the following new features:
koalas.DataFrame:
koalas.Series:
koalas.MultiIndex:
Along with the following improvements:
- Introduce column_scols in InternalFrame substitude for data_columns. (#956)
- Fix different index level assignment when 'compute.ops_on_diff_frames' is enabled (#1045)
- Fix Dataframe.melt function & Add doctest case for melt function (#987)
- Enable creating Index from list like 'Index([1, 2, 3])' (#986)
- Fix combine_frames to handle where the right hand side arguments are modified Series (#1020)
setup.py
should support Python 2 to show a proper error message. (#1027)- Remove
Series.schema
. (#993)