GroupBy.
quantile
Return group values at the given quantile.
New in version 3.4.0.
Value between 0 and 1 providing the quantile to compute.
Default accuracy of approximation. Larger value means better accuracy. The relative error can be deduced by 1.0 / accuracy. This is a panda-on-Spark specific parameter.
Return type determined by caller of GroupBy object.
See also
pyspark.pandas.Series.quantile
pyspark.pandas.DataFrame.quantile
pyspark.sql.functions.percentile_approx
Notes
quantile in pandas-on-Spark are using distributed percentile approximation algorithm unlike pandas, the result might be different with pandas, also interpolation parameter is not supported yet.
Examples
>>> df = ps.DataFrame([ ... ['a', 1], ['a', 2], ['a', 3], ... ['b', 1], ['b', 3], ['b', 5] ... ], columns=['key', 'val'])
Groupby one column and return the quantile of the remaining columns in each group.
>>> df.groupby('key').quantile() val key a 2.0 b 3.0