Skip to content

Datasets#

You can select the prefix to source the x, y values to plot.

Example of plotting a sklearn dataset

from sklearn.datasets import load_iris

from cumulative.datasets.load_sklearn import load_sklearn
from cumulative.plotting import plot_ctx

c = load_sklearn(load_iris)
c.sample(m=100)

print("Dataset sample:")
print(c.df.head(5))
print("--\n")

print("Dataset overview:")
c.describe()
print("--\n")

c.score(src="base.z", dst="score", method="value")
c.sort(by="score.value")
c.interpolate(method="pchip", num=100)

with plot_ctx() as ax:
    c.plot.draw(
        ax=ax, src="base", style="-", ms=1, alpha=0.5, score="score.value"
    )
Output
Dataset sample:
         base.x                base.y  base.z
1  [0, 1, 2, 3]  [4.9, 3.0, 1.4, 0.2]       0
2  [0, 1, 2, 3]  [4.7, 3.2, 1.3, 0.2]       0
3  [0, 1, 2, 3]  [4.6, 3.1, 1.5, 0.2]       0
5  [0, 1, 2, 3]  [5.4, 3.9, 1.7, 0.4]       0
6  [0, 1, 2, 3]  [4.6, 3.4, 1.4, 0.3]       0
--

Dataset overview:
Count...: 100
Length..: min=4 max=4 diff=0
X min...: min=0 max=0 diff=0
X max...: min=3 max=3 diff=0
Y min...: min=0.1 max=2.5 diff=2.4
Y max...: min=4.4 max=7.7 diff=3.3
--