Skip to content

Datasets#

You can select the prefix to source the x, y values to plot.

Example of plotting a sklearn dataset

from sklearn.datasets import load_iris

from cumulative.datasets.load_sklearn import load_sklearn
from cumulative.plotting import plot_ctx

c = load_sklearn(load_iris)
c.sample(m=100)

print("Dataset sample:")
print(c.df.head(5))
print("--\n")

print("Dataset overview:")
c.describe()
print("--\n")

c.score(src="base.z", dst="score", method="value")
c.sort(by="score.value")
c.interpolate(method="pchip", num=100)

with plot_ctx() as ax:
    c.plot.draw(
        ax=ax, src="base", style="-", ms=1, alpha=0.5, score="score.value"
    )
Output
Dataset sample:
           base.x                base.y  base.z
72   [0, 1, 2, 3]  [6.3, 2.5, 4.9, 1.5]       1
112  [0, 1, 2, 3]  [6.8, 3.0, 5.5, 2.1]       2
132  [0, 1, 2, 3]  [6.4, 2.8, 5.6, 2.2]       2
88   [0, 1, 2, 3]  [5.6, 3.0, 4.1, 1.3]       1
37   [0, 1, 2, 3]  [4.9, 3.6, 1.4, 0.1]       0
--

Dataset overview:
Count...: 100
Length..: min=4 max=4 diff=0
X min...: min=0 max=0 diff=0
X max...: min=3 max=3 diff=0
Y min...: min=0.1 max=2.5 diff=2.4
Y max...: min=4.3 max=7.9 diff=3.6000000000000005
--