Stratified Sampling in Pandas

df = pd.read_csv("path/to/data.csv")

df.groupby('Category', group_keys=False).apply(lambda x: x.sample(frac=0.1, random_state=42))

Revision #2
Created 10 November 2022 11:21:06 by James
Updated 11 February 2024 16:27:03 by James