DEVELOPMENT... OpenML
Data
Fashion-MNIST_seed_2_nrows_2000_nclasses_10_ncols_100_stratify_True

Fashion-MNIST_seed_2_nrows_2000_nclasses_10_ncols_100_stratify_True

active ARFF Publicly available Visibility: public Uploaded 17-11-2022 by David Wilson
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Subsampling of the dataset Fashion-MNIST (40996) with seed=2 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed: int, nrows_max: int = 2_000, ncols_max: int = 100, nclasses_max: int = 10, stratified: bool = True, ) -> Dataset: rng = np.random.default_rng(seed) x = self.x y = self.y # Uniformly sample classes = y.unique() if len(classes) > nclasses_max: vcs = y.value_counts() selected_classes = rng.choice( classes, size=nclasses_max, replace=False, p=vcs / sum(vcs), ) # Select the indices where one of these classes is present idxs = y.index[y.isin(classes)] x = x.iloc[idxs] y = y.iloc[idxs] # Uniformly sample columns if required if len(x.columns) > ncols_max: columns_idxs = rng.choice( list(range(len(x.columns))), size=ncols_max, replace=False ) sorted_column_idxs = sorted(columns_idxs) selected_columns = list(x.columns[sorted_column_idxs]) x = x[selected_columns] else: sorted_column_idxs = list(range(len(x.columns))) if len(x) > nrows_max: # Stratify accordingly target_name = y.name data = pd.concat((x, y), axis="columns") _, subset = train_test_split( data, test_size=nrows_max, stratify=data[target_name], shuffle=True, random_state=seed, ) x = subset.drop(target_name, axis="columns") y = subset[target_name] # We need to convert categorical columns to string for openml categorical_mask = [self.categorical_mask[i] for i in sorted_column_idxs] columns = list(x.columns) return Dataset( # Technically this is not the same but it's where it was derived from dataset=self.dataset, x=x, y=y, categorical_mask=categorical_mask, columns=columns, ) ```

101 features

class (target)nominal10 unique values
0 missing
pixel29numeric1 unique values
0 missing
pixel31numeric8 unique values
0 missing
pixel39numeric243 unique values
0 missing
pixel45numeric242 unique values
0 missing
pixel59numeric9 unique values
0 missing
pixel64numeric225 unique values
0 missing
pixel75numeric248 unique values
0 missing
pixel76numeric249 unique values
0 missing
pixel78numeric219 unique values
0 missing
pixel79numeric185 unique values
0 missing
pixel80numeric121 unique values
0 missing
pixel107numeric228 unique values
0 missing
pixel132numeric248 unique values
0 missing
pixel136numeric203 unique values
0 missing
pixel142numeric33 unique values
0 missing
pixel151numeric244 unique values
0 missing
pixel158numeric250 unique values
0 missing
pixel163numeric245 unique values
0 missing
pixel165numeric189 unique values
0 missing
pixel169numeric21 unique values
0 missing
pixel180numeric249 unique values
0 missing
pixel184numeric248 unique values
0 missing
pixel194numeric176 unique values
0 missing
pixel203numeric233 unique values
0 missing
pixel206numeric250 unique values
0 missing
pixel216numeric251 unique values
0 missing
pixel233numeric243 unique values
0 missing
pixel234numeric249 unique values
0 missing
pixel242numeric250 unique values
0 missing
pixel252numeric86 unique values
0 missing
pixel264numeric248 unique values
0 missing
pixel275numeric253 unique values
0 missing
pixel283numeric110 unique values
0 missing
pixel286numeric234 unique values
0 missing
pixel302numeric250 unique values
0 missing
pixel308numeric121 unique values
0 missing
pixel312numeric164 unique values
0 missing
pixel335numeric226 unique values
0 missing
pixel336numeric127 unique values
0 missing
pixel338numeric87 unique values
0 missing
pixel343numeric240 unique values
0 missing
pixel348numeric253 unique values
0 missing
pixel355numeric253 unique values
0 missing
pixel359numeric246 unique values
0 missing
pixel372numeric250 unique values
0 missing
pixel373numeric249 unique values
0 missing
pixel375numeric251 unique values
0 missing
pixel385numeric245 unique values
0 missing
pixel392numeric144 unique values
0 missing
pixel393numeric71 unique values
0 missing
pixel397numeric236 unique values
0 missing
pixel407numeric251 unique values
0 missing
pixel417numeric242 unique values
0 missing
pixel421numeric118 unique values
0 missing
pixel422numeric177 unique values
0 missing
pixel448numeric168 unique values
0 missing
pixel450numeric197 unique values
0 missing
pixel452numeric229 unique values
0 missing
pixel454numeric251 unique values
0 missing
pixel463numeric252 unique values
0 missing
pixel472numeric245 unique values
0 missing
pixel477numeric133 unique values
0 missing
pixel483numeric248 unique values
0 missing
pixel486numeric254 unique values
0 missing
pixel491numeric249 unique values
0 missing
pixel492numeric252 unique values
0 missing
pixel505numeric121 unique values
0 missing
pixel508numeric232 unique values
0 missing
pixel510numeric254 unique values
0 missing
pixel514numeric254 unique values
0 missing
pixel519numeric251 unique values
0 missing
pixel532numeric156 unique values
0 missing
pixel537numeric254 unique values
0 missing
pixel562numeric191 unique values
0 missing
pixel566numeric252 unique values
0 missing
pixel568numeric254 unique values
0 missing
pixel574numeric251 unique values
0 missing
pixel596numeric253 unique values
0 missing
pixel598numeric251 unique values
0 missing
pixel616numeric129 unique values
0 missing
pixel625numeric251 unique values
0 missing
pixel639numeric242 unique values
0 missing
pixel642numeric199 unique values
0 missing
pixel651numeric241 unique values
0 missing
pixel657numeric246 unique values
0 missing
pixel662numeric247 unique values
0 missing
pixel667numeric236 unique values
0 missing
pixel671numeric146 unique values
0 missing
pixel674numeric105 unique values
0 missing
pixel681numeric246 unique values
0 missing
pixel684numeric242 unique values
0 missing
pixel693numeric244 unique values
0 missing
pixel695numeric236 unique values
0 missing
pixel714numeric240 unique values
0 missing
pixel716numeric246 unique values
0 missing
pixel749numeric240 unique values
0 missing
pixel756numeric21 unique values
0 missing
pixel758numeric9 unique values
0 missing
pixel770numeric228 unique values
0 missing
pixel771numeric224 unique values
0 missing

19 properties

2000
Number of instances (rows) of the dataset.
101
Number of attributes (columns) of the dataset.
10
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
100
Number of numeric attributes.
1
Number of nominal attributes.
0.99
Percentage of nominal attributes.
0.1
Average class difference between consecutive instances.
99.01
Percentage of numeric attributes.
0
Percentage of missing values.
0
Percentage of instances having missing values.
0
Percentage of binary attributes.
0
Number of binary attributes.
200
Number of instances belonging to the least frequent class.
10
Percentage of instances belonging to the least frequent class.
200
Number of instances belonging to the most frequent class.
10
Percentage of instances belonging to the most frequent class.
0.05
Number of attributes divided by the number of instances.

0 tasks

Define a new task