DEVELOPMENT... OpenML
Data
Fashion-MNIST_seed_4_nrows_2000_nclasses_10_ncols_100_stratify_True

Fashion-MNIST_seed_4_nrows_2000_nclasses_10_ncols_100_stratify_True

active ARFF Publicly available Visibility: public Uploaded 17-11-2022 by David Wilson
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Subsampling of the dataset Fashion-MNIST (40996) with seed=4 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed: int, nrows_max: int = 2_000, ncols_max: int = 100, nclasses_max: int = 10, stratified: bool = True, ) -> Dataset: rng = np.random.default_rng(seed) x = self.x y = self.y # Uniformly sample classes = y.unique() if len(classes) > nclasses_max: vcs = y.value_counts() selected_classes = rng.choice( classes, size=nclasses_max, replace=False, p=vcs / sum(vcs), ) # Select the indices where one of these classes is present idxs = y.index[y.isin(classes)] x = x.iloc[idxs] y = y.iloc[idxs] # Uniformly sample columns if required if len(x.columns) > ncols_max: columns_idxs = rng.choice( list(range(len(x.columns))), size=ncols_max, replace=False ) sorted_column_idxs = sorted(columns_idxs) selected_columns = list(x.columns[sorted_column_idxs]) x = x[selected_columns] else: sorted_column_idxs = list(range(len(x.columns))) if len(x) > nrows_max: # Stratify accordingly target_name = y.name data = pd.concat((x, y), axis="columns") _, subset = train_test_split( data, test_size=nrows_max, stratify=data[target_name], shuffle=True, random_state=seed, ) x = subset.drop(target_name, axis="columns") y = subset[target_name] # We need to convert categorical columns to string for openml categorical_mask = [self.categorical_mask[i] for i in sorted_column_idxs] columns = list(x.columns) return Dataset( # Technically this is not the same but it's where it was derived from dataset=self.dataset, x=x, y=y, categorical_mask=categorical_mask, columns=columns, ) ```

101 features

class (target)nominal10 unique values
0 missing
pixel21numeric91 unique values
0 missing
pixel28numeric3 unique values
0 missing
pixel43numeric238 unique values
0 missing
pixel56numeric6 unique values
0 missing
pixel60numeric21 unique values
0 missing
pixel66numeric246 unique values
0 missing
pixel88numeric32 unique values
0 missing
pixel98numeric243 unique values
0 missing
pixel100numeric245 unique values
0 missing
pixel123numeric246 unique values
0 missing
pixel129numeric248 unique values
0 missing
pixel133numeric245 unique values
0 missing
pixel138numeric114 unique values
0 missing
pixel147numeric237 unique values
0 missing
pixel150numeric246 unique values
0 missing
pixel152numeric242 unique values
0 missing
pixel155numeric248 unique values
0 missing
pixel166numeric150 unique values
0 missing
pixel167numeric115 unique values
0 missing
pixel173numeric197 unique values
0 missing
pixel197numeric19 unique values
0 missing
pixel202numeric238 unique values
0 missing
pixel221numeric223 unique values
0 missing
pixel239numeric252 unique values
0 missing
pixel260numeric233 unique values
0 missing
pixel263numeric249 unique values
0 missing
pixel265numeric251 unique values
0 missing
pixel270numeric255 unique values
0 missing
pixel281numeric41 unique values
0 missing
pixel297numeric248 unique values
0 missing
pixel300numeric254 unique values
0 missing
pixel306numeric234 unique values
0 missing
pixel315numeric227 unique values
0 missing
pixel336numeric118 unique values
0 missing
pixel338numeric76 unique values
0 missing
pixel352numeric251 unique values
0 missing
pixel356numeric255 unique values
0 missing
pixel363numeric230 unique values
0 missing
pixel364numeric133 unique values
0 missing
pixel366numeric87 unique values
0 missing
pixel367numeric125 unique values
0 missing
pixel370numeric242 unique values
0 missing
pixel371numeric245 unique values
0 missing
pixel373numeric254 unique values
0 missing
pixel378numeric249 unique values
0 missing
pixel383numeric251 unique values
0 missing
pixel390numeric220 unique values
0 missing
pixel402numeric255 unique values
0 missing
pixel407numeric251 unique values
0 missing
pixel413numeric250 unique values
0 missing
pixel414numeric242 unique values
0 missing
pixel422numeric158 unique values
0 missing
pixel425numeric241 unique values
0 missing
pixel431numeric254 unique values
0 missing
pixel437numeric245 unique values
0 missing
pixel438numeric248 unique values
0 missing
pixel441numeric251 unique values
0 missing
pixel474numeric233 unique values
0 missing
pixel480numeric232 unique values
0 missing
pixel486numeric253 unique values
0 missing
pixel498numeric251 unique values
0 missing
pixel510numeric250 unique values
0 missing
pixel512numeric254 unique values
0 missing
pixel514numeric252 unique values
0 missing
pixel526numeric251 unique values
0 missing
pixel531numeric236 unique values
0 missing
pixel560numeric132 unique values
0 missing
pixel562numeric185 unique values
0 missing
pixel592numeric225 unique values
0 missing
pixel606numeric249 unique values
0 missing
pixel612numeric242 unique values
0 missing
pixel620numeric221 unique values
0 missing
pixel627numeric251 unique values
0 missing
pixel631numeric249 unique values
0 missing
pixel637numeric251 unique values
0 missing
pixel647numeric166 unique values
0 missing
pixel648numeric212 unique values
0 missing
pixel655numeric248 unique values
0 missing
pixel665numeric251 unique values
0 missing
pixel669numeric236 unique values
0 missing
pixel671numeric171 unique values
0 missing
pixel674numeric95 unique values
0 missing
pixel687numeric243 unique values
0 missing
pixel688numeric248 unique values
0 missing
pixel692numeric246 unique values
0 missing
pixel696numeric230 unique values
0 missing
pixel703numeric111 unique values
0 missing
pixel705numeric217 unique values
0 missing
pixel709numeric240 unique values
0 missing
pixel710numeric240 unique values
0 missing
pixel733numeric215 unique values
0 missing
pixel745numeric232 unique values
0 missing
pixel749numeric239 unique values
0 missing
pixel751numeric225 unique values
0 missing
pixel753numeric198 unique values
0 missing
pixel755numeric66 unique values
0 missing
pixel756numeric16 unique values
0 missing
pixel774numeric222 unique values
0 missing
pixel777numeric171 unique values
0 missing
pixel781numeric127 unique values
0 missing

19 properties

2000
Number of instances (rows) of the dataset.
101
Number of attributes (columns) of the dataset.
10
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
100
Number of numeric attributes.
1
Number of nominal attributes.
0.99
Percentage of nominal attributes.
0.11
Average class difference between consecutive instances.
99.01
Percentage of numeric attributes.
0
Percentage of missing values.
0
Percentage of instances having missing values.
0
Percentage of binary attributes.
0
Number of binary attributes.
200
Number of instances belonging to the least frequent class.
10
Percentage of instances belonging to the least frequent class.
200
Number of instances belonging to the most frequent class.
10
Percentage of instances belonging to the most frequent class.
0.05
Number of attributes divided by the number of instances.

0 tasks

Define a new task