DEVELOPMENT... OpenML

Data

JavaScript is required to properly view the contents of this page!

Explore
- Data
- Task
- Flow
- Run
- Study
- Task type
- Measure
- People
Help
Blog
Contact
Please cite us

california

california

active ARFF See source Visibility: public Uploaded 21-06-2022 by Frank Wallace
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes

Issue	#Downvotes for this reason	By

Loading wiki

Help us complete this description Edit

Dataset used in the tabular data benchmark https://github.com/LeoGrin/tabular-benchmark, transformed in the same way. This dataset belongs to the "classification on numerical features" benchmark. Original source: https://www.dcc.fc.up.pt/~ltorgo/Regression/cal_housing.html Please give credit to the original source if you use this dataset.

9 features

price (target)	nominal	2 unique values 0 missing
MedInc	numeric	12925 unique values 0 missing
HouseAge	numeric	52 unique values 0 missing
AveRooms	numeric	19387 unique values 0 missing
AveBedrms	numeric	14229 unique values 0 missing
Population	numeric	3887 unique values 0 missing
AveOccup	numeric	18837 unique values 0 missing
Latitude	numeric	862 unique values 0 missing
Longitude	numeric	844 unique values 0 missing

Show all 9 features

19 properties

NumberOfInstances

20634

Number of instances (rows) of the dataset.

NumberOfFeatures

9

Number of attributes (columns) of the dataset.

NumberOfClasses

2

Number of distinct values of the target attribute (if it is nominal).

NumberOfMissingValues

0

Number of missing values in the dataset.

NumberOfInstancesWithMissingValues

0

Number of instances with at least one value missing.

NumberOfNumericFeatures

8

Number of numeric attributes.

NumberOfSymbolicFeatures

1

Number of nominal attributes.

PercentageOfSymbolicFeatures

11.11

Percentage of nominal attributes.

AutoCorrelation

1

Average class difference between consecutive instances.

PercentageOfNumericFeatures

88.89

Percentage of numeric attributes.

PercentageOfMissingValues

0

Percentage of missing values.

PercentageOfInstancesWithMissingValues

0

Percentage of instances having missing values.

PercentageOfBinaryFeatures

11.11

Percentage of binary attributes.

NumberOfBinaryFeatures

1

Number of binary attributes.

MinorityClassSize

10317

Number of instances belonging to the least frequent class.

MinorityClassPercentage

50

Percentage of instances belonging to the least frequent class.

MajorityClassSize

10317

Number of instances belonging to the most frequent class.

MajorityClassPercentage

50

Percentage of instances belonging to the most frequent class.

0

Number of attributes divided by the number of instances.

Show all 19 properties

1 tasks

Supervised Classification on california

0 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: price

Define a new task