OpenML

JavaScript is required to properly view the contents of this page!

Explore
- Data
- Task
- Flow
- Run
- Study
- Task type
- Measure
- People
Help
Blog
Contact
Please cite us

auto93

active ARFF Publicly available Visibility: public Uploaded 03-10-2014 by unknown
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes

Issue	#Downvotes for this reason	By

Loading wiki

Help us complete this description Edit

Author: Source: Unknown - Date unknown Please cite: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Attributes 2,4, and 6 deleted. Midrange price treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning with encoding length selection. In Progress in Connectionist-Based Information Systems. Singapore: Springer-Verlag. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! NAME: 1993 New Car Data TYPE: Sample SIZE: 93 observations, 26 variables DESCRIPTIVE ABSTRACT: Specifications are given for 93 new car models for the 1993 year. Several measures are given to evaluate price, mpg ratings, engine size, body size, and features. SOURCES: _Consumer Reports: The 1993 Cars - Annual Auto Issue_ (April 1993), Yonkers, NY: Consumers Union. _PACE New Car & Truck 1993 Buying Guide_ (1993), Milwaukee, WI: Pace Publications Inc. VARIABLE DESCRIPTIONS: Line 1 Columns 1 - 14 Manufacturer 15 - 29 Model 30 - 36 Type Small, Sporty, Compact, Midsize, Large - as defined in the _Consumer Reports_ article 38 - 41 Minimum Price (in $1,000) - Price for basic version of this model 43 - 46 Midrange Price (in $1,000) - Average of Min and Max prices 48 - 51 Maximum Price (in $1,000) - Price for a premium version 53 - 54 City MPG (miles per gallon by EPA rating) 56 - 57 Highway MPG 59 - 59 Air Bags standard 0 = none, 1 = driver only, 2 = driver & passenger 61 - 61 Drive train type 0 = rear wheel drive 1 = front wheel drive 2 = all wheel drive 63 - 63 Number of cylinders 65 - 67 Engine size (liters) 69 - 71 Horsepower (maximum) 73 - 76 RPM (revs per minute at maximum horsepower) Line 2 Columns 1 - 4 Engine revolutions per mile (in highest gear) 6 - 6 Manual transmission available 0 = No, 1 = Yes 8 - 11 Fuel tank capacity (gallons) 13 - 13 Passenger capacity (persons) 15 - 17 Length (inches) 19 - 21 Wheelbase (inches) 23 - 24 Width (inches) 26 - 27 U-turn space (feet) 29 - 32 Rear seat room (inches) 34 - 35 Luggage capacity (cu. ft.) 37 - 40 Weight (pounds) 42 - 42 Domestic? 0 = non-U.S. manufacturer, 1 = U.S. manufacturer Values are aligned and delimited by blanks. Missing values are denoted with *. There are two data lines for each case. SPECIAL NOTES: The only missing values are for CYLINDERS in the rotary engine Mazda RX-7, REAR SEAT room for the two-seaters (Corvette and RX-7), and LUGGAGE capacity for the vans and two-seaters. WEIGHT is taken from the _Consumer Reports_ data and includes a full fuel tank, automatic transmission (if available), and air conditioning. STORY BEHIND THE DATA: Cars were selected at random from among 1993 passenger car models that were listed in both the _Consumer Reports_ issue and the _PACE Buying Guide_. Pickup trucks and Sport/Utility vehicles were eliminated due to incomplete information in the _Consumer Reports_ source. Duplicate models (e.g., Dodge Shadow and Plymouth Sundance) were listed at most once. A similar dataset for 1989 model cars appeared as one of the sample datasets shipped with the _Student Edition of Execustat_ (PWS-KENT 1990). Further description can be found in the "Datasets and Stories" article "1993 New Car Data" in the _Journal of Statistics Education_ (Lock 1993). Send the message send jse/v1n1/datasets.lock to the address archive@jse.stat.ncsu.edu PEDAGOGICAL NOTES: This is a multi-purpose dataset that can be used at many points in an introductory course. It includes many good numeric variables and several options for dividing the cars up into groups. Students tend to be familiar with most of the variables (and specific car models). They can anticipate and pose explanations for many of the relationships to be found in the data, although some surprises may be encountered. One can easily find examples of pairs of variables that demonstrate strong or weak, positive or negative associations. PRICE and MPG variables tend to be popular choices as "dependent" variables. Basic graphs will often reveal unusual data values (like the price for a Mercedes-Benz). REFERENCES: Lock, R. H. (1993), "1993 New Car Data," _Journal of Statistics Education_, 1, No. 1. _Student Edition of Execustat_ (1990), Boston, MA: PWS-KENT Publishing Co. SUBMITTED BY: Robin H. Lock Mathematics Department St. Lawrence University Canton, NY 13617 (315) 379-5960 rlock@stlawu.bitnet

23 features

class (target)	numeric	81 unique values 0 missing
Manual_transmission_available	nominal	2 unique values 0 missing
Domestic	nominal	2 unique values 0 missing
Weight	numeric	81 unique values 0 missing
Luggage_capacity	numeric	16 unique values 11 missing
Rear_seat_room	numeric	24 unique values 2 missing
U-turn_space	numeric	14 unique values 0 missing
Width	numeric	16 unique values 0 missing
Wheelbase	numeric	27 unique values 0 missing
Length	numeric	51 unique values 0 missing
Passenger_capacity	numeric	6 unique values 0 missing
Fuel_tank_capacity	numeric	38 unique values 0 missing
Manufacturer	nominal	31 unique values 0 missing
Engine_revolutions_per_mile	numeric	78 unique values 0 missing
RPM	numeric	24 unique values 0 missing
Horsepower	numeric	57 unique values 0 missing
Engine_size	numeric	26 unique values 0 missing
Number_of_cylinders	numeric	5 unique values 1 missing
Drive_train_type	nominal	3 unique values 0 missing
Air_Bags_standard	nominal	3 unique values 0 missing
Highway_MPG	numeric	22 unique values 0 missing
City_MPG	numeric	21 unique values 0 missing
Type	nominal	6 unique values 0 missing

Show all 23 features

107 properties

NumberOfInstances

Number of instances (rows) of the dataset.

NumberOfFeatures

Number of attributes (columns) of the dataset.

NumberOfClasses

Number of distinct values of the target attribute (if it is nominal).

NumberOfMissingValues

Number of missing values in the dataset.

NumberOfInstancesWithMissingValues

Number of instances with at least one value missing.

NumberOfNumericFeatures

Number of numeric attributes.

NumberOfSymbolicFeatures

Number of nominal attributes.

AutoCorrelation

-6.5

Average class difference between consecutive instances.

CfsSubsetEval_DecisionStumpAUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_DecisionStumpErrRate

Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_DecisionStumpKappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_NaiveBayesAUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_NaiveBayesErrRate

Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_NaiveBayesKappa

Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_kNN1NAUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_kNN1NErrRate

Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_kNN1NKappa

Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

ClassEntropy

Entropy of the target attribute values.

DecisionStumpAUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump

DecisionStumpErrRate

Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump

DecisionStumpKappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump

Dimensionality

0.25

Number of attributes divided by the number of instances.

EquivalentNumberOfAtts

Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.

J48.00001.AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001

J48.00001.ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001

J48.00001.Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001

J48.0001.AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001

J48.0001.ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001

J48.0001.Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001

J48.001.AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001

J48.001.ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001

J48.001.Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001

MajorityClassPercentage

Percentage of instances belonging to the most frequent class.

MajorityClassSize

Number of instances belonging to the most frequent class.

MaxAttributeEntropy

Maximum entropy among attributes.

MaxKurtosisOfNumericAtts

Maximum kurtosis among attributes of the numeric type.

MaxMeansOfNumericAtts

5280.65

Maximum of means among attributes of the numeric type.

MaxMutualInformation

Maximum mutual information between the nominal attributes and the target attribute.

MaxNominalAttDistinctValues

The maximum number of distinct values among attributes of the nominal type.

MaxSkewnessOfNumericAtts

1.7

Maximum skewness among attributes of the numeric type.

MaxStdDevOfNumericAtts

596.73

Maximum standard deviation of attributes of the numeric type.

MeanAttributeEntropy

Average entropy of the attributes.

MeanKurtosisOfNumericAtts

0.67

Mean kurtosis among attributes of the numeric type.

MeanMeansOfNumericAtts

668.65

Mean of means among attributes of the numeric type.

MeanMutualInformation

Average mutual information between the nominal attributes and the target attribute.

MeanNoiseToSignalRatio

An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.

MeanNominalAttDistinctValues

7.83

Average number of distinct values among the attributes of the nominal type.

MeanSkewnessOfNumericAtts

0.45

Mean skewness among attributes of the numeric type.

MeanStdDevOfNumericAtts

105.72

Mean standard deviation of attributes of the numeric type.

MinAttributeEntropy

Minimal entropy among attributes.

MinKurtosisOfNumericAtts

-0.86

Minimum kurtosis among attributes of the numeric type.

MinMeansOfNumericAtts

2.67

Minimum of means among attributes of the numeric type.

MinMutualInformation

Minimal mutual information between the nominal attributes and the target attribute.

MinNominalAttDistinctValues

The minimal number of distinct values among attributes of the nominal type.

MinSkewnessOfNumericAtts

-0.26

Minimum skewness among attributes of the numeric type.

MinStdDevOfNumericAtts

1.04

Minimum standard deviation of attributes of the numeric type.

MinorityClassPercentage

Percentage of instances belonging to the least frequent class.

MinorityClassSize

Number of instances belonging to the least frequent class.

NaiveBayesAUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes

NaiveBayesErrRate

Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes

NaiveBayesKappa

Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes

NumberOfBinaryFeatures

Number of binary attributes.

PercentageOfBinaryFeatures

8.7

Percentage of binary attributes.

PercentageOfInstancesWithMissingValues

11.83

Percentage of instances having missing values.

PercentageOfMissingValues

0.65

Percentage of missing values.

PercentageOfNumericFeatures

73.91

Percentage of numeric attributes.

PercentageOfSymbolicFeatures

26.09

Percentage of nominal attributes.

Quartile1AttributeEntropy

First quartile of entropy among attributes.

Quartile1KurtosisOfNumericAtts

-0.33

First quartile of kurtosis among attributes of the numeric type.

Quartile1MeansOfNumericAtts

15.28

First quartile of means among attributes of the numeric type.

Quartile1MutualInformation

First quartile of mutual information between the nominal attributes and the target attribute.

Quartile1SkewnessOfNumericAtts

-0.01

First quartile of skewness among attributes of the numeric type.

Quartile1StdDevOfNumericAtts

2.99

First quartile of standard deviation of attributes of the numeric type.

Quartile2AttributeEntropy

Second quartile (Median) of entropy among attributes.

Quartile2KurtosisOfNumericAtts

0.38

Second quartile (Median) of kurtosis among attributes of the numeric type.

Quartile2MeansOfNumericAtts

29.09

Second quartile (Median) of means among attributes of the numeric type.

Quartile2MutualInformation

Second quartile (Median) of mutual information between the nominal attributes and the target attribute.

Quartile2SkewnessOfNumericAtts

0.23

Second quartile (Median) of skewness among attributes of the numeric type.

Quartile2StdDevOfNumericAtts

5.33

Second quartile (Median) of standard deviation of attributes of the numeric type.

Quartile3AttributeEntropy

Third quartile of entropy among attributes.

Quartile3KurtosisOfNumericAtts

1.02

Third quartile of kurtosis among attributes of the numeric type.

Quartile3MeansOfNumericAtts

163.52

Third quartile of means among attributes of the numeric type.

Quartile3MutualInformation

Third quartile of mutual information between the nominal attributes and the target attribute.

Quartile3SkewnessOfNumericAtts

0.91

Third quartile of skewness among attributes of the numeric type.

Quartile3StdDevOfNumericAtts

33.49

Third quartile of standard deviation of attributes of the numeric type.

REPTreeDepth1AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1

REPTreeDepth1ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1

REPTreeDepth1Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1

REPTreeDepth2AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2

REPTreeDepth2ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2

REPTreeDepth2Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2

REPTreeDepth3AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3

REPTreeDepth3ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3

REPTreeDepth3Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3

RandomTreeDepth1AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

RandomTreeDepth1ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

RandomTreeDepth1Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

RandomTreeDepth2AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

RandomTreeDepth2ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

RandomTreeDepth2Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

RandomTreeDepth3AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

RandomTreeDepth3ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

RandomTreeDepth3Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

StdvNominalAttDistinctValues

11.44

Standard deviation of the number of distinct values among attributes of the nominal type.

kNN1NAUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk

kNN1NErrRate

Error rate achieved by the landmarker weka.classifiers.lazy.IBk

kNN1NKappa

Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk

Show all 107 properties

18 tasks

Supervised Regression on auto93

0 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: mean_absolute_error - target_feature: class

Supervised Regression on auto93

0 runs - estimation_procedure: 10 times 10-fold Crossvalidation - evaluation_measure: mean_absolute_error - target_feature: class

Supervised Regression on auto93

0 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: class

Supervised Regression on auto93

0 runs - estimation_procedure: 10 times 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: class

Supervised Regression on auto93

0 runs - estimation_procedure: 5 times 2-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: class

Supervised Regression on auto93

0 runs - estimation_procedure: Custom 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: class

Supervised Regression on auto93

0 runs - estimation_procedure: Test on Training Data - evaluation_measure: predictive_accuracy - target_feature: class

Clustering on auto93

0 runs

Clustering on auto93