DEVELOPMENT... OpenML
Data
vehicle_sensIT

vehicle_sensIT

active Sparse_ARFF Publicly available Visibility: public Uploaded 29-08-2014 by David
0 likes downloaded by 23 people , 30 total downloads 0 issues 0 downvotes
  • concept_drift mythbusting_1 study_1 study_15 study_20 study_41
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Author: M. Duarte, Y. H. Hu Source: [original](http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets) - 2013-11-14 - Please cite: M. Duarte and Y. H. Hu. Vehicle classification in distributed sensor networks. Journal of Parallel and Distributed Computing, 64(7):826-838, July 2004. This is the SensIT Vehicle (combined) dataset, retrieved 2013-11-14 from the libSVM site. Additional to the preprocessing done there (see LibSVM site for details), this dataset was created as follows: -join test and train datasets (2 files, already pre-combined) -relabel classes 1,2=positive class and 3=negative class -normalize each file columnwise according to the following rules: -If a column only contains one value (constant feature), it will set to zero and thus removed by sparsity. -If a column contains two values (binary feature), the value occuring more often will be set to zero, the other to one. -If a column contains more than two values (multinary/real feature), the column is divided by its std deviation.

101 features

Y (target)nominal2 unique values
0 missing
X1numeric90612 unique values
0 missing
X2numeric83667 unique values
0 missing
X3numeric93241 unique values
0 missing
X4numeric91467 unique values
0 missing
X5numeric78817 unique values
0 missing
X6numeric84543 unique values
0 missing
X7numeric84445 unique values
0 missing
X8numeric92565 unique values
0 missing
X9numeric94507 unique values
0 missing
X10numeric95144 unique values
0 missing
X11numeric95227 unique values
0 missing
X12numeric96721 unique values
0 missing
X13numeric97419 unique values
0 missing
X14numeric97866 unique values
0 missing
X15numeric97836 unique values
0 missing
X16numeric97807 unique values
0 missing
X17numeric97734 unique values
0 missing
X18numeric97799 unique values
0 missing
X19numeric97732 unique values
0 missing
X20numeric97777 unique values
0 missing
X21numeric97762 unique values
0 missing
X22numeric97541 unique values
0 missing
X23numeric97438 unique values
0 missing
X24numeric97227 unique values
0 missing
X25numeric97271 unique values
0 missing
X26numeric97280 unique values
0 missing
X27numeric97346 unique values
0 missing
X28numeric97276 unique values
0 missing
X29numeric97273 unique values
0 missing
X30numeric97328 unique values
0 missing
X31numeric97475 unique values
0 missing
X32numeric97429 unique values
0 missing
X33numeric97414 unique values
0 missing
X34numeric97452 unique values
0 missing
X35numeric97445 unique values
0 missing
X36numeric97266 unique values
0 missing
X37numeric97316 unique values
0 missing
X38numeric97198 unique values
0 missing
X39numeric97291 unique values
0 missing
X40numeric97359 unique values
0 missing
X41numeric97194 unique values
0 missing
X42numeric97219 unique values
0 missing
X43numeric97187 unique values
0 missing
X44numeric97212 unique values
0 missing
X45numeric97250 unique values
0 missing
X46numeric97178 unique values
0 missing
X47numeric97180 unique values
0 missing
X48numeric97239 unique values
0 missing
X49numeric97171 unique values
0 missing
X50numeric97155 unique values
0 missing
X51numeric97873 unique values
0 missing
X52numeric73164 unique values
0 missing
X53numeric72925 unique values
0 missing
X54numeric86238 unique values
0 missing
X55numeric92160 unique values
0 missing
X56numeric95063 unique values
0 missing
X57numeric94849 unique values
0 missing
X58numeric96214 unique values
0 missing
X59numeric96389 unique values
0 missing
X60numeric96484 unique values
0 missing
X61numeric96837 unique values
0 missing
X62numeric96845 unique values
0 missing
X63numeric97026 unique values
0 missing
X64numeric97039 unique values
0 missing
X65numeric97039 unique values
0 missing
X66numeric97126 unique values
0 missing
X67numeric97132 unique values
0 missing
X68numeric97107 unique values
0 missing
X69numeric97162 unique values
0 missing
X70numeric97157 unique values
0 missing
X71numeric96984 unique values
0 missing
X72numeric96817 unique values
0 missing
X73numeric96994 unique values
0 missing
X74numeric97019 unique values
0 missing
X75numeric97082 unique values
0 missing
X76numeric97111 unique values
0 missing
X77numeric97269 unique values
0 missing
X78numeric97320 unique values
0 missing
X79numeric97220 unique values
0 missing
X80numeric97392 unique values
0 missing
X81numeric97384 unique values
0 missing
X82numeric97386 unique values
0 missing
X83numeric97449 unique values
0 missing
X84numeric97365 unique values
0 missing
X85numeric97410 unique values
0 missing
X86numeric97316 unique values
0 missing
X87numeric97361 unique values
0 missing
X88numeric97371 unique values
0 missing
X89numeric97368 unique values
0 missing
X90numeric97311 unique values
0 missing
X91numeric97315 unique values
0 missing
X92numeric97370 unique values
0 missing
X93numeric97377 unique values
0 missing
X94numeric97318 unique values
0 missing
X95numeric97387 unique values
0 missing
X96numeric97361 unique values
0 missing
X97numeric97373 unique values
0 missing
X98numeric97344 unique values
0 missing
X99numeric97282 unique values
0 missing
X100numeric97340 unique values
0 missing

107 properties

98528
Number of instances (rows) of the dataset.
101
Number of attributes (columns) of the dataset.
2
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
100
Number of numeric attributes.
1
Number of nominal attributes.
0.5
Average class difference between consecutive instances.
0.84
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.16
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.68
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.84
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.16
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.68
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.84
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.16
Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.68
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
1
Entropy of the target attribute values.
0.79
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump
0.21
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump
0.58
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump
0
Number of attributes divided by the number of instances.
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
0.82
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001
0.17
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001
0.67
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001
0.82
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001
0.17
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001
0.67
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001
0.82
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001
0.17
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001
0.67
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001
50
Percentage of instances belonging to the most frequent class.
49264
Number of instances belonging to the most frequent class.
Maximum entropy among attributes.
119.15
Maximum kurtosis among attributes of the numeric type.
0.91
Maximum of means among attributes of the numeric type.
Maximum mutual information between the nominal attributes and the target attribute.
2
The maximum number of distinct values among attributes of the nominal type.
7.98
Maximum skewness among attributes of the numeric type.
1
Maximum standard deviation of attributes of the numeric type.
Average entropy of the attributes.
34.02
Mean kurtosis among attributes of the numeric type.
-0.35
Mean of means among attributes of the numeric type.
Average mutual information between the nominal attributes and the target attribute.
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
2
Average number of distinct values among the attributes of the nominal type.
3.37
Mean skewness among attributes of the numeric type.
1
Mean standard deviation of attributes of the numeric type.
Minimal entropy among attributes.
-1.44
Minimum kurtosis among attributes of the numeric type.
-1.47
Minimum of means among attributes of the numeric type.
Minimal mutual information between the nominal attributes and the target attribute.
2
The minimal number of distinct values among attributes of the nominal type.
-1.43
Minimum skewness among attributes of the numeric type.
1
Minimum standard deviation of attributes of the numeric type.
50
Percentage of instances belonging to the least frequent class.
49264
Number of instances belonging to the least frequent class.
0.85
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes
0.19
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes
0.61
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes
1
Number of binary attributes.
0.99
Percentage of binary attributes.
0
Percentage of instances having missing values.
0
Percentage of missing values.
99.01
Percentage of numeric attributes.
0.99
Percentage of nominal attributes.
First quartile of entropy among attributes.
15.01
First quartile of kurtosis among attributes of the numeric type.
-0.85
First quartile of means among attributes of the numeric type.
First quartile of mutual information between the nominal attributes and the target attribute.
2.41
First quartile of skewness among attributes of the numeric type.
1
First quartile of standard deviation of attributes of the numeric type.
Second quartile (Median) of entropy among attributes.
19.02
Second quartile (Median) of kurtosis among attributes of the numeric type.
-0.24
Second quartile (Median) of means among attributes of the numeric type.
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.
3.31
Second quartile (Median) of skewness among attributes of the numeric type.
1
Second quartile (Median) of standard deviation of attributes of the numeric type.
Third quartile of entropy among attributes.
50.91
Third quartile of kurtosis among attributes of the numeric type.
-0.09
Third quartile of means among attributes of the numeric type.
Third quartile of mutual information between the nominal attributes and the target attribute.
5
Third quartile of skewness among attributes of the numeric type.
1
Third quartile of standard deviation of attributes of the numeric type.
0.89
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.15
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.69
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.89
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.15
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.69
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.89
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.15
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.69
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.78
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0.22
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0.57
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0.78
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.22
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.57
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.78
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0.22
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0.57
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0
Standard deviation of the number of distinct values among attributes of the nominal type.
0.74
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk
0.26
Error rate achieved by the landmarker weka.classifiers.lazy.IBk
0.48
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk

15 tasks

233 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: Y
129 runs - estimation_procedure: 10 times 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: Y
0 runs - estimation_procedure: 33% Holdout set - evaluation_measure: predictive_accuracy - target_feature: Y
41 runs - estimation_procedure: Interleaved Test then Train - target_feature: Y
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
Define a new task