OpenML

JavaScript is required to properly view the contents of this page!

Explore
- Data
- Task
- Flow
- Run
- Study
- Task type
- Measure
- People
Help
Blog
Contact
Please cite us

kdd_internet_usage

active ARFF Publicly available Visibility: public Uploaded 04-10-2014 by Felicia West
0 likes downloaded by 6 people , 6 total downloads 0 issues 0 downvotes

Issue	#Downvotes for this reason	By

Loading wiki

Help us complete this description Edit

Author: Source: Unknown - Date unknown Please cite: Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and all others as negative ('N'). Originally converted by Quan Sun.

69 features

Who_Pays_for_Access_Work (target)	nominal	2 unique values 0 missing
Not_Purchasing_Not_option	nominal	2 unique values 0 missing
Opinions_on_Censorship	nominal	4 unique values 0 missing
Not_Purchasing_Not_applicable	nominal	2 unique values 0 missing
Not_Purchasing_No_credit	nominal	2 unique values 0 missing
Not_Purchasing_Never_tried	nominal	2 unique values 0 missing
Not_Purchasing_Judge_quality	nominal	2 unique values 0 missing
Not_Purchasing_Enough_info	nominal	2 unique values 0 missing
Not_Purchasing_Easier_locally	nominal	2 unique values 0 missing
Not_Purchasing_Company_policy	nominal	2 unique values 0 missing
Not_Purchasing_Cant_find	nominal	2 unique values 0 missing
Not_Purchasing_Bad_press	nominal	2 unique values 0 missing
Not_Purchasing_Bad_experience	nominal	2 unique values 0 missing
Race	nominal	8 unique values 0 missing
Primary_Place_of_WWW_Access	nominal	9 unique values 0 missing
Primary_Language	nominal	119 unique values 0 missing
Primary_Computing_Platform	nominal	11 unique values 2699 missing
Most_Import_Issue_Facing_the_Internet	nominal	9 unique values 0 missing
Registered_to_Vote	nominal	4 unique values 0 missing
Who_Pays_for_Access_Self	nominal	2 unique values 0 missing
Who_Pays_for_Access_School	nominal	2 unique values 0 missing
Who_Pays_for_Access_Parents	nominal	2 unique values 0 missing
Who_Pays_for_Access_Other	nominal	2 unique values 0 missing
Who_Pays_for_Access_Dont_Know	nominal	2 unique values 0 missing
Web_Page_Creation	nominal	3 unique values 0 missing
Web_Ordering	nominal	3 unique values 0 missing
Sexual_Preference	nominal	6 unique values 0 missing
Not_Purchasing_Other	nominal	2 unique values 0 missing
Not_Purchasing_Unfamiliar_vendor	nominal	2 unique values 0 missing
Not_Purchasing_Uncomfortable	nominal	2 unique values 0 missing
Not_Purchasing_Too_complicated	nominal	2 unique values 0 missing
Not_Purchasing_Security	nominal	2 unique values 0 missing
Not_Purchasing_Receipt	nominal	2 unique values 0 missing
Not_Purchasing_Privacy	nominal	2 unique values 0 missing
Not_Purchasing_Prefer_people	nominal	2 unique values 0 missing
Community_Membership_Professional	nominal	2 unique values 0 missing
Disability_Not_Say	nominal	2 unique values 0 missing
Disability_Not_Impaired	nominal	2 unique values 0 missing
Disability_Motor	nominal	2 unique values 0 missing
Disability_Hearing	nominal	2 unique values 0 missing
Disability_Cognitive	nominal	2 unique values 0 missing
Country	nominal	129 unique values 0 missing
Community_Membership_Support	nominal	2 unique values 0 missing
Community_Membership_Religious	nominal	2 unique values 0 missing
Disability_Vision	nominal	2 unique values 0 missing
Community_Membership_Political	nominal	2 unique values 0 missing
Community_Membership_Other	nominal	2 unique values 0 missing
Community_Membership_None	nominal	2 unique values 0 missing
Community_Membership_Hobbies	nominal	2 unique values 0 missing
Community_Membership_Family	nominal	2 unique values 0 missing
Community_Building	nominal	4 unique values 0 missing
Age	nominal	77 unique values 0 missing
How_You_Heard_About_Survey_Others	nominal	2 unique values 0 missing
Marital_Status	nominal	7 unique values 0 missing
Major_Occupation	nominal	5 unique values 0 missing
Major_Geographical_Location	nominal	10 unique values 0 missing
How_You_Heard_About_Survey_WWW_Page	nominal	2 unique values 0 missing
How_You_Heard_About_Survey_Usenet_News	nominal	2 unique values 0 missing
How_You_Heard_About_Survey_Search_Engine	nominal	2 unique values 0 missing
How_You_Heard_About_Survey_Remebered	nominal	2 unique values 0 missing
How_You_Heard_About_Survey_Printed_Media	nominal	2 unique values 0 missing
Actual_Time	nominal	46 unique values 0 missing
How_You_Heard_About_Survey_Mailing_List	nominal	2 unique values 0 missing
How_You_Heard_About_Survey_Friend	nominal	2 unique values 0 missing
How_You_Heard_About_Survey_Banner	nominal	2 unique values 0 missing
Household_Income	nominal	9 unique values 0 missing
Gender	nominal	2 unique values 0 missing
Falsification_of_Information	nominal	7 unique values 0 missing
Education_Attainment	nominal	9 unique values 0 missing

Show all 69 features

107 properties

NumberOfInstances

10108

Number of instances (rows) of the dataset.

NumberOfFeatures

Number of attributes (columns) of the dataset.

NumberOfClasses

Number of distinct values of the target attribute (if it is nominal).

NumberOfMissingValues

2699

Number of missing values in the dataset.

NumberOfInstancesWithMissingValues

2699

Number of instances with at least one value missing.

NumberOfNumericFeatures

Number of numeric attributes.

NumberOfSymbolicFeatures

Number of nominal attributes.

AutoCorrelation

0.61

Average class difference between consecutive instances.

CfsSubsetEval_DecisionStumpAUC

0.91

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_DecisionStumpErrRate

0.13

Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_DecisionStumpKappa

0.66

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_NaiveBayesAUC

0.91

Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_NaiveBayesErrRate

0.13

Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_NaiveBayesKappa

0.66

Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_kNN1NAUC

0.91

Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_kNN1NErrRate

0.13

Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_kNN1NKappa

0.66

Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

ClassEntropy

0.84

Entropy of the target attribute values.

DecisionStumpAUC

0.77

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump

DecisionStumpErrRate

0.27

Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump

DecisionStumpKappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump

Dimensionality

0.01

Number of attributes divided by the number of instances.

EquivalentNumberOfAtts

40.3

Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.

J48.00001.AUC

0.91

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001

J48.00001.ErrRate

0.11

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001

J48.00001.Kappa

0.7

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001

J48.0001.AUC

0.91

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001

J48.0001.ErrRate

0.11

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001

J48.0001.Kappa

0.7

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001

J48.001.AUC

0.91

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001

J48.001.ErrRate

0.11

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001

J48.001.Kappa

0.7

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001

MajorityClassPercentage

73.14

Percentage of instances belonging to the most frequent class.

MajorityClassSize

7393

Number of instances belonging to the most frequent class.

MaxAttributeEntropy

5.71

Maximum entropy among attributes.

MaxKurtosisOfNumericAtts

Maximum kurtosis among attributes of the numeric type.

MaxMeansOfNumericAtts

Maximum of means among attributes of the numeric type.

MaxMutualInformation

0.38

Maximum mutual information between the nominal attributes and the target attribute.

MaxNominalAttDistinctValues

129

The maximum number of distinct values among attributes of the nominal type.

MaxSkewnessOfNumericAtts

Maximum skewness among attributes of the numeric type.

MaxStdDevOfNumericAtts

Maximum standard deviation of attributes of the numeric type.

MeanAttributeEntropy

Average entropy of the attributes.

MeanKurtosisOfNumericAtts

Mean kurtosis among attributes of the numeric type.

MeanMeansOfNumericAtts

Mean of means among attributes of the numeric type.

MeanMutualInformation

0.02

Average mutual information between the nominal attributes and the target attribute.

MeanNoiseToSignalRatio

47.03

An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.

MeanNominalAttDistinctValues

8.36

Average number of distinct values among the attributes of the nominal type.

MeanSkewnessOfNumericAtts

Mean skewness among attributes of the numeric type.

MeanStdDevOfNumericAtts

Mean standard deviation of attributes of the numeric type.

MinAttributeEntropy

0.04

Minimal entropy among attributes.

MinKurtosisOfNumericAtts

Minimum kurtosis among attributes of the numeric type.

MinMeansOfNumericAtts

Minimum of means among attributes of the numeric type.

MinMutualInformation

Minimal mutual information between the nominal attributes and the target attribute.

MinNominalAttDistinctValues

The minimal number of distinct values among attributes of the nominal type.

MinSkewnessOfNumericAtts

Minimum skewness among attributes of the numeric type.

MinStdDevOfNumericAtts

Minimum standard deviation of attributes of the numeric type.

MinorityClassPercentage

26.86

Percentage of instances belonging to the least frequent class.

MinorityClassSize

2715

Number of instances belonging to the least frequent class.

NaiveBayesAUC

0.92

Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes

NaiveBayesErrRate

0.15

Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes

NaiveBayesKappa

0.65

Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes

NumberOfBinaryFeatures

Number of binary attributes.

PercentageOfBinaryFeatures

71.01

Percentage of binary attributes.

PercentageOfInstancesWithMissingValues

26.7

Percentage of instances having missing values.

PercentageOfMissingValues

0.39

Percentage of missing values.

PercentageOfNumericFeatures

Percentage of numeric attributes.

PercentageOfSymbolicFeatures

100

Percentage of nominal attributes.

Quartile1AttributeEntropy

0.37

First quartile of entropy among attributes.

Quartile1KurtosisOfNumericAtts

First quartile of kurtosis among attributes of the numeric type.

Quartile1MeansOfNumericAtts

First quartile of means among attributes of the numeric type.

Quartile1MutualInformation

First quartile of mutual information between the nominal attributes and the target attribute.

Quartile1SkewnessOfNumericAtts

First quartile of skewness among attributes of the numeric type.

Quartile1StdDevOfNumericAtts

First quartile of standard deviation of attributes of the numeric type.

Quartile2AttributeEntropy

0.64

Second quartile (Median) of entropy among attributes.

Quartile2KurtosisOfNumericAtts

Second quartile (Median) of kurtosis among attributes of the numeric type.

Quartile2MeansOfNumericAtts

Second quartile (Median) of means among attributes of the numeric type.

Quartile2MutualInformation

Second quartile (Median) of mutual information between the nominal attributes and the target attribute.

Quartile2SkewnessOfNumericAtts

Second quartile (Median) of skewness among attributes of the numeric type.

Quartile2StdDevOfNumericAtts

Second quartile (Median) of standard deviation of attributes of the numeric type.

Quartile3AttributeEntropy

0.98

Third quartile of entropy among attributes.

Quartile3KurtosisOfNumericAtts

Third quartile of kurtosis among attributes of the numeric type.

Quartile3MeansOfNumericAtts

Third quartile of means among attributes of the numeric type.

Quartile3MutualInformation

0.01

Third quartile of mutual information between the nominal attributes and the target attribute.

Quartile3SkewnessOfNumericAtts

Third quartile of skewness among attributes of the numeric type.

Quartile3StdDevOfNumericAtts

Third quartile of standard deviation of attributes of the numeric type.

REPTreeDepth1AUC

0.91

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1

REPTreeDepth1ErrRate

0.12

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1

REPTreeDepth1Kappa

0.69

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1

REPTreeDepth2AUC

0.91

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2

REPTreeDepth2ErrRate

0.12

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2

REPTreeDepth2Kappa

0.69

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2

REPTreeDepth3AUC

0.91

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3

REPTreeDepth3ErrRate

0.12

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3

REPTreeDepth3Kappa

0.69

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3

RandomTreeDepth1AUC

0.66

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

RandomTreeDepth1ErrRate

0.26

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

RandomTreeDepth1Kappa

0.29

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

RandomTreeDepth2AUC

0.66

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

RandomTreeDepth2ErrRate

0.26

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

RandomTreeDepth2Kappa

0.29

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

RandomTreeDepth3AUC

0.66

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

RandomTreeDepth3ErrRate

0.26

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

RandomTreeDepth3Kappa

0.29

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

StdvNominalAttDistinctValues

22.7

Standard deviation of the number of distinct values among attributes of the nominal type.

kNN1NAUC

0.81

Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk

kNN1NErrRate

0.18

Error rate achieved by the landmarker weka.classifiers.lazy.IBk

kNN1NKappa

0.51

Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk

Show all 107 properties

17 tasks

Supervised Classification on kdd_internet_usage

343 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: Who_Pays_for_Access_Work

Supervised Classification on kdd_internet_usage

209 runs - estimation_procedure: 10 times 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: Who_Pays_for_Access_Work

Supervised Classification on kdd_internet_usage

0 runs - estimation_procedure: 33% Holdout set - target_feature: Who_Pays_for_Access_Work

Supervised Classification on kdd_internet_usage

0 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: mean_precision - target_feature: Country

Learning Curve on kdd_internet_usage

70 runs - estimation_procedure: 10-fold Learning Curve - target_feature: Who_Pays_for_Access_Work

Supervised Data Stream Classification on kdd_internet_usage

0 runs - estimation_procedure: Interleaved Test then Train - target_feature: Who_Pays_for_Access_Work

Clustering on kdd_internet_usage

0 runs

Clustering on kdd_internet_usage

0 runs - estimation_procedure: 50 times Clustering