DEVELOPMENT... OpenML
Data
Census-Income-KDD

Census-Income-KDD

active ARFF Publicly available Visibility: public Uploaded 07-12-2020 by Richard Davis
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Author: Terran Lane and Ronny Kohavi. Data Mining and Visualization. Silicon Graphics Source: [original](https://archive.ics.uci.edu/ml/datasets/Census-Income+(KDD)) - 2000 Please cite: Dua, D. and Graff, C. (2019). UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science. This version has feature names based on https://www2.1010data.com/documentationcenter/beta/Tutorials/MachineLearningExamples/CensusIncomeDataSet.html Missing data is also properly encoded in this version. The feature 'unknown' in the dataset does not appear in the list above. This possibly refers to the feature 'instance weight' in the original UCI description. Feature Name * Age of the worker age * Class of worker class_worker * Industry code det_ind_code * Occupation code det_occ_code * Level of education education * Wage per hour wage_per_hour * Enrolled in educational institution last week hs_college * Marital status marital_stat * Major industry code major_ind_code * Major occupation code major_occ_code * Race race * Hispanic origin hisp_origin * Sex sex * Member of a labor union union_member * Reason for unemployment unemp_reason * Full- or part-time employment status full_or_part_emp * Capital gains capital_gains * Capital losses capital_losses * Dividends from stocks stock_dividends * Tax filer status tax_filer_stat * Region of previous residence region_prev_res * State of previous residence state_prev_res * Detailed household and family status det_hh_fam_stat * Detailed household summary in household det_hh_summ * Unknown Unknown * Migration code - change in MSA mig_chg_msa * Migration code - change in region mig_chg_reg * Migration code - move within region mig_move_reg * Live in this house one year ago mig_same * Migration - previous residence in sunbelt mig_prev_sunbelt * Number of persons that worked for employer num_emp * Family members under 18 fam_under_18 * Country of birth father country_father * Country of birth mother country_mother * Country of birth country_self * Citizenship citizenship * Own business or self-employed? own_or_self * Fill included questionnaire for Veterans Admin. vet_question * Veterans benefits vet_benefits * Weeks worked in the year weeks_worked * Year of survey year * Income less than or greater than 0,000 income_50k * Number of years of education edu_year

42 features

income_50k (target)string2 unique values
0 missing
det_hh_fam_statstring38 unique values
0 missing
state_prev_resstring50 unique values
708 missing
det_hh_summstring8 unique values
0 missing
unknownnumeric99800 unique values
0 missing
mig_chg_msastring9 unique values
99696 missing
mig_chg_regstring8 unique values
99696 missing
mig_move_regstring9 unique values
99696 missing
mig_samestring3 unique values
0 missing
mig_prev_sunbeltstring3 unique values
99696 missing
num_empnumeric7 unique values
0 missing
fam_under_18string5 unique values
0 missing
country_fatherstring42 unique values
6713 missing
country_motherstring42 unique values
6119 missing
country_selfstring42 unique values
3393 missing
citizenshipstring5 unique values
0 missing
own_or_selfnumeric3 unique values
0 missing
vet_questionstring3 unique values
0 missing
vet_benefitsnumeric3 unique values
0 missing
weeks_workednumeric53 unique values
0 missing
yearnumeric2 unique values
0 missing
hisp_originstring10 unique values
0 missing
class_workerstring9 unique values
0 missing
det_ind_codenumeric52 unique values
0 missing
det_occ_codenumeric47 unique values
0 missing
educationstring17 unique values
0 missing
wage_per_hournumeric1240 unique values
0 missing
hs_collegestring3 unique values
0 missing
marital_statstring7 unique values
0 missing
major_ind_codestring24 unique values
0 missing
major_occ_codestring15 unique values
0 missing
racestring5 unique values
0 missing
agenumeric91 unique values
0 missing
sexstring2 unique values
0 missing
union_memberstring3 unique values
0 missing
unemp_reasonstring6 unique values
0 missing
full_or_part_empstring8 unique values
0 missing
capital_gainsnumeric132 unique values
0 missing
capital_lossesnumeric113 unique values
0 missing
stock_dividendsnumeric1478 unique values
0 missing
tax_filer_statstring6 unique values
0 missing
region_prev_resstring6 unique values
0 missing

19 properties

199523
Number of instances (rows) of the dataset.
42
Number of attributes (columns) of the dataset.
2
Number of distinct values of the target attribute (if it is nominal).
415717
Number of missing values in the dataset.
104393
Number of instances with at least one value missing.
13
Number of numeric attributes.
0
Number of nominal attributes.
0
Percentage of nominal attributes.
1
Average class difference between consecutive instances.
30.95
Percentage of numeric attributes.
4.96
Percentage of missing values.
52.32
Percentage of instances having missing values.
0
Percentage of binary attributes.
0
Number of binary attributes.
12382
Number of instances belonging to the least frequent class.
6.21
Percentage of instances belonging to the least frequent class.
187141
Number of instances belonging to the most frequent class.
93.79
Percentage of instances belonging to the most frequent class.
0
Number of attributes divided by the number of instances.

0 tasks

Define a new task