DEVELOPMENT... { "data_id": "42351", "name": "UCI-student-performance-por", "exact_name": "UCI-student-performance-por", "version": 1, "version_label": "1", "description": "**Author**: P. Cortez and A. Silva \r\n**Source**: [original](http:\/\/archive.ics.uci.edu\/ml\/datasets\/Student+Performance) - 2008 \r\n**Please cite**: P. Cortez and A. Silva. Using Data Mining to Predict Secondary School Student Performance. In A. Brito and J. Teixeira Eds., Proceedings of 5th FUture BUsiness TEChnology Conference (FUBUTEC 2008) pp. 5-12, Porto, Portugal, April, 2008, EUROSIS, ISBN 978-9077381-39-7. \r\n\r\nThis data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was collected by using school reports and questionnaires.\r\n\r\nTwo datasets are provided regarding the performance in two distinct subjects: Mathematics (mat) and Portuguese language (por). In [Cortez and Silva, 2008], the two datasets were modeled under binary\/five-level classification and regression tasks. This dataset regard the performance in Portuguese.\r\n\r\nImportant note: the target attribute G3 has a strong correlation with attributes G2 and G1. This occurs because G3 is the final year grade (issued at the 3rd period), while G1 and G2 correspond to the 1st and 2nd period grades. It is more difficult to predict G3 without G2 and G1, but such prediction is much more useful (see paper source for more details).\r\n\r\n**Attributes** \r\n1 school - student's school (binary: 'GP' - Gabriel Pereira or 'MS' - Mousinho da Silveira). \r\n2 sex - student's sex (binary: 'F' - female or 'M' - male) \r\n3 age - student's age (numeric: from 15 to 22) \r\n4 address - student's home address type (binary: 'U' - urban or 'R' - rural) \r\n5 famsize - family size (binary: 'LE3' - less or equal to 3 or 'GT3' - greater than 3) \r\n6 Pstatus - parent's cohabitation status (binary: 'T' - living together or 'A' - apart) \r\n7 Medu - mother's education (numeric: 0 - none, 1 - primary education (4th grade), 2 \u00e2\u20ac\u201c 5th to 9th grade, 3 \u00e2\u20ac\u201c secondary education or 4 \u00e2\u20ac\u201c higher education) \r\n8 Fedu - father's education (numeric: 0 - none, 1 - primary education (4th grade), 2 \u00e2\u20ac\u201c 5th to 9th grade, 3 \u00e2\u20ac\u201c secondary education or 4 \u00e2\u20ac\u201c higher education) \r\n9 Mjob - mother's job (nominal: 'teacher', 'health' care related, civil 'services' (e.g. administrative or police), 'at_home' or 'other') \r\n10 Fjob - father's job (nominal: 'teacher', 'health' care related, civil 'services' (e.g. administrative or police), 'at_home' or 'other') \r\n11 reason - reason to choose this school (nominal: close to 'home', school 'reputation', 'course' preference or 'other') \r\n12 guardian - student's guardian (nominal: 'mother', 'father' or 'other') \r\n13 traveltime - home to school travel time (numeric: 1 - <15>1 hour) \r\n14 studytime - weekly study time (numeric: 1 - <2>10 hours) \r\n15 failures - number of past class failures (numeric: n if 1<=n<3, else 4) \r\n16 schoolsup - extra educational support (binary: yes or no) \r\n17 famsup - family educational support (binary: yes or no) \r\n18 paid - extra paid classes within the course subject (Math or Portuguese) (binary: yes or no) \r\n19 activities - extra-curricular activities (binary: yes or no) \r\n20 nursery - attended nursery school (binary: yes or no) \r\n21 higher - wants to take higher education (binary: yes or no) \r\n22 internet - Internet access at home (binary: yes or no) \r\n23 romantic - with a romantic relationship (binary: yes or no) \r\n24 famrel - quality of family relationships (numeric: from 1 - very bad to 5 - excellent) \r\n25 freetime - free time after school (numeric: from 1 - very low to 5 - very high) \r\n26 goout - going out with friends (numeric: from 1 - very low to 5 - very high) \r\n27 Dalc - workday alcohol consumption (numeric: from 1 - very low to 5 - very high) \r\n28 Walc - weekend alcohol consumption (numeric: from 1 - very low to 5 - very high) \r\n29 health - current health status (numeric: from 1 - very bad to 5 - very good) \r\n30 absences - number of school absences (numeric: from 0 to 93) \r\n\r\nthese grades are related with the course subject, Math or Portuguese: \r\n31 G1 - first period grade (numeric: from 0 to 20) \r\n31 G2 - second period grade (numeric: from 0 to 20) \r\n32 G3 - final grade (numeric: from 0 to 20, output target)", "format": "ARFF", "uploader": "Morgan Bentley", "uploader_id": 1935, "visibility": "public", "creator": "\"P. Cortez and A. Silva\"", "contributor": null, "date": "2020-04-08 18:53:08", "update_comment": null, "last_update": "2020-04-08 18:53:08", "licence": "Public", "status": "active", "error_message": null, "url": "https:\/\/www.openml.org\/data\/download\/21826962\/por.arff", "default_target_attribute": "G3", "row_id_attribute": null, "ignore_attribute": null, "runs": 0, "suggest": { "input": [ "UCI-student-performance-por", "This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was collected by using school reports and questionnaires. Two datasets are provided regarding the performance in two distinct subjects: Mathematics (mat) and Portuguese language (por). In [Cortez and Silva, 2008], the two datasets were modeled under binary\/five-level classification and regression tasks. This d " ], "weight": 5 }, "qualities": { "NumberOfInstances": 649, "NumberOfFeatures": 33, "NumberOfClasses": 0, "NumberOfMissingValues": 0, "NumberOfInstancesWithMissingValues": 0, "NumberOfNumericFeatures": 16, "NumberOfSymbolicFeatures": 0, "PercentageOfSymbolicFeatures": 0, "AutoCorrelation": -1.962962962962963, "PercentageOfNumericFeatures": 48.484848484848484, "PercentageOfMissingValues": 0, "PercentageOfInstancesWithMissingValues": 0, "PercentageOfBinaryFeatures": 0, "NumberOfBinaryFeatures": 0, "MinorityClassSize": null, "MinorityClassPercentage": null, "MajorityClassSize": null, "MajorityClassPercentage": null, "Dimensionality": 0.05084745762711865 }, "tags": [], "features": [ { "name": "G3", "index": "32", "type": "numeric", "distinct": "17", "missing": "0", "target": "1", "min": "0", "max": "19", "mean": "12", "stdev": "3" }, { "name": "paid", "index": "17", "type": "string", "distinct": "2", "missing": "0" }, { "name": "famsup", "index": "16", "type": "string", "distinct": "2", "missing": "0" }, { "name": "activities", "index": "18", "type": "string", "distinct": "2", "missing": "0" }, { "name": "nursery", "index": "19", "type": "string", "distinct": "2", "missing": "0" }, { "name": "higher", "index": "20", "type": "string", "distinct": "2", "missing": "0" }, { "name": "internet", "index": "21", "type": "string", "distinct": "2", "missing": "0" }, { "name": "romantic", "index": "22", "type": "string", "distinct": "2", "missing": "0" }, { "name": "famrel", "index": "23", "type": "numeric", "distinct": "5", "missing": "0", "min": "1", "max": "5", "mean": "4", "stdev": "1" }, { "name": "freetime", "index": "24", "type": "numeric", "distinct": "5", "missing": "0", "min": "1", "max": "5", "mean": "3", "stdev": "1" }, { "name": "goout", "index": "25", "type": "numeric", "distinct": "5", "missing": "0", "min": "1", "max": "5", "mean": "3", "stdev": "1" }, { "name": "Dalc", "index": "26", "type": "numeric", "distinct": "5", "missing": "0", "min": "1", "max": "5", "mean": "2", "stdev": "1" }, { "name": "Walc", "index": "27", "type": "numeric", "distinct": "5", "missing": "0", "min": "1", "max": "5", "mean": "2", "stdev": "1" }, { "name": "health", "index": "28", "type": "numeric", "distinct": "5", "missing": "0", "min": "1", "max": "5", "mean": "4", "stdev": "1" }, { "name": "absences", "index": "29", "type": "numeric", "distinct": "24", "missing": "0", "min": "0", "max": "32", "mean": "4", "stdev": "5" }, { "name": "G1", "index": "30", "type": "numeric", "distinct": "17", "missing": "0", "min": "0", "max": "19", "mean": "11", "stdev": "3" }, { "name": "G2", "index": "31", "type": "numeric", "distinct": "16", "missing": "0", "min": "0", "max": "19", "mean": "12", "stdev": "3" }, { "name": "school", "index": "0", "type": "string", "distinct": "2", "missing": "0" }, { "name": "schoolsup", "index": "15", "type": "string", "distinct": "2", "missing": "0" }, { "name": "failures", "index": "14", "type": "numeric", "distinct": "4", "missing": "0", "min": "0", "max": "3", "mean": "0", "stdev": "1" }, { "name": "studytime", "index": "13", "type": "numeric", "distinct": "4", "missing": "0", "min": "1", "max": "4", "mean": "2", "stdev": "1" }, { "name": "traveltime", "index": "12", "type": "numeric", "distinct": "4", "missing": "0", "min": "1", "max": "4", "mean": "2", "stdev": "1" }, { "name": "guardian", "index": "11", "type": "string", "distinct": "3", "missing": "0" }, { "name": "reason", "index": "10", "type": "string", "distinct": "4", "missing": "0" }, { "name": "Fjob", "index": "9", "type": "string", "distinct": "5", "missing": "0" }, { "name": "Mjob", "index": "8", "type": "string", "distinct": "5", "missing": "0" }, { "name": "Fedu", "index": "7", "type": "numeric", "distinct": "5", "missing": "0", "min": "0", "max": "4", "mean": "2", "stdev": "1" }, { "name": "Medu", "index": "6", "type": "numeric", "distinct": "5", "missing": "0", "min": "0", "max": "4", "mean": "3", "stdev": "1" }, { "name": "Pstatus", "index": "5", "type": "string", "distinct": "2", "missing": "0" }, { "name": "famsize", "index": "4", "type": "string", "distinct": "2", "missing": "0" }, { "name": "address", "index": "3", "type": "string", "distinct": "2", "missing": "0" }, { "name": "age", "index": "2", "type": "numeric", "distinct": "8", "missing": "0", "min": "15", "max": "22", "mean": "17", "stdev": "1" }, { "name": "sex", "index": "1", "type": "string", "distinct": "2", "missing": "0" } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 0, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 0 }