DEVELOPMENT... OpenML
Data
parkinsons-telemonitoring

parkinsons-telemonitoring

active ARFF Publicly available Visibility: public Uploaded 16-02-2016 by unknown
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By
missing target1User 5824


Loading wiki
Help us complete this description Edit
Author: Athanasios Tsanas (tsanasthanasis '@' gmail.com) and Max Little (littlem '@' physics.ox.ac.uk) Source: UCI Please cite: Source: The dataset was created by Athanasios Tsanas (tsanasthanasis '@' gmail.com) and Max Little (littlem '@' physics.ox.ac.uk) of the University of Oxford, in collaboration with 10 medical centers in the US and Intel Corporation who developed the telemonitoring device to record the speech signals. The original study used a range of linear and nonlinear regression methods to predict the clinician's Parkinson's disease symptom score on the UPDRS scale. Data Set Information: This dataset is composed of a range of biomedical voice measurements from 42 people with early-stage Parkinson's disease recruited to a six-month trial of a telemonitoring device for remote symptom progression monitoring. The recordings were automatically captured in the patient's homes. Columns in the table contain subject number, subject age, subject gender, time interval from baseline recruitment date, motor UPDRS, total UPDRS, and 16 biomedical voice measures. Each row corresponds to one of 5,875 voice recording from these individuals. The main aim of the data is to predict the motor and total UPDRS scores ('motor_UPDRS' and 'total_UPDRS') from the 16 voice measures. The data is in ASCII CSV format. The rows of the CSV file contain an instance corresponding to one voice recording. There are around 200 recordings per patient, the subject number of the patient is identified in the first column. For further information or to pass on comments, please contact Athanasios Tsanas (tsanasthanasis '@' gmail.com) or Max Little (littlem '@' physics.ox.ac.uk). Further details are contained in the following reference -- if you use this dataset, please cite: Athanasios Tsanas, Max A. Little, Patrick E. McSharry, Lorraine O. Ramig (2009), 'Accurate telemonitoring of Parkinson’s disease progression by non-invasive speech tests', IEEE Transactions on Biomedical Engineering (to appear). Further details about the biomedical voice measures can be found in: Max A. Little, Patrick E. McSharry, Eric J. Hunter, Lorraine O. Ramig (2009), 'Suitability of dysphonia measurements for telemonitoring of Parkinson's disease', IEEE Transactions on Biomedical Engineering, 56(4):1015-1022 Attribute Information: subject# - Integer that uniquely identifies each subject age - Subject age sex - Subject gender '0' - male, '1' - female test_time - Time since recruitment into the trial. The integer part is the number of days since recruitment. motor_UPDRS - Clinician's motor UPDRS score, linearly interpolated total_UPDRS - Clinician's total UPDRS score, linearly interpolated Jitter(%),Jitter(Abs),Jitter:RAP,Jitter:PPQ5,Jitter:DDP - Several measures of variation in fundamental frequency Shimmer,Shimmer(dB),Shimmer:APQ3,Shimmer:APQ5,Shimmer:APQ11,Shimmer:DDA - Several measures of variation in amplitude NHR,HNR - Two measures of ratio of noise to tonal components in the voice RPDE - A nonlinear dynamical complexity measure DFA - Signal fractal scaling exponent PPE - A nonlinear measure of fundamental frequency variation Relevant Papers: Little MA, McSharry PE, Hunter EJ, Ramig LO (2009), 'Suitability of dysphonia measurements for telemonitoring of Parkinson's disease', IEEE Transactions on Biomedical Engineering, 56(4):1015-1022 Little MA, McSharry PE, Roberts SJ, Costello DAE, Moroz IM. 'Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection', BioMedical Engineering OnLine 2007, 6:23 (26 June 2007) Citation Request: If you use this dataset, please cite the following paper: A Tsanas, MA Little, PE McSharry, LO Ramig (2009) 'Accurate telemonitoring of Parkinson’s disease progression by non-invasive speech tests', IEEE Transactions on Biomedical Engineering (to appear).

22 features

Shimmernumeric3581 unique values
0 missing
PPEnumeric4777 unique values
0 missing
DFAnumeric5282 unique values
0 missing
RPDEnumeric5430 unique values
0 missing
HNRnumeric4780 unique values
0 missing
NHRnumeric5532 unique values
0 missing
Shimmer.DDAnumeric4223 unique values
0 missing
Shimmer.APQ11numeric3283 unique values
0 missing
Shimmer.APQ5numeric2850 unique values
0 missing
Shimmer.APQ3numeric2664 unique values
0 missing
Shimmer.dB.numeric852 unique values
0 missing
subject.numeric42 unique values
0 missing
Jitter.DDPnumeric1703 unique values
0 missing
Jitter.PPQ5numeric840 unique values
0 missing
Jitter.RAPnumeric853 unique values
0 missing
Jitter.Abs.numeric4105 unique values
0 missing
Jitter...numeric1305 unique values
0 missing
total_UPDRSnumeric1129 unique values
0 missing
motor_UPDRSnumeric1080 unique values
0 missing
test_timenumeric2442 unique values
0 missing
sexnumeric2 unique values
0 missing
agenumeric23 unique values
0 missing

107 properties

5875
Number of instances (rows) of the dataset.
22
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
22
Number of numeric attributes.
0
Number of nominal attributes.
Average class difference between consecutive instances.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Entropy of the target attribute values.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump
0
Number of attributes divided by the number of instances.
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001
Percentage of instances belonging to the most frequent class.
Number of instances belonging to the most frequent class.
Maximum entropy among attributes.
81.57
Maximum kurtosis among attributes of the numeric type.
92.86
Maximum of means among attributes of the numeric type.
Maximum mutual information between the nominal attributes and the target attribute.
The maximum number of distinct values among attributes of the nominal type.
7.59
Maximum skewness among attributes of the numeric type.
53.45
Maximum standard deviation of attributes of the numeric type.
Average entropy of the attributes.
21.46
Mean kurtosis among attributes of the numeric type.
11.52
Mean of means among attributes of the numeric type.
Average mutual information between the nominal attributes and the target attribute.
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
Average number of distinct values among the attributes of the nominal type.
2.67
Mean skewness among attributes of the numeric type.
4.5
Mean standard deviation of attributes of the numeric type.
Minimal entropy among attributes.
-1.39
Minimum kurtosis among attributes of the numeric type.
0
Minimum of means among attributes of the numeric type.
Minimal mutual information between the nominal attributes and the target attribute.
The minimal number of distinct values among attributes of the nominal type.
-0.81
Minimum skewness among attributes of the numeric type.
0
Minimum standard deviation of attributes of the numeric type.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes
0
Number of binary attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
0
Percentage of missing values.
100
Percentage of numeric attributes.
0
Percentage of nominal attributes.
First quartile of entropy among attributes.
-0.49
First quartile of kurtosis among attributes of the numeric type.
0.02
First quartile of means among attributes of the numeric type.
First quartile of mutual information between the nominal attributes and the target attribute.
0.08
First quartile of skewness among attributes of the numeric type.
0.01
First quartile of standard deviation of attributes of the numeric type.
Second quartile (Median) of entropy among attributes.
13.91
Second quartile (Median) of kurtosis among attributes of the numeric type.
0.14
Second quartile (Median) of means among attributes of the numeric type.
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.
3.1
Second quartile (Median) of skewness among attributes of the numeric type.
0.07
Second quartile (Median) of standard deviation of attributes of the numeric type.
Third quartile of entropy among attributes.
27.58
Third quartile of kurtosis among attributes of the numeric type.
21.35
Third quartile of means among attributes of the numeric type.
Third quartile of mutual information between the nominal attributes and the target attribute.
4.39
Third quartile of skewness among attributes of the numeric type.
5.25
Third quartile of standard deviation of attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
Standard deviation of the number of distinct values among attributes of the nominal type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk
Error rate achieved by the landmarker weka.classifiers.lazy.IBk
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk

11 tasks

0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
Define a new task