DEVELOPMENT... OpenML
Data
Breast-cancer-prediction

Breast-cancer-prediction

active ARFF CC0: Public Domain Visibility: public Uploaded 24-03-2022 by Stewart
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context This dataset includes data from a random sample of 20,000 digital and 20,000 film-screen mammograms received by women age 60-89 years within the Breast Cancer Surveillance Consortium (BCSC) between January 2005 and December 2008. Some women contribute multiple examinations to the dataset. Data is useful in teaching about data analysis, epidemiological study designs, or statistical methods for binary outcomes or correlated data. Content The data set contains 39998 rows and 13 cols. Attributes are described as follows: Field Name Type (Format) Description AgeAtTheTimeOf_Mammography number Patient's age in years at time of mammogram Radiologists_Assessment string Radiologist's assessment based on the BI-RADS scale --- --- --- IsBinaryIndicatorOfCancer_Diagnosis boolean Binary indicator of cancer diagnosis within one year of screening mammogram (false= No cancer diagnosis, true= Cancer diagnosis) --- --- --- ComparisonMammogramFrom_Mammography string Comparison mammogram from prior mammography examination available --- --- --- PatientsBIRADSBreastDensity string Patient's BI-RADS breast density as recorded at time of mammogram --- --- --- FamilyHistoryOfBreastCancer string Family history of breast cancer in a first degree relative --- --- --- CurrentUseOfHormoneTherapy string Current use of hormone therapy at time of mammogram --- --- --- Binary_Indicator string Binary indicator of whether the woman had ever received a prior mammogram --- --- --- HistoryOfBreast_Biopsy string Prior history of breast biopsy --- --- --- IsFilmOrDigitalMammogram boolean Film or digital mammogram (true=Digital mammogram, false=Film mammogram) --- --- --- Cancer_Type string Type of cancer --- --- --- Acknowledgements We acknowledge the Breast Cancer Surveillance Consortium (BCSC) for making this data set available for research purposes.

12 features

Age_At_The_Time_Of_Mammographynumeric30 unique values
0 missing
Radiologists_Assessmentstring6 unique values
0 missing
Is_Binary_Indicator_Of_Cancer_Diagnosisnominal2 unique values
0 missing
Comparison_Mammogram_From_Mammographystring3 unique values
0 missing
Patients_BI_RADS_Breast_Densitystring4 unique values
0 missing
Family_History_Of_Breast_Cancerstring3 unique values
0 missing
Current_Use_Of_Hormone_Therapystring3 unique values
0 missing
Binary_Indicatorstring3 unique values
0 missing
History_Of_Breast_Biopsystring3 unique values
0 missing
Is_Film_Or_Digital_Mammogramnominal2 unique values
0 missing
Cancer_Typestring3 unique values
0 missing
Body_Mass_Indexstring1897 unique values
0 missing
Patients_Study_ID (ignore)numeric36714 unique values
0 missing

19 properties

39998
Number of instances (rows) of the dataset.
12
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
1
Number of numeric attributes.
2
Number of nominal attributes.
16.67
Percentage of nominal attributes.
Average class difference between consecutive instances.
8.33
Percentage of numeric attributes.
0
Percentage of missing values.
0
Percentage of instances having missing values.
16.67
Percentage of binary attributes.
2
Number of binary attributes.
Number of instances belonging to the least frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the most frequent class.
0
Number of attributes divided by the number of instances.

0 tasks

Define a new task