DEVELOPMENT... OpenML
Data
Hottest-Kaggle-Datasets

Hottest-Kaggle-Datasets

active ARFF CC0: Public Domain Visibility: public Uploaded 23-03-2022 by Mark Murphy
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context This data was collected as a course project for the immersive data science course (by General Assembly and Misk Academy). Content This dataset is in a CSV format, it consists of 5717 rows and 15 columns, where each row is a dataset on Kaggle and each column represents a feature of that dataset. title dataset name usability dataset usability rating by Kaggle numoffiles number of files associated with the dataset typesoffiles types of files associated with the dataset files_size size of the dataset files vote_counts total votes count by the dataset viewer medal reward to popular datasets measured by the number of upvotes (votes by novices are excluded from medal calculation), [Bronze = 5 Votes, Silver = 20 Votes, Gold = 50 Votes] url_reference reference to the dataset page on Kaggle in the format: www.kaggle.com/url_reference keywords Topics tagged with the dataset numofcolumns number of features in the dataset views number of views downloads number of downloads downloadperview download per view ratio date_created dataset creation date last_updated date of the last update Acknowledgements I would like to thank all my GA instructors for their continuous help and support All data were taken from https://www.kaggle.com , collected on 30 Jan 2021 Inspiration Using this dataset, we could try to predict the upcoming datasets uploaded, number of votes, number of downloads, medal type, etc.

15 features

titlestring5658 unique values
1 missing
usabilitynumeric39 unique values
0 missing
num_of_filesnumeric873 unique values
333 missing
types_of_filesstring26 unique values
267 missing
files_sizestring1631 unique values
66 missing
vote_countsnumeric499 unique values
0 missing
medalstring4 unique values
0 missing
url_referencestring5710 unique values
0 missing
keywordsstring3565 unique values
618 missing
num_of_columnsnumeric201 unique values
2668 missing
viewsnumeric4532 unique values
1 missing
downloadsnumeric2595 unique values
1 missing
download_per_viewnumeric112 unique values
1 missing
date_createdstring1447 unique values
95 missing
last_updatedstring1379 unique values
95 missing

19 properties

5717
Number of instances (rows) of the dataset.
15
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
4146
Number of missing values in the dataset.
3006
Number of instances with at least one value missing.
7
Number of numeric attributes.
0
Number of nominal attributes.
0
Percentage of nominal attributes.
Average class difference between consecutive instances.
46.67
Percentage of numeric attributes.
4.83
Percentage of missing values.
52.58
Percentage of instances having missing values.
0
Percentage of binary attributes.
0
Number of binary attributes.
Number of instances belonging to the least frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the most frequent class.
0
Number of attributes divided by the number of instances.

0 tasks

Define a new task