DEVELOPMENT... OpenML
Data
PC-Games-2020

PC-Games-2020

active ARFF CC0: Public Domain Visibility: public Uploaded 24-03-2022 by Stewart
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context The projects goal is to use this dataset to predict the level of success game developers should expect given their game design details. Features such as 'Indie' (developed by indie studio), 'Soundtrack' (whether or not the game was noted for its soundtrack), and 'Genres', will be able to predict the popularity of the game. Content Gathered the data July 2020 by doing one long scrape of the Steam store, from most popular to least popular. You can see signs of this by the correlation between index and the presence value (number of online posts related to the game). While performing the scrape, each game was supplemented by calling the RAWG API and adding another dozen or so features. Inspiration My main inspiration with this dataset was to gain and share the importance of each of the features related to game success on the Steam store. This information could be valuable for game developers, and I would also like to create a game using the insights, to evaluate the accuracy of the models.

27 features

RatingsBreakdownstring4806 unique values
15206 missing
Tagsstring24866 unique values
205 missing
Descriptionstring27271 unique values
219 missing
Publishernumeric0 unique values
30250 missing
Achievementsnumeric448 unique values
94 missing
ESRBstring6 unique values
25503 missing
Languagesstring3196 unique values
223 missing
Controllernumeric2 unique values
274 missing
Playersstring29 unique values
17916 missing
DiscountedCoststring121 unique values
29523 missing
OriginalCoststring395 unique values
747 missing
Franchisestring1843 unique values
25193 missing
Soundtracknumeric2 unique values
205 missing
ReleaseDatestring4133 unique values
3226 missing
Unnamed:_0numeric30250 unique values
0 missing
Memorystring693 unique values
1934 missing
Storagestring2065 unique values
2759 missing
Graphicsstring10443 unique values
4323 missing
Platformstring2041 unique values
127 missing
Presencenumeric7419 unique values
94 missing
Indienumeric2 unique values
205 missing
Genresstring1007 unique values
2968 missing
Metacriticnumeric71 unique values
26894 missing
SteamURLstring29298 unique values
55 missing
RawgIDnumeric27407 unique values
94 missing
Namestring27407 unique values
94 missing
idnumeric30250 unique values
0 missing

19 properties

30250
Number of instances (rows) of the dataset.
27
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
188331
Number of missing values in the dataset.
30250
Number of instances with at least one value missing.
10
Number of numeric attributes.
0
Number of nominal attributes.
0
Percentage of nominal attributes.
Average class difference between consecutive instances.
37.04
Percentage of numeric attributes.
23.06
Percentage of missing values.
100
Percentage of instances having missing values.
0
Percentage of binary attributes.
0
Number of binary attributes.
Number of instances belonging to the least frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the most frequent class.
0
Number of attributes divided by the number of instances.

0 tasks

Define a new task