DEVELOPMENT... OpenML
Study
study
Ensembles of classifiers are among the best performing classifiers available in many data mining applications. However, most ensembles developed specifically for the dynamic data stream setting rely…
0 datasets, 62 tasks, 13 flows, 805 runs
Benchmarking in Machine Learning is often much more difficult than it seems, and hard to reproduce. This study is a new approach to do a collaborative, in-depth benchmarking of algorithms, and allows…
100 datasets, 100 tasks, 0 flows, 0 runs
Feature selection can be of value to classification for a variety of reasons. Real world data sets can be rife with irrelevant features, especially if the data was not gather specifically for the…
394 datasets, 394 tasks, 24 flows, 9454 runs
Ensembles of classifiers are among the best performing classifiers available in many data mining applications. Rather than training one classifier, multiple classifiers are trained, and their…
60 datasets, 60 tasks, 8 flows, 4002 runs
A subgroup discovery study.
0 datasets, 3600 tasks, 4 flows, 0 runs
this study joins multiple data stream studies
0 datasets, 0 tasks, 0 flows, 0 runs
All datasets, tasks, flows and setups used for Chapter 6 in the PhD Thesis "Massively Collaborative Machine Learning"
105 datasets, 105 tasks, 27 flows, 0 runs
Authors: Salisu Mamman Abdulrahman, Pavel Brazdil, Jan N. van Rijn, Joaquin Vanschoren Abstract: Algorithm selection methods can be speeded-up substantially by incorporating multi-objective measures…
39 datasets, 39 tasks, 53 flows, 9627 runs
Containing all datasets, tasks, flows and runs used in the ASLib OpenML Scenario.
442 datasets, 441 tasks, 63 flows, 0 runs
With the advent of automated machine learning, automated hyperparameter optimization methods are by now routinely used. However, this progress is not yet matched by equal progress on automatic…
0 datasets, 0 tasks, 0 flows, 164911 runs
We advocate the use of curated, comprehensive benchmark suites of machine learning datasets, backed by standardized OpenML-based interfaces and complementary software toolkits written in Python, Java…
72 datasets, 72 tasks, 0 flows, 0 runs
Comparison of linear and non-linear models. [Jupyter Notebook](https://github.com/janvanrijn/linear-vs-non-linear/blob/master/notebook/Linear-vs-Non-Linear.ipynb)
299 datasets, 299 tasks, 5 flows, 1693 runs
Contains currency trading tasks, for various valuta pairs.
192 datasets, 192 tasks, 0 flows, 0 runs
Subset of the OpenML100, with datasets that are friedly towards scikit-learn algorithms (no Imputation or One-hot-encoding necessary)
54 datasets, 54 tasks, 0 flows, 0 runs