Dataset used in the tabular data benchmark https://github.com/LeoGrin/tabular-benchmark, transformed in the same way. This dataset belongs to the "regression on both numerical and categorical features" benchmark.
Original link: https://openml.org/d/42496
Original description:
Author: City of Seattle
Source: https://data.seattle.gov/Public-Safety/Crime-Data/4fs7-3vj5 - 24-06-2019
Please cite:
This data represents crime reported to the Seattle Police Department (SPD). Each row contains the record of a unique event where at least one criminal offense was reported by a member of the community or detected by an officer in the field. This data is the same data used in meetings such as SeaStat (https://www.seattle.gov/police/information-and-data/seastat) for strategic planning, accountability and performance management.
For more information see:
https://data.seattle.gov/Public-Safety/Crime-Data/4fs7-3vj5 For this version, the task was downsampled to 10 percent. Compute a new target Reported_Time. Compute new date features, ignore some features and encode as features as factor variables.