DEVELOPMENT... OpenML
Data
nyc_taxi_green

nyc_taxi_green

active ARFF Publicly available Visibility: public Uploaded 22-12-2022 by Shirley
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Data Description The dataset includes New York City Taxi and Limousine Commission (TLC) trips of the green line in December 2016. All trips are paid with a credit card leaving some tip. The variable 'tip_amount' was chosen as target variable. Attribute Description 1. *VendorID* - A code indicating the LPEP provider that provided the record. 1: Creative Mobile Technologies, LLC; 2: VeriFone Inc. 2. *store_and_fwd_flag* 3. *RatecodeID* 4. *PULocationID* 5. *DOLocationID* - TLC Taxi Zone in which the taximeter was disengaged. 6. *passenger_count* - the number of passengers in the vehicle. This is a driver-entered value 7. *extra* - miscellaneous extras and surcharges. Currently, this only includes the $0.50 and $1 rush hour and overnight charges 8. *mta_tax* - $0.50 MTA tax that is automatically triggered based on the metered rate in use. 9. *tip_amount* - target feature 10. *tolls_amount* 11. *improvement_surcharge* - $0.30 improvement surcharge assessed on hailed trips at the flag drop 12. *total_amount* 13. *trip_type* - 1: Street-hail, 2: Dispatch 14. *lpep_pickup_datetime_day* 15. *lpep_pickup_datetime_hour* 16. *lpep_pickup_datetime_minute* 17. *lpep_dropoff_datetime_day* 18. *lpep_dropoff_datetime_hour* 19. *lpep_dropoff_datetime_minute*

19 features

tip_amount (target)numeric1811 unique values
0 missing
tolls_amountnumeric105 unique values
0 missing
lpep_dropoff_datetime_minutenumeric60 unique values
0 missing
lpep_dropoff_datetime_hournumeric24 unique values
0 missing
lpep_dropoff_datetime_daynumeric31 unique values
0 missing
lpep_pickup_datetime_minutenumeric60 unique values
0 missing
lpep_pickup_datetime_hournumeric24 unique values
0 missing
lpep_pickup_datetime_daynumeric31 unique values
0 missing
trip_typenominal2 unique values
0 missing
total_amountnumeric5377 unique values
0 missing
improvement_surchargenominal3 unique values
0 missing
VendorIDnominal2 unique values
0 missing
mta_taxnominal3 unique values
0 missing
extranominal5 unique values
0 missing
passenger_countnumeric10 unique values
0 missing
DOLocationIDnominal259 unique values
0 missing
PULocationIDnominal233 unique values
0 missing
RatecodeIDnominal5 unique values
0 missing
store_and_fwd_flagnominal2 unique values
0 missing

19 properties

581835
Number of instances (rows) of the dataset.
19
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
10
Number of numeric attributes.
9
Number of nominal attributes.
47.37
Percentage of nominal attributes.
-1.12
Average class difference between consecutive instances.
52.63
Percentage of numeric attributes.
0
Percentage of missing values.
0
Percentage of instances having missing values.
15.79
Percentage of binary attributes.
3
Number of binary attributes.
Number of instances belonging to the least frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the most frequent class.
0
Number of attributes divided by the number of instances.

1 tasks

0 runs - estimation_procedure: 33% Holdout set - target_feature: tip_amount
Define a new task