OpenML

JavaScript is required to properly view the contents of this page!

Explore
- Data
- Task
- Flow
- Run
- Study
- Task type
- Measure
- People
Help
Blog
Contact
Please cite us

QSAR-DATASET-FOR-DRUG-TARGET-CHEMBL3116

deactivated ARFF Publicly available Visibility: public Uploaded 15-07-2016 by unknown
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes

Issue	#Downvotes for this reason	By

Loading wiki

Help us complete this description Edit

This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target ChEMBL_ID: CHEMBL3116 (TID: 11626), and it has 507 rows and 68 features (not including molecule IDs and class feature: molecule_id and pXC50). The features represent Molecular Descriptors which were generated from SMILES strings. Missing value imputation was applied to this dataset (By choosing the Median). Feature selection was also applied.

70 features

pXC50 (target)	numeric	107 unique values 0 missing
SRW04	numeric	124 unique values 0 missing
ZM1	numeric	87 unique values 0 missing
ATS8p	numeric	420 unique values 0 missing
Eta_L	numeric	466 unique values 0 missing
SM02_AEA.bo.	numeric	290 unique values 0 missing
Eig10_AEA.bo.	numeric	338 unique values 0 missing
Eig12_EA.bo.	numeric	331 unique values 0 missing
Eig10_AEA.ri.	numeric	382 unique values 0 missing
SRW02	numeric	41 unique values 0 missing
nBO	numeric	41 unique values 0 missing
MWC01	numeric	41 unique values 0 missing
MPC01	numeric	41 unique values 0 missing
SM14_AEA.bo.	numeric	352 unique values 0 missing
Eig06_EA	numeric	352 unique values 0 missing
SpMax3_Bh.m.	numeric	323 unique values 0 missing
X1Mad	numeric	459 unique values 0 missing
SpAD_EA	numeric	440 unique values 0 missing
ATS8i	numeric	428 unique values 0 missing
SM07_AEA.ri.	numeric	357 unique values 0 missing
Eig12_EA.ed.	numeric	357 unique values 0 missing
ATS3i	numeric	395 unique values 0 missing
ZM2Mad	numeric	474 unique values 0 missing
Eig14_AEA.ed.	numeric	325 unique values 0 missing
SpAD_EA.ri.	numeric	488 unique values 0 missing
Eig05_AEA.bo.	numeric	346 unique values 0 missing
Eig12_EA.ri.	numeric	374 unique values 0 missing
SpMax7_Bh.i.	numeric	324 unique values 0 missing
SpMin3_Bh.m.	numeric	307 unique values 0 missing
SpMax3_Bh.i.	numeric	312 unique values 0 missing
ATS6v	numeric	432 unique values 0 missing
SpMin3_Bh.v.	numeric	302 unique values 0 missing
ATS8e	numeric	432 unique values 0 missing
X3	numeric	427 unique values 0 missing
ATS8v	numeric	423 unique values 0 missing
SM14_AEA.dm.	numeric	416 unique values 0 missing
Eig10_EA.bo.	numeric	355 unique values 0 missing
Eig10_EA	numeric	332 unique values 0 missing
SpMax4_Bh.m.	numeric	373 unique values 0 missing
X1Per	numeric	470 unique values 0 missing
SpMin3_Bh.e.	numeric	305 unique values 0 missing
X1Kup	numeric	460 unique values 0 missing
ATS6m	numeric	418 unique values 0 missing
ATS7v	numeric	429 unique values 0 missing
SM04_AEA.dm.	numeric	332 unique values 0 missing
Eig05_EA.ed.	numeric	416 unique values 0 missing
ATS7m	numeric	430 unique values 0 missing
X2sol	numeric	433 unique values 0 missing
Psi_i_1	numeric	462 unique values 0 missing
SpMax3_Bh.v.	numeric	322 unique values 0 missing
Eig10_EA.ri.	numeric	373 unique values 0 missing
SpMax3_Bh.p.	numeric	339 unique values 0 missing
X1v	numeric	459 unique values 0 missing
Chi1_AEA.ri.	numeric	411 unique values 0 missing
SpAD_AEA.ri.	numeric	488 unique values 0 missing
MWC02	numeric	87 unique values 0 missing
X2v	numeric	468 unique values 0 missing
ATS2p	numeric	378 unique values 0 missing
SpMax3_Bh.e.	numeric	332 unique values 0 missing
SpAD_AEA.bo.	numeric	445 unique values 0 missing
ATS1p	numeric	381 unique values 0 missing
Chi1_EA	numeric	411 unique values 0 missing
molecule_id (row identifier)	nominal	507 unique values 0 missing
Chi1_AEA.ed.	numeric	411 unique values 0 missing
Chi1_AEA.dm.	numeric	411 unique values 0 missing
Chi1_AEA.bo.	numeric	411 unique values 0 missing
X3sol	numeric	430 unique values 0 missing
SpMax7_Bh.v.	numeric	346 unique values 0 missing
X1MulPer	numeric	463 unique values 0 missing
ATS2v	numeric	379 unique values 0 missing

Show all 70 features

62 properties

NumberOfInstances

507

Number of instances (rows) of the dataset.

NumberOfFeatures

Number of attributes (columns) of the dataset.

NumberOfClasses

Number of distinct values of the target attribute (if it is nominal).

NumberOfMissingValues

Number of missing values in the dataset.

NumberOfInstancesWithMissingValues

Number of instances with at least one value missing.

NumberOfNumericFeatures

Number of numeric attributes.

NumberOfSymbolicFeatures

Number of nominal attributes.

Quartile1MutualInformation

First quartile of mutual information between the nominal attributes and the target attribute.

PercentageOfBinaryFeatures

Percentage of binary attributes.

PercentageOfInstancesWithMissingValues

Percentage of instances having missing values.

PercentageOfMissingValues

Percentage of missing values.

PercentageOfNumericFeatures

98.57

Percentage of numeric attributes.

PercentageOfSymbolicFeatures

1.43

Percentage of nominal attributes.

Quartile1AttributeEntropy

First quartile of entropy among attributes.

Quartile1KurtosisOfNumericAtts

0.09

First quartile of kurtosis among attributes of the numeric type.

Quartile1MeansOfNumericAtts

2.84

First quartile of means among attributes of the numeric type.

StdvNominalAttDistinctValues

Standard deviation of the number of distinct values among attributes of the nominal type.

Quartile1SkewnessOfNumericAtts

-1.3

First quartile of skewness among attributes of the numeric type.

Quartile1StdDevOfNumericAtts

0.31

First quartile of standard deviation of attributes of the numeric type.

Quartile2AttributeEntropy

Second quartile (Median) of entropy among attributes.

Quartile2KurtosisOfNumericAtts

0.4

Second quartile (Median) of kurtosis among attributes of the numeric type.

Quartile2MeansOfNumericAtts

3.78

Second quartile (Median) of means among attributes of the numeric type.

Quartile2MutualInformation

Second quartile (Median) of mutual information between the nominal attributes and the target attribute.

Quartile2SkewnessOfNumericAtts

-0.58

Second quartile (Median) of skewness among attributes of the numeric type.

Quartile2StdDevOfNumericAtts

0.88

Second quartile (Median) of standard deviation of attributes of the numeric type.

Quartile3AttributeEntropy

Third quartile of entropy among attributes.

Quartile3KurtosisOfNumericAtts

2.13

Third quartile of kurtosis among attributes of the numeric type.

Quartile3MeansOfNumericAtts

9.17

Third quartile of means among attributes of the numeric type.

Quartile3MutualInformation

Third quartile of mutual information between the nominal attributes and the target attribute.

Quartile3SkewnessOfNumericAtts

0.26

Third quartile of skewness among attributes of the numeric type.

Quartile3StdDevOfNumericAtts

2.66

Third quartile of standard deviation of attributes of the numeric type.

AutoCorrelation

-0.09

Average class difference between consecutive instances.

MeanMeansOfNumericAtts

12.16

Mean of means among attributes of the numeric type.

ClassEntropy

Entropy of the target attribute values.

Dimensionality

0.14

Number of attributes divided by the number of instances.

EquivalentNumberOfAtts

Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.

MajorityClassPercentage

Percentage of instances belonging to the most frequent class.

MajorityClassSize

Number of instances belonging to the most frequent class.

MaxAttributeEntropy

Maximum entropy among attributes.

MaxKurtosisOfNumericAtts

46.67

Maximum kurtosis among attributes of the numeric type.

MaxMeansOfNumericAtts

190.28

Maximum of means among attributes of the numeric type.

MaxMutualInformation

Maximum mutual information between the nominal attributes and the target attribute.

MaxNominalAttDistinctValues

The maximum number of distinct values among attributes of the nominal type.

MaxSkewnessOfNumericAtts

3.16

Maximum skewness among attributes of the numeric type.

MaxStdDevOfNumericAtts

55.42

Maximum standard deviation of attributes of the numeric type.

MeanAttributeEntropy

Average entropy of the attributes.

MeanKurtosisOfNumericAtts

2.41

Mean kurtosis among attributes of the numeric type.

NumberOfBinaryFeatures

Number of binary attributes.

MeanMutualInformation

Average mutual information between the nominal attributes and the target attribute.

MeanNoiseToSignalRatio

An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.

MeanNominalAttDistinctValues

Average number of distinct values among the attributes of the nominal type.

MeanSkewnessOfNumericAtts

-0.56

Mean skewness among attributes of the numeric type.

MeanStdDevOfNumericAtts

3.35

Mean standard deviation of attributes of the numeric type.

MinAttributeEntropy

Minimal entropy among attributes.

MinKurtosisOfNumericAtts

-0.48