DEVELOPMENT... OpenML
Data
QSAR-DATASET-FOR-DRUG-TARGET-CHEMBL5102

QSAR-DATASET-FOR-DRUG-TARGET-CHEMBL5102

deactivated ARFF Publicly available Visibility: public Uploaded 15-07-2016 by unknown
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target ChEMBL_ID: CHEMBL5102 (TID: 11978), and it has 1102 rows and 205 features (not including molecule IDs and class feature: molecule_id and pXC50). The features represent Molecular Descriptors which were generated from SMILES strings. Missing value imputation was applied to this dataset (By choosing the Median). Feature selection was also applied.

207 features

pXC50 (target)numeric710 unique values
0 missing
molecule_id (row identifier)nominal1102 unique values
0 missing
SIC3numeric222 unique values
0 missing
SIC4numeric200 unique values
0 missing
CIC4numeric478 unique values
0 missing
nCsp2numeric19 unique values
0 missing
CIC2numeric593 unique values
0 missing
SIC2numeric213 unique values
0 missing
Eig13_EA.bo.numeric540 unique values
0 missing
Eig13_AEA.bo.numeric466 unique values
0 missing
Eig09_AEA.bo.numeric367 unique values
0 missing
Eig12_AEA.bo.numeric435 unique values
0 missing
Eig11_AEA.bo.numeric442 unique values
0 missing
Eig12_EA.bo.numeric490 unique values
0 missing
Eig11_EA.bo.numeric458 unique values
0 missing
piPC05numeric526 unique values
0 missing
Uinumeric21 unique values
0 missing
piPC06numeric582 unique values
0 missing
Eig10_AEA.ed.numeric416 unique values
0 missing
Eig11_EA.ed.numeric476 unique values
0 missing
SM06_AEA.ri.numeric476 unique values
0 missing
DLS_04numeric9 unique values
0 missing
IC1numeric552 unique values
0 missing
piPC08numeric613 unique values
0 missing
piPC07numeric599 unique values
0 missing
Eig11_AEA.ed.numeric442 unique values
0 missing
X3numeric624 unique values
0 missing
Eig11_AEA.dm.numeric458 unique values
0 missing
Eig12_AEA.dm.numeric502 unique values
0 missing
SM02_AEA.bo.numeric261 unique values
0 missing
DLS_consnumeric45 unique values
0 missing
ATS2mnumeric582 unique values
0 missing
Eig12_AEA.ed.numeric427 unique values
0 missing
RDSQnumeric653 unique values
0 missing
IC2numeric610 unique values
0 missing
piIDnumeric577 unique values
0 missing
SpAD_AEA.ed.numeric649 unique values
0 missing
ATS4mnumeric685 unique values
0 missing
piPC09numeric626 unique values
0 missing
Eig09_EA.ed.numeric462 unique values
0 missing
SM04_AEA.ri.numeric462 unique values
0 missing
Eig07_AEA.bo.numeric431 unique values
0 missing
X3solnumeric686 unique values
0 missing
NNRSnumeric12 unique values
0 missing
Eig09_AEA.dm.numeric529 unique values
0 missing
SpAD_AEA.bo.numeric734 unique values
0 missing
IDDEnumeric341 unique values
0 missing
SpAD_AEA.ri.numeric1010 unique values
0 missing
SpAD_EAnumeric651 unique values
0 missing
ATS7mnumeric743 unique values
0 missing
Eig11_AEA.ri.numeric536 unique values
0 missing
CATS2D_07_DAnumeric7 unique values
0 missing
Eig08_AEA.dm.numeric528 unique values
0 missing
MWC02numeric96 unique values
0 missing
ZM1numeric96 unique values
0 missing
MWnumeric759 unique values
0 missing
GMTIVnumeric945 unique values
0 missing
SRW04numeric133 unique values
0 missing
Eig11_EA.ri.numeric538 unique values
0 missing
Xindexnumeric199 unique values
0 missing
IDDMnumeric246 unique values
0 missing
Chi1_EA.ri.numeric964 unique values
0 missing
ATS8mnumeric739 unique values
0 missing
Eig11_EAnumeric406 unique values
0 missing
SM05_AEA.dm.numeric406 unique values
0 missing
SpMax6_Bh.m.numeric484 unique values
0 missing
BIDnumeric90 unique values
0 missing
ATSC4enumeric742 unique values
0 missing
Eig13_EAnumeric477 unique values
0 missing
SM07_AEA.dm.numeric477 unique values
0 missing
Eig12_AEA.ri.numeric639 unique values
0 missing
SM02_EA.ri.numeric523 unique values
0 missing
X0numeric344 unique values
0 missing
Eig08_AEA.ri.numeric522 unique values
0 missing
ATSC4snumeric1060 unique values
0 missing
SpMax2_Bh.m.numeric267 unique values
0 missing
X2numeric598 unique values
0 missing
GGI8numeric321 unique values
0 missing
LPRSnumeric654 unique values
0 missing
nSKnumeric37 unique values
0 missing
BIC3numeric201 unique values
0 missing
Eig07_EAnumeric430 unique values
0 missing
SM15_AEA.bo.numeric430 unique values
0 missing
MWC09numeric521 unique values
0 missing
TWCnumeric523 unique values
0 missing
ATS6mnumeric712 unique values
0 missing
IDEnumeric412 unique values
0 missing
MSDnumeric588 unique values
0 missing
Eig12_EA.ri.numeric590 unique values
0 missing
Psi_e_0numeric954 unique values
0 missing
Vindexnumeric142 unique values
0 missing
IDETnumeric655 unique values
0 missing
MWC08numeric506 unique values
0 missing
SMTIVnumeric943 unique values
0 missing
Eta_sh_ynumeric210 unique values
0 missing
Eig07_AEA.ri.numeric564 unique values
0 missing
SIC5numeric198 unique values
0 missing
MWC07numeric506 unique values
0 missing
MATS2inumeric430 unique values
0 missing
DELSnumeric1054 unique values
0 missing
JGI8numeric18 unique values
0 missing
ZM1Madnumeric891 unique values
0 missing
XMODnumeric900 unique values
0 missing
MWC10numeric527 unique values
0 missing
Chi1_EA.bo.numeric683 unique values
0 missing
Eig06_EAnumeric460 unique values
0 missing
SM14_AEA.bo.numeric460 unique values
0 missing
Eig05_EA.dm.numeric47 unique values
0 missing
Xunumeric640 unique values
0 missing
Eig13_AEA.ri.numeric676 unique values
0 missing
Eig13_EA.ri.numeric643 unique values
0 missing
SM04_EA.bo.numeric459 unique values
0 missing
CENTnumeric573 unique values
0 missing
Eig09_EA.bo.numeric375 unique values
0 missing
PCDnumeric559 unique values
0 missing
Eig10_EA.bo.numeric421 unique values
0 missing
MWC06numeric491 unique values
0 missing
Eig10_AEA.bo.numeric406 unique values
0 missing
GGI9numeric336 unique values
0 missing
Psi_i_0numeric823 unique values
0 missing
SM06_EA.bo.numeric494 unique values
0 missing
GATS8vnumeric447 unique values
0 missing
JGI7numeric25 unique values
0 missing
nR06numeric7 unique values
0 missing
PW5numeric60 unique values
0 missing
D.Dtr09numeric186 unique values
0 missing
TRSnumeric28 unique values
0 missing
SMTInumeric645 unique values
0 missing
HVcpxnumeric385 unique values
0 missing
ICRnumeric345 unique values
0 missing
SpMin2_Bh.p.numeric144 unique values
0 missing
Psi_i_snumeric641 unique values
0 missing
ON1numeric270 unique values
0 missing
GATS2pnumeric411 unique values
0 missing
SRW08numeric461 unique values
0 missing
IDMTnumeric656 unique values
0 missing
ZM2Vnumeric396 unique values
0 missing
D.Dtr10numeric102 unique values
0 missing
ZM2MulPernumeric939 unique values
0 missing
C.numeric128 unique values
0 missing
C.016numeric5 unique values
0 missing
Eig03_AEA.dm.numeric394 unique values
0 missing
IVDEnumeric255 unique values
0 missing
JGTnumeric289 unique values
0 missing
nR.Csnumeric5 unique values
0 missing
Eta_alphanumeric550 unique values
0 missing
Eig13_AEA.dm.numeric533 unique values
0 missing
ATS7vnumeric704 unique values
0 missing
piPC04numeric444 unique values
0 missing
Eta_epsinumeric695 unique values
0 missing
Psi_e_1numeric924 unique values
0 missing
Eta_B_Anumeric43 unique values
0 missing
NdsCHnumeric5 unique values
0 missing
P_VSA_s_3numeric688 unique values
0 missing
X5solnumeric671 unique values
0 missing
JGI9numeric17 unique values
0 missing
AECCnumeric520 unique values
0 missing
S2Knumeric727 unique values
0 missing
PCRnumeric296 unique values
0 missing
Eta_betaSnumeric88 unique values
0 missing
Eig07_AEA.dm.numeric530 unique values
0 missing
BIC4numeric184 unique values
0 missing
ATS3mnumeric622 unique values
0 missing
SpMin1_Bh.p.numeric140 unique values
0 missing
SpMax7_Bh.m.numeric472 unique values
0 missing
ON0numeric151 unique values
0 missing
P_VSA_v_2numeric376 unique values
0 missing
Ramnumeric24 unique values
0 missing
GATS1pnumeric439 unique values
0 missing
Eig10_AEA.ri.numeric491 unique values
0 missing
Eig05_EA.bo.numeric473 unique values
0 missing
SM15_AEA.ri.numeric473 unique values
0 missing
Eig10_EAnumeric381 unique values
0 missing
SM04_AEA.dm.numeric381 unique values
0 missing
JGI4numeric52 unique values
0 missing
Eig06_EA.bo.numeric479 unique values
0 missing
nR10numeric3 unique values
0 missing
ATSC7inumeric790 unique values
0 missing
MWC05numeric475 unique values
0 missing
JGI5numeric38 unique values
0 missing
C.002numeric14 unique values
0 missing
nCsnumeric16 unique values
0 missing
PHInumeric752 unique values
0 missing
D.Dtr06numeric643 unique values
0 missing
ZM2Pernumeric936 unique values
0 missing
SpMax1_Bh.v.numeric228 unique values
0 missing
X4solnumeric689 unique values
0 missing
ATS8vnumeric702 unique values
0 missing
SpMax5_Bh.m.numeric408 unique values
0 missing
MATS2pnumeric321 unique values
0 missing
nCarnumeric21 unique values
0 missing
X3Avnumeric97 unique values
0 missing
PW3numeric102 unique values
0 missing
SpMax7_Bh.s.numeric552 unique values
0 missing
SpDiam_EA.bo.numeric225 unique values
0 missing
ATS7pnumeric729 unique values
0 missing
Eig02_EA.ed.numeric308 unique values
0 missing
SM11_AEA.dm.numeric308 unique values
0 missing
nABnumeric16 unique values
0 missing
Eig09_EAnumeric359 unique values
0 missing
SM03_AEA.dm.numeric359 unique values
0 missing
SM04_AEA.ed.numeric460 unique values
0 missing
Eig01_EA.bo.numeric225 unique values
0 missing
SM11_AEA.ri.numeric225 unique values
0 missing
SpMax_EA.bo.numeric225 unique values
0 missing
IC3numeric576 unique values
0 missing
P_VSA_s_6numeric324 unique values
0 missing

62 properties

1102
Number of instances (rows) of the dataset.
207
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
206
Number of numeric attributes.
1
Number of nominal attributes.
First quartile of mutual information between the nominal attributes and the target attribute.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
0
Percentage of missing values.
99.52
Percentage of numeric attributes.
0.48
Percentage of nominal attributes.
First quartile of entropy among attributes.
0.26
First quartile of kurtosis among attributes of the numeric type.
1.26
First quartile of means among attributes of the numeric type.
Standard deviation of the number of distinct values among attributes of the nominal type.
-0.32
First quartile of skewness among attributes of the numeric type.
0.29
First quartile of standard deviation of attributes of the numeric type.
Second quartile (Median) of entropy among attributes.
1.36
Second quartile (Median) of kurtosis among attributes of the numeric type.
3.7
Second quartile (Median) of means among attributes of the numeric type.
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.
0.19
Second quartile (Median) of skewness among attributes of the numeric type.
0.45
Second quartile (Median) of standard deviation of attributes of the numeric type.
Third quartile of entropy among attributes.
3.18
Third quartile of kurtosis among attributes of the numeric type.
10.78
Third quartile of means among attributes of the numeric type.
Third quartile of mutual information between the nominal attributes and the target attribute.
0.78
Third quartile of skewness among attributes of the numeric type.
2.36
Third quartile of standard deviation of attributes of the numeric type.
-0.04
Average class difference between consecutive instances.
618.62
Mean of means among attributes of the numeric type.
Entropy of the target attribute values.
0.19
Number of attributes divided by the number of instances.
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
Percentage of instances belonging to the most frequent class.
Number of instances belonging to the most frequent class.
Maximum entropy among attributes.
102.98
Maximum kurtosis among attributes of the numeric type.
47299.1
Maximum of means among attributes of the numeric type.
Maximum mutual information between the nominal attributes and the target attribute.
The maximum number of distinct values among attributes of the nominal type.
9.2
Maximum skewness among attributes of the numeric type.
25731.42
Maximum standard deviation of attributes of the numeric type.
Average entropy of the attributes.
3.5
Mean kurtosis among attributes of the numeric type.
0
Number of binary attributes.
Average mutual information between the nominal attributes and the target attribute.
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
Average number of distinct values among the attributes of the nominal type.
0.33
Mean skewness among attributes of the numeric type.
324.07
Mean standard deviation of attributes of the numeric type.
Minimal entropy among attributes.
-0.98
Minimum kurtosis among attributes of the numeric type.
0.01
Minimum of means among attributes of the numeric type.
Minimal mutual information between the nominal attributes and the target attribute.
The minimal number of distinct values among attributes of the nominal type.
-2.15
Minimum skewness among attributes of the numeric type.
0
Minimum standard deviation of attributes of the numeric type.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.

12 tasks

2 runs - estimation_procedure: Custom 10-fold Crossvalidation - target_feature: pXC50
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
Define a new task