DEVELOPMENT... OpenML
Data
QSAR-DATASET-FOR-DRUG-TARGET-CHEMBL4015

QSAR-DATASET-FOR-DRUG-TARGET-CHEMBL4015

deactivated ARFF Publicly available Visibility: public Uploaded 15-07-2016 by unknown
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target ChEMBL_ID: CHEMBL4015 (TID: 11575), and it has 1490 rows and 208 features (not including molecule IDs and class feature: molecule_id and pXC50). The features represent Molecular Descriptors which were generated from SMILES strings. Missing value imputation was applied to this dataset (By choosing the Median). Feature selection was also applied.

210 features

pXC50 (target)numeric748 unique values
0 missing
molecule_id (row identifier)nominal1490 unique values
0 missing
Minumeric79 unique values
0 missing
ATSC1inumeric653 unique values
0 missing
GGI10numeric331 unique values
0 missing
ATSC3enumeric715 unique values
0 missing
nFnumeric9 unique values
0 missing
NsFnumeric9 unique values
0 missing
P_VSA_e_6numeric9 unique values
0 missing
Eig12_AEA.dm.numeric509 unique values
0 missing
Eig11_AEA.dm.numeric512 unique values
0 missing
SM03_EA.dm.numeric106 unique values
0 missing
SM05_EA.dm.numeric163 unique values
0 missing
SM07_EA.dm.numeric148 unique values
0 missing
Psi_e_Anumeric693 unique values
0 missing
Psi_i_Anumeric693 unique values
0 missing
GATS3enumeric579 unique values
0 missing
C.013numeric4 unique values
0 missing
F.083numeric4 unique values
0 missing
nCRX3numeric4 unique values
0 missing
Eta_epsinumeric844 unique values
0 missing
GATS2snumeric368 unique values
0 missing
Dznumeric288 unique values
0 missing
nCpnumeric9 unique values
0 missing
nCsp3numeric27 unique values
0 missing
Eta_alpha_Anumeric88 unique values
0 missing
Eig05_EA.ri.numeric619 unique values
0 missing
NsssCHnumeric8 unique values
0 missing
GATS3inumeric408 unique values
0 missing
GATS3snumeric551 unique values
0 missing
GGI4numeric609 unique values
0 missing
Menumeric88 unique values
0 missing
Eig11_EAnumeric451 unique values
0 missing
SM05_AEA.dm.numeric451 unique values
0 missing
Ramnumeric19 unique values
0 missing
IDMnumeric718 unique values
0 missing
SpMAD_AEA.bo.numeric145 unique values
0 missing
HDcpxnumeric185 unique values
0 missing
Eig04_AEA.ri.numeric551 unique values
0 missing
Eig13_AEA.dm.numeric552 unique values
0 missing
Eta_C_Anumeric593 unique values
0 missing
Eig06_EA.bo.numeric552 unique values
0 missing
BIDnumeric87 unique values
0 missing
SpDiam_EA.dm.numeric100 unique values
0 missing
Eta_betaP_Anumeric304 unique values
0 missing
GGI7numeric483 unique values
0 missing
Eig11_AEA.ri.numeric545 unique values
0 missing
Eig03_AEA.dm.numeric453 unique values
0 missing
Eig11_AEA.ed.numeric525 unique values
0 missing
Eig05_EA.dm.numeric65 unique values
0 missing
nSKnumeric38 unique values
0 missing
S0Knumeric393 unique values
0 missing
SRW08numeric549 unique values
0 missing
Eta_sh_xnumeric100 unique values
0 missing
ATSC4enumeric910 unique values
0 missing
SpMax7_Bh.m.numeric506 unique values
0 missing
Eig11_EA.ri.numeric532 unique values
0 missing
Eig08_AEA.ed.numeric568 unique values
0 missing
DLS_02numeric5 unique values
0 missing
ZM2MulPernumeric1305 unique values
0 missing
Eig06_EA.dm.numeric63 unique values
0 missing
Eig10_AEA.ed.numeric571 unique values
0 missing
Eig12_EAnumeric505 unique values
0 missing
SM06_AEA.dm.numeric505 unique values
0 missing
GGI6numeric512 unique values
0 missing
SpMaxA_EA.bo.numeric117 unique values
0 missing
MPC01numeric41 unique values
0 missing
MWC01numeric41 unique values
0 missing
nBOnumeric41 unique values
0 missing
SRW02numeric41 unique values
0 missing
ZM2Kupnumeric1261 unique values
0 missing
SM08_AEA.bo.numeric632 unique values
0 missing
ATSC2snumeric1341 unique values
0 missing
SM07_AEA.bo.numeric637 unique values
0 missing
Eta_epsi_Anumeric236 unique values
0 missing
MWC06numeric575 unique values
0 missing
GGI3numeric256 unique values
0 missing
ZM1Vnumeric273 unique values
0 missing
ZM2Vnumeric370 unique values
0 missing
Infective.80numeric2 unique values
0 missing
Senumeric937 unique values
0 missing
SM05_AEA.bo.numeric553 unique values
0 missing
nXnumeric10 unique values
0 missing
SM06_AEA.bo.numeric583 unique values
0 missing
ATSC2inumeric900 unique values
0 missing
ATS8vnumeric815 unique values
0 missing
Eig03_EA.ri.numeric477 unique values
0 missing
Psi_i_0numeric1084 unique values
0 missing
Eig04_AEA.bo.numeric472 unique values
0 missing
nHAccnumeric15 unique values
0 missing
Eig12_AEA.ri.numeric605 unique values
0 missing
SpMax6_Bh.e.numeric479 unique values
0 missing
ARRnumeric162 unique values
0 missing
Eta_FL_Anumeric186 unique values
0 missing
Eig04_EA.ri.numeric526 unique values
0 missing
PW2numeric70 unique values
0 missing
SpMax4_Bh.s.numeric272 unique values
0 missing
Eig08_AEA.dm.numeric649 unique values
0 missing
Eig12_EA.ri.numeric566 unique values
0 missing
IVDEnumeric346 unique values
0 missing
LLS_02numeric5 unique values
0 missing
IC2numeric732 unique values
0 missing
Eig10_EA.ed.numeric700 unique values
0 missing
SM05_AEA.ri.numeric700 unique values
0 missing
JGI10numeric15 unique values
0 missing
ON1numeric317 unique values
0 missing
Eig11_EA.ed.numeric636 unique values
0 missing
SM06_AEA.ri.numeric636 unique values
0 missing
Eig03_AEA.ed.numeric408 unique values
0 missing
GATS5vnumeric332 unique values
0 missing
SpMax3_Bh.s.numeric203 unique values
0 missing
Svnumeric967 unique values
0 missing
SpMax2_Bh.i.numeric177 unique values
0 missing
SpMin8_Bh.e.numeric405 unique values
0 missing
TIC3numeric992 unique values
0 missing
Eig14_AEA.bo.numeric567 unique values
0 missing
ATSC8inumeric978 unique values
0 missing
Eig08_EA.bo.numeric537 unique values
0 missing
Eig04_EAnumeric484 unique values
0 missing
SM12_AEA.bo.numeric484 unique values
0 missing
JGTnumeric300 unique values
0 missing
X2numeric923 unique values
0 missing
Eta_sh_pnumeric243 unique values
0 missing
Eig09_EA.ed.numeric706 unique values
0 missing
SM04_AEA.ri.numeric706 unique values
0 missing
Chi1_EA.bo.numeric947 unique values
0 missing
CATS2D_07_ALnumeric20 unique values
0 missing
SM05_EA.ri.numeric803 unique values
0 missing
SpMAD_EA.ed.numeric610 unique values
0 missing
Eig12_EA.bo.numeric555 unique values
0 missing
Eig03_EAnumeric408 unique values
0 missing
SM11_AEA.bo.numeric408 unique values
0 missing
IC3numeric623 unique values
0 missing
Eig04_EA.ed.numeric608 unique values
0 missing
SM13_AEA.dm.numeric608 unique values
0 missing
Eig06_EA.ri.numeric649 unique values
0 missing
SsFnumeric814 unique values
0 missing
ATSC7snumeric1360 unique values
0 missing
SM03_EA.bo.numeric120 unique values
0 missing
Eig07_AEA.ed.numeric599 unique values
0 missing
ATSC3snumeric1359 unique values
0 missing
VvdwZAZnumeric991 unique values
0 missing
Eig04_AEA.ed.numeric465 unique values
0 missing
ATS3snumeric692 unique values
0 missing
AACnumeric491 unique values
0 missing
IC0numeric491 unique values
0 missing
ATS4snumeric730 unique values
0 missing
ATSC1snumeric1251 unique values
0 missing
Eig05_AEA.ri.numeric624 unique values
0 missing
Eig04_EA.dm.numeric81 unique values
0 missing
Chi0_EA.dm.numeric904 unique values
0 missing
ATS3inumeric629 unique values
0 missing
SpAD_EA.dm.numeric431 unique values
0 missing
SM06_AEA.ed.numeric694 unique values
0 missing
Eig06_EAnumeric517 unique values
0 missing
SM14_AEA.bo.numeric517 unique values
0 missing
Eig05_AEA.dm.numeric633 unique values
0 missing
Eig13_EA.ri.numeric626 unique values
0 missing
ATS1vnumeric526 unique values
0 missing
Eig13_AEA.ri.numeric631 unique values
0 missing
SM04_EA.dm.numeric352 unique values
0 missing
SM05_AEA.ed.numeric663 unique values
0 missing
VARnumeric230 unique values
0 missing
SM07_EA.ri.numeric877 unique values
0 missing
SM03_EAnumeric26 unique values
0 missing
Eta_Bnumeric454 unique values
0 missing
SM12_EA.dm.numeric162 unique values
0 missing
SM14_EA.dm.numeric149 unique values
0 missing
SM10_EA.dm.numeric184 unique values
0 missing
Eig14_EA.bo.numeric617 unique values
0 missing
SM06_EA.dm.numeric280 unique values
0 missing
Qindexnumeric32 unique values
0 missing
MDDDnumeric970 unique values
0 missing
C.008numeric7 unique values
0 missing
Eig13_AEA.ed.numeric557 unique values
0 missing
ATS1mnumeric510 unique values
0 missing
SpMax8_Bh.e.numeric476 unique values
0 missing
CMC.80numeric2 unique values
0 missing
SM03_EA.ed.numeric446 unique values
0 missing
IACnumeric928 unique values
0 missing
TIC0numeric928 unique values
0 missing
SM06_EAnumeric671 unique values
0 missing
Eig05_EAnumeric526 unique values
0 missing
SM13_AEA.bo.numeric526 unique values
0 missing
SM05_EAnumeric167 unique values
0 missing
Eig06_AEA.ri.numeric627 unique values
0 missing
Polnumeric59 unique values
0 missing
Eig03_AEA.ri.numeric461 unique values
0 missing
X5solnumeric923 unique values
0 missing
Neoplastic.80numeric2 unique values
0 missing
SpMin6_Bh.p.numeric433 unique values
0 missing
Eig09_AEA.dm.numeric621 unique values
0 missing
SM07_EAnumeric645 unique values
0 missing
TPSA.Tot.numeric430 unique values
0 missing
Eig01_EA.dm.numeric61 unique values
0 missing
SpMax_EA.dm.numeric61 unique values
0 missing
SM02_EA.ed.numeric419 unique values
0 missing
SpMax8_Bh.i.numeric443 unique values
0 missing
Eig03_EA.ed.numeric556 unique values
0 missing
SM12_AEA.dm.numeric556 unique values
0 missing
VvdwMGnumeric949 unique values
0 missing
Vxnumeric949 unique values
0 missing
SM07_AEA.ed.numeric705 unique values
0 missing
Eig10_EA.ri.numeric561 unique values
0 missing
SM04_AEA.ed.numeric611 unique values
0 missing
SM02_EA.ri.numeric546 unique values
0 missing
SM09_EA.dm.numeric130 unique values
0 missing
SM11_EA.dm.numeric109 unique values
0 missing
SM13_EA.dm.numeric103 unique values
0 missing
SM15_EA.dm.numeric92 unique values
0 missing

62 properties

1490
Number of instances (rows) of the dataset.
210
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
209
Number of numeric attributes.
1
Number of nominal attributes.
First quartile of mutual information between the nominal attributes and the target attribute.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
0
Percentage of missing values.
99.52
Percentage of numeric attributes.
0.48
Percentage of nominal attributes.
First quartile of entropy among attributes.
-0.04
First quartile of kurtosis among attributes of the numeric type.
1.53
First quartile of means among attributes of the numeric type.
Standard deviation of the number of distinct values among attributes of the nominal type.
-0.88
First quartile of skewness among attributes of the numeric type.
0.23
First quartile of standard deviation of attributes of the numeric type.
Second quartile (Median) of entropy among attributes.
0.5
Second quartile (Median) of kurtosis among attributes of the numeric type.
3.56
Second quartile (Median) of means among attributes of the numeric type.
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.
-0.51
Second quartile (Median) of skewness among attributes of the numeric type.
0.41
Second quartile (Median) of standard deviation of attributes of the numeric type.
Third quartile of entropy among attributes.
1.64
Third quartile of kurtosis among attributes of the numeric type.
8.92
Third quartile of means among attributes of the numeric type.
Third quartile of mutual information between the nominal attributes and the target attribute.
0.05
Third quartile of skewness among attributes of the numeric type.
1.7
Third quartile of standard deviation of attributes of the numeric type.
0.11
Average class difference between consecutive instances.
28.52
Mean of means among attributes of the numeric type.
Entropy of the target attribute values.
0.14
Number of attributes divided by the number of instances.
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
Percentage of instances belonging to the most frequent class.
Number of instances belonging to the most frequent class.
Maximum entropy among attributes.
17.08
Maximum kurtosis among attributes of the numeric type.
591.43
Maximum of means among attributes of the numeric type.
Maximum mutual information between the nominal attributes and the target attribute.
The maximum number of distinct values among attributes of the nominal type.
2.06
Maximum skewness among attributes of the numeric type.
120.36
Maximum standard deviation of attributes of the numeric type.
Average entropy of the attributes.
1.17
Mean kurtosis among attributes of the numeric type.
0
Number of binary attributes.
Average mutual information between the nominal attributes and the target attribute.
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
Average number of distinct values among the attributes of the nominal type.
-0.43
Mean skewness among attributes of the numeric type.
6.62
Mean standard deviation of attributes of the numeric type.
Minimal entropy among attributes.
-1.3
Minimum kurtosis among attributes of the numeric type.
0.01
Minimum of means among attributes of the numeric type.
Minimal mutual information between the nominal attributes and the target attribute.
The minimal number of distinct values among attributes of the nominal type.
-2.15
Minimum skewness among attributes of the numeric type.
0
Minimum standard deviation of attributes of the numeric type.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.

12 tasks

2 runs - estimation_procedure: Custom 10-fold Crossvalidation - target_feature: pXC50
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
Define a new task