The Descriptor toolkit provides two modules: Descriptor List and
Descriptor Upload, allowing users to standardize nanodescriptor values using descriptor preprocessing method
and analyze the associated NM space using principal component analysis (PCA).
1. Nanodescriptor value standardization: Standardization is a technique often applied as part of descriptor preparation
for machine learning. The goal of standardization is to change the values of numeric columns in the dataset to a common scale,
without distorting differences in the ranges of values.
2. NM space analysis: PCA applied to the descriptor data identifies the combination of attributes (principal components,
or directions in the nanodescriptors space) that account for the most variance in the descriptor data. Both the top two and top three
principal components were used to represent the distribution of corresponding NMs.
The standardized nanodescriptor dataset, along with the 2D and 3D NM space data, are downloadable after descriptor analysis.