Online server under development.

Examples of calculations, graphs, technical information.

Zero Block

To select significant predictors(features variables), we have developed an original

software package that includes:

a genetic algorithm,

a stepwise regression method,

regression trees and model trees using M5method.

software package that includes:

a genetic algorithm,

a stepwise regression method,

regression trees and model trees using M5method.

(M5 model tree is a decision tree learner for regression task which is used to predict values of numerical response variable, which is a binary decision tree having linear regression functions at the terminal (leaf) nodes, which can predict continuous numerical attributes) as well as building

ensembles of

The built trees can also be linearized into decision rules either directly or using the M5 method. Program accepts input variables to be continuous, binary, and categorical, as well as manages missing values. Model trees combine a conventional regression tree with the possibility of linear regression functions at the leaves. This representation usually provides higher accuracy than regression trees but preserves the advantage of clear and easy-to-interpret structure.

Thus, combining the methods GA-PLS, GA-OLS, GA-KNN, GA-RR, GA-PC, FS-PLS, FSOLS, FS-KNN, FS-RR, FS-PC and regression trees and model trees using we got a set of significant predictors for the prediction model. The described methods were implemented in the original toolbox.

At the same time, we consistently used a genetic algorithm, a decision tree to select the optimal set of features.

**1. Genetic Algorithms(GA)** are adaptive heuristic search algorithms that belong to the larger part of evolutionary algorithms. Genetic algorithms are based on the ideas of natural selection and genetics. These are intelligent exploitation of random search provided with historical data to direct the search into the region of better performance in solution space.

They are commonly used to generate high-quality solutions for optimization problems and search problems.

To select significant predictors, we used the following combined approaches:

** GA-PLS** (partial least squares),

** GA-OLS**(ordinary least squares) ,

** GA-PC**(principal component),

** GA-RR**(ridge regression),

Also, to improve the classification, we use

** GA-KNN(** k- nearest neighbor) (KNN).In this case, instead of considering all training samples and taking k-neighbors, we used GA, which immediately takes k-neighbors, and then calculates the distance to classify the test samples.

They are commonly used to generate high-quality solutions for optimization problems and search problems.

To select significant predictors, we used the following combined approaches:

Also, to improve the classification, we use

An example program for the developed Machine Learning and Deep Learning Method for biological systems. The analysis is carried out for charged amino acid residues that are replaced (mutated) in the Spike-ACE2 dimer.

Correlation after clustering represents the dependence between calculated and experimental data

Correlation after clustering represents the dependence between calculated and experimental data

significant improvement of the fit, and repeating this process until none improves the model to a statistically significant extent.

To select significant predictors, we choose the smallest value of RMSE(root mean square error) из GA- PLS, GA-OLS, GA-KNN, GA-RR, GA-PC, FS-PLS, FS-OLS, FS-KNN,

FS-RR, FS-PC.

An example program for the developed Machine Learning and Deep Learning Method for biological systems. The analysis is carried out for hydrophobic amino acid residues that are replaced (mutated) in the Spike-ACE2 dimer.

Correlation after clustering represents the dependence between calculated and experimental data

Correlation after clustering represents the dependence between calculated and experimental data

The main buttons in the Machine Learning program.

[Deep mutational scanning of an antibody

against epidermal growth factor receptor using

mammalian cell display and massively parallel

pyrosequencing] details of the experimental studies are described here

against epidermal growth factor receptor using

mammalian cell display and massively parallel

pyrosequencing] details of the experimental studies are described here

Correlation between the stability of P53 protein in denaturant and calculated data using different types of clustering

Module description for multivariate regression coupled with variable selection

How to do all this?

We will be very grateful for any possible assistance to the developer team!

Tilda Publishing

Responsible editor-in-chief in the field of mathematical physics

and machine learning:

and machine learning: