Preprocessing

Zero mean (subtract the mean from each predictor) to center the data.
Divide by standard deviation to scale the data.
DateTime
One-Hot Code
Look For skewness, log/sqrt/Box Cox transform if necessary (Boxcox)
Resolve Outliers (and understand their meaning) (apply spatial sign if model is sensitive to outliers)
Eliminate Missing Data (Can be problematic if missingness is predictive. Tree Based Models can deal with missing data)
Imputation/Interpolation (KNN or intermediate regression model)

Exploratory Data Analysis

Maximal Information Coefficient Matrix / Correlation Matrix
Box-Chart Everything
Scatter Every Combination of Features
Pivot Tables
Group by particular features
Histogram Everything
Outlier Analysis
Transform Variables (Square, Cube, Inverse, Log) and Plot
Summary (Mean, Mode, Minimum, Maximum, Upper/Lower Quartiles, Identify Outliers)

Data Reduction

Principal Component Analysis 2.Linear Discriminant Analysis (For Classification)
Feature Selection (Only use the components that account for a majority of the information when Modeling
Remove Low/Zero Variance Predictors
Remove multicollinear heavily correlated features
Isomap
Lasso

Algorithms for Regression

Linear Regression
- Ridge Regression / Lasso / Elastic Net
- Best Subset Selection
- Forward and Backward Stepwise, Stagewise
Partial Least Squares
Principal Components Regression
Neural Networks
- CNN
- RNN -LSTM
Multivariate Adaptive Regression Splines
Support Vector Regressor
K-Nearest Neighbors
Regression Decision Trees
Bagged Trees
Random Forests
Extremely Random Forests
Gradient Boosted Trees
Generalized Linear Model
Generalized Additive Model

Evaluating Regression

Algorithms for Classification

Evaluating Classification

Unsupervised Learning

K-Means
- K-Means++
- K-Medoids
Hierarchical Agglomerative Clustering
- Single Linkage, Complete Linkage, Average Linkage, Centroid Criterion
Principal Components Analysis
Spectral Clustering
Affinity Propagation
Biclustering
Gaussian Mixture Model

Classification Class Imbalance

Feature Evaluation

Coefficients in Linear Models
Random Forest Importances (variance for regression, information gain for classification)
Pearson Correlation with Outcome
Maximal Information Coefficient (MIC)
Distance Correlation (code)
Model with/without feature
Randomly shuffle the feature between data points, check difference in model quality
Lasso Automatic Selection
Mean Decrease Accuracy (code)
Stability Selection
Recursive Feature Elimination

Parameter Tuning

Text Features

Modeling Techniques

Feature Engineering
- Basis Expansions
- Combine Features
- average values, median values, variances, sums, differences, maximums or minimums, and counts.
Stacking (using output of one algorithm as input to the next)
Internal Prediction
Blending (Especially with differentiated models)
Account For Missing Data (It can be information)
External Data
Acquire Domain Knowledge for Feature Engineering
Random Forest, Boosters, Trees Importances for Feature Exploration
Clustering for feature creation
Distance to Class Centroid