目录

回归及其他方法

书籍

《机器学习那些事》 A Few Useful Things to Know about Machine Learning https://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf
《统计学习方法》李航 Python实现代码https://github.com/fengdu78/lihang-code
《迁移学习简明手册》:https://github.com/jindongwang/transferlearning-tutorial
《神经网络与深度学习》https://nndl.github.io/
《面向机器学习的特征工程》https://github.com/apachecn/fe4ml-zh
Explanatory Model Analysishttps://pbiecek.github.io/ema/
《人工智能知识树》https://github.com/apachecn/ai-roadmap
ds-cheatsheets 速查手册https://github.com/FavioVazquez/ds-cheatsheets

短文

15 TYPES OF REGRESSION IN DATA SCIENCEhttps://www.listendata.com/2018/03/regression-analysis.html
Are categorical variables getting lost in your random forests?https://roamanalytics.com/2016/10/28/are-categorical-variables-getting-lost-in-your-random-forests/
Selecting good featureshttp://blog.datadive.net/selecting-good-features-part-iii-random-forests/
Github半监督学习https://github.com/topics/semi-supervised-learning
Github特征工程https://github.com/topics/feature-engineering
A Data Science Framework: To Achieve 99% Accuracyhttps://www.kaggle.com/ldfreeman3/a-data-science-framework-to-achieve-99-accuracy#How-a-Data-Scientist-Beat-the-Odds
Machine Learning Mastery 博客文章翻译https://github.com/apachecn/ml-mastery-zh

实例

实例The Super Duper NLP Repohttps://notebooks.quantumstat.com/
实例Deep Learning Modelshttps://github.com/rasbt/deeplearning-models
我们为你精选了一份Jupyter/IPython笔记本集合https://www.jiqizhixin.com/articles/2019-04-23-3
Learn ML with clean codehttps://github.com/madewithml/lessons
Progressive Growing of GANshttps://github.com/tkarras/progressive_growing_of_gans
Papers with codehttps://github.com/zziz/pwc
Good-Papershttps://github.com/hoangcuong2011/Good-Papers
XGBoost with Labelhttps://medium.com/@songxia.sophia/two-machine-learning-algorithms-to-predict-xgboost-neural-network-with-entity-embedding-caac68717dea
python-is-coolhttps://github.com/chiphuyen/python-is-cool

回归问题注意点

多重共线性 Colinearity diagnostics(多重共线性诊断) 差分法、岭回归、lasso、Elastic-Net等
自相关性 残差图、Durbin-Watson Statistics(德宾—瓦特逊检验)、Q-Statistics
异方差性 图示检验法、Goldfeld - Quandt 检验法、White检验法、Park检验法和Gleiser检验法 模型变换、加权最小二乘法
交叉验证

基础理论

最小二乘法、最大似然法

包说明

分类回归

方法 中文 Python R
Linear Models Generalized Linear Models
logistic
Ordinary Least Squares sklearn.linear_model.LinearRegression
Stochastic Gradient Descent 随机梯度下降 sklearn.linear_model.SGDClassifier sklearn.linear_model.SGDRegressor
Cox
Ridge Regression 岭回归 sklearn.linear_model.Ridge
lasso 套索回归 sklearn.linear_model.Lasso
Elastic-Net 弹性网回归 sklearn.linear_model.ElasticNet
Least-angle regression sklearn.linear_model.Lars
Ensemble methods Ensemble methods
Decision Tree 决策树
Boosted Decision Tree (GDBT)
Gradient Boosted Regression Trees (GBRT) gradient-tree-boosting
AdaBoost sklearn.ensemble.AdaBoostClassifier sklearn.ensemble.AdaBoostRegressor
XGBoost XGBoost GitHub
Random Forest 随机森林 sklearn.ensemble.RandomForestClassifier sklearn.ensemble.RandomForestRegressor
SVM

备注

Lasso (Least Absolute Shrinkage and Selection Operator)

Stepwise Regression 逐步回归

Bayesian、Ecological和Robust回归

Trees

https://github.com/IBM/xgboost-smote-detect-fraud/blob/master/notebook/Fraud_Detection.ipynbXGB+SMOTE
https://github.com/andosa/treeinterpreter Article Tree Interpreter

AI

https://github.com/ajbrock/BigGAN-PyTorch BigGAN Pytorch

线性回归

直线回归
多重线性回归
曲线回归

其他

时间序列分析
meta分析
主成分分析
因子分析
聚类分析
判别分析