TensorFlow on Spark Github:
https://github.com/yahoo/TensorFlowOnSpark
Deep Learning with Apache Spark and TensorFlow:
https://databricks.com/blog/2016/01/25/deep-learning-with-apache-spark-and-tensorflow.html
当Spark遇上TensorFlow分布式深度学习框架原理和实践
https://juejin.im/post/5ad4b620f265da23a04a0ad0
Spark 随机森林算法原理、源码分析及案例实战
https://www.ibm.com/developerworks/cn/opensource/os-cn-spark-random-forest/
Scalable-machine-learning
https://docs.huihoo.com/machine-learning/scalable-machine-learning/
Papers
Spark: Cluster Computing with Working Sets, Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, Ion Stoica. USENIX HotCloud (2010).
http://people.csail.mit.edu/matei/papers/2010/hotcloud_spark.pdf
Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing, Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J. Franklin, Scott Shenker, Ion Stoica. NSDI (2012)
https://www.usenix.org/system/files/conference/nsdi12/nsdi12-final138.pdf
MLlib: Machine Learning in Apache Spark, X. Meng, J. Bradley, B. Yuvaz, E. Sparks, S. Venkataraman, D. Liu, J. Freeman, D. Tsai, M. Amde, S. Owen, D. Xin, R. Xin, M. Franklin, R. Zadeh, M. Zaharia, A. Talwalkar. Preprint (2015).
https://arxiv.org/pdf/1505.06807.pdf