Reading List 4
List of Selected Papers on Algorithms for Large-Scale Graph Processing.
1/ [ISAAC'11] Goodrich, M. T., Sitchinava, N., & Zhang, Q. (2011, December). Sorting, searching, and simulation in the mapreduce framework. In International Symposium on Algorithms and Computation (pp. 374-383). Springer, Berlin, Heidelberg. 1 2 3 4 5 6 7 8 @inproceedings{goodrich2011sorting, title={Sorting, searching, and simulation in the mapreduce framework}, author={Goodrich, Michael T and Sitchinava, Nodari and Zhang, Qin}, booktitle={International Symposium on Algorithms and Computation}, pages={374--383}, year={2011}, organization={Springer} } 2/ [STOC'14] Andoni, A., Nikolov, A., Onak, K., & Yaroslavtsev, G. (2014, May). Parallel algorithms for geometric graph problems. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing (pp. 574-583).
Reading list & mathematics resources.
Math Reading List
List of Ebooks for Data Science
The Law - The mathematical foundations # Statistical Inference - Casella & Berger Foundations of Applied Mathematics History - Foundational works that provide additional context for more advanced concepts # Convex Optimization - Boyd & Vandenberghe Probability Theory: The Logic of Science - Jaynes Clean Code - Martin Poetry - Prose type works # The Art of Data Analysis Why Predictions Fail Weapons of Math Destruction Major Prophets - Seminal works on major topics # Applied Regression Analysis - Draper & Smith
List of Github Repository for Data Science
The Data Engineering Cookbook, Github A curated list of data engineering tools for software developers, Github Data Engineering Zoomcamp, Github Python Data Science Handbook: full text in Jupyter Notebooks, Github Data Science for Beginners - A Curriculum, Github Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines. Github Papers & tech blogs by companies sharing their work on data science & machine learning in production. Github An awesome Data Science repository to learn and apply for real world problems. Github List of Data Science Cheatsheets to rule the world, Github Data science interview questions and answers, Github A curated list of applied machine learning and data science notebooks and libraries across different industries, Github A curated list of data science blogs, Github