Data Science Interviews
1.48MB. 0 audio & 27 images. Updated 20200929.
Description
This deck helps you study the material behind common data science interview questions. It focuses on data science theory, mostly statistics and machine learning, rather than practice (it contains no code). It contains general knowledge needed for problem solving, rather than specific problems.
Updated 9/20: Revised to make notes simpler, more concise, and more accurate. Many more pictures. Cards now have a reference link for easy access to source material. Split large cards into several parts.
Included:
 Statistics (counting and probability, hypothesis testing, distributions)
 Machine Learning (bias/variance, over/underfitting, regularization, bagging, boosting, gradient descent)
 Supervised Learning (linear and logistic regression, naive bayes, decision trees, random forests, SVM, KNN, neural networks)
 Clustering (kmeans, DBSCAN, spectral, hierarchical, evaluation)
 Metrics (precision, recall, F1, sensitivity, specificity, ROC, AUC, log loss)
 Linear Algebra (linear independence, orthogonality, bases, eigenvectors, invertible matrix theorem, SVD)
 Feature Engineering (feature selection, PCA)
 Soft Skills / behavioral and experience questions
Questions are grouped into subdecks by topic, such as stats, supervised learning, and clustering. You can study just the subtopics you want to learn and skip those you already know.
Questions are tagged as highlevel (broad, important, conceptual knowledge), mediumlevel, and lowlevel (less common, details, equations). A good study strategy would be to start with all highlevel questions, then move on to medium and low.
Material is sourced from around the web and from
Data Science Interviews Exposed by You et. al. As you are studying the material, one great list of actual questions (and answers) on which to test your knowledge is
Data Science Interview Questions & Detailed Answers.
Does not cover:
Data wrangling; Programming, engineering; Databases, SQL; Natural Language Processing; Deep Learning; Recommender Systems; Bayesian Methods; Time Series Analysis; Anomaly Detection; Visualization; Calculus; or the very basics.
Requires Anki 2.1+ for Mathjax equations.
Sample (from 117 notes)
Cards are customizable!
When this deck is imported into the desktop program, cards will appear as
the deck author has made them. If you'd like to customize what appears on
the front and back of a card, you can do so by clicking the Edit button, and
then clicking the Cards button.
Front 
Explain bagging 
Back 
Bootstrap aggregatingTrain multiple models on subsamples and average predictions to reduce variance.Usually uses "strong," lowbias models.

Ref 
https://en.wikipedia.org/wiki/Bootstrap_aggregating 
Credit 
https://en.wikipedia.org/wiki/Bootstrap_aggregating 
Tags 
highlevel 
Front 
How does an artificial neuron (perceptron) work? 
Back 
Applies an activation function to the weighted sum of its inputs.\[ y = f \left( \textstyle \sum w_i x_i \right) \]Common activation functions are linear, step, sigmoid, tangent, rectified linear...

Ref 
https://en.wikipedia.org/wiki/Perceptron 
Credit 
http://www.theprojectspot.com/tutorialpost/introductiontoartificialneuralnetworkspart1/7 
Tags 
highlevel 
Front 
How do decision trees work, highlevel? 
Back 
Recursively split the data into groups based on most discriminating feature; each leaf gives a prediction.

Ref 
https://victorzhou.com/blog/introtorandomforests/ 
Credit 
https://victorzhou.com/blog/introtorandomforests/ 
Tags 
highlevel 
After the file is downloaded, doubleclick on it to open it in
the desktop
program.
At this time, it is not possible to add shared decks directly to
your
AnkiWeb account  they need to be added from the desktop then
synchronized to
AnkiWeb.
Reviews
on
1617753628
Merci
on
1613975167
thanks
on
1608738031
Clear and concise questions and answers.
on
1600225780
Nice deck. Covers main concepts!
on
1586692853
Using it a lot! Thanks
on
1584546694
Good!
on
1581520587
Thank you
on
1562757741
Awesome
on
1550469199
Exactly what I was looking for!
on
1547303850
Thank you! Good job!
on
1535528578
Thanks a lot!
on
1524787200
quite a comprehensive deck, with some subtle questions