Data Science Interviews
1.48MB. 0 audio & 27 images. Updated 2020-09-29.
Description
This deck helps you study the material behind common data science interview questions. It focuses on data science theory, mostly statistics and machine learning, rather than practice (it contains no code). It contains general knowledge needed for problem solving, rather than specific problems.
Updated 9/20: Revised to make notes simpler, more concise, and more accurate. Many more pictures. Cards now have a reference link for easy access to source material. Split large cards into several parts.
Included:
- Statistics (counting and probability, hypothesis testing, distributions)
- Machine Learning (bias/variance, over/underfitting, regularization, bagging, boosting, gradient descent)
- Supervised Learning (linear and logistic regression, naive bayes, decision trees, random forests, SVM, KNN, neural networks)
- Clustering (k-means, DBSCAN, spectral, hierarchical, evaluation)
- Metrics (precision, recall, F1, sensitivity, specificity, ROC, AUC, log loss)
- Linear Algebra (linear independence, orthogonality, bases, eigenvectors, invertible matrix theorem, SVD)
- Feature Engineering (feature selection, PCA)
- Soft Skills / behavioral and experience questions
Questions are grouped into sub-decks by topic, such as stats, supervised learning, and clustering. You can study just the sub-topics you want to learn and skip those you already know.
Questions are tagged as high-level (broad, important, conceptual knowledge), medium-level, and low-level (less common, details, equations). A good study strategy would be to start with all high-level questions, then move on to medium and low.
Material is sourced from around the web and from
Data Science Interviews Exposed by You et. al. As you are studying the material, one great list of actual questions (and answers) on which to test your knowledge is
Data Science Interview Questions & Detailed Answers.
Does not cover:
Data wrangling; Programming, engineering; Databases, SQL; Natural Language Processing; Deep Learning; Recommender Systems; Bayesian Methods; Time Series Analysis; Anomaly Detection; Visualization; Calculus; or the very basics.
Requires Anki 2.1+ for Mathjax equations.
Sample (from 117 notes)
Cards are customizable!
When this deck is imported into the desktop program, cards will appear as
the deck author has made them. If you'd like to customize what appears on
the front and back of a card, you can do so by clicking the Edit button, and
then clicking the Cards button.
Front |
Explain bagging |
Back |
Bootstrap aggregatingTrain multiple models on subsamples and average predictions to reduce variance.Usually uses "strong," low-bias models.
 |
Ref |
https://en.wikipedia.org/wiki/Bootstrap_aggregating |
Credit |
https://en.wikipedia.org/wiki/Bootstrap_aggregating |
Tags |
high-level |
Front |
How does an artificial neuron (perceptron) work? |
Back |
Applies an activation function to the weighted sum of its inputs.\[ y = f \left( \textstyle \sum w_i x_i \right) \]Common activation functions are linear, step, sigmoid, tangent, rectified linear...
 |
Ref |
https://en.wikipedia.org/wiki/Perceptron |
Credit |
http://www.theprojectspot.com/tutorial-post/introduction-to-artificial-neural-networks-part-1/7 |
Tags |
high-level |
Front |
How do decision trees work, high-level? |
Back |
Recursively split the data into groups based on most discriminating feature; each leaf gives a prediction.
 |
Ref |
https://victorzhou.com/blog/intro-to-random-forests/ |
Credit |
https://victorzhou.com/blog/intro-to-random-forests/ |
Tags |
high-level |
After the file is downloaded, double-click on it to open it in
the desktop
program.
At this time, it is not possible to add shared decks directly to
your
AnkiWeb account - they need to be added from the desktop then
synchronized to
AnkiWeb.
Reviews
on
1617753628
Merci
on
1613975167
thanks
on
1608738031
Clear and concise questions and answers.
on
1600225780
Nice deck. Covers main concepts!
on
1586692853
Using it a lot! Thanks
on
1584546694
Good!
on
1581520587
Thank you
on
1562757741
Awesome
on
1550469199
Exactly what I was looking for!
on
1547303850
Thank you! Good job!
on
1535528578
Thanks a lot!
on
1524787200
quite a comprehensive deck, with some subtle questions