Data Science Questions Asked in Microsoft :
1. How best to select a representative sample of search queries from 5 million?
2. Three friends in Seattle told you it’s rainy. Each has a probability of 1/3 of lying. What’s the probability of Seattle is rainy?
3. Can you explain the fundamentals of Naive Bayes? How do you set the threshold?
4. Can you explain what MapReduce is and how it works?
5. Can you explain SVM?
6. How do you detect if a new observation is outlier? What is a bias-variance trade off ?
7. Discuss how to randomly select a sample from a product user population.
8. How do you implement autocomplete?
9. Describe the working of gradient boost.
10. Find the maximum of sub sequence in an integer list.
11. What would you do to summarize a twitter feed?
12. Explain the steps for data wrangling and cleaning before applying machine learning algorithms.
13. How to deal with unbalanced binary classification?
14. How to measure distance between data point?
15. Define variance.
16. What is the difference between box plot and histogram?
17. How do you solve the L2-regularized regression problem?
18. How to compute an inverse matrix faster by playing around with some computational tricks?
19. How to perform a series of calculations without a calculator. Explain the logic behind the steps.
20. What is a difference between good and bad Data Visualization?
21. How do you find percentile? Write the code for it.
22. Find max sum subsequence from a sequence of values.
23. What are the different regularization metrics L1 and L2?
24. Create a function that checks if a word is a palindrome.
25. Merge k (in this case k=2) arrays and sort them.
- My Guess:
Ans 1. Sort them alphabetically and then select the number of items at regular interval. Like 1 from each 10,000.
- Ans 2. 1-1/3*1/3*1/3 = 1-1/27 ~= 96% ?
- Ans 4. split the data into multiple parts of equal size. calculate each part separately. then merge the result in a final go.
- Ans 6. outlier = deviation from mean > standard deviation * 3
3-Apr-2018 11:53 pm