Preguntas de entrevista de Data mining engineer

56

Preguntas de entrevista para Data Mining Engineer compartidas por los candidatos

Principales preguntas de entrevista

Ordenar: Relevancia|Popular|Fecha
Yelp
A un Search and Data Mining Engineer le preguntaron...6 de junio de 2016

List the strings that are anagrams from a set of strings?

2 respuestas

Sorting the strings is not optimal because each sort is O(N log N) where N is the number of characters in each word. A more optimal solution is to create a function to encode each word as a hash table of character frequencies, which is O(N) for each word. Menos

sort the strings and compare

Adobe

How would you design a recommendation system (like amazon)?

2 respuestas

Use collaborate filtering to compare personal preference with others. If A and B are similar, we can recommend preferred items in B to A. Menos

Why downvote on other answer? He/she is right. Collaborative filtering is the most common strategy for recommendation systems. You see user A buys these things and user B also bought those things but user B bought this other thing too so let's show that thing to User A. Menos

LinkedIn

Implement a sampling function with nominal distribution.

2 respuestas

I think you mean Normal distribution! If you are using R use set.seed(). You can then use rnorm() with size, mean & SD. e.g. >set.seed(123) >rnorm(100, 2, 5) Menos

I'm the original poster, sorry for my typo. I actually mean multinomial distribution. And the advanced question was, if the probability is a skewed distribution, how would you speed up your algorithm. You can find both answer from Wikipedia. :) Menos

LinkedIn

Only one easy/medium leetcode question during the coding module.

1 respuestas

I got the optimal solution (with a couple nudges but time to spare), yet apparently this was the only module where I did not "meet expectations." Shame that some presumably small mistake in my first hour was enough to discount the otherwise very strong 6 hour interview. Menos

Yelp

Hacker Rank: remove 2 or more e's from a string.

1 respuestas

just use replace function of string

Yelp

longest palindrome of a string

1 respuestas

there is a O(n) algorithm which I think is what they want because I correctly coded both the O(n^2) and O(n^3) and got rejected later. Menos

Ping An Insurance

Difference between l1 and l2 regularization.

1 respuestas

Mathematically speaking, it adds a regularization term in order to prevent the coefficients to fit so perfectly to overfit. The difference between the L1 and L2 is just that L2 is the sum of the square of the weights, while L1 is just the sum of the weights Menos

Adobe

Design a recommendation system???

1 respuestas

It depends on the volume of data that we have. Assuming there is a lot of data on hand, it is best to use a Collaborative filtering. This involves finding similar users/items for whom we are recommending products and implement a weighted average of their likeliness to the product to help make a decision on recommending the product. This could be implemented as a user-user collaborative filtering where we find similar users or an item-item collaborative filtering. If we have fewer data to work with, it is a better idea to implement a Content-based filtering approach where we create profiles for the users and try to recommend products based on the features of the user profiles. Menos

VK

What feature you can propose for Odnoklassniki social network?

1 respuestas

I proposed the joint purchasing and joint fitness recommendation services for social network users. The interviewers were impressed by this idea. Menos

LinkedIn

Find the point where the sum of distance to all other points is minimized.

1 respuestas

The closest point to the mean of all the points.

Viendo 1-10 de 56 preguntas de entrevista

Consultar preguntas de entrevista para empleos similares