Pregunta de entrevista
Entrevista de Data Scientist, Analytics
-
MetaWe have a table called ad_accounts(account_id, date, status). Status can be active/closed/fraud. A) what percent of active accounts are fraud? B) How many accounts became fraud today for the first time? C) What would be the financial impact of letting fraud accounts become active (how would you approach this question)?
Respuestas de entrevistas
6 respuestas
A) what percent of active accounts are fraud? Select sum(Case when status = ‘fraud’ then 1 else 0 end)/count(*) as Fraud_percentage from ad_accounts where status ‘closed’; B) How many accounts became fraud today for the first time? select count(*) from ( select account_id, min(date) as First_fraud from ad_accounts where status = 'fraud' group by account_id having First_fraud = current_date() );
Anónimo en
Yep, should be A) what percent of active accounts are fraud? SELECT COUNT(DISTINCT t2.account_id)/COUNT( DISTINCT t1.account_id) AS perc_fraud FROM ad_accounts AS t1 LEFT JOIN ad_accounts AS t2 ON t1.account_id = t2.account_id AND t2.status = 'fraud' AND t2.date > t1.date WHERE t1.status = 'active'
Interviewquery en
For question B, if I assume i have today's data ans yesterday's data in the table, would this work? Select Count (distinct a.Account_id) From ad_accounts A Inner join ad_accounts b On a.account_id=b.account_id Where a.date=current_data and b.date=date_add (‘day’, -1, current_date) And a.status=’fraud’ And b.status!=’fraud’
Yael en
A) what percent of active accounts are fraud? SELECT COUNT(DISTINCT t2.account_id)/COUNT( DISTINCT t1.account_id) AS perc_fraud FROM ad_accounts AS t1 INNER JOIN ad_accounts AS t2 ON t1.account_id = t2.account_id AND t2.status = 'fraud' AND t2.date > t1.date WHERE t1.account_id = 'active' B) How many accounts became fraud today for the first time? SELECT COUNT(DISTINCT t1.account_id) AS fraud_today FROM ad_accounts AS t1 INNER JOIN ad_accounts AS t2 ON t1.account_id = t2.account_id AND t2.status 'fraud' AND t2.date < t1.date WHERE t1.status = 'fraud' AND DATE_TRUNC('day, t1.date) = '2019-04-20'::timestamp
JP en
^ You need to left join
@JP en
^ For the first query, status DIFFERENT than 'closed'.
Anónimo en