Questions from leetcode
1. Customers placing the largest number of orders
=> 단순하게 MAX(order_number)라고 했는데 여기서 order_number는 주문번호이기에 크기 비교용이 아님. 따라서 각 customer마다 몇 건의 order record가 있는지를 count해서 그 빈도가 가장 큰 사람의 customer_id를 제시해야함.
SELECT customer_number FROM orders GROUP BY customer_number ORDER BY COUNT(*) DESC LIMIT 1; |
SELECT customer_number FROM orders GROUP BY customer_number HAVING COUNT(customer_number) >= all (SELECT COUNT(customer_number) FROM orders GROUP BY customer_number); |
COUNT(*): count(*) / count(1) 과 count(변수이름) 은 NULL 값을 포함하는 여부가 다르다.
count(*) 과 count(1) 은 NULL 값인 컬럼도 포함하지만, count(name) 은 NULL 값인 컬럼은 포함하지 않는다.
ALL/ANY: The operator must be a standard comparison operator (=, <>, !=, >, >=, <, or <=).
SELECT column_name(s) FROM table_name WHERE column_name operator ALL (SELECT column_name FROM table_name WHERE condition); |
HAVING: HAVING에서는 위의 SELECT파트에서 만든 새로운 변수 이름을 쓸 수 없다.
*Order of clause: SELECT - FROM - WHERE - GROUP BY - HAVING - ORDER BY
Follow-up: What if more than one customer has the largest number of orders, can you find all the customer_number in this case?
SELECT customer_number FROM Orders GROUP BY customer_number HAVING COUNT(order_number) = ( SELECT COUNT(order_number) AS order_cnt FROM Orders GROUP BY customer_number ORDER BY 1 DESC LIMIT 1) |
order_number의 빈도수가 데이터에서 나오는 최고 값과 같은 모든 customer_number를 리턴함
2. Several friends at a cinema ticket office would like to reserve consecutive available seats.
Can you help to query all the consecutive available seats order by the seat_id using the following cinema table?
=> Idea: Self-join을 활용하여 풀어보자
SELECT DISTINCT a.seat_id FROM cinema a JOIN cinema b ON ABS(a.seat_id - b.seat_id) = 1 AND a.free = true and b.free = true ORDER BY a.seat_id; SELECT DISTINCT seat_id FROM cinema WHERE free = 1 AND (seat_id - 1 IN (SELECT seat_id FROM cinema WHERE free = 1) OR seat_id + 1 IN (SELECT seat_id FROM cinema WHERE free = 1)) ORDER BY seat_id; |
To find the consecutive available seats, the value in the a.seat_id should be more(or less) than the value b.seat_id, and both of them should be free.
seat_id free seat_id free
1 | 1 | 1 | 1 |
2 | 0 | 1 | 1 |
3 | 1 | 1 | 1 |
4 | 1 | 1 | 1 |
5 | 1 | 1 | 1 |
*ABS() = 1 을 ON의 조건으로 걸 수 있다는 점!
*알리아스.변수명 = 변수값 이렇게 쓸 수 있다는 점.
*DISTINCT를 안쓸 경우 4 - 3, 4 - 5 모두가 잡혀서 좌석은 4 하나인데 레코드가 두 개 잡히게 됨.
*IN: IN is a logical operator in SQL that allows you to specify a list of values that you'd like to include in the results.
ex) SELECT *
FROM tutorial.billboard_top_100_year_end
WHERE "group" IN ('M.C. Hammer', 'Hammer', 'Elvis Presley')
3. #1511 Customer Order Frequency
Write an SQL query to report the customer_id and name of customers who have spent at least $100 in each month of June and July 2020.
* Table1: Customers (customer_id | name | country)
* Table2: Product (product_id | description | price)
* Table3: Orders (order_id | customer_id | product_id | order_date | quantity)
SELECT customer_id, name FROM Customers JOIN Orders USING (customer_id) JOIN Product USING (product_id) GROUP BY customer_id HAVING SUM(IF(LEFT(order_date, 7)='2020-06', quantity, 0) * price)>=100 AND SUM(IF(LEFT(order_date,7)='2020-07', quantity, 0)*price)>=100 |
JOIN하는 두 테이블의 Column 이름이 같으면 USING 사용 가능 (간단), 다르면 ON 사용해야함
여러개의 JOIN을 연달아 하기 가능
문제에서 >=$100라는 조건을 주었는데 이는 SUM값으로 구할 수 있다. 중요한 부분은 GROUP BY가 HAVING보다 먼저온다는 것! month함수를 쓰기보다 LEFT로 간단하게 제약을 주었고, quantity 또는 0으로 equation을 간결하게 한 것이 멋진 쿼리이다.
4. #1322. Ads Performance
A company is running Ads and wants to calculate the performance of each Ad.
Performance of the Ad is measured using Click-Through Rate (CTR) where:
Write an SQL query to find the CTR of each Ad. Round CTR to two decimal points.
Return the result table ordered by CTR in descending order and by ad_id in ascending order in case of a tie.
SELECT
for ad_id = 3, ctr = (1/(1+1)) * 100 = 50.00 for ad_id = 5, ctr = 0.00, Note that ad_id = 5 has no clicks or views. Note that we do not care about Ignored Ads.
SELECT ad_id, IFNULL(ROUND(SUM(CASE WHEN action='Clicked' THEN 1 ELSE 0 END)/ SUM(CASE WHEN action='Clicked' OR action='Viewed' THEN 1 ELSE 0 END)*100 ,2),0) AS ctr FROM Ads GROUP BY ad_id ORDER ctr DESC |
*IFNULL statement: SELECT IFNULL(Column명, "Null일 경우 대체 값") FROM 테이블명;
예제: "SELECT keyword, IFNULL( common_rep, IFNULL( fur_rep, makeup_rep ) ) as syn_2 FROM syn_color";
출처: https://solbel.tistory.com/1074 [개발자의 끄적끄적]
*COALESCE는 지정한 표현식들 중에 NULL이 아닌 첫 번째 값을 반환한다.
*SUM(action='Clicked') 같이 category 변수의 SUM? COUNT와의 차이는? ENUM type이라 index를 사용한다고 하는데, 안전하게 CASE WHEN 사용하는 것으로!
5. #1083. Sales Analysis II
Write an SQL query that reports the buyers who have bought S8 but not iPhone. Note that S8 and iPhone are products present in the Product table. Return the result table in any order.
The query result format is in the following example.
-- Product: product_id | product_name | unit_price
-- Sales: seller_id | product_id | buyer_id | sale_date | quantity | price
SELECT buyer_id FROM Sales LEFT JOIN Product USING(product_id) GROUP BY buyer_id HAVING SUM(CASE WHEN product_name = 'S8' THEN 1 ELSE 0 END) > 0 AND SUM(CASE WHEN product_name ='iPhone' THEN 1 ELSE 0 END) = 0; |
6. #610. Triangle Judgement
SELECT x, y, z, CASE WHEN x+y > z AND x+z > y AND z+y>x THEN 'Yes' ELSE 'No' END AS 'triangle' FROM triangle |
7. #607. Sales Person
Given three tables: salesperson, company, orders.
Output all the names in the table salesperson, who didn’t have sales to company 'RED'.
salesperson: sales_id | name | salary | commission_rate | hire_date
company: com_id | name | city
orders: order_id | order_date | com_id | sales_id | amount
SELECT name FROM salesperson WHERE sales_id NOT IN (SELECT sales_id FROM orders JOIN company USING(com_id) WHERE name = 'red') |
*NOT: NOT is a logical operator in SQL that you can put before any conditional statement to select rows for which that statement is false.
ex) SELECT * FROM tutorial.billboard_top_100_year_end WHERE year = 2013 AND year_rank NOT BETWEEN 2 AND 3
=> In the above case, you can see that results for which year_rank is equal to 2 or 3 are not included. (No 2, No 3)
https://mode.com/sql-tutorial/sql-not-operator/
SQL NOT | Basic SQL - Mode
Starting here? This lesson is part of a full-length tutorial in using SQL for Data Analysis. Check out the beginning. In this lesson we'll cover: The SQL NOT operator NOT is a logical operator in SQL that you can put before any conditional statement to sel
mode.com
8. #1303. Find the Team Size
Write an SQL query to find the team size of each of the employees.
Return result table in any order. The query result format is in the following example.
Employee: employee_id | team_id
SELECT employee_id, team_size FROM Employee JOIN (SELECT team_id, COUNT(*) AS team_size FROM Employee GROUP BY team_id) team USING(team_id) |
SELECT employee_id, COUNT(*) OVER (PARTITION BY team_id) AS team_size FROM Employee |
*AS 다음에 소문자하고 _ 로만 변수명을 하면 ' 안해도 됨
*team_id 별로 COUNT 한 것을 team_size라고 하자는 뜻 (window function)
9. #511. Game Play Analysis I
Write an SQL query to report the first login date for each player. Return the result table in any order.
Activity: player_id | device_id | event_date | games_played
SELECT player_id, MIN(event_date) AS first_login FROM Activity GROUP BY player_id |
10. #1068. Product Sales Analysis I
Write an SQL query that reports the product_name, year, and price for each sale_id in the Sales table.
Sales: sale_id | product_id | year | quantity | price
Product: product_id | product_name
SELECT product_name, year, price FROM Sales LEFT JOIN Product USING (product_id) |
11. #1251. Average Selling Price
Write an SQL query to find the average selling price for each product. average_price should be rounded to 2 decimal places.
Prices: product_id | start_date | end_date | price
UnitsSold: product_id | purchase_date | units
SELECT u.product_id, ROUND(SUM(units * price)/SUM(units),2) AS average_price FROM UnitsSold u LEFT JOIN Prices p ON u.product_id = p.product_id AND u.purchase_date >= p.start_date AND u.purchase_date <= p.end_date GROUP BY 1 |
SELECT u.product_id, ROUND(SUM(u.units*p.price)/SUM(u.units),2) AS average_price FROM UnitsSold u JOIN Prices p ON u.product_id=p.product_id AND (u.purchase_date BETWEEN p.start_date AND p.end_date) GROUP BY product_id; |
*GROUP BY에서 product_id는 공통된 이름이어서 하나의 칼럼 => no need to specify
*JOIN을 주로 사용하는데 LEFT JOIN을 사용안하는 이유는?
12. #1113. Reported Posts
Write an SQL query that reports the number of posts reported yesterday for each report reason. Assume today is 2019-07-05. Return the result table in any order.
Actions: user_id | post_id | action_date | action | extra
SELECT extra AS report_reason, COUNT(extra) AS report_count FROM Actions WHERE action_date = '2019-07-04' AND action = 'report' GROUP BY extra |
SELECT extra AS report_reason, COUNT(DISTINCT post_id) as report_count FROM Actions WHERE action_date = '2019-07-04' AND action = 'report' GROUP BY extra |
** 왼쪽 내 코드에서의 실수: 한 포스팅당 action을 여러개 할 수 있다는 점을 간과함 (ex. user_id=1이 작성한 post_id=1에서 3가지 action이 발생하여 row가 3개가 됨 (view/like/share). 데이터를 꼼꼼히 살펴봐야함.
13. #613. Shortest Distance in a Line
Write an SQL query to report the shortest distance between any two points from the Point table.
Point: x
SELECT MIN(ABS(a.distance)) AS shortest FROM (SELECT p1.x - p2.x AS distance FROM Point p1 JOIN Point p2 ON p1.x != p2.x) a |
*Alias is required! 이름붙이기 싫어도 붙이도록.
*처음에 p1와 p2가 다른 조건에서 붙이라는 것을 빼먹었었다.
14. #1084. Sales Analysis III
Write an SQL query that reports the products that were only sold in the spring of 2019. That is, between 2019-01-01 and 2019-03-31 inclusive.
Product: product_id | product_name | unit_price
Sales: seller_id | product_id | buyer_id | sale_date | quantity | price
SELECT product_id, product_name FROM Sales JOIN Product Using(product_id) GROUP BY product_id HAVING MIN(sale_date) >= '2019-01-01' AND MAX(sale_date) <= '2019-03-31' |
'Coding 공부기록' 카테고리의 다른 글
[SQL] Leetcode easy 문제 뿌시기 (0) | 2021.10.27 |
---|---|
[Product Sense] 채용공고를 기반으로 공부하기(1) - 토스(Toss) Data analyst (0) | 2021.10.27 |
[D-11] things to improve (0) | 2021.10.25 |
[Metric Change] increased click on a search result (0) | 2021.10.21 |
[LC] Data Structure Day1: contains duplicate (0) | 2021.10.07 |