Youtube Video commentary is also available.
P-006: From the receipt detail data frame "df_receipt", specify the columns in the order of sales date (sales_ymd), customer ID (customer_id), product code (product_cd), sales quantity (quantity), sales amount (amount), and the following Extract data that meets the conditions. --Customer ID (customer_id) is "CS018205000001" --Sales amount (amount) is 1,000 or more or sales quantity (quantity) is 5 or more
code
df_receipt[['sales_ymd', 'customer_id', 'product_cd', 'quantity', 'amount']] \
.query('customer_id == "CS018205000001" & (amount >= 1000 or quantity >=5)')
output
sales_ymd customer_id product_cd quantity amount
36 20180911 CS018205000001 P071401012 1 2200
9843 20180414 CS018205000001 P060104007 6 600
21110 20170614 CS018205000001 P050206001 5 990
68117 20190226 CS018205000001 P071401020 1 2200
72254 20180911 CS018205000001 P071401005 1 1100
**-In Pandas DataFrame / Series, it is a method to check the rows that meet multiple conditions among the specified rows while specifying the columns.
-Use this when you want to narrow down the column information, specify the row, and check the information that meets multiple conditions.
-The or condition is expressed using the "|" pipeline (vertical bar).
ยท' [['
** * By the way, even if "|" is changed to "or" as shown in the code below, the same result will be obtained. ** **
df_receipt[['sales_ymd', 'customer_id', 'product_cd', 'quantity', 'amount']]
.query('customer_id == "CS018205000001" & (amount >= 1000 or quantity >=5)')
Recommended Posts