Methods that I found useful after reading kaggle's kernel

Recently I started working on kaggle and there was a method to simplify the processing of columns that I had been trying hard to make by hand, so I will summarize it as a memorandum. Only the usage used in the competition I'm doing is summarized briefly, so please jump to the article I referred to for detailed usage.

When you want to display the value you want

In the competition I'm doing this time, the given data existed as train_data and train_label, and there were duplicate items in the two csv. Ultimately, these two data must be merged and given to the model, so duplicate content must be thinned out before being merged.

unique()
Extract the unique values contained in the targeted columns. --isin (``` The value you want to check if it is included``) Check if the value you want to check is included in the DataFrame. The return value is a bool type, and False is returned by default. If you add~ to the beginning, True will be returned. --where (target condition, true, False, option) Perform each process for the index that matches the target conditions. With the option ʻinplace = True, it will be reflected in the original DataFrame. If the 2nd and 3rd arguments are omitted, the corresponding index will be returned.

I want to take multiple targets and perform the same processing, such as grouping by column

--groupby (['first column name you want to group', 'second column name you want to group']) .Process that you want to apply.mean () or its side Calculate the average price of group B that belongs to group A. Use it like this. There will be no duplication of the specified column name.

--agg ({' Column name to be processed': ['What you want to process 1 (min, max, etc.)', What you want to process 2]}) Convenient to use after groupby

Referenced articles

note.nkmk.me CUBE SUGAR CONTAINER

Memorandum of methods useful for organizing columns in DataFrame

Methods that I found useful after reading kaggle's kernel

When you want to display the value you want

I want to take multiple targets and perform the same processing, such as grouping by column

Referenced articles