What happens when you read two data in Pandas and combine them?
--Reading data
import pandas as pd
sample001 = pd.read_excel("sample_excel_001.xlsx")
sample001.head()
sample002 = pd.read_excel("sample_excel_002.xlsx")
sample002.head()
Two data were loaded into the data frame (sample001, sample002). It looks like you can combine the two data in the "data001" column.
--Data join
merge_data = pd.merge(sample001, sample002, on="data001", how="left")
merge_data.head()
It's easy to see that "data001" is the join keyword string and "data003" is the sample002 data. I'm talking about "data002_x" and "data002_y". You're asking, "Where are you from?" (No, you really know? The data frame specified in the first argument of merge has x) I don't like to see it, so at least I want to be able to quickly understand where the data came from.
You can use the suffixes option to specify a string to add to the end of the new column name when the column name is duplicated except for the join key.
merge_data_new = pd.merge(sample001, sample002, on="data001", how="left", suffixes=[".sample001", ".sample002"])
merge_data_new.head()
Oh, now you can easily tell where the data came from!
I wish I could, but I wish I could add a string at the beginning instead of the end. (Because it can be SQL-like)