I have 0 years of programming experience and challenge data processing with python

First, briefly introduce yourself. I started studying data science in May 2020.

・ It is the first time to touch the programming language itself until May 2020 ・ Since Excel is often used for work, it is a level that can handle simple functions.

When I was studying data science, I thought There are few places to practice data processing, which seems to be the most burdensome in practice! !! That is.

Meanwhile, around June, the Data Scientist Association uploaded the optimal issues on GitHub! Quote: General Incorporated Association Data Scientist Association Data Science 100 Knock (Structured Data Processing) https://github.com/The-Japan-DataScientist-Society/100knocks-preprocess

As a first step, I would like to try this 100 knocks with Python, SQL, R without looking at the answer code. As mentioned above, since I am a genuine amateur when it comes to programming, there may be a lot of fucking code, but please take a warm look.

P-001: Display the first 10 items of all items from the data frame (df_receipt) of the receipt details, and visually check what kind of data you have.

`In`



df_receipt.head(10)

Output result: スクリーンショット 2020-09-05 18.40.20.png

P-002: Specify columns in the order of sales date (sales_ymd), customer ID (customer_id), product code (product_cd), and sales amount (amount) from the receipt statement data frame (df_receipt), and display 10 items.

`In`



df_clms = df_receipt[["sales_ymd", "customer_id", "product_cd", "amount"]]
df_clms.head(10)

Output result: スクリーンショット 2020-09-05 18.43.40.png

I will update it when I have time.

Recommended Posts

I have 0 years of programming experience and challenge data processing with python

Full-width and half-width processing of CSV data in Python

Challenge principal component analysis of text data with Python

Image processing with Python (I tried binarizing it into a mosaic art of 0 and 1)

I tried to compare the processing speed with dplyr of R and pandas of Python

Get rid of dirty data with Python and regular expressions

I played with PyQt5 and Python3

Coexistence of Python2 and 3 with CircleCI (1.0)

I compared the speed of Hash with Topaz, Ruby and Python

Recommended books and sources of data analysis programming (Python or R)

Speed comparison of Wiktionary full text processing with F # and Python

I tried to teach Python to those who have no programming experience

Basics of binarized image processing with Python

Data pipeline construction with Python and Luigi

Dealing with "years and months" in Python

I installed and used Numba with Python3.5

Drawing with Matrix-Reinventor of Python Image Processing-

Recommendation of Altair! Data visualization with Python

Example of efficient data processing with PANDAS

I replaced the numerical calculation of Python with Rust and compared the speed

Rehabilitation of Python and NLP skills starting with "100 Language Processing Knock 2015" (Chapter 1)

I measured the speed of list comprehension, for and while with python2.7.

I tried to get and analyze the statistical data of the new corona with Python: Data of Johns Hopkins University

Python practice data analysis Summary of learning that I hit about 10 with 100 knocks

I tried hundreds of millions of SQLite with python

[Python] I introduced Word2Vec and played with it.

I made a competitive programming glossary with Python

[Python] I played with natural language processing ~ transformers ~

I tried Jacobian and partial differential with python

I tried to get CloudWatch data with Python

I tried function synthesis and curry with python

Implementation of TRIE tree with Python and LOUDS

I started machine learning with Python Data preprocessing

I / O related summary of python and fortran

Continuation of multi-platform development with Electron and Python

Practice of creating a data analysis platform with BigQuery and Cloud DataFlow (data processing)

Example of reading and writing CSV with Python

Rehabilitation of Python and NLP skills starting with "100 Language Processing Knock 2015" (Chapter 2 second half)

Rehabilitation of Python and NLP skills starting with "100 Language Processing Knock 2015" (Chapter 2 first half)

Get a large amount of Starbucks Twitter data with python and try data analysis Part 1

I created a stacked bar graph with matplotlib in Python and added a data label

For those who are new to programming but have decided to analyze data with Python

I just wanted to extract the data of the desired date and time with Django

Try to solve the programming challenge book with python3

List of Python libraries for data scientists and data engineers

Notes on HDR and RAW image processing with Python

I want to handle optimization with python and cplex

[OpenCV / Python] I tried image analysis of cells with OpenCV

Easy partial download of mp4 with python and youtube-dl!

[Chapter 5] Introduction to Python with 100 knocks of language processing

Visualize the range of interpolation and extrapolation with python

Overview and tips of seaborn with statistical data visualization

[python] Calculation of months and years of difference in datetime

I checked out the versions of Blender and Python

I made a LINE BOT with Python and Heroku

[Chapter 3] Introduction to Python with 100 knocks of language processing

[Chapter 2] Introduction to Python with 100 knocks of language processing

Python asynchronous processing ~ Full understanding of async and await ~

Process csv data with python (count processing using pandas)

Investigate Java and python data exchange with Apache Arrow

I tried to analyze J League data with Python