The data files and related materials used in this document are the GitHub repository below. https://github.com/wesm/pydata-book A memorandum summarizing the main points
--A book written for the purpose of becoming a good data analyst, and learning the knowledge to do the programming necessary for data analysis with Python. --Python is often used to build websites using web frameworks such as Djnago. One of the most important languages in data science, machine learning and general software development. --With improved support for libraries such as pandas and scikit-learn, Python has become a powerful choice in data analysis. --Python is an interpreted language, so its execution speed is slow. For applications that require low latency or applications that require effective use of resources (for example, high-frequency trading systems), it is more effective to maximize performance in a low-level language such as C ++. How to spend your time. --Python is a difficult language for developing parallel and multithreaded applications (by a mechanism called GIL).
Numpy Provides data structures and algorithms on the basis of numerical calculations in Python Typical examples are ndarray, a high-speed and efficient multidimensional array object, and mathematical operations.
pandas Introduced in 2010. The main object is DataFrame = tabular and columnar data structure It has both the high-performance array calculation function of numpy and the ability to flexibly manipulate data in spreadsheets and relational databases (like SQL). Pandas is one of the main things in this book. Data can be manipulated, prepared, and cleaned.
Matplotlab The most common Python library used for visualization of 2D formats such as graphs. A safe choice as a visualization tool to use by default
** Ipython and Jupyter ** IPython is recommended for use in situations where you edit, run, and try and error In 2014, the IPython web notebook was replaced by the Jupyter Notebook, which now supports more than 40 programming languages. Ipython is used as a kernel for using Python with Jupyter. Jupyter Notebook is a "notebook" for writing code on a web basis. Since the content can be edited with Markdown and HTML, you can create rich documents with a mixture of code and sentences.
SciPy A collection of packages dealing with common problems in the field of scientific computing. By using Numpy and SciPy together, they can be used as a rational and mature computational base and can be applied to many traditional scientific calculations.
scikit-learn At the top of general "machine learning tools". Submodules such as classification, regression, and clustering, cross-validation, preprocessing, etc.
statsmodels A classic statistical analysis package compared to scikit-learn.
--Python 2.x is called "legacy Python", Python 3.x is simply called "Python" -** Manging, Langling … The whole process of manipulating unstructured or messy data into a clean, structured format. - Pseudo code … Explains in a format similar to source code to explain algorithms and processes. - Syntax sugar **… A programming language grammar that does not add new features but makes input convenient.
The first qiita post is a memorandum of technical books Let's edit according to the review Disseminate technology with qiita and post what you want people to see Plus, I want to make a machine learning / AI web application with python Write in your own words as much as possible. Don't seek perfection too much. It is a very good motivation to have people see it, and studying progresses efficiently. I can't write everything, I'll summarize what I want to remember, what I tried to understand, and what I was interested in
Recommended Posts