Machine Learning ML, Statistics, Data Science. All of these words are often heard due to the rapid spread of AI in recent years, but what is the difference between them? There is no definite answer, but to put it simply, ** data science is a field that combines specialized fields such as statistics, computer science, and business **. It will be easier to understand if you look at the figure below. Currently, data scientists are active in various fields, and their adoption is expanding in various companies, from insurance companies to human resources companies and consulting firms. In all areas, it is common in that analysis is performed using data and specialized knowledge, and the analysis results are explained. On the other hand, machine learning can be thought of as the intersection of statistics and computer science in data science. The ** data-based problem-solving points in statistics and the computer science ** algorithms-solving problems ** points also apply to machine learning. On the other hand, in machine learning, we focus on ** how well the results apply **, in statistics we focus on ** mathematical correctness **, and in computer science ** how fast and accurately we process. I will mainly focus on whether it can be done **.
Machine learning methods are broadly classified into three types: "supervised learning," "unsupervised learning," and "reinforcement learning." Both "supervised learning" and "unsupervised learning" have in common that they train a given data on a machine to obtain output. On the other hand, in "supervised learning", ** the given data is accompanied by a label indicating whether or not the answer is correct in advance by humans **, whereas in "unsupervised learning", ** the given data is Since it does not contain information on whether or not the answer is correct **, the classification determined by the machine is the final output. In "reinforcement learning", evaluation (reward) is given to the output output from the input data, and learning is advanced accordingly. The application of "reinforcement learning" is relatively limited, such as walking robots and shogi games. Supervised / unsupervised learning is further divided into "classification problems" and "regression problems". In the "classification problem", the classification result of 1 or 0 is output as an output, while in the "regression problem", continuous numerical data is output as an output. There are various models for calculating output data from input data.
Recommended Posts