Hello, I'm active with a creator-named "umi-mori", is MORISHIGE. The targets and overview of this article are as follows:
--Target / Target --Researchers involved in machine learning --People engaged in software using machine learning (engineers / designers / project managers / marketers / sales / consultants / managers, etc.) --Overview / Abstract --Introduction of knowledge that you need to know when executing / operating machine learning on a cloud service called Azure.
Also, although the author of this article (I) is qualified for Azure Fundamental (Certificate), there are still many things that I have not learned enough. Therefore, if you have any knowledge such as "There is such a service!" Or "There are also such advantages and disadvantages!", I would appreciate it if you could share it in the comments.
In recent years, the number of technology options has increased, so it has become important for research and development to select technologies properly. Therefore, I would like to first explain "Why use Python?" And "Why use Microsoft Azure?".
I think there are three main reasons to use Python.
In terms of the number of active developers in 2019, it recorded 8.2M in the world, surpassing Java to become the second place. It is probably the most popular language in the field of Machine Learning (ML).
Choosing a language that many people are familiar with is one of the important factors to consider when developing a team. Another feature is that there are many documents (technical documents) because there are many Python users.
[Reference] Machine learning makes Python a programming language that surpasses Java and has the second largest number of developers https://ited.tech/artificial-intelligence/machine-learning-ai-made-python-second-most-used-language-over-java/
A library is a useful function or function that has already been put together. In addition to the standard library, Python is very rich in libraries because machine learning researchers and developers publish external libraries as Open Source Software (OSS). I am. The abundance of libraries leads to a reduction in man-hours, which is a great advantage in both research and service development.
[Reference] 7 recommended Python libraries that are strong in machine learning and artificial intelligence development https://proengineer.internous.co.jp/content/columnfeature/13907
Finally, the amount of code is small. There are trade-offs (slow execution speed, etc.) in this regard, but it leads to the advantages of ease of learning and reduction of man-hours for beginners.
[Reference] Compare the difference in description method with Python / C language https://algorithm.joho.info/programming/python/c-language-kijutsu-hikaku-chigai/
I think there are three main reasons to use Microsoft Azure.
CAPEX is an abbreviation for Capital Expenditure, which means capital expenditure. Usually, when you build a computer server in the laboratory or build an on-premises server to operate software services, the capital expenditure will be large.
If you purchase a high-priced machine at the timing required for research, various procedures such as phase estimation, examination, and depreciation may be involved. However, in the case of the cloud, CAPEX is not required and it can be executed immediately in a high-spec environment.
[Reference] Definition of CAPEX and OPEX in corporate activities https://www.clouderp.jp/blog/what-is-capex-opex
For software services that use machine learning, the elasticity of storage is also considered to be one of the major merits. If the design is such that the amount of learning data increases, adding it to on-premises requires labor and money in terms of maintenance and operation.
However, with the cloud, you can add storage at any time in just a few minutes.
[Reference] Basic knowledge of cloud storage https://2sq7d632aduy7flhh6iaxnby-wpengine.netdna-ssl.com/jp/wp-content/uploads/sites/2/2017/08/CDN-Broc-WP-vol01-r2-web.pdf
The above two were the merits of using the cloud. So what are the advantages over other cloud services such as AWS (Amazon Web Service) and GCP (Google Cloud Platform)?
One of the notable merits is "high cloud compliance". Cloud Compliance is a regulatory standard for cloud use in cloud industry guidelines and domestic and international laws. One of the features of Azure is that it meets more than 90 standards regarding management and protection policies for personal information and compliance with laws and regulations. Therefore, we recommend using Azure for research and services that handle sensitive information such as personal information, biometric information, and medical information.
Also, as an aside, Microsoft services such as Microsoft 365 (Office 365), Dynamics 365, Github, VS Code, etc. High convenience (so-called vertical integration model) due to the cooperation of.
[Reference] Azure compliance https://azure.microsoft.com/ja-jp/overview/trusted-cloud/compliance/
First, as a whole, we classified the services as follows.
A. Services for developers B. Data infrastructure services C. Distributed processing analysis service D. Execution environment service E. Service utilizing AI / ML F. Deploy (public) service
And each service name is summarized as the following "preserved version summary image (figure)". The more specific features and knowledge to be held are summarized below.
Introducing useful services for people (developers and data scientists) who actually develop models for machine learning.
Service name | Overview |
---|---|
1. Azure Machine Learning | A service that is the basis for executing machine learning on the cloud. |
2. Azure Machine Learning Studio | A web portal service for data scientist developers who can build models with GUI. |
3. Azure Notebooks | AservicethatallowsyoutoquicklyexecutecodesuchasPythononthecloud(*2021)/01/Scheduledtoendat21). |
A-1. Azure Machine Learning --A service that is the basis for executing machine learning on the cloud. --Although there are still issues with GUI in fields such as deep learning (DNN) and computer vision, basic models such as regression and clustering can be implemented with GUI. Moreover, it is possible to produce graphs that compare the prediction accuracy of each model. -Of course, it can be implemented in Python (Tensorflow etc. can be imported and coded in Jupyter Notebook etc.). -Can be installed in Azure DevOps and Power BI.
Azure Machine Learning-Official Site https://azure.microsoft.com/ja-jp/services/machine-learning/
A-2. Azure Machine Learning Studio --A web portal service for data scientist developers who can build models with GUI.
Azure Machine Learning Studio-Official Site https://docs.microsoft.com/ja-jp/azure/machine-learning/overview-what-is-machine-learning-studio
A-3. Azure Notebooks --A service that allows you to quickly execute code such as Python on the cloud. --The merit is that there is no need to build an environment. --Free service. --There is a limit of 4 GB of memory and 1 GB of data (as of December 2020). --Public preview (like a trial period) ends on January 21, 2021.
Azure Notebooks-Official Site https://notebooks.azure.com/
[For python beginners] I tried writing python on Microsoft Azure Notebooks. https://qiita.com/Catetin0310/items/2f07e10f056e88e25f5b
How to use GPU in Jupyter Notebook? https://azureai.devpost.com/forum_topics/33099-how-to-use-gpu-in-jupyter-notebook
Introducing data infrastructure services that are indispensable for building machine learning services.
Service name | Overview |
---|---|
1. Azure Synapse Analytics(Azure SQL Data Warehouse) | Data integration/Warehouse/Integrated analytics services such as data analytics. |
2. Azure Data Lake | A data lake service for storing large amounts of raw data. |
3. Azure Data Factory | Serverless data integration service. |
4. Azure Open Datasets | A service that allows you to access open data that is open to the public. |
B-1. Azure SQL Data Warehouse --Integrated analysis services such as data integration / warehouse / data analysis. --An evolved version of Azure SQL Data Warehouse. --Choose from serverless and dedicated resources. --It is possible to manage / analyze multiple databases at once. --Can be executed for terabyte / petabyte data. --Service equivalent to AWS Redshift and GCP BigQuery. --Automatic scaling according to the load is also possible. --Data is stored in Premium Storage (using SSD), so it is expensive but fast processing.
Azure Synapse Analytics-Official Site https://azure.microsoft.com/ja-jp/services/synapse-analytics/
Azure SQL Data Warehouse Changed to Azure Synapse Analytics-Official Site https://azure.microsoft.com/ja-jp/blog/azure-sql-data-warehouse-is-now-azure-synapse-analytics/
Explanation of Azure SQL Data Warehouse [Introduction from the series Azure service Ichikara] https://azure-recipe.kc-cloud.jp/2017/12/sql_data_warehouse_2017adcal/
B-2. Azure Data Lake --A data lake service for storing large amounts of raw data. --Petabyte-class files / Can store and analyze billions of objects.
Azure Data Lake-Official Site https://azure.microsoft.com/ja-jp/solutions/data-lake/
B-3. Azure Data Factory --Serverless data integration service. --Can be linked with Azure Synapse Analytics.
Azure Data Factory-Official Site https://azure.microsoft.com/ja-jp/services/data-factory/
B-4. Azure Open Datasets --A service that allows you to access open data that is open to the public. --Access to weather / satellite images / socioeconomic data / public holidays.
Azure Open Datasets-Official Site https://azure.microsoft.com/ja-jp/services/open-datasets/
Introducing a convenient service for analyzing the collected data.
Service name | Overview |
---|---|
1. Azure HDInsight | A service that provides a distributed processing framework for large-scale data. |
2. Azure Databricks | An Apache Spark-based analytics service that is simpler to use than Azure HDInsight. |
C-1. Azure HDInsight --A service that provides a distributed processing framework for large-scale data. --Supports frameworks such as Apache Hadoop, Spark, Kafka.
Azure HDInsight-Official Site https://azure.microsoft.com/ja-jp/services/hdinsight/
C-2. Azure Databricks --Apache Spark based analytics service. --Simpler to use than Azure HDInsight. --Collaboration function / automatic scaling function etc. are also available.
Azure Databricks-Official Site https://azure.microsoft.com/ja-jp/services/databricks/
Since there are several execution environment options such as virtual machines (VMs), we will introduce the services of the execution environment.
Service name | Overview |
---|---|
1. Azure Virtual Machine | The most common virtual machine service. |
2. Azure DevTest Labs | A service for quickly building virtual machines for developers. |
3. Azure Lab Services | A virtual machine service used for hackathons and education. |
D-1. Azure Virtual Machine --The most common virtual machine service. --Costs vary depending on CPU and GPU specifications.
Azure Virtual Machine Series-Official Site https://azure.microsoft.com/ja-jp/pricing/details/virtual-machines/series/
D-2. Azure DevTest Labs --A service for quickly building virtual machines (VMs) and PaaS resources for developers that are separate from the Production environment. --Since the specifications and number of VMs are also limited, it is possible to omit the approval process and build. --Since it will be temporarily lent to the developer, the instance will be automatically deleted after the rent is completed. --Azure DevTest Labs corresponds to the "test environment", and the "production environment" uses a virtual machine scale set with high load flexibility.
Azure DevTest Labs-Official Site https://docs.microsoft.com/ja-jp/azure/devtest-labs/devtest-lab-overview
D-3. Azure Lab Services --A virtual machine service used for hackathons and education. --Students and participants can access the lab created by the instructor and organizer. ――Scaling according to the participants is also possible.
Azure Lab Services-Official Site https://azure.microsoft.com/ja-jp/services/lab-services/
Introducing services that utilize Artificial Intelligence (AI) / Machine Learning (ML).
Service name | Overview |
---|---|
1. Azure Cognitive Services | letter/image/A service that recognizes voice and the like. |
2. Azure Bot Service | A bot service that answers questions on behalf of humans. |
3. Azure Cognitive Search(Azure Search) | A cloud search service for mobile and web developers utilizing AI. |
E-1. Azure Cognitive Services --A service that recognizes characters / images / sounds. --Easy to install because a learned environment is provided.
Azure Cognitive Services-Official Site https://azure.microsoft.com/ja-jp/services/cognitive-services/
E-2. Azure Bot Service --A bot service that answers questions on behalf of humans. --Text and voice support is possible. --Can be introduced into services such as customer support.
Azure Bot Service-Official Site https://azure.microsoft.com/ja-jp/services/bot-service/
E-3. Azure Cognitive Search(Azure Search) --Cloud search service for mobile and web developers using AI. --Available for natural language stack / visual / language / audio.
Azure Cognitive Search-Official Site https://azure.microsoft.com/ja-jp/services/search/
Introducing the services you should know when publishing a machine learning model built with Python or GUI as a web service or API (Application Programming Interface). Here, we will also consider web frameworks using Python called Flask and Django, and web services built by combining Python with other server-side languages such as NodeJS.
You may want to refer to the article "Choose an Azure compute service for your application" (https://docs.microsoft.com/ja-jp/azure/architecture/guide/technology-choices/compute-decision-tree) and choose a service to deploy depending on how much you want to customize your machine.
Service name | Overview |
---|---|
1. Azure Virtual Machine | The most common virtual machine service. |
2. Azure App Services | A service that allows you to build and deploy web applications and APIs. |
3. Azure Functions | Event-driven serverless computing platform service. |
F-1. Azure Virtual Machine --See [D-1. Azure Virtual Machine](#D-1. Azure Virtual Machine).
F-2. Azure App Services --A service that allows you to build and deploy web applications and APIs. --Although the degree of freedom is lower than that of VM, the minimum necessary to operate the service can be implemented with this service.
Azure App Services-Official Site https://azure.microsoft.com/ja-jp/services/app-service/
F-3. Azure Functions --Event-driven serverless computing platform service. --You can easily deploy the function you want to execute. Can be implemented in various languages (.NET, node.js, Java, etc.).
Azure Functions-Official Site https://azure.microsoft.com/ja-jp/services/functions/
How was that?
We hope this will be an opportunity for people with or without experience in data science or software development using Azure to increase withdrawals and lead to a systematic understanding.
Microsoft is a fast-growing company in Azure / Teams, and I'm looking forward to future developments. If you want to know more about Microsoft hardware and software, please refer to the following articles.
[Reference] 2020 Microsoft product summary! https://note.com/umi_mori/n/n0abe3126e989
Then, please read to the end, thank you. If you find it helpful, I would be grateful if you could follow LGTM / Stock / Twitter!
Next time, it will be an article by @ m-naoki!
-Passing Measures Microsoft Certification AZ-900: Microsoft Azure Fundamentals Text & Questions