In the following article, I tried to create TPC-H data, but since I uploaded the file from CentOS with azcopy, I will describe the method. Create a test environment using TPC-H (Synapse SQL pool)
First, create a storage account and container to upload.
In this example, all but resource group
, storage account name
, and replication
are specified by default.
In the network settings, this time I created it without changing anything with the default.
Nothing is set this time.
Click the create button to create a storage account.
After the storage account is created, create the container from the created storage account.
Select + Container
from the screen below.
I created a container called azcopytest
.
IAM settings are required to access Blobs. In addition, azcopy will be performed using the IAM information set here.
If this setting is not made, an error such as 403 This request is not authorized to perform this operation using this permission.
will occur during azcopy, and azcopy will not be possible.
For the role, select the required permissions such as Storage Blob Data Co-Creator
and specify the user to assign IAM.
First, download azcopy with wget.
$ wget https://azcopyvnext.azureedge.net/release20200818/azcopy_linux_amd64_10.6.0.tar.gz
After downloading, unzip and move to the created directory.
$ tar xvfx azcopy_linux_amd64_10.6.0.tar.gz
$ cd azcopy_linux_amd64_10.6.0
You need to log in with azcopy before uploading the file with azcopy.
Confirm the tenant ID because you need to enter the tenant ID when logging in with azcopy. You can check the tenant ID from Azure AD.
You can check it from Tenant Information
after the screen transition.
Log in from CentOS as follows.
$ ./azcopy login --tenant-id "<Tenant ID>"
When you run it, it will open a browser and you will be asked to enter the code from the specified URL, so open the browser and enter the code.
When you enter the specified URL in the browser, the following screen will appear, so enter the code.
If the login is successful, the message "succeeded" will be output as shown below.
Upload to Blob using azcopy's copy command.
$ ./azcopy copy "Local file name" "https://<Storage account name>.blob.core.windows.net/<Container name>"
Also, if you want to upload multiple files, you can specify *
etc.
$ ./azcopy copy "Local directory/*" "https://<Storage account name>.blob.core.windows.net/<Container name>"
Once the upload is complete, you can load the data into the Azure Syanpase Analytics SQL pool using PolyBase etc. The method is included in another article, so please refer to it if you like. I tried to populate the Synapse SQL pool with PolyBase