Kinesis Data Streams from zero Java experience (3.1)

Tutorial: Real-time analysis of stock data using Kinesis Data Streams I will.

The purpose is to get used to using KCL (Kinesis Consumer Library). Since you will be touching Java in earnest, the preparation part is also careful ... If you think, it is a part that ends with just preparation

(I think it's a characteristic of Java that preparation is difficult ...)

Preparation

The execution environment is assumed to be mac.

Editor installation

Installed Community Edition of InteliJ IDEA (because I was used to PyCharm). Double-click the downloaded one, drag and drop it to the application folder in the opened window, --If you don't mess with the default, Maven plugin will be installed --Featured Plugin is your choice. The author is all in.

Java, Maven installation

Install Homebrew in advance

brew cask install java  #You will be asked to enter the password, so enter it.

#Verification
java -version
> openjdk version "11.0.2" 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)

Check the Java version (currently 11 patterns)

PATH setting

Described below in bash_profile

~/.bash_profile


export JAVA_HOME=`/usr/libexec/java_home -v "11"`
PATH=${JAVA_HOME}/bin:${PATH}

After writing, execute source ~ / .bash_profile. (It seems that Java version switch can be done by shifting the value of -v)

maven installation

brew install maven

#Verification
mvn -version
> Apache Maven 3.6.0 (97c98ec64a1fdfee7767ce5ffb20918da4f719f3; 2018-10-25T03:41:47+09:00)
> Maven home: /usr/local/Cellar/maven/3.6.0/libexec
> Java version: 11.0.2, vendor: Oracle Corporation, runtime: /Library/Java/JavaVirtualMachines/openjdk-11.0.2.jdk/Contents/Home
> Default locale: en_JP, platform encoding: UTF-8
> OS name: "mac os x", version: "10.14.2", arch: "x86_64", family: "mac"

Tutorial source code

There is code in Learning Amazon Kinesis Development ... but honestly it hasn't been updated much.

--master: Correct answer with all code --learning-module-1: For learning to fill in while watching the tutorial It seems that the branch structure is said, but it seems that the updates accompanying the API update are reflected only in the master branch. Therefore, proceed based on the master branch.

git clone https://github.com/aws-samples/amazon-kinesis-learning.git

After this, it will move to an appropriate directory.

Creating a project

Initial setting

Launch IntelliJ

-** Select [Create new project] ** -** Select [Maven] ** tab> ** Check [Create from archetype] **> ** Select [org.apache.maven.archetypes: maven-archetype-quickstart] > ** [Next] ] ** - [group-id] **: Originally something like com.github. {Id} that seems to be unique in the world, but for simplification of code modification com.amazonaws.services.kinesis.samples Leave it as . ** [artifact-id] : Project name (e.g. kinesis-stocktrade-tutorial)> ** [Next] ** - [maven home directory] **: What you put in with brew> ** [Next] **> ** [Finish] ** (/usr/local/Cellar/maven/3.6.0/libexec)

Then the initialization script will run automatically. When completed, you should have a directory like the one below, including the pom file.

image.png

Also, I think there is a pop-up that says ** [Maven projects need to be imported] ** at the bottom right, so select ʻEnable Auto Import`.

Edit pom.xml

In maven, pom.xml describes the required libraries and build procedures. About the library

<dependencies>
  <dependency>
Library
  </dependency>
  <dependency>
Library
  </dependency>
     ...
</dependencies>

It is a form to add. (If you turn on Auto Import, IntelliJ will download it for you.)

The libraries required for the library for this tutorial

(Additions not described in the document, one step down is a dependency) Search for these from https://mvnrepository.com and paste the xml that came out. The latest version of KCL is v2, but this tutorial seems to stop at v1, so I chose v1.9.3. Using the SDK with Apache Maven

It's just annoying. (I wonder if there is something like pip or yarn in Java ...)

Source code placement and modification

From the cloned code, copy src / main / ... / stocktrades and below to the same directory as App.java created by IntelliJ. If groupId is com.amazonaws.services.kinesis.samples, it should have the same structure, so it is OK if you put it so that the structure is maintained.

If you select ** [Build] **> ** [Build Project] ** from the menu bar here, you will get an error in StockTradeRecordProcesser: 28. Because it was due to the IF change

StockTradeRecordProcesser.java


import com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShutdownReason;

Correct the line that gives the error. I'm wondering if there will continue to be warnings like com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker has been deprecated, but this is what KCL needs to change when upgrading to v2, I feel like

Also, if you change the groupId to something else, you can replace the error that occurs by repeating Build. In IntelliJ, click where the red line is drawn> A red miniature bulb will appear, so click => Set package name to ... and it will be corrected automatically.

Execution confirmation

After placing the credentials under ~ / .aws

Write side

Check the function that writes dummy stock trading data to the specified Kinesis. Select stocktrades / writer / StockTradesWriter.java on IntelliJ, select the menu bar ** [Run] **> ** [Run ...] **, and select StockTradesWriter in the pop-up that appears. Select. Then

Usage: StockTradesWriter <stream name> <region>

Will come out. The cause is that nothing is passed as an argument, so again Menu bar ** [Run] **> ** [Edit Configurations] **> ** [StockTradesWriter] **> ** Enter {created Kinesis name} {region} in [Program Arguments] ** After ** [OK] **

image.png

** [Run] ** the Stock Trades Writer again and

INFO: Putting trade: ID 1: BUY 2383 shares of DIS for $114.35
Feb 11, 2019 9:45:47 PM com.amazonaws.services.kinesis.samples.stocktrades.writer.StockTradesWriter sendStockTrade
INFO: Putting trade: ID 2: SELL 1022 shares of JNJ for $93.94
Feb 11, 2019 9:45:47 PM com.amazonaws.services.kinesis.samples.stocktrades.writer.StockTradesWriter sendStockTrade
INFO: Putting trade: ID 3: BUY 7069 shares of WMT for $97.06
Feb 11, 2019 9:45:48 PM com.amazonaws.services.kinesis.samples.stocktrades.writer.StockTradesWriter sendStockTrade
INFO: Putting trade: ID 4: BUY 5939 shares of GOOG for $437.30

If a log like this flows, it is a success.

Reading side

Read the record from Kinesis and check the function that aggregates the transaction information.

For stocktrades / processor / StockTradesProcessor.java, ** [Run] ** as well as the reader

Usage: StockTradesProcessor <application name> <stream name> <region>

Will come out. You can set the same values for streamname and region as on the write side, but be careful because this is the application name, which is ** the table name of DynamoDB used by KCL to manage the state **. If it doesn't exist, it will be created without permission, so I think you can use the same name as streamname. If you set it again like the writing side and ** [Run] **

INFO: Initializing record processor for shard: shardId-000000000057

If a display like this appears, it is a success for the time being.

Add build settings

In order to run Writer and Processor independently, compile them separately so that they can run independently. Add to pom.xml as below.

pom.xml


<project>
  ...
  <build>
    <pluginManagement><!--Manage the version here-->
        ...
        <plugin>
          <artifactId>maven-assembly-plugin</artifactId>
          <version>3.1.1</version>
        </plugin>
      </plugins>
    </pluginManagement>
    <plugins>
      <plugin>
        <!--Create a package (jar) that summarizes execution classes-->
        <!-- assembly-plugin creates a single executable jar including dependency libraries-->
        <artifactId>maven-assembly-plugin</artifactId>
        ...
        <executions>
          <execution>
            <id>build-writer</id>
            <phase>package</phase><!--Generated at the same time when you hit mvn package-->
            <goals>
              <goal>single</goal><!--basic goal of assembly plugin-->
            </goals>
            <configuration>
              <archive>
                <manifest>
                  <!--Main of this jar()Specify the file to be-->
                  <mainClass>com.amazonaws.services.kinesis.samples.stocktrades.writer.StockTradesWriter</mainClass>
                </manifest>
              </archive>
              <descriptorRefs>
                <!--Option to combine projects and externally dependent libraries into one jar-->
                <descriptorRef>jar-with-dependencies</descriptorRef>
              </descriptorRefs>
              <finalName>StockTradesWriter</finalName>
            </configuration>
          </execution>
          <execution>
            <id>build-processor</id>
            <phase>package</phase>
            <goals>
              <goal>single</goal>
            </goals>
            <configuration>
              <archive>
                <manifest>
                  <mainClass>com.amazonaws.services.kinesis.samples.stocktrades.processor.StockTradesProcessor</mainClass>
                </manifest>
              </archive>
              <descriptorRefs>
                <descriptorRef>jar-with-dependencies</descriptorRef>
              </descriptorRefs>
              <finalName>StockTradesProcessor</finalName>
            </configuration>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

After adding, select ** [Project name] **> ** [Lifecycle] **> ** [package] ** from the ** [Maven] ** tab on the right of IntelliJ, and click the ▶ ️ mark to target Directly below

Is generated.

The rest is the same as the operation check with IntelliJ earlier

java -jar StockTradesWriter.jar-jar-with-dependencies.jar {streamname} {region}
#In another tab
java -jar StockTradesProcessor.jar-jar-with-dependencies.jar {appname} {streamname} {region}

I think that you can check the operation by hitting it at the terminal.


So far, I'm going to learn about each class and method that should have been filled in.

Recommended Posts

Kinesis Data Streams from zero Java experience (1)
Kinesis Data Streams from zero Java experience (3.1)
Kinesis Data Streams from zero Java experience (3.2)
Data processing using stream API from Java 8
Use PostgreSQL data type (jsonb) from Java
About CLDR locale data enabled by default from Java 9
Call Java from JRuby
[Java] Data type ①-Basic type
Access API.AI from Java
From Java to Ruby !!
[Java] Main data types
Java basic data types
Java to fly data from Android to ROS of Jetson Nano
Get weather forecasts from Watson Weather Company Data in simple Java
CData Software hands-on (getting kintone data from Java console application)
CData Software Hands-on (Get Twitter data from Java console application)