[JAVA] Introduction to Apache Beam (1) ~ Reading and writing text ~

Overall purpose

Create a simple Apache Beam program to understand how it works

Purpose of this time

Create a program that reads a local text file and writes it as is

Main story



Maven : 3.5.2






import org.apache.beam.sdk.Pipeline;
import org.apache.beam.sdk.io.TextIO;
import org.apache.beam.sdk.options.PipelineOptions;
import org.apache.beam.sdk.options.PipelineOptionsFactory;
import org.apache.beam.sdk.values.PCollection;

public class SimpleBeam {
    public static void main(String[] args){
        PipelineOptions options = PipelineOptionsFactory.create();

        Pipeline p = Pipeline.create(options);
        //Read text
        PCollection<String> textData = p.apply(TextIO.read().from("Sample.txt"));
        //Text writing
        //Pipeline run

If the library whose dependency is not resolved, click command + and add it as shown in the image.









The output is as follows, and if the `` wordcounts-. * File is created in the ~ / beamSample` directory, it succeeds.

In other words

This time, there is nothing that seems to stumble, but since I have little understanding of the contents of IntelliJ, I was impatient with an unknown error several times. However, most of the causes were that the dependencies could not be resolved, so I managed to do ʻAdd_Maven`.

from next time

This time, I just moved it, so from the next time onward, I would like to configure a simple Pipeline that also serves as a review of the idea of MapReduce.

