[JAVA] NLP4J [006-033] 100 language processing knocks with NLP4J # 33 Sahen noun

Return to Index

I'll try.

33. Sahen noun

Extract all the nouns of the s-irregular connection.

Maven

Use the version currently under development.

<dependency>
	<groupId>org.nlp4j</groupId>
	<artifactId>nlp4j-core</artifactId>
	<version>1.1.1.0-SNAPSHOT</version>
</dependency>

Text Data

In the morphological analysis used by default (Yahoo! Japan Developer Network Japanese morphological analysis), the upper limit of the request size is 900KB, and the number of times is limited, so a small text file is used.

one

I am a cat.
There is no name yet.

I have no idea where I was born.
I remember only crying in a dim and damp place.
I saw human beings for the first time here.
Moreover, I heard later that it was the most evil race of human beings called Shosei.
This student is a story that sometimes catches us and boiled and eats.
However, I didn't think anything at that time, so I didn't think it was particularly scary.
It just felt fluffy when it was placed on his palm and lifted up.
It is probably the beginning of what is called a human being that he calmed down a little on his palm and saw the student's face.
The feeling that I thought was strange at this time still remains.
The face, which should be decorated with the first hair, is slippery and looks like a kettle.
After that, I met a cat a lot, but I have never met such a single wheel.
Not only that, the center of the face is too protruding.
Then I sometimes blow smoke from the hole.
It was so throaty that I was really weak.
It was around this time that I finally learned that this is a cigarette that humans drink.


Java Code

package nlp4j.nokku.chap4;

import java.util.List;

import nlp4j.Document;
import nlp4j.DocumentAnnotator;
import nlp4j.DocumentAnnotatorPipeline;
import nlp4j.Keyword;
import nlp4j.crawler.Crawler;
import nlp4j.crawler.TextFileLineSeparatedCrawler;
import nlp4j.impl.DefaultDocumentAnnotatorPipeline;
import nlp4j.index.DocumentIndex;
import nlp4j.index.SimpleDocumentIndex;
import nlp4j.yhoo_jp.YJpMaAnnotator;

public class Nokku31 {
	public static void main(String[] args) throws Exception {
		//Use the text file crawler provided by NLP4J
		Crawler crawler = new TextFileLineSeparatedCrawler();
		crawler.setProperty("file", "src/test/resources/nlp4j.crawler/neko_short_utf8.txt");
		crawler.setProperty("encoding", "UTF-8");
		crawler.setProperty("target", "text");
		//Document crawl
		List<Document> docs = crawler.crawlDocuments();
		//Definition of NLP pipeline (process by connecting multiple processes as a pipeline)
		DocumentAnnotatorPipeline pipeline = new DefaultDocumentAnnotatorPipeline();
		{
			// Yahoo!Annotator using Japan's morphological analysis API
			DocumentAnnotator annotator = new YJpMaAnnotator();
			pipeline.add(annotator);
		}
		//Execution of annotation processing
		pipeline.annotate(docs);
		//Use DocumentIndex to count keywords.
		SimpleDocumentIndex index = new SimpleDocumentIndex();
		//Add documentation
		index.addDocuments(docs);
		List<Keyword> kwds = index.getKeywordsWithoutCount();

		String meishi = null;

//It's not a cool way, but this time I'll simply look for a noun +.
		for (Keyword kwd : kwds) {
			if (kwd.getFacet().equals("noun")) {
				meishi = kwd.getLex();
			} //
			else if (meishi != null && kwd.getLex().equals("To do")) {
				System.err.println(meishi + kwd.getLex());
				meishi = null;
			} //
			else {
				meishi = null;
			}
		}
	}
}

result

Remember
decorate
Protruding

Impressions

It's not a sa verb, but a sa verb.

Summary

With NLP4J, you can easily process natural language in Java!

Project URL

https://www.nlp4j.org/ NLP4J_N_128.png


Return to Index

Recommended Posts

NLP4J [006-033] 100 language processing knocks with NLP4J # 33 Sahen noun
NLP4J [006-034] 100 language processing knocks with NLP4J # 34 "A B"
NLP4J [006-030] 100 language processing knocks with NLP4J # 30 Reading morphological analysis results
NLP4J [006-032] 100 language processing with NLP4J Knock # 32 Prototype of verb
NLP4J [006-034c] 100 language processing knocks with NLP4J # 34 Try to solve "A's B" smarter (final edition)
Introducing NLP4J-[000] Natural Language Processing Index in Java
Christmas with Processing
Try debugging natural language processing on Windows. with VS Code