I'll try.
Extract all the nouns of the s-irregular connection.
Maven
Use the version currently under development.
<dependency>
<groupId>org.nlp4j</groupId>
<artifactId>nlp4j-core</artifactId>
<version>1.1.1.0-SNAPSHOT</version>
</dependency>
Text Data
In the morphological analysis used by default (Yahoo! Japan Developer Network Japanese morphological analysis), the upper limit of the request size is 900KB, and the number of times is limited, so a small text file is used.
one
I am a cat.
There is no name yet.
I have no idea where I was born.
I remember only crying in a dim and damp place.
I saw human beings for the first time here.
Moreover, I heard later that it was the most evil race of human beings called Shosei.
This student is a story that sometimes catches us and boiled and eats.
However, I didn't think anything at that time, so I didn't think it was particularly scary.
It just felt fluffy when it was placed on his palm and lifted up.
It is probably the beginning of what is called a human being that he calmed down a little on his palm and saw the student's face.
The feeling that I thought was strange at this time still remains.
The face, which should be decorated with the first hair, is slippery and looks like a kettle.
After that, I met a cat a lot, but I have never met such a single wheel.
Not only that, the center of the face is too protruding.
Then I sometimes blow smoke from the hole.
It was so throaty that I was really weak.
It was around this time that I finally learned that this is a cigarette that humans drink.
Java Code
package nlp4j.nokku.chap4;
import java.util.List;
import nlp4j.Document;
import nlp4j.DocumentAnnotator;
import nlp4j.DocumentAnnotatorPipeline;
import nlp4j.Keyword;
import nlp4j.crawler.Crawler;
import nlp4j.crawler.TextFileLineSeparatedCrawler;
import nlp4j.impl.DefaultDocumentAnnotatorPipeline;
import nlp4j.index.DocumentIndex;
import nlp4j.index.SimpleDocumentIndex;
import nlp4j.yhoo_jp.YJpMaAnnotator;
public class Nokku31 {
public static void main(String[] args) throws Exception {
//Use the text file crawler provided by NLP4J
Crawler crawler = new TextFileLineSeparatedCrawler();
crawler.setProperty("file", "src/test/resources/nlp4j.crawler/neko_short_utf8.txt");
crawler.setProperty("encoding", "UTF-8");
crawler.setProperty("target", "text");
//Document crawl
List<Document> docs = crawler.crawlDocuments();
//Definition of NLP pipeline (process by connecting multiple processes as a pipeline)
DocumentAnnotatorPipeline pipeline = new DefaultDocumentAnnotatorPipeline();
{
// Yahoo!Annotator using Japan's morphological analysis API
DocumentAnnotator annotator = new YJpMaAnnotator();
pipeline.add(annotator);
}
//Execution of annotation processing
pipeline.annotate(docs);
//Use DocumentIndex to count keywords.
SimpleDocumentIndex index = new SimpleDocumentIndex();
//Add documentation
index.addDocuments(docs);
List<Keyword> kwds = index.getKeywordsWithoutCount();
String meishi = null;
//It's not a cool way, but this time I'll simply look for a noun +.
for (Keyword kwd : kwds) {
if (kwd.getFacet().equals("noun")) {
meishi = kwd.getLex();
} //
else if (meishi != null && kwd.getLex().equals("To do")) {
System.err.println(meishi + kwd.getLex());
meishi = null;
} //
else {
meishi = null;
}
}
}
}
Remember
decorate
Protruding
It's not a sa verb, but a sa verb.
With NLP4J, you can easily process natural language in Java!
https://www.nlp4j.org/
Recommended Posts