I want to create a Parquet file even in Ruby

background

Python's pandas and DataFrame.to_parquet are so good that it's a trend that "Python is for handling parquet files". https://pandas.pydata.org/pandas-docs/version/0.22.0/generated/pandas.DataFrame.to_parquet.html#pandas.DataFrame.to_parquet

I found it easy to make it in Ruby, so I'll share it.

manner

You can use the official apache gem. (Note that ≠ red-arrow) https://github.com/apache/arrow/tree/master/ruby/red-parquet

Verification

File creation

gem installation

$ gem install red-parquet

Create test file (csv)

$ echo colA,colB > test.csv
$ echo 1,2  >> test.csv

Conversion process on ruby (csv-> parquet)

$ irb
irb(main):001:0> require "parquet"
=> true
irb(main):002:0> table = Arrow::Table.load("./test.csv")
=> #<Arrow::Table:0x7fbb0d3e6708 ptr=0x7fbb0e0a4010>
	 colA	 colB
0	    1	    2
irb(main):003:0> table.save("./test.parquet")
=> true

Verification

Raise test.parquet to S3 and check with S3 Select

スクリーンショット 2020-06-08 19.08.04.png

did it! !! (He also does type inference ...!)

Remarks

If you read this area, it seems that you can operate files even with Ruby unexpectedly. https://www.slideshare.net/kou/datasciencerb

Recommended Posts

I want to create a Parquet file even in Ruby
[Ruby] I want to put an array in a variable. I want to convert to an array
I want to use arrow notation in Ruby
[Ruby] I want to do a method jump!
I want to be eventually even in kotlin
I want to get the value in Ruby
Even in Java, I want to output true with a == 1 && a == 2 && a == 3 (PowerMockito edition)
I thought about the best way to create a ValueObject in Ruby
I want to use a little icon in Rails
I tried to create a Clova skill in Java
I want to monitor a specific file with WatchService
I want to define a function in Rails Console
I want to click a GoogleMap pin in RSpec
I want to create a generic annotation for a type
Even in Java, I want to output true with a == 1 && a == 2 && a == 3 (black magic edition)
I want to perform high-speed prime factorization in Ruby (ABC177E)
I want to create a form to select the [Rails] category
I want to Flash Attribute in Spring even if I set a reverse proxy! (do not do)
I tried to create a simple map app in Android Studio
I want to create a chat screen for the Swift chat app!
I want to change the value of Attribute in Selenium of Ruby
I want to develop a web application!
I want to write a nice build.gradle
I want to RSpec even at Jest!
I want to write a unit test!
I want to use @Autowired in Servlet
If hash [: a] [: b] [: c] = 0 in Ruby, I want you to extend it recursively even if the key does not exist.
I tried to create an API to get data from a spreadsheet in Ruby (with service account)
[Ruby] I want to output only the odd-numbered characters in the character string
I tried to make a parent class of a value object in Ruby
How to create a query using variables in GraphQL [Using Ruby on Rails]
I want to select multiple items with a custom layout in Dialog
To create a Zip file while grouping database search results in Java
I want to create a dark web SNS with Jakarta EE 8 with Java 11
[Ruby] I want to display posted items in order of newest date
I want to display a PDF in Chinese (Korean) with thin reports
If you want to make a zip file with Ruby, it's rubyzip.
I want to ForEach an array with a Lambda expression in Java
"Teacher, I want to implement a login function in Spring" ① Hello World
I want to sort by tab delimited by ruby
I want to create a Servlet war file with OpenJDK on CentOS7. Without mvn. With no internet connection.
I want to send an email in Java.
Try to create a bulletin board in Java
I want to pass APP_HOME to logback in Gradle
I want to download a file on the Internet using Ruby and save it locally (with caution)
I wanted to make (a == 1 && a == 2 && a == 3) true in Java
I want to simply write a repeating string
I tried to create a LINE clone app
I made a Ruby extension library in C
How to create a theme in Liferay 7 / DXP
I want to design a structured exception handling
rsync4j --I want to touch rsync in Java.
How to easily create a pull-down in Rails
[Xcode] I want to manage images in folders
I tried to create Alexa skill in Java
Create a native extension of Ruby in Rust
[Android] I want to create a ViewPager that can be used for tutorials
Create a Windows desktop application in Ruby and distribute an executable file (.exe)!
I want to be able to read a file using refile with administrate [rails6]
Let's create a TODO application in Java 2 I want to create a template with Spring Initializr and make a Hello world
[Introduction] Try to create a Ruby on Rails application