I copy and use the template when I create a huge file.
--You don't have to look up commands when reading and writing CSV and files. --Progress appears ――It usually takes about 10 minutes to read a huge file, but it is displayed because you can not see the progress and you do not know the progress
The following script is a script for reading CSV
--If you want to mess with a text file, click here (https://github.com/setsumaru1992/portableScripts/blob/master/frequent_use_codes/ruby_file_handlers/read_and_write_file.rb) --If you want to do aggregation processing after seeing all the lines instead of sequential processing of each line, [here](https://github.com/setsumaru1992/portableScripts/blob/master/frequent_use_codes/ruby_file_handlers/read_csv_lines_and_write_result. rb)
require "csv"
$is_debug = true
def main(csv) #, output_file)
# output_file_writer = CSV.open(output_file, "w")
# output_cols = ["hoge"]
# output_file_writer.puts(output_cols)
FileHandler.csv_foreach(csv) do |row|
#processing
p row
# output_row_values = []
# output_file_writer.puts(output_row_values)
end
# puts "#{output_file}is created"
# output_file_writer.close
end
module FileHandler
class << self
def csv_foreach(csv)
log "#{Time.now}: read start #{csv}"
all_line_count = line_count(csv)
return_values = CSV.foreach(csv, headers: true).with_index(1).map do |row, row_no|
log progress(row_no, all_line_count) if progress_timing?(all_line_count, row_no)
yield(row)
end
log "#{Time.now}: read end #{csv}"
return_values
end
def line_count(file)
open(file){|f|
while f.gets; end
f.lineno
}
end
private
def log(message)
puts(message) if $is_debug
end
def progress_timing?(all_line_count, line_no)
return false if all_line_count < 100
#NOTE Change depending on processing time
div_number = 100
percent_unit = all_line_count / div_number
line_no % percent_unit == 0
end
def progress(current_count, all_count)
"#{Time.now}: #{CommonUtilities.percent(current_count, all_count)}% (#{CommonUtilities.number_with_commma(current_count)} / #{CommonUtilities.number_with_commma(all_count)})"
end
end
end
module CommonUtilities
class << self
def percent(num, all_count)
(num.fdiv(all_count) * 100).round(2)
end
def number_with_commma(number)
number.to_s.gsub(/(\d)(?=\d{3}+$)/, '\\1,')
end
end
end
main(ARGV[0])
Recommended Posts