Path processing with takewhile and dropwhile

If you're looking for something, there's a method like find, you can't use it like filter, I was wondering when to use takewhile and dropwhile

But the other day I felt like I was able to use it well (?) So I'll throw it as a small story

Long introduction

Such a path is a character string

data/output/user/contract.csv

I want any part

Let the part (let's call it base) and the part of ʻuser (let's call it dir) up to ʻoutput.

For the time being, it worked with this implementation

def base = path.split('/')[0..1].join('/') // data/output
def dir = path.split('/')[2]               // user

I want to cut more directories!

data/ver1.0/output/user/contract.csv

Hmm ... I've shifted the subscripts ...

def base = path.split('/')[0..2].join('/') // data/ver1.0/output
def dir = path.split('/')[3]               // user

Let's write the test code for the tool!

This is the dummy data used in the test!

test/data/ver1.0/output/user/contract.csv

i ... if should be used ... haha ...

if (isTest) {
  def base = path.split('/')[0..3].join('/') // test/data/ver1.0/output
  def dir = path.split('/')[4]               // user

} else {
  def base = path.split('/')[0..2].join('/') // data/ver1.0/output
  def dir = path.split('/')[3]               // user
}

No matter how much this implementation is

It's hard to understand with subscript access, and I can't stand changes in the directory structure at all. What's more, changing the subscript with a flag is a sight!

So takewhile / dropwhile!

If you want up to ... / output, take while, If you want the next `after output, dropwhile was fine!

def base = path.split('/').takeWhile {it != 'output'}.join('/').concat('/output') // data/ver1.0/output
def dir = path.split('/').dropWhile {it != 'output'}[1]                           // user

If this is the case, it can be flexibly handled automatically to some extent (although the configuration below ʻoutputdoes not change due to tool design reasons). Even so, I actually wrotetakewhile for the first time, but it doesn't include ʻoutput ... I'm a little disappointed there ...

bonus

Groovy Easy to write, as in the example The method chain is easy to read, and it's easy to write anonymous functions.

Python

path = 'data/ver1.0/output/user/contract.csv'

import itertools

taken_iter = itertools.takewhile(lambda x: x != 'output', path.split('/'))
print '/'.join(list(taken_iter)) + '/output' # data/ver1.0/output

dropped_iter = itertools.dropwhile(lambda x: x != 'output', path.split('/'))
print list(dropped_iter)[1]                  # user

I like Python, but I'm a little disappointed ... Since xxxwhile imports and calls the method, you have to apply list () or join () to the result. If you write it in one line, it will be ....)))) Anonymous functions are also a little harder to read than Groovy and Scala, and it's hard to understand what they're doing overall, so I'm sorry!

Haskell

import Data.List
import Data.List.Split

main = do
    let path = "data/ver1.0/output/user/contract.csv"

    print $ (intercalate "/"  $ (takeWhile (/= "output") $ splitOn "/" path)) ++ "/output"
    print $ (dropWhile (/= "output") $ splitOn "/" path) !! 1

Because there are many things to do, there will be a lot of (...) ... Also, it has nothing to do with the main subject, but why not use it without installing splitOn! I use it quite often!

Haskell2

import Data.List
import Data.List.Split

main = do
    let path = "data/ver1.0/output/user/contract.csv"

    let reversed = reverse $ splitOn "/" path
    print $ intercalate "/" $ reverse $ dropWhile (/= "output") reversed
    print $ last $ takeWhile (/= "output") $ reversed

If it is takeWhile, the essential ʻoutput` is not included in the result, so I tried reversing the elements Is it refreshing?

I also thought about PHP, but it seemed that there was no noticeable difference, so I omitted it, Java seems to be troublesome (prejudice), so I rejected it It's good to know various functions by writing in various languages!

Why does it always get longer with just this small story ... But it's not good to convey convenience!

Also, there is no opinion that "path as a character string" should be divided by "output"!

Recommended Posts

Path processing with takewhile and dropwhile
Image processing with MyHDL
Processing datasets with pandas (2)
Notes on HDR and RAW image processing with Python
Image processing with Python
Parallel processing with multiprocessing
FFT processing with numpy and scipy and low pass filter
With and without WSGI
Image Processing with PIL
[Let's play with Python] Image processing to monochrome and dots
With me, cp, and Subprocess
Image processing with Python (Part 2)
Programming with Python and Tkinter
100 Language Processing with Python Knock 2015
Encryption and decryption with Python
Working with tkinter and mouse
Parallel processing with local functions
Image processing with PIL (Pillow)
"Apple processing" with OpenCV3 + Python3
Python and hardware-Using RS232C with Python-
Acoustic signal processing with Python (2)
Acoustic signal processing with Python
100 image processing knocks !! (001 --010) Carefully and carefully
Parallel processing with Parallel of scikit-learn
Image processing with Python (Part 1)
Image processing with Python (Part 3)
Super-resolution with SRGAN and ESRGAN
group_by with sqlalchemy and sum
python with pyenv and venv
Image expansion and contraction processing
Data processing tips with Pandas
With me, NER and Flair
Works with Python and R
[Python] Image processing with scikit-image
Speed comparison of Wiktionary full text processing with F # and Python
I tried natural number expression and arithmetic processing only with list processing