When automating from data collection to input, correcting half-width and full-width notation fluctuations is one of the issues. Among them, half-width to full-width conversion of katakana alphanumeric characters, which is considered difficult to correct with regular expressions, is performed.
I exchanged data with python using the "clipboard action" of Power Automate Desktop (PAD), and thought about how to correct half-width and full-width notation using the library. In the demo, the half-width and full-width notation of the test data below is converting the address-like data that is messed up to full-width.
Windows10pro 20H2 Power Automate Desktop 2.2.20339.22608 .py file can be executed
Python 3.8.5 pandas 1.1.4 mojimoji 0.0.11
As a Python library, we use a library "mojimoji" that can convert Japanese character strings into half-width and full-width characters at high speed and "Pandas".
Python uses the https://www.python.org/ installer instead of the distribution. The libraries are installed separately. Information for January 2021.
Data is exchanged using the clipboard action of PAD and the function that can input and output Pandas data frames with the clipboard.
Prepare the test data.
number | Full name | Street address |
---|---|---|
1 | Yamada Ai | Minato-ku, Tokyo Ah 1-2 X Building 3F |
2 | Eo Tanaka | 3 Ii-ku, Saitama City, Saitama Prefecture-4 Wybil 1st floor |
3 | Kakiku Nakamura | 102 Set House, 6-3, Uu-ku, Chiba-shi, Chiba |
4 | Keiko Sato | 306, 3-3-3, Ee-cho, Maebashi-shi, Gunma |
5 | Suzuki Sashisu | 2-3, Oomachi, Utsunomiya City, Tochigi Prefecture-4 AAA Building 204C |
Paste the above table data into empty Excel and save it as dummytest.xlsx on your desktop.
You need to install mojimoji and pandas in a python runtime environment that is not venv.
% py -m pip install pandas
% py -m pip install mojimoji
If the clipboard history is turned on, the PAD "Clear Clipboard Contents" action doesn't seem to work, so leave it off. Settings> System> Clipboard.
Get a special folder
Clear the contents of the clipboard
Excel startup path is `` `% SpecialFolderPath%/dummytest.xlsx``` The test data prepared in preparation is specified.
Get the first empty row in a column from an Excel worksheet
Read from Excel Worksheet
Set clipboard text
Write the text to a file I'm writing a Python script here, but the current PAD specification doesn't allow multi-line text to fill in "text to write". Article I wrote before, but I would appreciate it if you could refer to it.
henkan.py
import mojimoji
import pandas as pd
df = pd.read_clipboard()
def zenkaku(x):
return mojimoji.han_to_zen(x)
df["Full-width conversion"] = df.Street address.map(zenkaku)
df.Full-width conversion.to_clipboard(index = None)
Execution of DOS command Execute the python script and store the data frame (table) that has undergone half-width and full-width conversion processing in the clipboard.
Paste the cell into an Excel worksheet The action name is "Paste cell in Excel worksheet", but you can paste the contents of the clipboard into the cell.
Delete files Delete the python script.
Clear the contents of the clipboard
Only this (;'∀')
PowerAutomateDesktopRobin
Folder.GetSpecialFolder SpecialFolder: Folder.SpecialFolder.DesktopDirectory SpecialFolderPath=> SpecialFolderPath
Clipboard.Clear _
Excel.LaunchAndOpen Path: $'''%SpecialFolderPath%/dummytest.xlsx''' Visible: True ReadOnly: False LoadAddInsAndMacros: False Instance=> ExcelInstance
Excel.Advanced.GetFirstFreeRowOnColumn Instance: ExcelInstance Column: 3 FirstFreeRowOnColumn=> FirstFreeRowOnColumn
Excel.ReadCells Instance: ExcelInstance StartColumn: 3 StartRow: 1 EndColumn: 3 EndRow: FirstFreeRowOnColumn - 1 ReadAsText: False FirstLineIsHeader: False RangeValue=> ExcelData
Clipboard.SetText Text: ExcelData
File.WriteText File: $'''%SpecialFolderPath%\\henkan.py''' TextToWrite: $'''import mojimoji
import pandas as pd
df = pd.read_clipboard()
def zenkaku(x):
return mojimoji.han_to_zen(x)
df[\"Full-width conversion\"] = df.Street address.map(zenkaku)
df.Full-width conversion.to_clipboard(index = None)''' AppendNewLine: True IfFileExists: File.IfFileExists.Overwrite Encoding: File.FileEncoding.UTF8
System.RunDOSCommand DOSCommandOrApplication: $'''%SpecialFolderPath%\\henkan.py''' WorkingDirectory: $'''C:\\Users\\Aphrodite\\Desktop''' StandardOutput=> CommandOutput StandardError=> CommandErrorOutput ExitCode=> CommandExitCode
Excel.Advanced.PasteAt Instance: ExcelInstance Column: 4 Row: 1
File.Delete Files: $'''%SpecialFolderPath%\\henkan.py'''
Clipboard.Clear _
Data can be transmitted using the clipboard between PAD and Pandas. One of the challenges (for me) has been solved thanks to the great library of Python. There are many other great libraries, so it seems that python integration can be done in various ways. For that reason, I would like you to modify the text action so that it can use multiple lines in the usual way. (Winautomation is possible) A similar method would be possible for RPA tools that have clipboard-related actions.
If you want to convert manually, you can also use the JIS function of Excel. I chose this method because I don't want to use a format that uses a lot of functions as much as possible, and I thought about realizing it only within PAD using regular expressions, but it didn't work.
I think it's best to be forced at the input stage. (;'∀')
Perform half-width / full-width conversion at high speed with Python
I also tried unicodedata.normalize from Python2 implemented in PAD, but I gave up because the number of characters is strange. Reference Character code hell secret story Episode 3: Unicode normalization with no backtracking