Everyone of R user Hello.
R is convenient, isn't it? R is convenient for using machine learning and various statistical methods, but sometimes I think "I want to write this in python !!".
So, today I would like to write how to write (execute) python in Rstudio, including a memorandum meaning.
reticulate is one of the R packages. -Run python in Rstudio -Installation of python package (module) -Call *** R object with python *** ・ *** Call a python object with R ***
There are four main things you can do, but the bottom two are strong. With this, you can also do "Crawling (data collection) with python and making it into a data frame" >> "Analysis / visualization with R".
Also, R's "View" function is a big advantage, and you can temporarily visually check the hard-to-see pandas data frame, which is a disadvantage of python, with R.
And the most important thing is *** "You don't have to bother to start google colab or annaconda !!!!" ***.
Installing and calling packages is the same as any other in R.
.r
> install.packages("reticulate")
> library(reticulate)
> #python start
> repl_python()
>>> #python started
>>>
You need (should) have python installed to run reticulate. It seems to be specified by reticulate :: use_python (), but I didn't work. .. When I check it, it seems that a different version from the python I installed is used, but I'm not sure. .. (I'm sorry while writing the article !!) Please let me know if you get an error at the above point. ..
That's all there is to preparation. Now let's use python.
*** It's hard to tell whether R or python is used, but the console screen is ・ R is ">" (1) ・ Python is ">>>" (3) It is ***
.r
>repl_python()
>>>
>>> 1 + 1
2
>>> print("python3")
python3
>>> [i for i in range(4)]
[0, 1, 2, 3]
>>> #Use quit to exit python
>>> quit
>
> #Returning to R
it is perfect. Input completion (?) For object names and functions is performed without any problem as in R. (((It's very different from somebody else.
It is extremely difficult (or impossible) to master python with only built-in functions. Let's install the package immediately.
.r
>>> import pandas as pd
ModuleNotFoundError: No module named 'pandas'
Aw. I get an error. The word "as pd" is not mentioned. To use packages and modules with reticulate python ・ Install on reticulate ・ Import with python You need to do. It's complicated.
Let's install after leaving python once.
.r
>>> quit
>
> #py_install()Use
> py_install("pandas")
If you get a completion message, you are successful. Now you can use it in python.
.r
> repl_python()
>>>
>>> import pandas as pd
>>>
>>> #This completes loading
This may not work. For example, "MeCab" used for language processing.
.r
> #R side
> py_install("mecab")
error: one or more Python packages failed to install [error code 1]
If you get this error, it's a bit annoying, but here's what to do: Take the installation method from conda. First, search for the package you want at https://anaconda.org. Then select the platform that suits your environment, open it, and look for a command like this:
conda install -c temporary-recipes mecab-python3
This is a command to be executed from the terminal, but we will use it. ** conda install -c (channel name) (package name) ** Because it is like that
.r
> conda_install(channel = "temporary-recipes", packages = "mecab-python3")
>
# All requested packages already installed.
This completes the installation successfully.
.r
> repl_python()
>>>
>>> import MeCab
>>>
There is no problem reading. I'm not familiar with it so deeply, so I don't know, but please note that the package names are different between conda and python.
This is where reticulate comes into its own.
Use the object created in R in python.
.r
> a <- 1
> repl_python()
>>>
>>> #「r.Can be called with
>>> r.a
1.0
>>> r.a + 1
2.0
R is numeric, python uses int type and float type, and the data type is different, so it seems that some conversion is done.
Of course you can also use data frames.
.r
>>> r.iris
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa
.. ... ... ... ... ...
145 6.7 3.0 5.2 2.3 virginica
146 6.3 2.5 5.0 1.9 virginica
147 6.5 3.0 5.2 2.0 virginica
148 6.2 3.4 5.4 2.3 virginica
149 5.9 3.0 5.1 1.8 virginica
[150 rows x 5 columns]
>>>
Now let's do the opposite. This may be used more frequently.
.r
>>> b = 1
>>> quit
>
> #「py$Call with
> py$b
[1] 1
When calling a data frame with python> R, it seems that index information may be lost, but you can use it as much as you like depending on your ingenuity.
It's such a convenient reticulate package, but there weren't many sites that were put together, so I tried to put it together easily.
Depending on the environment, it may not be usable, !!, etc., but please comment in that case. I don't know much so let's study together lol
As I wrote at the beginning, taking advantage of each of R and python, ** "Crawling with python, collecting / processing data and passing it to R for analysis / visualization" ** It's easy to do.
If you have Rstudio, you don't have to install some unfamiliar snake and start it every time, so it will be a very good tool for those who are using R and who want to start python from now on.
Recommended Posts