I couldn't find the Japanese article of v0.11.0, so I have a memorandum.
This article is __for beginners __. Unity beginners imitate one of the official tutorials on ML-Agents do it It is one of the machine learning, __ Reinforcement Learning __.
__ I will make something like this. : arrow_up: __
It's for people who haven't done machine learning yet, although they know how to make Unity work easily. Rather than focusing on theory, we are introducing it so that you can experience it while moving your hands.
* This article is current as of November 13, 2019. </ b> ML-Agents are undergoing rapid version upgrades, so always check for the latest information. ~~ [Book published last year](https://www.amazon.co.jp/Unity%E3%81%A7%E3%81%AF%E3%81%98%E3%82%81%E3%82 % 8B% E6% A9% 9F% E6% A2% B0% E5% AD% A6% E7% BF% 92% E3% 83% BB% E5% BC% B7% E5% 8C% 96% E5% AD% A6 % E7% BF% 92-Unity-ML-Agents% E5% AE% 9F% E8% B7% B5% E3% 82% B2% E3% 83% BC% E3% 83% A0% E3% 83% 97% E3 % 83% AD% E3% 82% B0% E3% 83% A9% E3% 83% 9F% E3% 83% B3% E3% 82% B0-% E5% B8% 83% E7% 95% 99% E5% B7% 9D-% E8% 8B% B1% E4% B8% 80 / dp / 48624648181) didn't help ~~ (Transition of this year ⇒ January 2019: * v0.6 * ➡ April: * v0.8 * ➡ October: * v0.10 * ➡ As of November: * v0.11 *)
Here are some essential words for doing machine learning in Unity. That is __ "Academy", "Brain", and "Agent" __.
Basically, in the environment defined by "Academy" in Unity, "Brain" controls the actions taken by "Agent". This time, we will perform reinforcement learning via an external TensorFlow (Python framework), and load the generated neural network model in Unity and execute it. (This is a simple tutorial, so I won't touch the Academy very much.)
__ If you are new to this, you can skip it. __ I used v0.8x and v0.9x, but I'm not sure because I can't find Brain Parameters, but if you're just looking here, maybe it's okay.
-* Broadcast Hub * is abolished. -* Brain Scriptable Objects * is abolished. ⇒ Change to * Behavior Parameters * </ b> item -Major setup change of * Visual Observation *. --Renewed definition of gRPC. --Abolition of online BC training.
Please install the following first.
- Unity5 </ b> (ver is 2017.4 migration, I think there is no problem)
Roller Ball
.ml-agents-master \ UnitySDK \ Assets
ML-Agents
folder into your project.
-* 3D Object> Plane * to create a plane.
--Name the * Plane * you created as Floor
.
--* Transform * of Floor
Position = (0, 0, 0)
Rotation = (0, 0, 0)
Scale = (1, 1, 1)
To
-Play with * Element * of * Inpector *> * Materials * to make it look like you like.
-* 3D Object> Cube * to create a cube.
--Name the created * Cube * to Target
.
--* Transform * of Target
Position = (3, 0.5, 3)
Rotation = (0, 0, 0)
Scale = (1, 1, 1)
To
--Similar to Floor
, you can change the appearance to your liking.
-* 3D Object> Sphere * to put out a sphere.
--Name the * Sphere * you created as RollerAgent
.
--* Transform * of RollerAgent
Position = (0, 0.5, 0)
Rotation = (0, 0, 0)
Scale = (1, 1, 1)
To
――As before, change the appearance to your liking.
If you want it to look like a ball, choose the CheckerSquare
material.
-Add * Rigidbody * from * Add Component *.
-* Create Empty * will bring out an empty * GameObject *. --Name the * GameObject * you created to ʻAcademy`.
Next, I will describe the contents in C #.
-With ʻAcademyselected in the * Hierarchy * window, use * Add Component-> New Script * to create a script named
RollerAcademy.cs. --Rewrite the contents of
RollerAcademy.cs` to the following. You can erase the original contents.
RollerAcademy.cs
using MLAgents;
public class RollerAcademy : Academy{ }
In this description, Basic functions such as "observation-decision-action-action" (omitted here) are inherited from the * Academy * class to the * RollerAcademy * class. So it's okay with two lines.
Select RollerAgent
in the * Hierarchy * window and select
Create a script named RollerAgent.cs
with * Add Component-> New Script *.
Rewrite the contents of RollerAgent.cs
as follows.
RollerAgent.cs
using MLAgents;
public class RollerAgent : Agent{ }
Like * Academy *, it reads the namespace * MLAgents * and specifies * Agent * as the base class to inherit it.
This is the basic procedure for incorporating ML-agents into __Unity. Next, we will add a mechanism for the ball to charge toward the box by reinforcement learning.
Rewrite the contents of RollerAgent.cs
as follows.
RollerAgent.cs
using unityEngine;
using MLAgents;
public class RollerAgent:Agent
{
Rigidbody rBody;
void Start(){
rBody = GetComponent<Rigidbody>();
}
public Transform Target;
public override void AgentReset()
{
if (this.transform.position.y < 0)
{
//Rotational acceleration and acceleration reset
this.rBody.angularVelocity = Vector3.zero;
this.rBody.velocity = Vector3.zero;
//Return agent to initial position
this.transform.position = new Vector3( 0, 0.5f, 0)
}
//Target relocation
Target.position = new Vector3(Random.value * 8 - 4, 0.5f,
Random.value * 8 - 4);
}
}
here,
--Next __ relocation and initialization __ when RollerAgent
reaches the box (Target
)
--__Return __ when RollerAgent
falls off the floor (Floor
)
Is being processed.
Rigidbody
is a component used in Unity's physics simulation.
This time it will be used to run the agent.
The values of * Position, Rotation, Scale * are recorded in Transform
.
By defining it as public
, * Inpector * can pass Transform
of * Target *.
Add the following in the class of RollerAgent.cs
.
public override void CollectObservations()
{
//Target and agent location
AddVectorObs(Target.position);
AddvectorObs(This.transform.position);
//Agent speed
AddVectorObs(rBody.velocity.x);
AddVectorObs(rBody.velocity.z);
}
here, __ Processing to collect observed data as a feature vector __ I am doing.
The 3D coordinates of * Target * and * Agent * and the total 8D vectors of * Agent * velocities * x * and * z * are passed to the neural network. ~~ 8 dimensions is cool to express ~~
Add the following processing related to the ʻAgentAction ()function to
RollerAgent.cs`.
public float speed = 10
public override void AgentAction(float[] vectorAction, string textAction)
{
//Action
Vector3 controlSignal = Vector3.zero;
controlSignal.x = vectorAction[0];
controlSignal.z = vectorAction[1];
rBody.AddForce(controlSignal * speed);
//Reward
//Get the distance to the box (target) from the distance the ball (agent) moved
float distanceToTarget = Vector3.Distance(this.transform.position,
Target.position);
//When the box (target) is reached
if (distanceToTarget < 1.42f)
{
//Rewarded and completed
SetReward(1.0f);
Done();
}
//If you fall off the floor
if (this.transform.position.y < 0)
{
Done();
}
}
here, __The "action" of reading the two types of forces (continuous values) applied in the X and Z directions and trying to move the agent The learning algorithm processes __ which gives a "reward" when the agent can reach the box safely and picks up the "reward" when it falls.
The ʻAddForce` function is a function for applying physical force to an object that has a * Rigidbody * component and moving it. Only when the distance below the reference value to judge whether the target has been reached is calculated, the reward will be given and the reset will be performed.
In order to get enough learning in more complicated situations, it is effective not only to take up the reward but also to punish it. ~~ (At v0,5x, it was -1
when it fell off the floor, but it seems that it was judged unnecessary in the latest version) ~~
RollerAgents.cs
using unityEngine;
using MLAgents;
public class RollerAgent:Agent
{
Rigidbody rBody;
void Start(){
rBody = GetComponent<Rigidbody>();
}
public Transform Target;
public override void AgentReset()
{
if (this.transform.position.y < 0)
{
//Rotational acceleration and acceleration reset
this.rBody.angularVelocity = Vector3.zero;
this.rBody.velocity = Vector3.zero;
//Return agent to initial position
this.transform.position = new Vector3( 0, 0.5f, 0)
}
//Target relocation
Target.position = new Vector3(Random.value * 8 - 4, 0.5f,
Random.value * 8 - 4);
}
public override void CollectObservations()
{
//Target and agent location
AddVectorObs(Target.position);
AddvectorObs(This.transform.position);
//Agent speed
AddVectorObs(rBody.velocity.x);
AddVectorObs(rBody.velocity.z);
}
public override void AgentAction(float[] vectorAction, string textAction)
{
//Action
Vector3 controlSignal = Vector3.zero;
controlSignal.x = vectorAction[0];
controlSignal.z = vectorAction[1];
rBody.AddForce(controlSignal * speed);
//Reward
//Get the distance to the box (target) from the distance the ball (agent) moved
float distanceToTarget = Vector3.Distance(this.transform.position,
Target.position);
//When the box (target) is reached
if (distanceToTarget < 1.42f)
{
//Rewarded and completed
SetReward(1.0f);
Done();
}
//If you fall off the floor
if (this.transform.position.y < 0)
{
Done();
}
}
}
-Select RollerAgent
in the * Hierarchy * window and change theRollerAgent (Script)
item by two points.
Decision Interval = 10
Target = Target(Transform)
-Add * Add Component> Behavior Parameters * and change the settings as follows.
Behavior Name = RollerBallBrain
Vector Observation Space Size = 8
Vector Action Space Type = Continuous
Vector Action Space Size = 2
Also, according to the Official Documentation (https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Learning-Environment-Create-New.md), if you continue to use the default parameters 30 It seems that it takes time to learn 10,000 steps. This time it's not that complicated, so let's rewrite some of the parameters to reduce the number of trials to less than 20,000 steps.
-Open trainer_config.yaml
in * ml-agents-master-0.11> config> * with an editor (VS code or Notepad) and rewrite the values of the following items.
batch_size: 10
buffer_size: 100
Now you are ready to train.
It's almost time to get here.
Before reinforcement learning, let's manually check whether the environment created so far works properly.
Implement the following method additionally in the class of RollerAgent.cs
.
public override float[] Heuristic()
{
var action = new float[2];
action[0] = Input.GetAxis("Horizontal");
action[1] = Input.GetAxis("Vertical");
return action;
}
Horizontal (horizontal) input axis with Horizontal
,
Allows Vertical
to accept vertical (vertical) input axes.
You can now use the "W", "A", "S", "D" or arrow keys.
Finally, in the Roller Agent
* Inspector *,
Select the * Use Heuristic * check box under * Behavior Parameters *.
Press Play to run it. If you can confirm that it works by key input, it is successful.
Now, let's move on to the learning step.
First, launch Anaconda Prompt. You can find it immediately by searching from the start menu (Win key).
conda create -n ml-agents python=3.6
Enter to build a virtual environment. [^ 1]
Proceed([y]/n)?
You will be asked if you want to install it, so enter y
. continue,
activate ml-agents
Enter to move to the virtual environment. [^ 2]
Make sure you have (ml-agents)
at the beginning of the command line.
cd <ml-agent folder >
Go to. [^ 3]
pip install mlagents
Install the library that ML-Agents uses independently. (It takes a few minutes) This installation makes dependencies such as TensorFlow / Jupyter.
After a while, It is OK if a screen like this appears.
cd <ml-agents folder >\ml-agents-envs
Go to.
pip install -e .
To install the package. It is OK if the screen looks like this. And
cd <ml-agents folder >\ml-agents
Go to.
pip install -e .
To install the package.
This completes the preparation on the Python side.
__ : collision: [Note]: The TensorFlowSharp plugin is not used in v0.6.x or later. __ If you have been referring to old books, we recommend that you recreate a new virtual environment.
Until ML-Agents ver0.5.0, TensorFlowSharp was used to communicate with Python, but please do not use it in the latest version. If you use it, the following error will occur.
No model was present for the Brain 3DBallLearning. UnityEngine.Debug:LogError(Object) MLAgents.LearningBrain:DecideAction() (at Assets/ML-Agents/Scripts/LearningBrain.cs:191) MLAgents.Brain:BrainDecideAction() (at Assets/ML-Agents/Scripts/Brain.cs:80) MLAgents.Academy:EnvironmentStep() (at Assets/ML-Agents/Scripts/Academy.cs:601) MLAgents.Academy:FixedUpdate() (at Assets/ML-Agents/Scripts/Academy.cs:627)
Well, finally we will start learning. The dream AI experience is just around the corner. let's do our best.
cd <ml-agents> folder
Enter to move to the downloaded folder hierarchy.
mlagents-learn config/trainer_config.yaml --run-id=firstRun --train
To execute. [^ 4] At the bottom of the command line, __INFO:mlagents.envs:Start training by pressing the Play button in the Unity Editor. (Go back to the Unity editor and press the Play button to start training.) __ Is displayed.
Go back to the Unity screen and uncheck * Use Heuristic * in __ * Behavior Parameters * and press the __,: arrow_forward: button.
When the ball started chasing the box, learning started normally.
__ If you do not press the Play button for a while, a timeout error will occur, so please execute the same command again. __
The log is output to the console log every 1000 steps. If you want to interrupt in the middle, you can interrupt with Ctrl + C. (If you dare to finish early, you can make a "weak AI")
__Step is the number of trials (learning), __ __Mean Reward earned average reward, __ __Std of Reward is standard deviation __ (value representing data variability) Represents.
After learning, the RollerBallBrain.nn
file will be created under<ml-agents folder> \ models \ <id name ~>
.
Now it's time to try out the model of the generated neural network.
Copy the RollerBallBrain.nn
file from earlier to the * Assets * folder in Unity's Project.
(The location can be anywhere in the project)
Then click the: radio_button: button on the far right of the * Model * item in the * Inspector * of the RollerAgent
and select the imported .nn
file. (* At this time, be careful not to confuse if there is a .nn
extension file with the same name.)
Also, if * Use Heuristic * in * Behavior Parameters * is left checked, it will not work properly. __ Be sure to uncheck it after the test. __
Now let's press: arrow_forward: Play.
__ If the ball starts chasing you safely, you are successful. __
In Anaconda Prompt, do the following:
tensorboard --logdir=summaries --port=6006
If you open [localhost: 6006](http: // localhost: 6006 /) in your browser, you can see the transition of learning in a graph.
――If you can read more Gorigori C #, you will be able to fine-tune the algorithm yourself __ ――In reinforcement learning, __AI's wisdom can be classified into weak, medium, strong, etc. by the number of learnings __ --Ver is frequently renewed, __ information is apt to deteriorate __ ―― ~~ Learning is much faster than humans. The power of science is amazing! !! ~~
Even beginners can use assets to create a convenient world where simple machine learning can be imitated in a day. How was it when you actually touched it? I hope it will give you an opportunity to become interested in machine learning.
If you find any expressions or errors that you are interested in, I would appreciate it if you could point them out. Also, if you found this article helpful, I like it! It will be __encouragement if you give me.
Thank you for your cooperation.
Below are articles from our ancestors who have been very helpful in learning. I would like to take this opportunity to say __Acknowledgment __.
Unity-Technologies Official Document (GitHub) ml-agents Migration Guide (GitHub) [Unity: How to use ML-Agents in September 2019 (ver0.9.0 /0.9.1/0.9.2)](https://www.fast-system.jp/unity-ml-agents-version-0- 9-0-how to /) [Unity] I tried a tutorial on reinforcement learning (ML-Agents v0.8.1) [Create a new learning environment with Unity's ML-Agents (0.6.0a version)](http://am1tanaka.hatenablog.com/entry/2019/01/18/212915#%E5%AD%A6%E7 % BF% 92% E5% 8A% B9% E6% 9E% 9C% E3% 82% 92% E9% AB% 98% E3% 82% 81% E3% 82% 8B% E3% 81% 8A% E3% 81 % BE% E3% 81% 91)
[^ 1]: * You can change the "ml-agents" * part to any name you like. [^ 2]: Activate with the virtual environment name you set [^ 3]: Directory where * ml-agents-master * was downloaded in Preparation [^ 4]: * You can change the part of "firstRun" * to any name you like.
Recommended Posts