Create an image recognition application that recognizes the numbers written on the screen with Pytorch Mobile and kotlin. ** Create all the functions of the model and android for image recognition from scratch. ** ** It will be divided into two parts, ** Model Creation (Python) ** and ** Android Implementation (kotlin) **.
This android studio project Github: https://github.com/SY-BETA/NumberRecognitionApp/tree/master
If you haven't made a model with python yet, Create an image recognition application that discriminates the numbers written on the screen with android (PyTorch Mobile) [Network creation] Please make it with / 077b5b8d3163fb7de800). Or if you are an android engineer who does not have a python environment or if you are tired of making models, we have learned models, so Download the trained model from Github: https://github.com/SY-BETA/CNN_PyTorch/blob/master/CNNModel.pt.
What to make this time, this ↓
Do 5 and 6
Now that the model has been created, we will be able to infer it on android using pytorch mobile
and implement the ability to write numbers on the screen.
Added the following to gradle (as of January 25, 2020)
dependencies {
implementation 'org.pytorch:pytorch_android:1.4.0'
implementation 'org.pytorch:pytorch_android_torchvision:1.4.0'
}
Set surfaceView for writing characters
xml file ↓
activity_main.xml
<androidx.constraintlayout.widget.ConstraintLayout
xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:app="http://schemas.android.com/apk/res-auto"
xmlns:tools="http://schemas.android.com/tools"
android:layout_width="match_parent"
android:layout_height="match_parent"
tools:context=".MainActivity">
<FrameLayout
android:id="@+id/frameLayout"
android:layout_width="230dp"
android:layout_height="230dp"
android:layout_marginStart="24dp"
android:layout_marginTop="24dp"
android:layout_marginEnd="24dp"
android:layout_marginBottom="24dp"
android:background="@android:color/darker_gray"
app:layout_constraintBottom_toTopOf="@+id/sampleImg"
app:layout_constraintEnd_toEndOf="parent"
app:layout_constraintStart_toStartOf="parent"
app:layout_constraintTop_toBottomOf="@+id/text1">
<SurfaceView
android:id="@+id/surfaceView"
android:layout_width="match_parent"
android:layout_height="match_parent" />
</FrameLayout>
<Button
android:id="@+id/resetBtn"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:text="reset"
app:layout_constraintBottom_toBottomOf="parent"
app:layout_constraintEnd_toStartOf="@+id/inferBtn"
app:layout_constraintHorizontal_bias="0.5"
app:layout_constraintStart_toStartOf="parent" />
<Button
android:id="@+id/inferBtn"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:text="inference"
app:layout_constraintBottom_toBottomOf="parent"
app:layout_constraintEnd_toEndOf="parent"
app:layout_constraintHorizontal_bias="0.5"
app:layout_constraintStart_toEndOf="@+id/resetBtn" />
<TextView
android:id="@+id/text1"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_marginStart="16dp"
android:layout_marginTop="24dp"
android:text="The written numbers are"
android:textSize="40sp"
app:layout_constraintStart_toStartOf="parent"
app:layout_constraintTop_toTopOf="parent" />
<TextView
android:id="@+id/resultNum"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_marginStart="8dp"
android:text="?"
android:textAppearance="@style/TextAppearance.AppCompat.Body2"
android:textColor="@color/colorAccent"
android:textSize="55sp"
app:layout_constraintBottom_toBottomOf="@+id/text1"
app:layout_constraintStart_toEndOf="@+id/text1"
app:layout_constraintTop_toTopOf="@+id/text1" />
<ImageView
android:id="@+id/sampleImg"
android:layout_width="100dp"
android:layout_height="100dp"
app:layout_constraintBottom_toTopOf="@+id/resetBtn"
app:layout_constraintEnd_toEndOf="parent"
app:layout_constraintStart_toStartOf="parent"
app:srcCompat="@mipmap/ic_launcher_round" />
<TextView
android:id="@+id/textView"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:text="After 28 × 28 resizing ↓"
app:layout_constraintBottom_toTopOf="@+id/sampleImg"
app:layout_constraintEnd_toEndOf="@+id/sampleImg"
app:layout_constraintStart_toStartOf="@+id/sampleImg" />
</androidx.constraintlayout.widget.ConstraintLayout>
Use surfaceView for drawing. For that purpose, create a class that inherits SurfaceView
and SurfaceHolder.Callback
and controls surfaceView
.
MNIST, which is the trained data of the model, was a white line on a black background, so I will be able to draw with that color.
Variables that hold various states. Appropriately copy and paste ok
DrawSurfaceView.kt
class DrawSurfaceView : SurfaceView, SurfaceHolder.Callback {
private var surfaceHolder: SurfaceHolder? = null
private var paint: Paint? = null
private var path: Path? = null
var color: Int? = null
var prevBitmap: Bitmap? = null /**Bitmap to hold the written image**/
private var prevCanvas: Canvas? = null
private var canvas: Canvas? = null
var width: Int? = null
var height: Int? = null
constructor(context: Context, surfaceView: SurfaceView, surfaceWidth: Int, surfaceHeight: Int) : super(context) {
// surfaceHolder
surfaceHolder = surfaceView.holder
///size of surfaceView
width = surfaceWidth
height = surfaceHeight
///Callback
surfaceHolder!!.addCallback(this)
///Paint settings
paint = Paint()
color = Color.WHITE //Write with a white line
paint!!.color = color as Int
paint!!.style = Paint.Style.STROKE
paint!!.strokeCap = Paint.Cap.ROUND
paint!!.isAntiAlias = false
paint!!.strokeWidth = 50F
}
}
Make sure to include the width and height of surfaceView
in the layout file when creating this instance with MainActivity
.
Create a data class that saves the path and color when drawing.
DrawSurfaceView.kt
////Saves path class information and color information for that path
data class pathInfo(
var path: Path,
var color: Int
)
Create a method to initialize canvas and bitmap with implement
DrawSurfaceView.kt
override fun surfaceCreated(holder: SurfaceHolder?) {
/// bitmap,canvas initialization
initializeBitmap()
}
override fun surfaceChanged(holder: SurfaceHolder?, format: Int, width: Int, height: Int) {
}
override fun surfaceDestroyed(holder: SurfaceHolder?) {
///Recycle bitmap(Memory leak prevention)
prevBitmap!!.recycle()
}
///Initialization of bitmap and canvas
private fun initializeBitmap() {
if (prevBitmap == null) {
prevBitmap = Bitmap.createBitmap(width!!, height!!, Bitmap.Config.ARGB_8888)
}
if (prevCanvas == null) {
prevCanvas = Canvas(prevBitmap!!)
}
//On black background
prevCanvas!!.drawColor(Color.BLACK)
}
This time Bitmap recycles when the surfaceView is destroyed. If you leave the bitmap as it is, there is a risk of memory leak, so recycle it when it is no longer used.
Create a function to draw on campus
DrawSurfaceView.kt
/////Function to draw
private fun draw(pathInfo: pathInfo) {
///Lock and get canvas
canvas = Canvas()
canvas = surfaceHolder!!.lockCanvas()
////Clear canvas
canvas!!.drawColor(0, PorterDuff.Mode.CLEAR)
///Draw the previous bitmap on the canvas
canvas!!.drawBitmap(prevBitmap!!, 0F, 0F, null)
////Draw path
paint!!.color = pathInfo.color
canvas!!.drawPath(pathInfo.path, paint!!)
///Unlock
surfaceHolder!!.unlockCanvasAndPost(canvas)
}
///Call a function for each action when you touch the screen
fun onTouch(event: MotionEvent): Boolean {
when (event.action) {
MotionEvent.ACTION_DOWN -> touchDown(event.x, event.y)
MotionEvent.ACTION_MOVE -> touchMove(event.x, event.y)
MotionEvent.ACTION_UP -> touchUp(event.x, event.y)
}
return true
}
/////Holds the point to draw in the path class
/// ACTION_Processing at the time of DOWN
private fun touchDown(x: Float, y: Float) {
path = Path()
path!!.moveTo(x, y)
}
/// ACTION_Processing at the time of MOVE
private fun touchMove(x: Float, y: Float) {
path!!.lineTo(x, y)
draw(pathInfo(path!!, color!!))
}
/// ACTION_Processing at the time of UP
private fun touchUp(x: Float, y: Float) {
path!!.lineTo(x, y)
draw(pathInfo(path!!, color!!))
prevCanvas!!.drawPath(path!!, paint!!)
}
Method to initialize the drawn bitmap
DrawSurfaceView.kt
///reset method
fun reset() {
///Initialization and canvas clear
initializeBitmap()
canvas = surfaceHolder!!.lockCanvas()
canvas?.drawColor(0, PorterDuff.Mode.CLEAR)
surfaceHolder!!.unlockCanvasAndPost(canvas)
}
This completes DrawSurfaceView
. If you implement this in MainActivity.kt
, you can implement the function to draw a picture.
Get the size of drawSurfaceView of the layout, create an instance of DrawSurfaceView
, and implement it.
Also, the reset button method can be called.
MainActivity.kt
class MainActivity : AppCompatActivity() {
var surfaceViewWidth: Int? = null
var surfaceViewHeight: Int? = null
var drawSurfaceView:DrawSurfaceView? = null
///Extension function
//Get the size of the surfaceView after the View is created using ViewTreeObserver
private inline fun <T : View> T.afterMeasure(crossinline f: T.() -> Unit) {
viewTreeObserver.addOnGlobalLayoutListener(object :
ViewTreeObserver.OnGlobalLayoutListener {
override fun onGlobalLayout() {
if (width > 0 && height > 0) {
viewTreeObserver.removeOnGlobalLayoutListener(this)
f()
}
}
})
}
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
setContentView(R.layout.activity_main)
///Use ViewTreeObserber
///Get the size of the surfaceView after the surfaceView has been generated
surfaceView.afterMeasure {
surfaceViewWidth = surfaceView.width
surfaceViewHeight = surfaceView.height
////DrawrSurfaceView set and instantiation
drawSurfaceView = DrawSurfaceView(
applicationContext,
surfaceView,
surfaceViewWidth!!,
surfaceViewHeight!!
)
///Set of listeners
surfaceView.setOnTouchListener { v, event -> drawSurfaceView!!.onTouch(event) }
}
///Reset button
resetBtn.setOnClickListener {
drawSurfaceView!!.reset() ///Call bitmap initialization method
sampleImg.setImageResource(R.color.colorPrimaryDark)
resultNum.text = "?"
}
}
}
If you can do so well, you should be able to draw a picture on the screen.
If something goes wrong, please copy and paste everything from Github. Github: https://github.com/SY-BETA/NumberRecognitionApp/tree/master
I will finally use PyTorch Mobile
from the next.
Create an assets folder in your project. (You can do it by right-clicking the app on the left side of the UI-> New-> Folder-> assets folder) Create an image recognition application that discriminates the numbers written on the screen with android (PyTorch Mobile). Throw in the learned model created in [Network Creation] or downloaded at the beginning.
Make it possible to get the path from that asset folder.
Add the following to ʻonCreate in
MainActivity.kt`.
MainActivity.kt
////Function to get the path from the asset file
fun assetFilePath(context: Context, assetName: String): String {
val file = File(context.filesDir, assetName)
if (file.exists() && file.length() > 0) {
return file.absolutePath
}
context.assets.open(assetName).use { inputStream ->
FileOutputStream(file).use { outputStream ->
val buffer = ByteArray(4 * 1024)
var read: Int
while (inputStream.read(buffer).also { read = it } != -1) {
outputStream.write(buffer, 0, read)
}
outputStream.flush()
}
return file.absolutePath
}
}
///Load trained model
val module = Module.load(assetFilePath(this, "CNNModel.pt"))
Note that loading images and models from the assets folder can be quite cumbersome.
Forward propagation is performed when the inference button is pressed in the loaded trained model.
In addition, the result is acquired and displayed.
Add the following to ʻonCreate in
MainActivity.kt`.
MainActivity.kt
//Inference button click
inferBtn.setOnClickListener {
//Image drawn(Get bitmap)
val bitmap = drawSurfaceView!!.prevBitmap!!
//Resize to the input size of the created trained model
val bitmapResized = Bitmap.createScaledBitmap(bitmap,28, 28, true)
///Tensor conversion and standardization
val inputTensor = TensorImageUtils.bitmapToFloat32Tensor(
bitmapResized,
TensorImageUtils.TORCHVISION_NORM_MEAN_RGB, TensorImageUtils.TORCHVISION_NORM_STD_RGB
)
///Reasoning and its consequences
///Forward propagation
val outputTensor = module.forward(IValue.from(inputTensor)).toTensor()
val scores = outputTensor.dataAsFloatArray
//View resized image
sampleImg.setImageBitmap(bitmapResized)
///Variable to store score
//Score MAX index=Numbers predicted by image recognition(From how to make a model)
var maxScore: Float = 0F
var maxScoreIdx = -1
for (i in scores.indices) {
Log.d("scores", scores[i].toString()) //Output score list to log(It's interesting to see which number is close)
if (scores[i] > maxScore) {
maxScore = scores[i]
maxScoreIdx = i
}
}
//Display inference results
resultNum.text = "$maxScoreIdx"
}
ʻThe size of inputTensor` is ** (1, 3, 28, 28) ** It is necessary to create a model so that this size is the input.
If you can do this, you should have the first app! !! Write numbers, make predictions, and play with them.
Overall, I had a hard time changing the number of channels when creating the network and adjusting the input size of the network. Since the implementation on android is just forward propagation, I thought that it would change depending on whether or not the network can be created. Also, PyTorch Mobile has just come out, but I was surprised that it was upgraded in about two weeks.
It's fun to be able to recognize the numbers written on the screen. This time it was handwritten numbers in MNIST, but it would be interesting to do other transfer learning.
This code is on Github. Github: https://github.com/SY-BETA/NumberRecognitionApp/tree/master
Trained CNN model Github: https://github.com/SY-BETA/CNN_PyTorch/blob/master/CNNModel.pt
Recommended Posts