environnement

python 3.7.7
CUDA 10.0.130
gcc 7.3.0

Environnement

git clone https://github.com/tensorflow/tpu.git
sudo apt-get install -y python-tk
pip install tensorflow-gpu==1.15
pip install --user Cython matplotlib opencv-python-headless pyyaml Pillow
pip install 'git+https://github.com/cocodataset/cocoapi#egg=pycocotools&subdirectory=PythonAPI'

Inférence avec le modèle entraîné

Téléchargez n'importe quel modèle https://github.com/tensorflow/tpu/blob/master/models/official/detection/MODEL_ZOO.md
Effectuer une inférence

Exemple de coco_label_map.csv

  1:person
  2:bicycle
  3:car

category_id: sous la forme de category Si vous souhaitez changer de classe, créez un fichier csv selon le format ci-dessus

Code d'exécution

  python ~/tpu/models/official/detection/inference.py \
    --model="retinanet" \
    --image_size=640\
    --checkpoint_path="./detection_retinanet_50/model.  ckpt" \
    --label_map_file="./retinanet/tpu/models/official/  detection/datasets/coco_label_map.csv" \
    --image_file_pattern="path/to/input/file" \
    --output_html="path/to/output/file" \
    --max_boxes_to_draw=10 \
    --min_score_threshold=0.05

Sortie au format html.
Git officiel ne peut déduire qu'avec des images.

3. Apprenez avec les données originales

Téléchargez les données entraînées https://github.com/tensorflow/tpu/blob/master/models/official/detection/MODEL_ZOO.md
Créer des données d'entrée

L'entrée est supposée être au format tfrecorfd.
Préparez le jeu de données au format coco (image + * .json)
Convertir au format tfrecord avec le programme suivant

  #!/bin/bash

  TRAIN_IMAGE_DIR="path/to/train/images/dir"
  TRAIN_OBJ_ANNOTATIONS_FILE="path/to/train/file"
  OUTPUT_DIR="path/to/output/dir"
  VAL_IMAGE_DIR="path/to/test/images/dir"
  VAL_OBJ_ANNOTATIONS_FILE="path/to/test/images/dir"

  function create_train_dataset(){
    python3 create_coco_tf_record.py \
      --logtostderr \
      --include_masks \
      --image_dir="${TRAIN_IMAGE_DIR}" \
      --object_annotations_file="$  {TRAIN_OBJ_ANNOTATIONS_FILE}" \
      --output_file_prefix="${OUTPUT_DIR}/train" \
      --num_shards=256
  }
  function create_val_dataset() {
    SCRIPT_DIR=$(dirname "$(readlink -f "$0")")
    PYTHONPATH="tf-models:tf-models/research"
    python3 $SCRIPT_DIR/create_coco_tf_record.py \
      --logtostderr \
      --include_masks \
      --image_dir="${VAL_IMAGE_DIR}" \
      --object_annotations_file="$  {VAL_OBJ_ANNOTATIONS_FILE}" \
      --output_file_prefix="${OUTPUT_DIR}/val" \
      --num_shards=32
  }

  create_train_dataset
  create_val_dataset

3. Effectuer l'apprentissage

  MODEL_DIR="<path to the directory to store model files>"
  TRAIN_FILE_PATTERN="<path to the TFRecord training data>"
  EVAL_FILE_PATTERN="<path to the TFRecord validation data>"
  VAL_JSON_FILE="<path to the validation annotation JSON file>"
  RESNET_CHECKPOINT="<path to trained model>"
  python ~/tpu/models/official/detection/main.py \
    --model="retinanet" \
    --model_dir="${MODEL_DIR?}" \
    --mode=train \
    --eval_after_training=True \
    --use_tpu=False \
    --params_override="{train: { checkpoint: { path: ${RESNET_CHECKPOINT?}, prefix: resnet50/ }, train_file_pattern: ${TRAIN_FILE_PATTERN?} }, eval: { val_json_file: ${VAL_JSON_FILE?}, eval_file_pattern: ${EVAL_FILE_PATTERN?} }}"

Vous pouvez saisir le temps d'exécution à partir du journal au moment de l'exécution de l'apprentissage

  INFO:tensorflow:examples/sec: 0.622754
  INFO:tensorflow:global_step/sec: 0.078258

4. Raisonnement

Exécutez

  python ~/tpu/models/official/detection/inference.py \
      --model="retinanet" \
      --image_size=640\
      --checkpoint_path="path/to/input" \
      --label_map_file="path/to/label" \
      --image_file_pattern="path/to/input/file" \
      --output_html="path/to/output/file" \
      --max_boxes_to_draw=10 \
      --min_score_threshold=0.05

--Modèle avant l'entraînement avec les données originales (Les résultats d'inférence seront téléchargés) --Modèle après la formation (Les résultats d'inférence seront téléchargés)

5. Évaluation

Exécutez

config_file utilise la sortie params.yaml pendant l'entraînement

  python ${RETINA_ROOT}/evaluate_model.py\
    --model="retinanet"\
    --checkpoint_path="path/to/imput/file"\
    --config_file="${CONFIG_PATH}"\
    --params_override="${PARAMS_PATH}"\ 
    --dump_predictions_only = True\
    --predictions_path="path/to/output/file"

Bonus: comparaison des performances

Comme c'est un gros problème, je l'ai comparé à d'autres modèles construits dans le passé. Tous étaient de taille 8 et ont été formés et évalués à l'aide de l'ensemble de données d'origine.

Vitesse d'apprentissage

Comparez le temps moyen passé sur 100 itérations.

	temps
retinanet	Environ 21[min]
ttfnet	Environ 228[min]

précision

Comparez la précision de l'AP et le résultat de l'inférence au moment de 2000 itérations.

	mAP
retinanet	96.35
ttfnet	79.78

Implémentation de retina net sur CentOS