Introduction

I am doing research using DNN.

After a year, I feel like I've finally found the best practice for project management, so I'll expose it.

environment

DNN learning & inference using Python + PyTorch
Enter the computer server (Ubuntu) with SSH and execute the program. (Main PC is Windows)
Use Onedrive and GitHub together for file synchronization with the computer server

Since Onedrive is insurance when a file suddenly blows off, I think that GitHub is basically enough.

Directory structure

program/
  ├ dataset/
  │   ├ dev/
  │   └ test/
  └ src/
      ├ common/
      │   ├ hoge.py
      │   ├ fuga.py
      │   ├ ...
      ├ method_xxx/
      │   ├ output/
      │   │   ├ YYYYMMDD_ID/
      │   │   │  ├ loss/
      │   │   │  │   ├ training_loss.npy
      │   │   │  │   └ validation_loss.npy
      │   │   │  ├ prediction/
      │   │   │  │   ├ img/
      │   │   │  │   ├ wav/
      │   │   │  │   ├ ...
      │   │   │  ├ condition.json
      │   │   │  ├ model.pth
      │   │   │  └ network.txt
      │   │   ├ YYYYMMDD_ID/
      │   │   ├ ...
      │   ├ generate_dataset.py
      │   ├ dataset_loader.py
      │   ├ dnn_model.py
      │   ├ dnn_training.py
      │   ├ dnn_evaluation.py
      │   ├ training_config.json
      │   └ evaluation_config.json
      ├ method_zzz/
      ├　...

Description of each folder / file

method_xxx / method_zzz: DNN models and datasets are created in various ways, so folders are created accordingly.
common: Contains modules that are commonly used by each method
method_xxx / output /: The learning result and the inference result are spit out here.
Refer to "Exposing my machine learning project management", and make the folder a unique folder with an ID after the date.
YYYYMMDD_ID / network.txt: Describes the network structure of the created model. The instance of the model defined in PyTorch is output as it is.

`Output DNN model structure`



class Model(nn.Module):
    def __init__(self, in_units, hidden_units, out_units):
        super(Model, self).__init__()
        self.l1 = nn.Linear(in_units, hidden_units)
        self.a1 = nn.ReLU()
        self.l2 = nn.Linear(hidden_units, hidden_units)
        self.a2 = nn.ReLU()

    def forward(self, x):
        x = self.a1(self.l1(x))
        y = self.a2(self.l2(x))

        return y

#Export network information for DNN model(.txt)
model = Model(in_size, hidden_size, out_size)
with open(OUT_DIR_NAME+'/network.txt', 'w') as f:
    f.write(str(model))

`network.txt`


Model(
  (l1): Linear(in_features=8546, out_features=682, bias=True)
  (a1): ReLU()
  (l2): Linear(in_features=682, out_features=682, bias=True)
  (a2): ReLU()
)

About json files

Learning parameter settings, model evaluation, and experimental results are managed in a json file. The contents of each are as follows.

`training_config.json`


{
    "method": "A detailed explanation of the method is described here.",
    "parameters": {
        "max_epochs": 1000,
        "batch_size": 128,
        "optimizer": "adam",
        "learning_rate": 0.001,
        "patience": 50,
        "norm": true
    },
    "datasets": {
        "data1": "../../dataset/dev/<file1_name>",
        "data2": "../../dataset/dev/<file2_name>"
    }
}

`evaluation_config.json`


{
    "target_dir": "YYYYMMDD_ID",
    "src_dir": {
        "file1": "../../dataset/test/<file1_name>",
        "file2": "../../dataset/test/<file2_name>"
    },
    "output_dir": "OUTPUT_DIR_NAME"
}

`condition.json`


{
    "method": "Explanation of the method",
    "parameters": {
        "max_epochs": 345,
        "batch_size": 128,
        "optimizer": "adam",
        "learning_rate": 0.001,
        "patience": 50,
        "norm": true
    },
    "datasets": {
        "data1": "../../dataset/dev/<file1_name>",
        "data2": "../../dataset/dev/<file1_name>",
    },
    "dnn": {
        "input_size": 8546,
        "output_size": 682
    },
    "loss": {
        "training_loss": 0.087654,
        "validation_loss": 0.152140
    }
}

Regarding the evaluation of DNN performance, ʻevaluation.json is read first, the target folder is specified from target_dir, and the parameters etc. are acquired from condition.json` in it.

the end

After some twists and turns, I settled on this kind of management method, but please tell me if there is a better management method ...

About Deep Learning (DNN) Project Management

Introduction

environment

Directory structure

Description of each folder / file

Output DNN model structure

network.txt