Metaflow Metaflow | https://metaflow.org/
train.py
from metaflow import FlowSpec, step, Parameter
class TrainingPipeline(FlowSpec):
param_config_str = Parameter('config',
help='Training config json str.',
default='{}')
@step
def start(self):
self.config = json.loads(self.param_config_str)
self.a = 0
self.next(self.step1)
@step
def step1(self):
self.a = 1
self.next(self.step2)
@step
def step2(self):
self.a = 2
self.next(self.end)
@step
def end(self):
pass
if __name__ == '__main__':
TrainingPipeline()
python train.py
When you run it, you will see that a ".metaflow" directory has been created in the run folder. Prepare the following script in the hierarchy where the .metaflow directory is located.
debug.py
from metaflow import Flow, namespace, Step
namespace(None)
data_start = Step('TrainingPipeline/[RUN_ID]/start').task.data
print('Step start : a -> {}'.format(data_start.a))
data1 = Step('TrainingPipeline/[RUN_ID]/step1').task.data
print('Step step1 : a -> {}'.format(data1.a))
data2 = Step('TrainingPipeline/[RUN_ID]/step2').task.data
print('Step step2 : a -> {}'.format(data2.a))
python debug.py
Step start : a -> 0
Step step1 : a -> 1
Step step2 : a -> 2
By the way, you can also save DataFrame etc. properly.