Task Parameters
Luigi Parameter
We can set parameters for tasks. Also please refer to Task Settings section.
class Task(gokart.TaskOnKart):
param_a: luigi.Parameter = luigi.Parameter()
param_c: luigi.ListParameter = luigi.ListParameter()
param_d: luigi.IntParameter = luigi.IntParameter(default=1)
Please refer to luigi document for a list of parameter types.
Gokart Parameter
There are also parameters provided by gokart.
gokart.TaskInstanceParameter
gokart.ListTaskInstanceParameter
gokart.ExplicitBoolParameter
gokart.TaskInstanceParameter
The TaskInstanceParameter() executes a task using the results of a task as dynamic parameters.
class TaskA(gokart.TaskOnKart[str]):
def run(self):
self.dump('Hello')
class TaskB(gokart.TaskOnKart[str]):
require_task: gokart.TaskInstanceParameter = gokart.TaskInstanceParameter()
def requires(self):
return self.require_task
def run(self):
task_a = self.load()
self.dump(','.join([task_a, 'world']))
task = TaskB(require_task=TaskA())
print(gokart.build(task)) # Hello,world
Helps to create a pipeline.
gokart.ListTaskInstanceParameter
The ListTaskInstanceParameter() is list of TaskInstanceParameter.
gokart.ExplicitBoolParameter
The ExplicitBoolParameter() is parameter for explicitly specified value.
luigi.BoolParameter already has “explicit parsing” feature, but also still has implicit behavior like follows.
$ python main.py Task --param
# param will be set as True
$ python main.py Task
# param will be set as False
ExplicitBoolParameter solves these problems on parameters from command line.
gokart.SerializableParameter
The SerializableParameter() is a parameter for any object that can be serialized and deserialized.
This parameter is particularly useful when you want to pass a complex object or a set of parameters to a task.
The object must implement the following methods:
gokart_serialize: Serialize the object to a string. This serialized string must uniquely identify the object to enable task caching. Note that it is not required for deserialization.gokart_deserialize: Deserialize the object from a string, typically used for CLI arguments.
Example
import json
from dataclasses import dataclass
import gokart
@dataclass(frozen=True)
class Config:
foo: int
# The `bar` field does not affect the result of the task.
# Similar to `luigi.Parameter(significant=False)`.
bar: str
def gokart_serialize(self) -> str:
# Serialize only the `foo` field since `bar` is irrelevant for caching.
return json.dumps({'foo': self.foo})
@classmethod
def gokart_deserialize(cls, s: str) -> 'Config':
# Deserialize the object from the provided string.
return cls(**json.loads(s))
class DummyTask(gokart.TaskOnKart):
config: gokart.SerializableParameter[Config] = gokart.SerializableParameter(object_type=Config)
def run(self):
# Save the `config` object as part of the task result.
self.dump(self.config)