gokart package¶
Submodules¶
gokart.file_processor module¶
-
class
gokart.file_processor.
BinaryFileProcessor
[source]¶ Bases:
gokart.file_processor.FileProcessor
Pass bytes to this processor
` figure_binary = io.BytesIO() plt.savefig(figure_binary) figure_binary.seek(0) BinaryFileProcessor().dump(figure_binary.read()) `
gokart.info module¶
-
gokart.info.
make_tree_info
(task: gokart.task.TaskOnKart, indent: str = '', last: bool = True, details: bool = False, abbr: bool = True, visited_tasks: Optional[Set[str]] = None, ignore_task_names: Optional[List[str]] = None) → str[source]¶ Return a string representation of the tasks, their statuses/parameters in a dependency tree format
This function has moved to gokart.tree.task_info.make_task_info_as_tree_str. This code is remained for backward compatibility.
- task: TaskOnKart
- Root task.
- details: bool
- Whether or not to output details.
- abbr: bool
- Whether or not to simplify tasks information that has already appeared.
- ignore_task_names: Optional[List[str]]
- List of task names to ignore.
- tree_info : str
- Formatted task dependency tree.
gokart.parameter module¶
-
class
gokart.parameter.
ExplicitBoolParameter
(*args, **kwargs)[source]¶ Bases:
luigi.parameter.BoolParameter
-
class
gokart.parameter.
ListTaskInstanceParameter
(default=<object object>, is_global=False, significant=True, description=None, config_path=None, positional=True, always_in_help=False, batch_method=None, visibility=<ParameterVisibility.PUBLIC: 0>)[source]¶ Bases:
luigi.parameter.Parameter
-
class
gokart.parameter.
TaskInstanceParameter
(default=<object object>, is_global=False, significant=True, description=None, config_path=None, positional=True, always_in_help=False, batch_method=None, visibility=<ParameterVisibility.PUBLIC: 0>)[source]¶ Bases:
luigi.parameter.Parameter
gokart.s3_config module¶
gokart.target module¶
-
class
gokart.target.
ModelTarget
(file_path: str, temporary_directory: str, load_function, save_function, redis_params: gokart.redis_lock.RedisParams)[source]¶ Bases:
gokart.target.TargetOnKart
-
class
gokart.target.
SingleFileTarget
(target: luigi.target.FileSystemTarget, processor: gokart.file_processor.FileProcessor, redis_params: gokart.redis_lock.RedisParams)[source]¶ Bases:
gokart.target.TargetOnKart
gokart.task module¶
-
class
gokart.task.
TaskOnKart
(*args, **kwargs)[source]¶ Bases:
luigi.task.Task
This is a wrapper class of luigi.Task.
The key methods of a TaskOnKart are:
make_target()
- this makes output target with a relative file path.make_model_target()
- this makes output target for models which generate multiple files to save.load()
- this loads input files of this task.dump()
- this save a object as output of this task.
-
FIX_RANDOM_SEED_VALUE_NONE_MAGIC_NUMBER
= -42497368¶
-
cache_unique_id
= <gokart.parameter.ExplicitBoolParameter object>¶
-
clone
(cls=None, **kwargs)[source]¶ Creates a new instance from an existing instance where some of the args have changed.
There’s at least two scenarios where this is useful (see test/clone_test.py):
- remove a lot of boiler plate when you have recursive dependencies and lots of args
- there’s task inheritance and some logic is on the base class
Parameters: - cls –
- kwargs –
Returns:
-
complete
() → bool[source]¶ If the task has any outputs, return
True
if all outputs exist. Otherwise, returnFalse
.However, you may freely override this method with custom logic.
-
delete_unnecessary_output_files
= <luigi.parameter.BoolParameter object>¶
-
fail_on_empty_dump
= <gokart.parameter.ExplicitBoolParameter object>¶
-
fix_random_seed_methods
= <luigi.parameter.ListParameter object>¶
-
fix_random_seed_value
= <luigi.parameter.IntParameter object>¶
-
load_data_frame
(target: Union[None, str, gokart.target.TargetOnKart] = None, required_columns: Optional[Set[str]] = None, drop_columns: bool = False) → pandas.core.frame.DataFrame[source]¶
-
local_temporary_directory
= <luigi.parameter.Parameter object>¶
-
make_large_data_frame_target
(relative_file_path: str = None, use_unique_id: bool = True, max_byte=67108864) → gokart.target.TargetOnKart[source]¶
-
make_model_target
(relative_file_path: str, save_function: Callable[[Any, str], None], load_function: Callable[[str], Any], use_unique_id: bool = True)[source]¶ Make target for models which generate multiple files in saving, e.g. gensim.Word2Vec, Tensorflow, and so on.
Parameters: - relative_file_path – A file path to save.
- save_function – A function to save a model. This takes a model object and a file path.
- load_function – A function to load a model. This takes a file path and returns a model object.
- use_unique_id – If this is true, add an unique id to a file base name.
-
make_target
(relative_file_path: str = None, use_unique_id: bool = True, processor: Optional[gokart.file_processor.FileProcessor] = None) → gokart.target.TargetOnKart[source]¶
-
modification_time_check
= <luigi.parameter.BoolParameter object>¶
-
redis_fail_on_collision
= <luigi.parameter.BoolParameter object>¶
-
redis_host
= <luigi.parameter.OptionalParameter object>¶
-
redis_port
= <luigi.parameter.OptionalParameter object>¶
-
redis_timeout
= <luigi.parameter.IntParameter object>¶
-
rerun
= <luigi.parameter.BoolParameter object>¶
-
serialized_task_definition_check
= <luigi.parameter.BoolParameter object>¶
-
should_dump_supplementary_log_files
= <gokart.parameter.ExplicitBoolParameter object>¶
-
significant
= <luigi.parameter.BoolParameter object>¶
-
store_index_in_feather
= <gokart.parameter.ExplicitBoolParameter object>¶
-
strict_check
= <luigi.parameter.BoolParameter object>¶
-
workspace_directory
= <luigi.parameter.Parameter object>¶
gokart.workspace_management module¶
gokart.zip_client module¶
-
class
gokart.zip_client.
LocalZipClient
(file_path: str, temporary_directory: str)[source]¶ Bases:
gokart.zip_client.ZipClient
-
path
¶
-