execution Package¶
execution Package¶
This package contains all modules related directly to the execution
basenoderun Module¶
-
class
fastr.execution.basenoderun.BaseNodeRun[source]¶ Bases:
fastr.abc.updateable.Updateable,fastr.abc.serializable.Serializable-
NODE_RUN_MAP= {'AdvancedFlowNode': <class 'fastr.execution.flownoderun.AdvancedFlowNodeRun'>, 'ConstantNode': <class 'fastr.execution.sourcenoderun.ConstantNodeRun'>, 'FlowNode': <class 'fastr.execution.flownoderun.FlowNodeRun'>, 'MacroNode': <class 'fastr.execution.macronoderun.MacroNodeRun'>, 'Node': <class 'fastr.execution.noderun.NodeRun'>, 'SinkNode': <class 'fastr.execution.sinknoderun.SinkNodeRun'>, 'SourceNode': <class 'fastr.execution.sourcenoderun.SourceNodeRun'>}¶
-
NODE_RUN_TYPES= {'AdvancedFlowNodeRun': <class 'fastr.execution.flownoderun.AdvancedFlowNodeRun'>, 'ConstantNodeRun': <class 'fastr.execution.sourcenoderun.ConstantNodeRun'>, 'FlowNodeRun': <class 'fastr.execution.flownoderun.FlowNodeRun'>, 'MacroNodeRun': <class 'fastr.execution.macronoderun.MacroNodeRun'>, 'NodeRun': <class 'fastr.execution.noderun.NodeRun'>, 'SinkNodeRun': <class 'fastr.execution.sinknoderun.SinkNodeRun'>, 'SourceNodeRun': <class 'fastr.execution.sourcenoderun.SourceNodeRun'>}¶
-
__abstractmethods__= frozenset({'_update'})¶
-
__module__= 'fastr.execution.basenoderun'¶
-
environmentmodules Module¶
This module contains a class to interact with EnvironmentModules
-
class
fastr.execution.environmentmodules.EnvironmentModules(protected=None)[source]¶ Bases:
objectThis class can control the module environments in python. It can list, load and unload environmentmodules. These modules are then used if subprocess is called from python.
-
__dict__= mappingproxy({'__module__': 'fastr.execution.environmentmodules', '__doc__': '\n This class can control the module environments in python. It can list, load\n and unload environmentmodules. These modules are then used if subprocess is\n called from python.\n ', '_module_settings_loaded': False, '_module_settings_warning': 'Cannot find Environment Modules home directory (environment variables not setup properly?)', '__init__': <function EnvironmentModules.__init__>, '__repr__': <function EnvironmentModules.__repr__>, 'sync': <function EnvironmentModules.sync>, '_sync_loaded': <function EnvironmentModules._sync_loaded>, '_sync_avail': <function EnvironmentModules._sync_avail>, '_module': <function EnvironmentModules._module>, 'totuple_modvalue': <staticmethod object>, 'tostring_modvalue': <staticmethod object>, '_run_commands_string': <function EnvironmentModules._run_commands_string>, 'loaded_modules': <property object>, 'avail_modules': <property object>, 'avail': <function EnvironmentModules.avail>, 'isloaded': <function EnvironmentModules.isloaded>, 'load': <function EnvironmentModules.load>, 'unload': <function EnvironmentModules.unload>, 'reload': <function EnvironmentModules.reload>, 'swap': <function EnvironmentModules.swap>, 'clear': <function EnvironmentModules.clear>, '__dict__': <attribute '__dict__' of 'EnvironmentModules' objects>, '__weakref__': <attribute '__weakref__' of 'EnvironmentModules' objects>})¶
-
__init__(protected=None)[source]¶ Create the environmentmodules control object
- Parameters
protected (list) – list of modules that should never be unloaded
- Returns
newly created EnvironmentModules
-
__module__= 'fastr.execution.environmentmodules'¶
-
__weakref__¶ list of weak references to the object (if defined)
-
avail(namestart=None)[source]¶ Print available modules in same way as commandline version
- Parameters
namestart – filter on modules that start with namestart
-
property
avail_modules¶ List of avaible modules
-
clear()[source]¶ Unload all modules (except the protected modules as they cannot be unloaded). This should result in a clean environment.
-
isloaded(module)[source]¶ Check if a specific module is loaded
- Parameters
module – module to check
- Returns
flag indicating the module is loaded
-
property
loaded_modules¶ List of currently loaded modules
-
swap(module1, module2)[source]¶ Swap one module for another one
- Parameters
module1 – module to unload
module2 – module to load
-
sync()[source]¶ Sync the object with the underlying environment. Re-checks the available and loaded modules
-
static
tostring_modvalue(value)[source]¶ Turn a representation of a module into a string representation
- Parameters
value – module representation (either str or tuple)
- Returns
string representation
-
executionscript Module¶
The executionscript is the script that wraps around a tool executable. It takes a job, builds the command, executes the command (while profiling it) and collects the results.
flownoderun Module¶
-
class
fastr.execution.flownoderun.FlowNodeRun(node, parent)[source]¶ Bases:
fastr.execution.noderun.NodeRunA Flow NodeRun is a special subclass of Nodes in which the amount of samples can vary per Output. This allows non-default data flows.
-
__abstractmethods__= frozenset({})¶
-
__module__= 'fastr.execution.flownoderun'¶
-
property
blocking¶ A FlowNodeRun is (for the moment) always considered blocking.
- Returns
True
-
property
dimnames¶ Names of the dimensions in the NodeRun output. These will be reflected in the SampleIdList of this NodeRun.
-
property
outputsize¶ Size of the outputs in this NodeRun
-
-
class
fastr.execution.flownoderun.AdvancedFlowNodeRun(node, parent)[source]¶ Bases:
fastr.execution.flownoderun.FlowNodeRun-
__abstractmethods__= frozenset({})¶
-
__module__= 'fastr.execution.flownoderun'¶
-
inputoutputrun Module¶
Classes for arranging the input and output for nodes.
Exported classes:
Input – An input for a node (holding datatype). Output – The output of a node (holding datatype and value). ConstantOutput – The output of a node (holding datatype and value).
Warning
Don’t mess with the Link, Input and Output internals from other places. There will be a huge chances of breaking the network functionality!
-
class
fastr.execution.inputoutputrun.AdvancedFlowOutputRun(node_run, template)[source]¶ Bases:
fastr.execution.inputoutputrun.OutputRun-
__abstractmethods__= frozenset({})¶
-
__module__= 'fastr.execution.inputoutputrun'¶
-
-
class
fastr.execution.inputoutputrun.BaseInputRun(node_run, template)[source]¶ Bases:
fastr.core.samples.HasSamples,fastr.planning.inputoutput.BaseInputBase class for all inputs runs.
-
__abstractmethods__= frozenset({'__getitem__', '_update', 'dimensions', 'fullid', 'itersubinputs'})¶
-
__init__(node_run, template)[source]¶ Instantiate a BaseInput
- Parameters
node – the parent node the input/output belongs to.
description – the
ParameterDescriptiondescribing the input/output.
- Returns
the created BaseInput
- Raises
FastrTypeError – if description is not of class
ParameterDescriptionFastrDataTypeNotAvailableError – if the DataType requested cannot be found in the
fastr.types
-
__module__= 'fastr.execution.inputoutputrun'¶
-
-
class
fastr.execution.inputoutputrun.InputRun(node_run, template)[source]¶ Bases:
fastr.execution.inputoutputrun.BaseInputRunClass representing an input of a node. Such an input will be connected to the output of another node or the output of an constant node to provide the input value.
-
__abstractmethods__= frozenset({})¶
-
__getitem__(key)[source]¶ Retrieve an item from this Input.
- Parameters
key (str,
SampleIdor tuple) – the key of the requested item, can be a key str, sample index tuple or aSampleId- Returns
the return value depends on the requested key. If the key was an int the corresponding
SubInputwill be returned. If the key was aSampleIdor sample index tuple, the correspondingSampleItemwill be returned.- Return type
SampleItemorSubInput- Raises
FastrTypeError – if key is not of a valid type
FastrKeyError – if the key is not found
-
__init__(node_run, template)[source]¶ Instantiate an input.
- Parameters
template – the Input that the InputRun is based on
-
__module__= 'fastr.execution.inputoutputrun'¶
-
__setstate__(state)[source]¶ Set the state of the Input by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
cardinality(key=None, job_data=None)[source]¶ Cardinality for an Input is the sum the cardinalities of the SubInputs, unless defined otherwise.
-
property
datatype¶ The datatype of this Input
-
property
dimensions¶ The size of the sample collections that can accessed via this Input.
-
property
fullid¶ The full defining ID for the Input
-
property
input_group¶ The id of the
InputGroupthis Input belongs to.
-
itersubinputs()[source]¶ Iterate over the
SubInputsin this Input.- Returns
iterator yielding
SubInput
example:
>>> for subinput in input_a.itersubinputs(): print subinput
-
-
class
fastr.execution.inputoutputrun.MacroOutputRun(node_run, template)[source]¶ Bases:
fastr.execution.inputoutputrun.OutputRun-
__abstractmethods__= frozenset({})¶
-
__module__= 'fastr.execution.inputoutputrun'¶
-
-
class
fastr.execution.inputoutputrun.NamedSubinputRun(parent)[source]¶ Bases:
fastr.execution.inputoutputrun.InputRunA named subinput for cases where the value of an input is mapping.
-
__abstractmethods__= frozenset({})¶
-
__getitem__(key)[source]¶ Retrieve an item (a SubInput) from this NamedSubInput.
- Parameters
key (
int) – the key of the requested item- Return type
- Returns
The
SubInputcorresponding with the key will be returned.- Raises
FastrTypeError – if key is not of a valid type
FastrKeyError – if the key is not found
-
__init__(parent)[source]¶ Instantiate an input.
- Parameters
template – the Input that the InputRun is based on
-
__module__= 'fastr.execution.inputoutputrun'¶
-
__str__()[source]¶ Get a string version for the NamedSubInput
- Returns
the string version
- Return type
-
property
fullid¶ The full defining ID for the NamedSubInputRun
-
property
item_index¶
-
-
class
fastr.execution.inputoutputrun.OutputRun(node_run, template)[source]¶ Bases:
fastr.planning.inputoutput.BaseOutput,fastr.core.samples.ContainsSamplesClass representing an output of a node. It holds the output values of the tool ran. Output fields can be connected to inputs of other nodes.
-
__abstractmethods__= frozenset({})¶
-
__getitem__(key)[source]¶ Retrieve an item from this Output. The returned value depends on what type of key used:
Retrieving data using index tuple: [index_tuple]
Retrieving data sample_id str: [SampleId]
Retrieving a list of data using SampleId list: [sample_id1, …, sample_idN]
Retrieving a
SubOutputusing an int or slice: [n] or [n:m]
- Parameters
key (int, slice,
SampleIdor tuple) – the key of the requested item, can be a number, slice, sample index tuple or aSampleId- Returns
the return value depends on the requested key. If the key was an int or slice the corresponding
SubOutputwill be returned (and created if needed). If the key was aSampleIdor sample index tuple, the correspondingSampleItemwill be returned. If the key was a list ofSampleIda tuple ofSampleItemwill be returned.- Return type
SubInputorSampleItemor list ofSampleItem- Raises
FastrTypeError – if key is not of a valid type
FastrKeyError – if the parent Node has not been executed
-
__init__(node_run, template)[source]¶ Instantiate an Output
- Parameters
node – the parent node the output belongs to.
description – the
ParameterDescriptiondescribing the output.
- Returns
created Output
- Raises
FastrTypeError – if description is not of class
ParameterDescriptionFastrDataTypeNotAvailableError – if the DataType requested cannot be found in the
fastr.types
-
__module__= 'fastr.execution.inputoutputrun'¶
-
__setitem__(key, value)[source]¶ Store an item in the Output
- Parameters
key (tuple of int or
SampleId) – key of the value to storevalue – the value to store
- Returns
None
- Raises
FastrTypeError – if key is not of correct type
-
__setstate__(state)[source]¶ Set the state of the Output by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
property
automatic¶ Flag indicating that the Output is generated automatically without being specified on the command line
-
cardinality(key=None, job_data=None)[source]¶ Cardinality of this Output, may depend on the inputs of the parent Node.
- Parameters
key (tuple of int or
SampleId) – key for a specific sample, can be sample index or id- Returns
the cardinality
- Return type
- Raises
FastrCardinalityError – if cardinality references an invalid
InputFastrTypeError – if the referenced cardinality values type cannot be case to int
FastrValueError – if the referenced cardinality value cannot be case to int
-
property
datatype¶ The datatype of this Output
-
property
fullid¶ The full defining ID for the Output
-
iterconvergingindices(collapse_dims)[source]¶ Iterate over all data, but collapse certain dimension to create lists of data.
- Parameters
collapse_dims (iterable of int) – dimension to collapse
- Returns
iterator SampleIndex (possibly containing slices)
-
property
preferred_types¶ The list of preferred
DataTypesfor this Output.
-
property
resulting_datatype¶ The
DataTypethat will the results of this Output will have.
-
property
samples¶ The SampleCollection of the samples in this Output. None if the NodeRun has not yet been executed. Otherwise a SampleCollection.
-
property
valid¶ Check if the output is valid, i.e. has a valid cardinality
-
-
class
fastr.execution.inputoutputrun.SourceOutputRun(node_run, template)[source]¶ Bases:
fastr.execution.inputoutputrun.OutputRunOutput for a SourceNodeRun, this type of Output determines the cardinality in a different way than a normal NodeRun.
-
__abstractmethods__= frozenset({})¶
-
__getitem__(item)[source]¶ Retrieve an item from this Output. The returned value depends on what type of key used:
Retrieving data using index tuple: [index_tuple]
Retrieving data sample_id str: [SampleId]
Retrieving a list of data using SampleId list: [sample_id1, …, sample_idN]
Retrieving a
SubOutputusing an int or slice: [n] or [n:m]
- Parameters
key (int, slice,
SampleIdor tuple) – the key of the requested item, can be a number, slice, sample index tuple or aSampleId- Returns
the return value depends on the requested key. If the key was an int or slice the corresponding
SubOutputwill be returned (and created if needed). If the key was aSampleIdor sample index tuple, the correspondingSampleItemwill be returned. If the key was a list ofSampleIda tuple ofSampleItemwill be returned.- Return type
SubInputorSampleItemor list ofSampleItem- Raises
FastrTypeError – if key is not of a valid type
FastrKeyError – if the parent NodeRun has not been executed
-
__init__(node_run, template)[source]¶ Instantiate a FlowOutput
- Parameters
node – the parent node the output belongs to.
description – the
ParameterDescriptiondescribing the output.
- Returns
created FlowOutput
- Raises
FastrTypeError – if description is not of class
ParameterDescriptionFastrDataTypeNotAvailableError – if the DataType requested cannot be found in the
fastr.types
-
__module__= 'fastr.execution.inputoutputrun'¶
-
__setitem__(key, value)[source]¶ Store an item in the Output
- Parameters
key (tuple of int or
SampleId) – key of the value to storevalue – the value to store
- Returns
None
- Raises
FastrTypeError – if key is not of correct type
-
cardinality(key=None, job_data=None)[source]¶ Cardinality of this SourceOutput, may depend on the inputs of the parent NodeRun.
-
property
dimensions¶ The dimensions of this SourceOutputRun
-
property
linearized¶ A linearized version of the sample data, this is lazily cached linearized version of the underlying SampleCollection.
-
property
ndims¶ The number of dimensions in this SourceOutput
-
property
size¶ The sample size of the SourceOutput
-
-
class
fastr.execution.inputoutputrun.SubInputRun(input_)[source]¶ Bases:
fastr.execution.inputoutputrun.BaseInputRunThis class is used by
Inputto allow for multiple links to anInput. The SubInput class can hold only a single Link to a (Sub)Output, but behaves very similar to anInputotherwise.-
__abstractmethods__= frozenset({})¶
-
__getitem__(key)[source]¶ Retrieve an item from this SubInput.
- Parameters
key (int,
SampleIdorSampleIndex) – the key of the requested item, can be a number, sample index tuple or aSampleId- Returns
the return value depends on the requested key. If the key was an int the corresponding
SubInputwill be returned. If the key was aSampleIdor sample index tuple, the correspondingSampleItemwill be returned.- Return type
SampleItemorSubInput- Raises
FastrTypeError – if key is not of a valid type
Note
As a SubInput has only one SubInput, only requesting int key 0 or -1 is allowed, and it will return self
-
__getstate__()[source]¶ Retrieve the state of the SubInput
- Returns
the state of the object
- Rtype dict
-
__init__(input_)[source]¶ Instantiate an SubInput.
- Parameters
input (
Input) – the parent of this SubInput.- Returns
the created SubInput
-
__module__= 'fastr.execution.inputoutputrun'¶
-
__setstate__(state)[source]¶ Set the state of the SubInput by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
cardinality(key=None, job_data=None)[source]¶ Get the cardinality for this SubInput. The cardinality for a SubInputs is defined by the incoming link.
-
property
description¶ The description object of this input/output
-
property
dimensions¶ The sample size of the SubInput
-
property
fullid¶ The full defining ID for the SubInput
-
property
input_group¶ The id of the
InputGroupthis SubInputs parent belongs to.
-
property
item_index¶
-
iteritems()[source]¶ Iterate over the
SampleItemsthat are in the SubInput.- Returns
iterator yielding
SampleItemobjects
-
itersubinputs()[source]¶ Iterate over SubInputs (for a SubInput it will yield self and stop iterating after that)
- Returns
iterator yielding
SubInput
example:
>>> for subinput in input_a.itersubinputs(): print subinput
-
property
node¶ The Node to which this SubInputs parent belongs
-
-
class
fastr.execution.inputoutputrun.SubOutputRun(output, index)[source]¶ Bases:
fastr.execution.inputoutputrun.OutputRunThe SubOutput is an Output that represents a slice of another Output.
-
__abstractmethods__= frozenset({})¶
-
__getitem__(key)[source]¶ Retrieve an item from this SubOutput. The returned value depends on what type of key used:
Retrieving data using index tuple: [index_tuple]
Retrieving data sample_id str: [SampleId]
Retrieving a list of data using SampleId list: [sample_id1, …, sample_idN]
Retrieving a
SubOutputusing an int or slice: [n] or [n:m]
- Parameters
key (int, slice,
SampleIdor tuple) – the key of the requested item, can be a number, slice, sample index tuple or aSampleId- Returns
the return value depends on the requested key. If the key was an int or slice the corresponding
SubOutputwill be returned (and created if needed). If the key was aSampleIdor sample index tuple, the correspondingSampleItemwill be returned. If the key was a list ofSampleIda tuple ofSampleItemwill be returned.- Return type
SubInputorSampleItemor list ofSampleItem- Raises
FastrTypeError – if key is not of a valid type
-
__getstate__()[source]¶ Retrieve the state of the SubOutput
- Returns
the state of the object
- Rtype dict
-
__init__(output, index)[source]¶ Instantiate a SubOutput
- Parameters
- Returns
created SubOutput
- Raises
FastrTypeError – if the output argument is not an instance of
OutputFastrTypeError – if the index argument is not an
intorslice
-
__module__= 'fastr.execution.inputoutputrun'¶
-
__setitem__(key, value)[source]¶ A function blocking the assignment operator. Values cannot be assigned to a SubOutput.
- Raises
FastrNotImplementedError – if called
-
__setstate__(state)[source]¶ Set the state of the SubOutput by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
cardinality(key=None, job_data=None)[source]¶ Cardinality of this SubOutput depends on the parent Output and
self.index- Parameters
key (tuple of int or
SampleId) – key for a specific sample, can be sample index or id- Returns
the cardinality
- Return type
- Raises
FastrCardinalityError – if cardinality references an invalid
InputFastrTypeError – if the referenced cardinality values type cannot be case to int
FastrValueError – if the referenced cardinality value cannot be case to int
-
property
datatype¶ The datatype of this SubOutput
-
property
fullid¶ The full defining ID for the SubOutput
-
property
indexrep¶ Simple representation of the index.
-
property
node¶ The NodeRun to which this SubOutput belongs
-
property
preferred_types¶ The list of preferred
DataTypesfor this SubOutput.
-
property
resulting_datatype¶ The
DataTypethat will the results of this SubOutput will have.
-
property
samples¶ The
SampleCollectionfor this SubOutput
-
job Module¶
This module contains the Job class and some related classes.
-
class
fastr.execution.job.InlineJob(*args, **kwargs)[source]¶ Bases:
fastr.execution.job.JobJob that does not actually need to run but is used for consistency in data processing and logging.
-
__init__(*args, **kwargs)[source]¶ Create a job
- Parameters
node (fastr.planning.node.Node) – the node the job is based on
sample_id – the id of the sample
sample_index – the index of the sample
input_arguments – the argument list
output_arguments – the argument list
hold_jobs – the jobs on which this jobs depend
preferred_types – The list of preferred types to use
- Returns
-
__module__= 'fastr.execution.job'¶
-
-
class
fastr.execution.job.Job(node, sample_id, sample_index, input_arguments, output_arguments, hold_jobs=None, preferred_types=None)[source]¶ Bases:
fastr.abc.serializable.SerializableClass describing a job.
Arguments: tool_name - the name of the tool (str) tool_version - the version of the tool (Version) argument - the arguments used when calling the tool (list) tmpdir - temporary directory to use to store output data hold_jobs - list of jobs that need to finished before this job can run (list)
-
COMMAND_DUMP= '__fastr_command__.pickle.gz'¶
-
INFO_DUMP= '__fastr_extra_job_info__.json'¶
-
PROV_DUMP= '__fastr_prov__.json'¶
-
RESULT_DUMP= '__fastr_result__.pickle.gz'¶
-
STDERR_DUMP= '__fastr_stderr__.txt'¶
-
STDOUT_DUMP= '__fastr_stdout__.txt'¶
-
__init__(node, sample_id, sample_index, input_arguments, output_arguments, hold_jobs=None, preferred_types=None)[source]¶ Create a job
- Parameters
node (fastr.planning.node.Node) – the node the job is based on
sample_id (
SampleId) – the id of the samplesample_index (
SampleIndex) – the index of the sampleinput_arguments (
Dict[str,SampleIndex]) – the argument listhold_jobs (
Optional[List[str]]) – the jobs on which this jobs dependpreferred_types (
Optional[List]) – The list of preferred types to use
- Returns
-
__module__= 'fastr.execution.job'¶
-
static
cast_to_type(value, datatypes)[source]¶ Try to cast value to one of the given datatypes. Will try all the datatypes in order.
-
property
commandfile¶ The path of the command pickle
-
property
commandurl¶ The url of the command pickle
-
create_payload()[source]¶ Create the payload for this object based on all the input/output arguments
- Returns
the payload
- Return type
-
property
extrainfofile¶ The path where the extra job info document is saved
-
property
extrainfourl¶ The url where the extra job info document is saved
-
classmethod
fill_output_argument(output_spec, cardinality, desired_type, requested, tmpurl)[source]¶ This is an abstract class method. The method should take the argument_dict generated from calling self.get_argument_dict() and turn it into a list of commandline arguments that represent this Input/Output.
-
property
fullid¶ The full id of the job
-
get_deferred(output_id, cardinality_nr, sample_id=None)[source]¶ Get a deferred pointing to a specific output value in the Job
-
get_result()[source]¶ Get the result of the job if it is available. Load the output file if found and check if the job matches the current object. If so, load and return the result.
- Returns
Job after execution or None if not available
- Return type
Job | None
-
classmethod
get_value(value)[source]¶ Get a value
- Parameters
value – the url of the value
datatype – datatype of the value
- Returns
the retrieved value
-
property
id¶ The id of this job
-
property
logfile¶ The path of the result pickle
-
property
logurl¶ The url of the result pickle
-
property
provfile¶ The path where the prov document is saved
-
property
provurl¶ The url where the prov document is saved
-
property
resources¶ The compute resources required for this job
-
property
status¶ The status of the job
-
property
stderrfile¶ The path where the stderr text is saved
-
property
stderrurl¶ The url where the stderr text is saved
-
property
stdoutfile¶ The path where the stdout text is saved
-
property
stdouturl¶ The url where the stdout text is saved
-
property
tmpurl¶ The URL of the tmpdir to use
-
property
tool¶
-
classmethod
translate_argument(value)[source]¶ Translate an argument from a URL to an actual path.
- Parameters
value – value to translate
datatype – the datatype of the value
- Returns
the translated value
-
static
translate_output_results(value, datatypes, mountpoint=None)[source]¶ Translate the results for on Output
- Parameters
value – the results value for the output
datatypes (tuple) – tuple of possible datatypes for the output
preferred_type – the preferred datatype of the output
- Returns
the update value for the result
-
translate_results(result)[source]¶ Translate the results of an interface (using paths etc) to the proper form using URI’s instead.
-
-
class
fastr.execution.job.JobCleanupLevel[source]¶ Bases:
enum.EnumThe cleanup level for Jobs that are finished.
-
__module__= 'fastr.execution.job'¶
-
all= 'all'¶
-
no_cleanup= 'no_cleanup'¶
-
non_failed= 'non_failed'¶
-
-
class
fastr.execution.job.JobState(_, stage, error)[source]¶ Bases:
enum.EnumThe possible states a Job can be in. An overview of the states and the adviced transitions are depicted in the following figure:
-
__module__= 'fastr.execution.job'¶
-
cancelled= ('cancelled', 'done', True)¶
-
created= ('created', 'idle', False)¶
-
property
done¶
-
execution_done= ('execution_done', 'in_progress', False)¶
-
execution_failed= ('execution_failed', 'in_progress', True)¶
-
failed= ('failed', 'done', True)¶
-
finished= ('finished', 'done', False)¶
-
hold= ('hold', 'idle', False)¶
-
property
idle¶
-
property
in_progress¶
-
nonexistent= ('nonexistent', 'idle', False)¶
-
processing_callback= ('processing_callback', 'in_progress', False)¶
-
queued= ('queued', 'idle', False)¶
-
running= ('running', 'in_progress', False)¶
-
-
class
fastr.execution.job.SinkJob(node, sample_id, sample_index, input_arguments, output_arguments, hold_jobs=None, substitutions=None, preferred_types=None)[source]¶ Bases:
fastr.execution.job.JobSpecial SinkJob for the Sink
-
__init__(node, sample_id, sample_index, input_arguments, output_arguments, hold_jobs=None, substitutions=None, preferred_types=None)[source]¶ Create a job
- Parameters
node (fastr.planning.node.Node) – the node the job is based on
sample_id – the id of the sample
sample_index – the index of the sample
input_arguments – the argument list
output_arguments – the argument list
hold_jobs – the jobs on which this jobs depend
preferred_types – The list of preferred types to use
- Returns
-
__module__= 'fastr.execution.job'¶
-
create_payload()[source]¶ Create the payload for this object based on all the input/output arguments
- Returns
the payload
- Return type
-
get_result()[source]¶ Get the result of the job if it is available. Load the output file if found and check if the job matches the current object. If so, load and return the result.
- Returns
Job after execution
-
property
id¶ The id of this job
-
substitute(value, datatype=None)[source]¶ Substitute the special fields that can be used in a SinkJob.
- Parameters
value (str) – the value to substitute fields in
datatype (BaseDataType) – the datatype for the value
- Returns
string with substitutions performed
- Return type
-
property
tmpurl¶ The URL of the tmpdir to use
-
-
class
fastr.execution.job.SourceJob(node, sample_id, sample_index, input_arguments, output_arguments, hold_jobs=None, preferred_types=None)[source]¶ Bases:
fastr.execution.job.JobSpecial SourceJob for the Source
-
__module__= 'fastr.execution.job'¶
-
linkrun Module¶
The link module contain the Link class. This class represents the links in a network. These links lead from an output (BaseOutput) to an input (BaseInput) and indicate the desired data flow. Links are smart objects, in the sense that when you set their start or end point, they register themselves with the Input and Output. They do all the book keeping, so as long as you only set the source and target of the Link, the link should be valid.
Warning
Don’t mess with the Link, Input and Output internals from other places. There will be a huge chances of breaking the network functionality!
-
class
fastr.execution.linkrun.LinkRun(link, parent=None)[source]¶ Bases:
fastr.abc.updateable.Updateable,fastr.abc.serializable.SerializableClass for linking outputs (
BaseOutput) to inputs (BaseInput)Examples:
>>> import fastr >>> network = fastr.Network() >>> link1 = network.create_link( n1.ouputs['out1'], n2.inputs['in2'] ) link2 = Link() link2.source = n1.ouputs['out1'] link2.target = n2.inputs['in2']
-
__abstractmethods__= frozenset({})¶
-
__dataschemafile__= 'Link.schema.json'¶
-
__getitem__(index)[source]¶ Get a an item for this Link. The item will be retrieved from the connected output, but a diverging or converging flow can change the number of samples/cardinality.
- Parameters
index (SampleIndex) – index of the item to retrieve
- Returns
the requested item
- Return type
SampleItem- Raises
FastrIndexError – if the index length does not match the number dimensions in the source data (after collapsing/expanding)
-
__hash__= None¶
-
__init__(link, parent=None)[source]¶ Create a new Link in a Network.
- Parameters
- Returns
newly created LinkRun
- Raises
FastrValueError – if parent is not given and fastr.current_network is not set
FastrValueError – if the source output is not in the same network as the Link
FastrValueError – if the target input is not in the same network as the Link
-
__module__= 'fastr.execution.linkrun'¶
-
__repr__()[source]¶ Get a string representation for the Link
- Returns
the string representation
- Return type
-
__setstate__(state)[source]¶ Set the state of the Link by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
- Raises
FastrValueError – if the parent network and fastr.current_network are not set
-
cardinality(index=None)[source]¶ Cardinality for a Link is given by source Output and the collapse/expand settings
- Parameters
key (SampleIndex) – key for a specific sample (can be only a sample index!)
- Returns
the cardinality
- Return type
int, sympy.Symbol
- Raises
FastrIndexError – if the index length does not match the number of dimension in the data
-
property
collapse¶ The converging dimensions of this link. Collapsing changes some dimensions of sample lists into cardinality, reshaping the data.
Collapse can be set to a tuple or an int/str, in which case it will be automatically wrapped in a tuple. The int will be seen as indices of the dimensions to collapse. The str will be seen as the name of the dimensions over which to collapse.
- Raises
FastrTypeError – if assigning a collapse value of a wrong type
-
property
collapse_indexes¶ The converging dimensions of this link as integers. Dimension names are replaces with the corresponding int.
Collapsing changes some dimensions of sample lists into cardinality, reshaping the data
-
classmethod
createobj(state, network=None)[source]¶ Create object function for Link
- Parameters
cls – The class to create
state – The state to use to create the Link
network – the parent Network
- Returns
newly created Link
-
destroy()[source]¶ The destroy function of a link removes all default references to a link. This means the references in the network, input and output connected to this link. If there is no references in other places in the code, it will destroy the link (reference count dropping to zero).
This function is called when a source for an input is set to another value and the links becomes disconnected. This makes sure there is no dangling links.
-
property
dimensions¶ The dimensions of the data delivered by the link. This can be different from the source dimensions because the link can make data collapse or expand.
-
property
expand¶ Flag indicating that the link will expand the cardininality into a new sample dimension to be created.
-
property
fullid¶ The full defining ID for the Input
-
property
parent¶ The Network to which this Link belongs.
-
property
size¶ The size of the data delivered by the link. This can be different from the source size because the link can make data collapse or expand.
-
property
source¶ The source
BaseOutputof the Link. Setting the source will automatically register the Link with the source BaseOutput. Updating source will also make sure the Link is unregistered with the previous source.- Raises
FastrTypeError – if assigning a non
BaseOutput
-
property
status¶
-
property
target¶ The target
BaseInputof the Link. Setting the target will automatically register the Link with the target BaseInput. Updating target will also make sure the Link is unregistered with the previous target.- Raises
FastrTypeError – if assigning a non
BaseInput
-
macronoderun Module¶
-
class
fastr.execution.macronoderun.MacroNodeRun(node, parent)[source]¶ Bases:
fastr.execution.noderun.NodeRunMacroNodeRun encapsulates an entire network in a single node.
-
__abstractmethods__= frozenset({})¶
-
__getstate__()[source]¶ Retrieve the state of the MacroNodeRun
- Returns
the state of the object
- Rtype dict
-
__init__(node, parent)[source]¶ - Parameters
network (fastr.planning.network.Network) – network to create macronode for
-
__module__= 'fastr.execution.macronoderun'¶
-
__setstate__(state)[source]¶ Set the state of the NodeRun by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
execute()[source]¶ Execute the node and create the jobs that need to run
- Returns
list of jobs to run
- Return type
list of
Jobs
-
property
network_run¶
-
networkanalyzer Module¶
Module that defines the NetworkAnalyzer and holds the reference implementation.
-
class
fastr.execution.networkanalyzer.DefaultNetworkAnalyzer[source]¶ Bases:
fastr.execution.networkanalyzer.NetworkAnalyzerDefault implementation of the NetworkAnalyzer.
-
__module__= 'fastr.execution.networkanalyzer'¶
-
-
class
fastr.execution.networkanalyzer.NetworkAnalyzer[source]¶ Bases:
objectBase class for NetworkAnalyzers
-
__dict__= mappingproxy({'__module__': 'fastr.execution.networkanalyzer', '__doc__': '\n Base class for NetworkAnalyzers\n ', 'analyze_network': <function NetworkAnalyzer.analyze_network>, '__dict__': <attribute '__dict__' of 'NetworkAnalyzer' objects>, '__weakref__': <attribute '__weakref__' of 'NetworkAnalyzer' objects>})¶
-
__module__= 'fastr.execution.networkanalyzer'¶
-
__weakref__¶ list of weak references to the object (if defined)
-
networkchunker Module¶
This module contains the NetworkChunker class and its default implementation the DefaultNetworkChunker
-
class
fastr.execution.networkchunker.DefaultNetworkChunker[source]¶ Bases:
fastr.execution.networkchunker.NetworkChunkerThe default implementation of the NetworkChunker. It tries to create as large as possible chunks so the execution blocks as little as possible.
-
__module__= 'fastr.execution.networkchunker'¶
-
chunck_network(network)[source]¶ Create a list of Network chunks that can be pre-analyzed completely. Each chunk needs to be executed before the next can be analyzed and executed.
The returned chunks are (at the moment) in the format of a tuple (start, nodes) which are both tuples. The tuple contain the nodes where to start execution (should ready if previous chunks are done) and all nodes of the chunk respectively.
- Parameters
network – Network to split into chunks
- Returns
tuple containing chunks
-
-
class
fastr.execution.networkchunker.NetworkChunker[source]¶ Bases:
objectThe base class for NetworkChunkers. A Network chunker is a class that takes a Network and produces a list of chunks that can each be analyzed and executed in one go.
-
__dict__= mappingproxy({'__module__': 'fastr.execution.networkchunker', '__doc__': '\n The base class for NetworkChunkers. A Network chunker is a class that takes\n a Network and produces a list of chunks that can each be analyzed and\n executed in one go.\n ', 'chunck_network': <function NetworkChunker.chunck_network>, '__dict__': <attribute '__dict__' of 'NetworkChunker' objects>, '__weakref__': <attribute '__weakref__' of 'NetworkChunker' objects>})¶
-
__module__= 'fastr.execution.networkchunker'¶
-
__weakref__¶ list of weak references to the object (if defined)
-
networkrun Module¶
Network module containing Network facilitators and analysers.
-
class
fastr.execution.networkrun.NetworkRun(network)[source]¶ Bases:
fastr.abc.serializable.SerializableThe Network class represents a workflow. This includes all Nodes (including ConstantNodes, SourceNodes and Sinks) and Links.
-
NETWORK_DUMP_FILE_NAME= '__fastr_network__.json'¶
-
SINK_DUMP_FILE_NAME= '__sink_data__.json'¶
-
SOURCE_DUMP_FILE_NAME= '__source_data__.pickle.gz'¶
-
__getitem__(item)[source]¶ Get an item by its fullid. The fullid can point to a link, node, input, output or even subinput/suboutput.
- Parameters
item (str,unicode) – fullid of the item to retrieve
- Returns
the requested item
-
__getstate__()[source]¶ Retrieve the state of the Network
- Returns
the state of the object
- Rtype dict
-
__hash__= None¶
-
__module__= 'fastr.execution.networkrun'¶
-
__setstate__(state)[source]¶ Set the state of the Network by the given state. This completely overwrites the old state!
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
check_id(id_)[source]¶ Check if an id for an object is valid and unused in the Network. The method will always returns True if it does not raise an exception.
- Parameters
id (str) – the id to check
- Returns
True
- Raises
FastrValueError – if the id is not correctly formatted
FastrValueError – if the id is already in use
-
property
constantlist¶
-
execute(sourcedata, sinkdata, execution_plugin=None, tmpdir=None, cluster_queue=None, timestamp=None)[source]¶ Execute the Network with the given data. This will analyze the Network, create jobs and send them to the execution backend of the system.
- Parameters
- Raises
FastrKeyError – if a source has not corresponding key in sourcedata
FastrKeyError – if a sink has not corresponding key in sinkdata
-
property
fullid¶ The fullid of the Network
-
property
global_id¶ The global id of the Network, this is different for networks used in macronodes, as they still have parents.
-
property
id¶ The id of the Network. This is a read only property.
-
job_finished(job)[source]¶ Call-back handler for when a job is finished. Will collect the results and handle blocking jobs. This function is automatically called when the execution plugin finished a job.
- Parameters
job (
Job) – the job that finished
-
property
long_id¶
-
property
network¶
-
property
nodegroups¶ Give an overview of the nodegroups in the network
-
register_signals()[source]¶ Register handles to handle SIGINT and SIGTERM handlers to gracefully shut down the execution :return:
-
property
sinklist¶
-
property
sourcelist¶
-
noderun Module¶
A module to maintain a run of a network node.
-
class
fastr.execution.noderun.NodeRun(node, parent)[source]¶ Bases:
fastr.execution.basenoderun.BaseNodeRunThe class encapsulating a node in the network. The node is responsible for setting and checking inputs and outputs based on the description provided by a tool instance.
-
__abstractmethods__= frozenset({})¶
-
__dataschemafile__= 'NodeRun.schema.json'¶
-
__eq__(other)[source]¶ Compare two Node instances with each other. This function ignores the parent and update status, but tests rest of the dict for equality. equality
- Parameters
other (NodeRun) – the other instances to compare to
- Returns
True if equal, False otherwise
-
__getstate__()[source]¶ Retrieve the state of the NodeRun
- Returns
the state of the object
- Rtype dict
-
__hash__= None¶
-
__init__(node, parent)[source]¶ Instantiate a node.
- Parameters
node (
Tool) – The node to base the noderun onparent (
Network) – the parent network of the node
- Returns
the newly created NodeRun
-
__module__= 'fastr.execution.noderun'¶
-
__repr__()[source]¶ Get a string representation for the NodeRun
- Returns
the string representation
- Return type
-
__setstate__(state)[source]¶ Set the state of the NodeRun by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
property
blocking¶ Indicate that the results of this NodeRun cannot be determined without first executing the NodeRun, causing a blockage in the creation of jobs. A blocking Nodes causes the Chunk borders.
-
create_job(sample_id, sample_index, job_data, job_dependencies, **kwargs)[source]¶ Create a job based on the sample id, job data and job dependencies.
- Parameters
sample_id (
SampleId) – the id of the corresponding samplesample_index (
SampleIndex) – the index of the corresponding samplejob_data (dict) – dictionary containing all input data for the job
job_dependencies – other jobs that need to finish before this job can run
- Returns
the created job
- Return type
-
classmethod
createobj(state, network=None)[source]¶ Create object function for generic objects
- Parameters
cls – The class to create
state – The state to use to create the Link
network – the parent Network
- Returns
newly created Link
-
property
dimnames¶ Names of the dimensions in the NodeRun output. These will be reflected in the SampleIdList of this NodeRun.
-
execute()[source]¶ Execute the node and create the jobs that need to run
- Returns
list of jobs to run
- Return type
list of
Jobs
-
property
fullid¶ The full defining ID for the NodeRun inside the network
-
get_sourced_nodes()[source]¶ A list of all Nodes connected as sources to this NodeRun
- Returns
list of all nodes that are connected to an input of this node
-
property
global_id¶ The global defining ID for the Node from the main network (goes out of macro nodes to root network)
-
property
id¶ The id of the NodeRun
-
property
input_groups¶ - A list of input groups for this NodeRun. An input group is InputGroup
object filled according to the NodeRun
-
property
listeners¶ All the listeners requesting output of this node, this means the listeners of all Outputs and SubOutputs
-
property
merge_dimensions¶
-
property
name¶ Name of the Tool the NodeRun was based on. In case a Toolless NodeRun was used the class name is given.
-
property
outputsize¶ Size of the outputs in this NodeRun
-
property
parent¶ The parent network of this node.
-
property
resources¶ Number of cores required for the execution of this NodeRun
-
set_result(job, failed_annotation)[source]¶ Incorporate result of a job into the NodeRun.
- Parameters
job (Type) – job of which the result to store
failed_annotation – A set of annotations, None if no errors else containing a tuple describing the errors
-
property
status¶
-
property
tool¶
-
sinknoderun Module¶
-
class
fastr.execution.sinknoderun.SinkNodeRun(node, parent)[source]¶ Bases:
fastr.execution.noderun.NodeRunClass which handles where the output goes. This can be any kind of file, e.g. image files, textfiles, config files, etc.
-
__abstractmethods__= frozenset({})¶
-
__dataschemafile__= 'SinkNodeRun.schema.json'¶
-
__getstate__()[source]¶ Retrieve the state of the NodeRun
- Returns
the state of the object
- Rtype dict
-
__init__(node, parent)[source]¶ Instantiation of the SinkNodeRun.
- Parameters
node (fastr.planning.node.Node) – The Node that this Run is based on.
parent (NetworkRun) – The NetworkRun that this NodeRun belongs to
- Returns
newly created sink node run
-
__module__= 'fastr.execution.sinknoderun'¶
-
__setstate__(state)[source]¶ Set the state of the NodeRun by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
create_job(sample_id, sample_index, job_data, job_dependencies, **kwargs)[source]¶ Create a job for a sink based on the sample id, job data and job dependencies.
-
property
datatype¶ The datatype of the data this sink can store.
-
execute()[source]¶ Execute the sink node and create the jobs that need to run
- Returns
list of jobs to run
- Return type
list of
Jobs
-
property
input¶ The default input of the sink NodeRun
-
set_data(data)[source]¶ Set the targets of this sink node.
- Parameters
data (dict or list of urls) – the targets rules for where to write the data
The target rules can include a few fields that can be filled out:
field
description
sample_id
the sample id of the sample written in string form
cardinality
the cardinality of the sample written
ext
the extension of the datatype of the written data, including the .
extension
the extension of the datatype of the written data, excluding the .
network
the id of the network the sink is part of
node
the id of the node of the sink
timestamp
the iso formatted datetime the network execution started
uuid
the uuid of the network run (generated using uuid.uuid1)
An example of a valid target could be:
>>> target = 'vfs://output_mnt/some/path/image_{sample_id}_{cardinality}{ext}'
Note
The
{ext}and{extension}are very similar but are both offered. In many cases having aname.{extension}will feel like the correct way to do it. However, if you have DataTypes with and without extension that can both exported by the same sink, this would cause eithername.extorname.to be generated. In this particular casename{ext}can help as it will create eithername.extorname.Note
If a datatype has multiple extensions (e.g. .tiff and .tif) the first extension defined in the extension tuple of the datatype will be used.
-
sourcenoderun Module¶
-
class
fastr.execution.sourcenoderun.SourceNodeRun(node, parent)[source]¶ Bases:
fastr.execution.flownoderun.FlowNodeRunClass providing a connection to data resources. This can be any kind of file, stream, database, etc from which data can be received.
-
__abstractmethods__= frozenset({})¶
-
__dataschemafile__= 'SourceNodeRun.schema.json'¶
-
__eq__(other)[source]¶ Compare two Node instances with each other. This function ignores the parent and update status, but tests rest of the dict for equality. equality
- Parameters
other (NodeRun) – the other instances to compare to
- Returns
True if equal, False otherwise
-
__getstate__()[source]¶ Retrieve the state of the SourceNodeRun
- Returns
the state of the object
- Rtype dict
-
__hash__= None¶
-
__init__(node, parent)[source]¶ Instantiation of the SourceNodeRun.
- Parameters
node (fastr.planning.node.Node) – The Node that this Run is based on.
parent (NetworkRun) – The NetworkRun that this NodeRun belongs to
- Returns
newly created sink node run
-
__module__= 'fastr.execution.sourcenoderun'¶
-
__setstate__(state)[source]¶ Set the state of the SourceNodeRun by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
create_job(sample_id, sample_index, job_data, job_dependencies, **kwargs)[source]¶ Create a job based on the sample id, job data and job dependencies.
- Parameters
sample_id (
SampleId) – the id of the corresponding samplesample_index (
SampleIndex) – the index of the corresponding samplejob_data (dict) – dictionary containing all input data for the job
job_dependencies – other jobs that need to finish before this job can run
- Returns
the created job
- Return type
-
property
datatype¶ The datatype of the data this source supplies.
-
property
dimnames¶ Names of the dimensions in the SourceNodeRun output. These will be reflected in the SampleIdLists.
-
execute()[source]¶ Execute the source node and create the jobs that need to run
- Returns
list of jobs to run
- Return type
list of
Jobs
-
property
output¶ Shorthand for
self.outputs['output']
-
property
outputsize¶ The size of output of this SourceNodeRun
-
set_data(data, ids=None)[source]¶ Set the data of this source node.
- Parameters
data (dict, OrderedDict or list of urls) – the data to use
ids – if data is a list, a list of accompanying ids
-
property
sourcegroup¶
-
property
valid¶ This does nothing. It only overloads the valid method of NodeRun(). The original is intended to check if the inputs are connected to some output. Since this class does not implement inputs, it is skipped.
-
-
class
fastr.execution.sourcenoderun.ConstantNodeRun(node, parent)[source]¶ Bases:
fastr.execution.sourcenoderun.SourceNodeRunClass encapsulating one output for which a value can be set. For example used to set a scalar value to the input of a node.
-
__abstractmethods__= frozenset({})¶
-
__dataschemafile__= 'ConstantNodeRun.schema.json'¶
-
__getstate__()[source]¶ Retrieve the state of the ConstantNodeRun
- Returns
the state of the object
- Rtype dict
-
__init__(node, parent)[source]¶ Instantiation of the ConstantNodeRun.
- Parameters
datatype – The datatype of the output.
data – the prefilled data to use.
id – The url pattern.
This class should never be instantiated directly (unless you know what you are doing). Instead create a constant using the network class like shown in the usage example below.
usage example:
>>> import fastr >>> network = fastr.Network() >>> source = network.create_source(datatype=types['ITKImageFile'], id_='sourceN')
or alternatively create a constant node by assigning data to an item in an InputDict:
>>> node_a.inputs['in'] = ['some', 'data']
which automatically creates and links a ConstantNodeRun to the specified Input
-
__module__= 'fastr.execution.sourcenoderun'¶
-
__setstate__(state)[source]¶ Set the state of the ConstantNodeRun by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
property
data¶ The data stored in this constant node
-