execution Package¶
execution
Package¶
This package contains all modules related directly to the execution
basenoderun
Module¶
-
class
fastr.execution.basenoderun.
BaseNodeRun
[source]¶ Bases:
fastr.abc.updateable.Updateable
,fastr.abc.serializable.Serializable
-
NODE_RUN_MAP
= {'AdvancedFlowNode': <class 'fastr.execution.flownoderun.AdvancedFlowNodeRun'>, 'ConstantNode': <class 'fastr.execution.sourcenoderun.ConstantNodeRun'>, 'FlowNode': <class 'fastr.execution.flownoderun.FlowNodeRun'>, 'MacroNode': <class 'fastr.execution.macronoderun.MacroNodeRun'>, 'Node': <class 'fastr.execution.noderun.NodeRun'>, 'SinkNode': <class 'fastr.execution.sinknoderun.SinkNodeRun'>, 'SourceNode': <class 'fastr.execution.sourcenoderun.SourceNodeRun'>}¶
-
NODE_RUN_TYPES
= {'AdvancedFlowNodeRun': <class 'fastr.execution.flownoderun.AdvancedFlowNodeRun'>, 'ConstantNodeRun': <class 'fastr.execution.sourcenoderun.ConstantNodeRun'>, 'FlowNodeRun': <class 'fastr.execution.flownoderun.FlowNodeRun'>, 'MacroNodeRun': <class 'fastr.execution.macronoderun.MacroNodeRun'>, 'NodeRun': <class 'fastr.execution.noderun.NodeRun'>, 'SinkNodeRun': <class 'fastr.execution.sinknoderun.SinkNodeRun'>, 'SourceNodeRun': <class 'fastr.execution.sourcenoderun.SourceNodeRun'>}¶
-
__abstractmethods__
= frozenset({'_update'})¶
-
__module__
= 'fastr.execution.basenoderun'¶
-
environmentmodules
Module¶
This module contains a class to interact with EnvironmentModules
-
class
fastr.execution.environmentmodules.
EnvironmentModules
(protected=None)[source]¶ Bases:
object
This class can control the module environments in python. It can list, load and unload environmentmodules. These modules are then used if subprocess is called from python.
-
__dict__
= mappingproxy({'__module__': 'fastr.execution.environmentmodules', '__doc__': '\n This class can control the module environments in python. It can list, load\n and unload environmentmodules. These modules are then used if subprocess is\n called from python.\n ', '_module_settings_loaded': False, '_module_settings_warning': 'Cannot find Environment Modules home directory (environment variables not setup properly?)', '__init__': <function EnvironmentModules.__init__>, '__repr__': <function EnvironmentModules.__repr__>, 'sync': <function EnvironmentModules.sync>, '_sync_loaded': <function EnvironmentModules._sync_loaded>, '_sync_avail': <function EnvironmentModules._sync_avail>, '_module': <function EnvironmentModules._module>, 'totuple_modvalue': <staticmethod object>, 'tostring_modvalue': <staticmethod object>, '_run_commands_string': <function EnvironmentModules._run_commands_string>, 'loaded_modules': <property object>, 'avail_modules': <property object>, 'avail': <function EnvironmentModules.avail>, 'isloaded': <function EnvironmentModules.isloaded>, 'load': <function EnvironmentModules.load>, 'unload': <function EnvironmentModules.unload>, 'reload': <function EnvironmentModules.reload>, 'swap': <function EnvironmentModules.swap>, 'clear': <function EnvironmentModules.clear>, '__dict__': <attribute '__dict__' of 'EnvironmentModules' objects>, '__weakref__': <attribute '__weakref__' of 'EnvironmentModules' objects>})¶
-
__init__
(protected=None)[source]¶ Create the environmentmodules control object
- Parameters
protected (list) – list of modules that should never be unloaded
- Returns
newly created EnvironmentModules
-
__module__
= 'fastr.execution.environmentmodules'¶
-
__weakref__
¶ list of weak references to the object (if defined)
-
avail
(namestart=None)[source]¶ Print available modules in same way as commandline version
- Parameters
namestart – filter on modules that start with namestart
-
property
avail_modules
¶ List of avaible modules
-
clear
()[source]¶ Unload all modules (except the protected modules as they cannot be unloaded). This should result in a clean environment.
-
isloaded
(module)[source]¶ Check if a specific module is loaded
- Parameters
module – module to check
- Returns
flag indicating the module is loaded
-
property
loaded_modules
¶ List of currently loaded modules
-
swap
(module1, module2)[source]¶ Swap one module for another one
- Parameters
module1 – module to unload
module2 – module to load
-
sync
()[source]¶ Sync the object with the underlying environment. Re-checks the available and loaded modules
-
static
tostring_modvalue
(value)[source]¶ Turn a representation of a module into a string representation
- Parameters
value – module representation (either str or tuple)
- Returns
string representation
-
executionscript
Module¶
The executionscript is the script that wraps around a tool executable. It takes a job, builds the command, executes the command (while profiling it) and collects the results.
flownoderun
Module¶
-
class
fastr.execution.flownoderun.
FlowNodeRun
(node, parent)[source]¶ Bases:
fastr.execution.noderun.NodeRun
A Flow NodeRun is a special subclass of Nodes in which the amount of samples can vary per Output. This allows non-default data flows.
-
__abstractmethods__
= frozenset({})¶
-
__module__
= 'fastr.execution.flownoderun'¶
-
property
blocking
¶ A FlowNodeRun is (for the moment) always considered blocking.
- Returns
True
-
property
dimnames
¶ Names of the dimensions in the NodeRun output. These will be reflected in the SampleIdList of this NodeRun.
-
property
outputsize
¶ Size of the outputs in this NodeRun
-
-
class
fastr.execution.flownoderun.
AdvancedFlowNodeRun
(node, parent)[source]¶ Bases:
fastr.execution.flownoderun.FlowNodeRun
-
__abstractmethods__
= frozenset({})¶
-
__module__
= 'fastr.execution.flownoderun'¶
-
inputoutputrun
Module¶
Classes for arranging the input and output for nodes.
Exported classes:
Input – An input for a node (holding datatype). Output – The output of a node (holding datatype and value). ConstantOutput – The output of a node (holding datatype and value).
Warning
Don’t mess with the Link, Input and Output internals from other places. There will be a huge chances of breaking the network functionality!
-
class
fastr.execution.inputoutputrun.
AdvancedFlowOutputRun
(node_run, template)[source]¶ Bases:
fastr.execution.inputoutputrun.OutputRun
-
__abstractmethods__
= frozenset({})¶
-
__module__
= 'fastr.execution.inputoutputrun'¶
-
-
class
fastr.execution.inputoutputrun.
BaseInputRun
(node_run, template)[source]¶ Bases:
fastr.core.samples.HasSamples
,fastr.planning.inputoutput.BaseInput
Base class for all inputs runs.
-
__abstractmethods__
= frozenset({'__getitem__', '_update', 'dimensions', 'fullid', 'itersubinputs'})¶
-
__init__
(node_run, template)[source]¶ Instantiate a BaseInput
- Parameters
node – the parent node the input/output belongs to.
description – the
ParameterDescription
describing the input/output.
- Returns
the created BaseInput
- Raises
FastrTypeError – if description is not of class
ParameterDescription
FastrDataTypeNotAvailableError – if the DataType requested cannot be found in the
fastr.types
-
__module__
= 'fastr.execution.inputoutputrun'¶
-
-
class
fastr.execution.inputoutputrun.
InputRun
(node_run, template)[source]¶ Bases:
fastr.execution.inputoutputrun.BaseInputRun
Class representing an input of a node. Such an input will be connected to the output of another node or the output of an constant node to provide the input value.
-
__abstractmethods__
= frozenset({})¶
-
__getitem__
(key)[source]¶ Retrieve an item from this Input.
- Parameters
key (str,
SampleId
or tuple) – the key of the requested item, can be a key str, sample index tuple or aSampleId
- Returns
the return value depends on the requested key. If the key was an int the corresponding
SubInput
will be returned. If the key was aSampleId
or sample index tuple, the correspondingSampleItem
will be returned.- Return type
SampleItem
orSubInput
- Raises
FastrTypeError – if key is not of a valid type
FastrKeyError – if the key is not found
-
__init__
(node_run, template)[source]¶ Instantiate an input.
- Parameters
template – the Input that the InputRun is based on
-
__module__
= 'fastr.execution.inputoutputrun'¶
-
__setstate__
(state)[source]¶ Set the state of the Input by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
cardinality
(key=None, job_data=None)[source]¶ Cardinality for an Input is the sum the cardinalities of the SubInputs, unless defined otherwise.
-
property
datatype
¶ The datatype of this Input
-
property
dimensions
¶ The size of the sample collections that can accessed via this Input.
-
property
fullid
¶ The full defining ID for the Input
-
property
input_group
¶ The id of the
InputGroup
this Input belongs to.
-
itersubinputs
()[source]¶ Iterate over the
SubInputs
in this Input.- Returns
iterator yielding
SubInput
example:
>>> for subinput in input_a.itersubinputs(): print subinput
-
-
class
fastr.execution.inputoutputrun.
MacroOutputRun
(node_run, template)[source]¶ Bases:
fastr.execution.inputoutputrun.OutputRun
-
__abstractmethods__
= frozenset({})¶
-
__module__
= 'fastr.execution.inputoutputrun'¶
-
-
class
fastr.execution.inputoutputrun.
NamedSubinputRun
(parent)[source]¶ Bases:
fastr.execution.inputoutputrun.InputRun
A named subinput for cases where the value of an input is mapping.
-
__abstractmethods__
= frozenset({})¶
-
__getitem__
(key)[source]¶ Retrieve an item (a SubInput) from this NamedSubInput.
- Parameters
key (
int
) – the key of the requested item- Return type
- Returns
The
SubInput
corresponding with the key will be returned.- Raises
FastrTypeError – if key is not of a valid type
FastrKeyError – if the key is not found
-
__init__
(parent)[source]¶ Instantiate an input.
- Parameters
template – the Input that the InputRun is based on
-
__module__
= 'fastr.execution.inputoutputrun'¶
-
__str__
()[source]¶ Get a string version for the NamedSubInput
- Returns
the string version
- Return type
-
property
fullid
¶ The full defining ID for the NamedSubInputRun
-
property
item_index
¶
-
-
class
fastr.execution.inputoutputrun.
OutputRun
(node_run, template)[source]¶ Bases:
fastr.planning.inputoutput.BaseOutput
,fastr.core.samples.ContainsSamples
Class representing an output of a node. It holds the output values of the tool ran. Output fields can be connected to inputs of other nodes.
-
__abstractmethods__
= frozenset({})¶
-
__getitem__
(key)[source]¶ Retrieve an item from this Output. The returned value depends on what type of key used:
Retrieving data using index tuple: [index_tuple]
Retrieving data sample_id str: [SampleId]
Retrieving a list of data using SampleId list: [sample_id1, …, sample_idN]
Retrieving a
SubOutput
using an int or slice: [n] or [n:m]
- Parameters
key (int, slice,
SampleId
or tuple) – the key of the requested item, can be a number, slice, sample index tuple or aSampleId
- Returns
the return value depends on the requested key. If the key was an int or slice the corresponding
SubOutput
will be returned (and created if needed). If the key was aSampleId
or sample index tuple, the correspondingSampleItem
will be returned. If the key was a list ofSampleId
a tuple ofSampleItem
will be returned.- Return type
SubInput
orSampleItem
or list ofSampleItem
- Raises
FastrTypeError – if key is not of a valid type
FastrKeyError – if the parent Node has not been executed
-
__init__
(node_run, template)[source]¶ Instantiate an Output
- Parameters
node – the parent node the output belongs to.
description – the
ParameterDescription
describing the output.
- Returns
created Output
- Raises
FastrTypeError – if description is not of class
ParameterDescription
FastrDataTypeNotAvailableError – if the DataType requested cannot be found in the
fastr.types
-
__module__
= 'fastr.execution.inputoutputrun'¶
-
__setitem__
(key, value)[source]¶ Store an item in the Output
- Parameters
key (tuple of int or
SampleId
) – key of the value to storevalue – the value to store
- Returns
None
- Raises
FastrTypeError – if key is not of correct type
-
__setstate__
(state)[source]¶ Set the state of the Output by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
property
automatic
¶ Flag indicating that the Output is generated automatically without being specified on the command line
-
cardinality
(key=None, job_data=None)[source]¶ Cardinality of this Output, may depend on the inputs of the parent Node.
- Parameters
key (tuple of int or
SampleId
) – key for a specific sample, can be sample index or id- Returns
the cardinality
- Return type
- Raises
FastrCardinalityError – if cardinality references an invalid
Input
FastrTypeError – if the referenced cardinality values type cannot be case to int
FastrValueError – if the referenced cardinality value cannot be case to int
-
property
datatype
¶ The datatype of this Output
-
property
fullid
¶ The full defining ID for the Output
-
iterconvergingindices
(collapse_dims)[source]¶ Iterate over all data, but collapse certain dimension to create lists of data.
- Parameters
collapse_dims (iterable of int) – dimension to collapse
- Returns
iterator SampleIndex (possibly containing slices)
-
property
preferred_types
¶ The list of preferred
DataTypes
for this Output.
-
property
resulting_datatype
¶ The
DataType
that will the results of this Output will have.
-
property
samples
¶ The SampleCollection of the samples in this Output. None if the NodeRun has not yet been executed. Otherwise a SampleCollection.
-
property
valid
¶ Check if the output is valid, i.e. has a valid cardinality
-
-
class
fastr.execution.inputoutputrun.
SourceOutputRun
(node_run, template)[source]¶ Bases:
fastr.execution.inputoutputrun.OutputRun
Output for a SourceNodeRun, this type of Output determines the cardinality in a different way than a normal NodeRun.
-
__abstractmethods__
= frozenset({})¶
-
__getitem__
(item)[source]¶ Retrieve an item from this Output. The returned value depends on what type of key used:
Retrieving data using index tuple: [index_tuple]
Retrieving data sample_id str: [SampleId]
Retrieving a list of data using SampleId list: [sample_id1, …, sample_idN]
Retrieving a
SubOutput
using an int or slice: [n] or [n:m]
- Parameters
key (int, slice,
SampleId
or tuple) – the key of the requested item, can be a number, slice, sample index tuple or aSampleId
- Returns
the return value depends on the requested key. If the key was an int or slice the corresponding
SubOutput
will be returned (and created if needed). If the key was aSampleId
or sample index tuple, the correspondingSampleItem
will be returned. If the key was a list ofSampleId
a tuple ofSampleItem
will be returned.- Return type
SubInput
orSampleItem
or list ofSampleItem
- Raises
FastrTypeError – if key is not of a valid type
FastrKeyError – if the parent NodeRun has not been executed
-
__init__
(node_run, template)[source]¶ Instantiate a FlowOutput
- Parameters
node – the parent node the output belongs to.
description – the
ParameterDescription
describing the output.
- Returns
created FlowOutput
- Raises
FastrTypeError – if description is not of class
ParameterDescription
FastrDataTypeNotAvailableError – if the DataType requested cannot be found in the
fastr.types
-
__module__
= 'fastr.execution.inputoutputrun'¶
-
__setitem__
(key, value)[source]¶ Store an item in the Output
- Parameters
key (tuple of int or
SampleId
) – key of the value to storevalue – the value to store
- Returns
None
- Raises
FastrTypeError – if key is not of correct type
-
cardinality
(key=None, job_data=None)[source]¶ Cardinality of this SourceOutput, may depend on the inputs of the parent NodeRun.
-
property
dimensions
¶ The dimensions of this SourceOutputRun
-
property
linearized
¶ A linearized version of the sample data, this is lazily cached linearized version of the underlying SampleCollection.
-
property
ndims
¶ The number of dimensions in this SourceOutput
-
property
size
¶ The sample size of the SourceOutput
-
-
class
fastr.execution.inputoutputrun.
SubInputRun
(input_)[source]¶ Bases:
fastr.execution.inputoutputrun.BaseInputRun
This class is used by
Input
to allow for multiple links to anInput
. The SubInput class can hold only a single Link to a (Sub)Output, but behaves very similar to anInput
otherwise.-
__abstractmethods__
= frozenset({})¶
-
__getitem__
(key)[source]¶ Retrieve an item from this SubInput.
- Parameters
key (int,
SampleId
orSampleIndex
) – the key of the requested item, can be a number, sample index tuple or aSampleId
- Returns
the return value depends on the requested key. If the key was an int the corresponding
SubInput
will be returned. If the key was aSampleId
or sample index tuple, the correspondingSampleItem
will be returned.- Return type
SampleItem
orSubInput
- Raises
FastrTypeError – if key is not of a valid type
Note
As a SubInput has only one SubInput, only requesting int key 0 or -1 is allowed, and it will return self
-
__getstate__
()[source]¶ Retrieve the state of the SubInput
- Returns
the state of the object
- Rtype dict
-
__init__
(input_)[source]¶ Instantiate an SubInput.
- Parameters
input (
Input
) – the parent of this SubInput.- Returns
the created SubInput
-
__module__
= 'fastr.execution.inputoutputrun'¶
-
__setstate__
(state)[source]¶ Set the state of the SubInput by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
cardinality
(key=None, job_data=None)[source]¶ Get the cardinality for this SubInput. The cardinality for a SubInputs is defined by the incoming link.
-
property
description
¶ The description object of this input/output
-
property
dimensions
¶ The sample size of the SubInput
-
property
fullid
¶ The full defining ID for the SubInput
-
property
input_group
¶ The id of the
InputGroup
this SubInputs parent belongs to.
-
property
item_index
¶
-
iteritems
()[source]¶ Iterate over the
SampleItems
that are in the SubInput.- Returns
iterator yielding
SampleItem
objects
-
itersubinputs
()[source]¶ Iterate over SubInputs (for a SubInput it will yield self and stop iterating after that)
- Returns
iterator yielding
SubInput
example:
>>> for subinput in input_a.itersubinputs(): print subinput
-
property
node
¶ The Node to which this SubInputs parent belongs
-
-
class
fastr.execution.inputoutputrun.
SubOutputRun
(output, index)[source]¶ Bases:
fastr.execution.inputoutputrun.OutputRun
The SubOutput is an Output that represents a slice of another Output.
-
__abstractmethods__
= frozenset({})¶
-
__getitem__
(key)[source]¶ Retrieve an item from this SubOutput. The returned value depends on what type of key used:
Retrieving data using index tuple: [index_tuple]
Retrieving data sample_id str: [SampleId]
Retrieving a list of data using SampleId list: [sample_id1, …, sample_idN]
Retrieving a
SubOutput
using an int or slice: [n] or [n:m]
- Parameters
key (int, slice,
SampleId
or tuple) – the key of the requested item, can be a number, slice, sample index tuple or aSampleId
- Returns
the return value depends on the requested key. If the key was an int or slice the corresponding
SubOutput
will be returned (and created if needed). If the key was aSampleId
or sample index tuple, the correspondingSampleItem
will be returned. If the key was a list ofSampleId
a tuple ofSampleItem
will be returned.- Return type
SubInput
orSampleItem
or list ofSampleItem
- Raises
FastrTypeError – if key is not of a valid type
-
__getstate__
()[source]¶ Retrieve the state of the SubOutput
- Returns
the state of the object
- Rtype dict
-
__init__
(output, index)[source]¶ Instantiate a SubOutput
- Parameters
- Returns
created SubOutput
- Raises
FastrTypeError – if the output argument is not an instance of
Output
FastrTypeError – if the index argument is not an
int
orslice
-
__module__
= 'fastr.execution.inputoutputrun'¶
-
__setitem__
(key, value)[source]¶ A function blocking the assignment operator. Values cannot be assigned to a SubOutput.
- Raises
FastrNotImplementedError – if called
-
__setstate__
(state)[source]¶ Set the state of the SubOutput by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
cardinality
(key=None, job_data=None)[source]¶ Cardinality of this SubOutput depends on the parent Output and
self.index
- Parameters
key (tuple of int or
SampleId
) – key for a specific sample, can be sample index or id- Returns
the cardinality
- Return type
- Raises
FastrCardinalityError – if cardinality references an invalid
Input
FastrTypeError – if the referenced cardinality values type cannot be case to int
FastrValueError – if the referenced cardinality value cannot be case to int
-
property
datatype
¶ The datatype of this SubOutput
-
property
fullid
¶ The full defining ID for the SubOutput
-
property
indexrep
¶ Simple representation of the index.
-
property
node
¶ The NodeRun to which this SubOutput belongs
-
property
preferred_types
¶ The list of preferred
DataTypes
for this SubOutput.
-
property
resulting_datatype
¶ The
DataType
that will the results of this SubOutput will have.
-
property
samples
¶ The
SampleCollection
for this SubOutput
-
job
Module¶
This module contains the Job class and some related classes.
-
class
fastr.execution.job.
InlineJob
(*args, **kwargs)[source]¶ Bases:
fastr.execution.job.Job
Job that does not actually need to run but is used for consistency in data processing and logging.
-
__init__
(*args, **kwargs)[source]¶ Create a job
- Parameters
node (fastr.planning.node.Node) – the node the job is based on
sample_id – the id of the sample
sample_index – the index of the sample
input_arguments – the argument list
output_arguments – the argument list
hold_jobs – the jobs on which this jobs depend
preferred_types – The list of preferred types to use
- Returns
-
__module__
= 'fastr.execution.job'¶
-
-
class
fastr.execution.job.
Job
(node, sample_id, sample_index, input_arguments, output_arguments, hold_jobs=None, preferred_types=None)[source]¶ Bases:
fastr.abc.serializable.Serializable
Class describing a job.
Arguments: tool_name - the name of the tool (str) tool_version - the version of the tool (Version) argument - the arguments used when calling the tool (list) tmpdir - temporary directory to use to store output data hold_jobs - list of jobs that need to finished before this job can run (list)
-
COMMAND_DUMP
= '__fastr_command__.pickle.gz'¶
-
INFO_DUMP
= '__fastr_extra_job_info__.json'¶
-
PROV_DUMP
= '__fastr_prov__.json'¶
-
RESULT_DUMP
= '__fastr_result__.pickle.gz'¶
-
STDERR_DUMP
= '__fastr_stderr__.txt'¶
-
STDOUT_DUMP
= '__fastr_stdout__.txt'¶
-
__init__
(node, sample_id, sample_index, input_arguments, output_arguments, hold_jobs=None, preferred_types=None)[source]¶ Create a job
- Parameters
node (fastr.planning.node.Node) – the node the job is based on
sample_id (
SampleId
) – the id of the samplesample_index (
SampleIndex
) – the index of the sampleinput_arguments (
Dict
[str
,SampleIndex
]) – the argument listhold_jobs (
Optional
[List
[str
]]) – the jobs on which this jobs dependpreferred_types (
Optional
[List
]) – The list of preferred types to use
- Returns
-
__module__
= 'fastr.execution.job'¶
-
static
cast_to_type
(value, datatypes)[source]¶ Try to cast value to one of the given datatypes. Will try all the datatypes in order.
-
property
commandfile
¶ The path of the command pickle
-
property
commandurl
¶ The url of the command pickle
-
create_payload
()[source]¶ Create the payload for this object based on all the input/output arguments
- Returns
the payload
- Return type
-
property
extrainfofile
¶ The path where the extra job info document is saved
-
property
extrainfourl
¶ The url where the extra job info document is saved
-
classmethod
fill_output_argument
(output_spec, cardinality, desired_type, requested, tmpurl)[source]¶ This is an abstract class method. The method should take the argument_dict generated from calling self.get_argument_dict() and turn it into a list of commandline arguments that represent this Input/Output.
-
property
fullid
¶ The full id of the job
-
get_deferred
(output_id, cardinality_nr, sample_id=None)[source]¶ Get a deferred pointing to a specific output value in the Job
-
get_result
()[source]¶ Get the result of the job if it is available. Load the output file if found and check if the job matches the current object. If so, load and return the result.
- Returns
Job after execution or None if not available
- Return type
Job | None
-
classmethod
get_value
(value)[source]¶ Get a value
- Parameters
value – the url of the value
datatype – datatype of the value
- Returns
the retrieved value
-
property
id
¶ The id of this job
-
property
logfile
¶ The path of the result pickle
-
property
logurl
¶ The url of the result pickle
-
property
provfile
¶ The path where the prov document is saved
-
property
provurl
¶ The url where the prov document is saved
-
property
resources
¶ The compute resources required for this job
-
property
status
¶ The status of the job
-
property
stderrfile
¶ The path where the stderr text is saved
-
property
stderrurl
¶ The url where the stderr text is saved
-
property
stdoutfile
¶ The path where the stdout text is saved
-
property
stdouturl
¶ The url where the stdout text is saved
-
property
tmpurl
¶ The URL of the tmpdir to use
-
property
tool
¶
-
classmethod
translate_argument
(value)[source]¶ Translate an argument from a URL to an actual path.
- Parameters
value – value to translate
datatype – the datatype of the value
- Returns
the translated value
-
static
translate_output_results
(value, datatypes, mountpoint=None)[source]¶ Translate the results for on Output
- Parameters
value – the results value for the output
datatypes (tuple) – tuple of possible datatypes for the output
preferred_type – the preferred datatype of the output
- Returns
the update value for the result
-
translate_results
(result)[source]¶ Translate the results of an interface (using paths etc) to the proper form using URI’s instead.
-
-
class
fastr.execution.job.
JobCleanupLevel
[source]¶ Bases:
enum.Enum
The cleanup level for Jobs that are finished.
-
__module__
= 'fastr.execution.job'¶
-
all
= 'all'¶
-
no_cleanup
= 'no_cleanup'¶
-
non_failed
= 'non_failed'¶
-
-
class
fastr.execution.job.
JobState
(_, stage, error)[source]¶ Bases:
enum.Enum
The possible states a Job can be in. An overview of the states and the adviced transitions are depicted in the following figure:
-
__module__
= 'fastr.execution.job'¶
-
cancelled
= ('cancelled', 'done', True)¶
-
created
= ('created', 'idle', False)¶
-
property
done
¶
-
execution_done
= ('execution_done', 'in_progress', False)¶
-
execution_failed
= ('execution_failed', 'in_progress', True)¶
-
failed
= ('failed', 'done', True)¶
-
finished
= ('finished', 'done', False)¶
-
hold
= ('hold', 'idle', False)¶
-
property
idle
¶
-
property
in_progress
¶
-
nonexistent
= ('nonexistent', 'idle', False)¶
-
processing_callback
= ('processing_callback', 'in_progress', False)¶
-
queued
= ('queued', 'idle', False)¶
-
running
= ('running', 'in_progress', False)¶
-
-
class
fastr.execution.job.
SinkJob
(node, sample_id, sample_index, input_arguments, output_arguments, hold_jobs=None, substitutions=None, preferred_types=None)[source]¶ Bases:
fastr.execution.job.Job
Special SinkJob for the Sink
-
__init__
(node, sample_id, sample_index, input_arguments, output_arguments, hold_jobs=None, substitutions=None, preferred_types=None)[source]¶ Create a job
- Parameters
node (fastr.planning.node.Node) – the node the job is based on
sample_id – the id of the sample
sample_index – the index of the sample
input_arguments – the argument list
output_arguments – the argument list
hold_jobs – the jobs on which this jobs depend
preferred_types – The list of preferred types to use
- Returns
-
__module__
= 'fastr.execution.job'¶
-
create_payload
()[source]¶ Create the payload for this object based on all the input/output arguments
- Returns
the payload
- Return type
-
get_result
()[source]¶ Get the result of the job if it is available. Load the output file if found and check if the job matches the current object. If so, load and return the result.
- Returns
Job after execution
-
property
id
¶ The id of this job
-
substitute
(value, datatype=None)[source]¶ Substitute the special fields that can be used in a SinkJob.
- Parameters
value (str) – the value to substitute fields in
datatype (BaseDataType) – the datatype for the value
- Returns
string with substitutions performed
- Return type
-
property
tmpurl
¶ The URL of the tmpdir to use
-
-
class
fastr.execution.job.
SourceJob
(node, sample_id, sample_index, input_arguments, output_arguments, hold_jobs=None, preferred_types=None)[source]¶ Bases:
fastr.execution.job.Job
Special SourceJob for the Source
-
__module__
= 'fastr.execution.job'¶
-
linkrun
Module¶
The link module contain the Link class. This class represents the links in a network. These links lead from an output (BaseOutput) to an input (BaseInput) and indicate the desired data flow. Links are smart objects, in the sense that when you set their start or end point, they register themselves with the Input and Output. They do all the book keeping, so as long as you only set the source and target of the Link, the link should be valid.
Warning
Don’t mess with the Link, Input and Output internals from other places. There will be a huge chances of breaking the network functionality!
-
class
fastr.execution.linkrun.
LinkRun
(link, parent=None)[source]¶ Bases:
fastr.abc.updateable.Updateable
,fastr.abc.serializable.Serializable
Class for linking outputs (
BaseOutput
) to inputs (BaseInput
)Examples:
>>> import fastr >>> network = fastr.Network() >>> link1 = network.create_link( n1.ouputs['out1'], n2.inputs['in2'] ) link2 = Link() link2.source = n1.ouputs['out1'] link2.target = n2.inputs['in2']
-
__abstractmethods__
= frozenset({})¶
-
__dataschemafile__
= 'Link.schema.json'¶
-
__getitem__
(index)[source]¶ Get a an item for this Link. The item will be retrieved from the connected output, but a diverging or converging flow can change the number of samples/cardinality.
- Parameters
index (SampleIndex) – index of the item to retrieve
- Returns
the requested item
- Return type
SampleItem
- Raises
FastrIndexError – if the index length does not match the number dimensions in the source data (after collapsing/expanding)
-
__hash__
= None¶
-
__init__
(link, parent=None)[source]¶ Create a new Link in a Network.
- Parameters
- Returns
newly created LinkRun
- Raises
FastrValueError – if parent is not given and fastr.current_network is not set
FastrValueError – if the source output is not in the same network as the Link
FastrValueError – if the target input is not in the same network as the Link
-
__module__
= 'fastr.execution.linkrun'¶
-
__repr__
()[source]¶ Get a string representation for the Link
- Returns
the string representation
- Return type
-
__setstate__
(state)[source]¶ Set the state of the Link by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
- Raises
FastrValueError – if the parent network and fastr.current_network are not set
-
cardinality
(index=None)[source]¶ Cardinality for a Link is given by source Output and the collapse/expand settings
- Parameters
key (SampleIndex) – key for a specific sample (can be only a sample index!)
- Returns
the cardinality
- Return type
int, sympy.Symbol
- Raises
FastrIndexError – if the index length does not match the number of dimension in the data
-
property
collapse
¶ The converging dimensions of this link. Collapsing changes some dimensions of sample lists into cardinality, reshaping the data.
Collapse can be set to a tuple or an int/str, in which case it will be automatically wrapped in a tuple. The int will be seen as indices of the dimensions to collapse. The str will be seen as the name of the dimensions over which to collapse.
- Raises
FastrTypeError – if assigning a collapse value of a wrong type
-
property
collapse_indexes
¶ The converging dimensions of this link as integers. Dimension names are replaces with the corresponding int.
Collapsing changes some dimensions of sample lists into cardinality, reshaping the data
-
classmethod
createobj
(state, network=None)[source]¶ Create object function for Link
- Parameters
cls – The class to create
state – The state to use to create the Link
network – the parent Network
- Returns
newly created Link
-
destroy
()[source]¶ The destroy function of a link removes all default references to a link. This means the references in the network, input and output connected to this link. If there is no references in other places in the code, it will destroy the link (reference count dropping to zero).
This function is called when a source for an input is set to another value and the links becomes disconnected. This makes sure there is no dangling links.
-
property
dimensions
¶ The dimensions of the data delivered by the link. This can be different from the source dimensions because the link can make data collapse or expand.
-
property
expand
¶ Flag indicating that the link will expand the cardininality into a new sample dimension to be created.
-
property
fullid
¶ The full defining ID for the Input
-
property
parent
¶ The Network to which this Link belongs.
-
property
size
¶ The size of the data delivered by the link. This can be different from the source size because the link can make data collapse or expand.
-
property
source
¶ The source
BaseOutput
of the Link. Setting the source will automatically register the Link with the source BaseOutput. Updating source will also make sure the Link is unregistered with the previous source.- Raises
FastrTypeError – if assigning a non
BaseOutput
-
property
status
¶
-
property
target
¶ The target
BaseInput
of the Link. Setting the target will automatically register the Link with the target BaseInput. Updating target will also make sure the Link is unregistered with the previous target.- Raises
FastrTypeError – if assigning a non
BaseInput
-
macronoderun
Module¶
-
class
fastr.execution.macronoderun.
MacroNodeRun
(node, parent)[source]¶ Bases:
fastr.execution.noderun.NodeRun
MacroNodeRun encapsulates an entire network in a single node.
-
__abstractmethods__
= frozenset({})¶
-
__getstate__
()[source]¶ Retrieve the state of the MacroNodeRun
- Returns
the state of the object
- Rtype dict
-
__init__
(node, parent)[source]¶ - Parameters
network (fastr.planning.network.Network) – network to create macronode for
-
__module__
= 'fastr.execution.macronoderun'¶
-
__setstate__
(state)[source]¶ Set the state of the NodeRun by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
execute
()[source]¶ Execute the node and create the jobs that need to run
- Returns
list of jobs to run
- Return type
list of
Jobs
-
property
network_run
¶
-
networkanalyzer
Module¶
Module that defines the NetworkAnalyzer and holds the reference implementation.
-
class
fastr.execution.networkanalyzer.
DefaultNetworkAnalyzer
[source]¶ Bases:
fastr.execution.networkanalyzer.NetworkAnalyzer
Default implementation of the NetworkAnalyzer.
-
__module__
= 'fastr.execution.networkanalyzer'¶
-
-
class
fastr.execution.networkanalyzer.
NetworkAnalyzer
[source]¶ Bases:
object
Base class for NetworkAnalyzers
-
__dict__
= mappingproxy({'__module__': 'fastr.execution.networkanalyzer', '__doc__': '\n Base class for NetworkAnalyzers\n ', 'analyze_network': <function NetworkAnalyzer.analyze_network>, '__dict__': <attribute '__dict__' of 'NetworkAnalyzer' objects>, '__weakref__': <attribute '__weakref__' of 'NetworkAnalyzer' objects>})¶
-
__module__
= 'fastr.execution.networkanalyzer'¶
-
__weakref__
¶ list of weak references to the object (if defined)
-
networkchunker
Module¶
This module contains the NetworkChunker class and its default implementation the DefaultNetworkChunker
-
class
fastr.execution.networkchunker.
DefaultNetworkChunker
[source]¶ Bases:
fastr.execution.networkchunker.NetworkChunker
The default implementation of the NetworkChunker. It tries to create as large as possible chunks so the execution blocks as little as possible.
-
__module__
= 'fastr.execution.networkchunker'¶
-
chunck_network
(network)[source]¶ Create a list of Network chunks that can be pre-analyzed completely. Each chunk needs to be executed before the next can be analyzed and executed.
The returned chunks are (at the moment) in the format of a tuple (start, nodes) which are both tuples. The tuple contain the nodes where to start execution (should ready if previous chunks are done) and all nodes of the chunk respectively.
- Parameters
network – Network to split into chunks
- Returns
tuple containing chunks
-
-
class
fastr.execution.networkchunker.
NetworkChunker
[source]¶ Bases:
object
The base class for NetworkChunkers. A Network chunker is a class that takes a Network and produces a list of chunks that can each be analyzed and executed in one go.
-
__dict__
= mappingproxy({'__module__': 'fastr.execution.networkchunker', '__doc__': '\n The base class for NetworkChunkers. A Network chunker is a class that takes\n a Network and produces a list of chunks that can each be analyzed and\n executed in one go.\n ', 'chunck_network': <function NetworkChunker.chunck_network>, '__dict__': <attribute '__dict__' of 'NetworkChunker' objects>, '__weakref__': <attribute '__weakref__' of 'NetworkChunker' objects>})¶
-
__module__
= 'fastr.execution.networkchunker'¶
-
__weakref__
¶ list of weak references to the object (if defined)
-
networkrun
Module¶
Network module containing Network facilitators and analysers.
-
class
fastr.execution.networkrun.
NetworkRun
(network)[source]¶ Bases:
fastr.abc.serializable.Serializable
The Network class represents a workflow. This includes all Nodes (including ConstantNodes, SourceNodes and Sinks) and Links.
-
NETWORK_DUMP_FILE_NAME
= '__fastr_network__.json'¶
-
SINK_DUMP_FILE_NAME
= '__sink_data__.json'¶
-
SOURCE_DUMP_FILE_NAME
= '__source_data__.pickle.gz'¶
-
__getitem__
(item)[source]¶ Get an item by its fullid. The fullid can point to a link, node, input, output or even subinput/suboutput.
- Parameters
item (str,unicode) – fullid of the item to retrieve
- Returns
the requested item
-
__getstate__
()[source]¶ Retrieve the state of the Network
- Returns
the state of the object
- Rtype dict
-
__hash__
= None¶
-
__module__
= 'fastr.execution.networkrun'¶
-
__setstate__
(state)[source]¶ Set the state of the Network by the given state. This completely overwrites the old state!
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
check_id
(id_)[source]¶ Check if an id for an object is valid and unused in the Network. The method will always returns True if it does not raise an exception.
- Parameters
id (str) – the id to check
- Returns
True
- Raises
FastrValueError – if the id is not correctly formatted
FastrValueError – if the id is already in use
-
property
constantlist
¶
-
execute
(sourcedata, sinkdata, execution_plugin=None, tmpdir=None, cluster_queue=None, timestamp=None)[source]¶ Execute the Network with the given data. This will analyze the Network, create jobs and send them to the execution backend of the system.
- Parameters
- Raises
FastrKeyError – if a source has not corresponding key in sourcedata
FastrKeyError – if a sink has not corresponding key in sinkdata
-
property
fullid
¶ The fullid of the Network
-
property
global_id
¶ The global id of the Network, this is different for networks used in macronodes, as they still have parents.
-
property
id
¶ The id of the Network. This is a read only property.
-
job_finished
(job)[source]¶ Call-back handler for when a job is finished. Will collect the results and handle blocking jobs. This function is automatically called when the execution plugin finished a job.
- Parameters
job (
Job
) – the job that finished
-
property
long_id
¶
-
property
network
¶
-
property
nodegroups
¶ Give an overview of the nodegroups in the network
-
register_signals
()[source]¶ Register handles to handle SIGINT and SIGTERM handlers to gracefully shut down the execution :return:
-
property
sinklist
¶
-
property
sourcelist
¶
-
noderun
Module¶
A module to maintain a run of a network node.
-
class
fastr.execution.noderun.
NodeRun
(node, parent)[source]¶ Bases:
fastr.execution.basenoderun.BaseNodeRun
The class encapsulating a node in the network. The node is responsible for setting and checking inputs and outputs based on the description provided by a tool instance.
-
__abstractmethods__
= frozenset({})¶
-
__dataschemafile__
= 'NodeRun.schema.json'¶
-
__eq__
(other)[source]¶ Compare two Node instances with each other. This function ignores the parent and update status, but tests rest of the dict for equality. equality
- Parameters
other (NodeRun) – the other instances to compare to
- Returns
True if equal, False otherwise
-
__getstate__
()[source]¶ Retrieve the state of the NodeRun
- Returns
the state of the object
- Rtype dict
-
__hash__
= None¶
-
__init__
(node, parent)[source]¶ Instantiate a node.
- Parameters
node (
Tool
) – The node to base the noderun onparent (
Network
) – the parent network of the node
- Returns
the newly created NodeRun
-
__module__
= 'fastr.execution.noderun'¶
-
__repr__
()[source]¶ Get a string representation for the NodeRun
- Returns
the string representation
- Return type
-
__setstate__
(state)[source]¶ Set the state of the NodeRun by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
property
blocking
¶ Indicate that the results of this NodeRun cannot be determined without first executing the NodeRun, causing a blockage in the creation of jobs. A blocking Nodes causes the Chunk borders.
-
create_job
(sample_id, sample_index, job_data, job_dependencies, **kwargs)[source]¶ Create a job based on the sample id, job data and job dependencies.
- Parameters
sample_id (
SampleId
) – the id of the corresponding samplesample_index (
SampleIndex
) – the index of the corresponding samplejob_data (dict) – dictionary containing all input data for the job
job_dependencies – other jobs that need to finish before this job can run
- Returns
the created job
- Return type
-
classmethod
createobj
(state, network=None)[source]¶ Create object function for generic objects
- Parameters
cls – The class to create
state – The state to use to create the Link
network – the parent Network
- Returns
newly created Link
-
property
dimnames
¶ Names of the dimensions in the NodeRun output. These will be reflected in the SampleIdList of this NodeRun.
-
execute
()[source]¶ Execute the node and create the jobs that need to run
- Returns
list of jobs to run
- Return type
list of
Jobs
-
property
fullid
¶ The full defining ID for the NodeRun inside the network
-
get_sourced_nodes
()[source]¶ A list of all Nodes connected as sources to this NodeRun
- Returns
list of all nodes that are connected to an input of this node
-
property
global_id
¶ The global defining ID for the Node from the main network (goes out of macro nodes to root network)
-
property
id
¶ The id of the NodeRun
-
property
input_groups
¶ - A list of input groups for this NodeRun. An input group is InputGroup
object filled according to the NodeRun
-
property
listeners
¶ All the listeners requesting output of this node, this means the listeners of all Outputs and SubOutputs
-
property
merge_dimensions
¶
-
property
name
¶ Name of the Tool the NodeRun was based on. In case a Toolless NodeRun was used the class name is given.
-
property
outputsize
¶ Size of the outputs in this NodeRun
-
property
parent
¶ The parent network of this node.
-
property
resources
¶ Number of cores required for the execution of this NodeRun
-
set_result
(job, failed_annotation)[source]¶ Incorporate result of a job into the NodeRun.
- Parameters
job (Type) – job of which the result to store
failed_annotation – A set of annotations, None if no errors else containing a tuple describing the errors
-
property
status
¶
-
property
tool
¶
-
sinknoderun
Module¶
-
class
fastr.execution.sinknoderun.
SinkNodeRun
(node, parent)[source]¶ Bases:
fastr.execution.noderun.NodeRun
Class which handles where the output goes. This can be any kind of file, e.g. image files, textfiles, config files, etc.
-
__abstractmethods__
= frozenset({})¶
-
__dataschemafile__
= 'SinkNodeRun.schema.json'¶
-
__getstate__
()[source]¶ Retrieve the state of the NodeRun
- Returns
the state of the object
- Rtype dict
-
__init__
(node, parent)[source]¶ Instantiation of the SinkNodeRun.
- Parameters
node (fastr.planning.node.Node) – The Node that this Run is based on.
parent (NetworkRun) – The NetworkRun that this NodeRun belongs to
- Returns
newly created sink node run
-
__module__
= 'fastr.execution.sinknoderun'¶
-
__setstate__
(state)[source]¶ Set the state of the NodeRun by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
create_job
(sample_id, sample_index, job_data, job_dependencies, **kwargs)[source]¶ Create a job for a sink based on the sample id, job data and job dependencies.
-
property
datatype
¶ The datatype of the data this sink can store.
-
execute
()[source]¶ Execute the sink node and create the jobs that need to run
- Returns
list of jobs to run
- Return type
list of
Jobs
-
property
input
¶ The default input of the sink NodeRun
-
set_data
(data)[source]¶ Set the targets of this sink node.
- Parameters
data (dict or list of urls) – the targets rules for where to write the data
The target rules can include a few fields that can be filled out:
field
description
sample_id
the sample id of the sample written in string form
cardinality
the cardinality of the sample written
ext
the extension of the datatype of the written data, including the .
extension
the extension of the datatype of the written data, excluding the .
network
the id of the network the sink is part of
node
the id of the node of the sink
timestamp
the iso formatted datetime the network execution started
uuid
the uuid of the network run (generated using uuid.uuid1)
An example of a valid target could be:
>>> target = 'vfs://output_mnt/some/path/image_{sample_id}_{cardinality}{ext}'
Note
The
{ext}
and{extension}
are very similar but are both offered. In many cases having aname.{extension}
will feel like the correct way to do it. However, if you have DataTypes with and without extension that can both exported by the same sink, this would cause eithername.ext
orname.
to be generated. In this particular casename{ext}
can help as it will create eithername.ext
orname
.Note
If a datatype has multiple extensions (e.g. .tiff and .tif) the first extension defined in the extension tuple of the datatype will be used.
-
sourcenoderun
Module¶
-
class
fastr.execution.sourcenoderun.
SourceNodeRun
(node, parent)[source]¶ Bases:
fastr.execution.flownoderun.FlowNodeRun
Class providing a connection to data resources. This can be any kind of file, stream, database, etc from which data can be received.
-
__abstractmethods__
= frozenset({})¶
-
__dataschemafile__
= 'SourceNodeRun.schema.json'¶
-
__eq__
(other)[source]¶ Compare two Node instances with each other. This function ignores the parent and update status, but tests rest of the dict for equality. equality
- Parameters
other (NodeRun) – the other instances to compare to
- Returns
True if equal, False otherwise
-
__getstate__
()[source]¶ Retrieve the state of the SourceNodeRun
- Returns
the state of the object
- Rtype dict
-
__hash__
= None¶
-
__init__
(node, parent)[source]¶ Instantiation of the SourceNodeRun.
- Parameters
node (fastr.planning.node.Node) – The Node that this Run is based on.
parent (NetworkRun) – The NetworkRun that this NodeRun belongs to
- Returns
newly created sink node run
-
__module__
= 'fastr.execution.sourcenoderun'¶
-
__setstate__
(state)[source]¶ Set the state of the SourceNodeRun by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
create_job
(sample_id, sample_index, job_data, job_dependencies, **kwargs)[source]¶ Create a job based on the sample id, job data and job dependencies.
- Parameters
sample_id (
SampleId
) – the id of the corresponding samplesample_index (
SampleIndex
) – the index of the corresponding samplejob_data (dict) – dictionary containing all input data for the job
job_dependencies – other jobs that need to finish before this job can run
- Returns
the created job
- Return type
-
property
datatype
¶ The datatype of the data this source supplies.
-
property
dimnames
¶ Names of the dimensions in the SourceNodeRun output. These will be reflected in the SampleIdLists.
-
execute
()[source]¶ Execute the source node and create the jobs that need to run
- Returns
list of jobs to run
- Return type
list of
Jobs
-
property
output
¶ Shorthand for
self.outputs['output']
-
property
outputsize
¶ The size of output of this SourceNodeRun
-
set_data
(data, ids=None)[source]¶ Set the data of this source node.
- Parameters
data (dict, OrderedDict or list of urls) – the data to use
ids – if data is a list, a list of accompanying ids
-
property
sourcegroup
¶
-
property
valid
¶ This does nothing. It only overloads the valid method of NodeRun(). The original is intended to check if the inputs are connected to some output. Since this class does not implement inputs, it is skipped.
-
-
class
fastr.execution.sourcenoderun.
ConstantNodeRun
(node, parent)[source]¶ Bases:
fastr.execution.sourcenoderun.SourceNodeRun
Class encapsulating one output for which a value can be set. For example used to set a scalar value to the input of a node.
-
__abstractmethods__
= frozenset({})¶
-
__dataschemafile__
= 'ConstantNodeRun.schema.json'¶
-
__getstate__
()[source]¶ Retrieve the state of the ConstantNodeRun
- Returns
the state of the object
- Rtype dict
-
__init__
(node, parent)[source]¶ Instantiation of the ConstantNodeRun.
- Parameters
datatype – The datatype of the output.
data – the prefilled data to use.
id – The url pattern.
This class should never be instantiated directly (unless you know what you are doing). Instead create a constant using the network class like shown in the usage example below.
usage example:
>>> import fastr >>> network = fastr.Network() >>> source = network.create_source(datatype=types['ITKImageFile'], id_='sourceN')
or alternatively create a constant node by assigning data to an item in an InputDict:
>>> node_a.inputs['in'] = ['some', 'data']
which automatically creates and links a ConstantNodeRun to the specified Input
-
__module__
= 'fastr.execution.sourcenoderun'¶
-
__setstate__
(state)[source]¶ Set the state of the ConstantNodeRun by the given state.
- Parameters
state (dict) – The state to populate the object with
- Returns
None
-
property
data
¶ The data stored in this constant node
-