Development and Design Documentation¶
In this chapter we will discuss the design of Fastr in more detail. We give pointers for development and add the design documents as we currently envision Fastr. This is both for people who are interested in the Fastr develop and for current developers to have an archive of the design decision agreed upon.
Sample flow in Fastr¶
The current Sample flow is the following:
The idea is that we make a common interface for all classes that are related to the flow of Samples. For this we propose the following mixin classes that provide the interface and allow for better code sharing. The basic structure of the classes is given in the following diagram:
The abstract and mixin methods are as follows:
|ABC||Inherits from||Abstract Methods||Mixin methods|
Though the flow is currently working like this, the mixins are not yet created.
The network execution should contain a number of steps:
- Creates a
NetworkRunbased on the current layout
- Creates a
- Transform the
Network(possibly joining Nodes of certain interface into a combined NodeRun etc)
- Start generation of the Job Direct Acyclic Graph (DAG)
- Transform the
- Prioritize Jobs based on some predefined rules
- Combine certain
Jobsto improve efficiency (e.g. minimize i/o on a grid)
- Run a (list of)
Jobs. If there is more than one jobs, run them sequentially on same execution host using a local temp for intermediate files.
- On finished callback: Updated DAG with newly ready jobs, or remove cancelled jobs
- Run a (list of)
This could be visualized as the following loop:
The callback of the
ExecutionPlugin to the
NetworkRun would trigger
the execution of the relevant
NodeRuns and the addition of more
The Job DAG should be thread-safe as it could be both read and extended at the same time.
If a list of jobs is send to the
ExecutionPlugin to be run as
on Job on an external execution platform, the resources should be
combined as follows: memory=max, cores=max, runtime=sum
If there are execution hosts that have mutliple cores the
ExecutionPlugin should manage this (for example by using pilot
SchedulingPlugin creates units that should be run
sequentially on the resources noted and will not attempt
NetworkRun would be contain similar information as the
not have functionality for editting/changing it. It would contain the
functionality to execute the Network and track the status and samples. This
Network.execute to create multiple concurent runs that operate
indepent of each other. Also editting a
Network after the run started would
have no effect on that run.
This is a plan, not yet implemented
For this to work, it would be important for a Jobs to have forward and backward dependency links.
The idea of the plugin is that it would give a priority on Jobs created by a
Network. This could be done based on different strategies:
- Based on (sorted) sample id’s, so that one sample is always prioritized over others. The idea is that samples are process as much as possible in order, finishing the first sample first. Only processing other samples if there is left-over capacity.
- Based on distance to a (particular)
Sink. This is to generate specific results as quick as possible. It would not focus on specific samples, but give priority to whatever sample is closest to being finished.
- Based on the distance to from a
Souce. Based on the sign of the weight it would either keep all samples on the same stage as much as possible, only progressing to a new
NodeRunwhen all samples are done with the previous
NodeRun, or it would push samples with accelerated rates.
Additionally it will group
Jobs to be executed on a single host. This could
reduce i/o and limited the number of jobs an external scheduler has to track.
The interface for such a plugin has not yet been established.
“Something that is kept or meant to be kept unknown or unseen by others.”
Fastr IOPlugins that need authentication data should use the Fastr SecretService for retrieving such data. The SecretService can be used as follows.
from fastr.utils.secrets import SecretService from fastr.utils.secrets.exceptions import CouldNotRetrieveCredentials secret_service = SecretService() try: password = secret_service.find_password_for_user('testserver.lan:9000', 'john-doe') except CouldNotRetrieveCredentials: # the password was not found pass
Implementing a SecretProvider¶
A SecretProvider is implemented as follows:
- Create a file in fastr/utils/secrets/providers/<yourprovidername>.py
- Use the template below to write your SecretProvider
- Add the secret provider to fastr/utils/secrets/providers/__init__.py
- Add the secret provider to fastr/utils/secrets/secretservice.py: import it and add it to the array in function _init_providers
from fastr.utils.secrets.secretprovider import SecretProvider from fastr.utils.secrets.exceptions import CouldNotRetrieveCredentials, CouldNotSetCredentials, CouldNotDeleteCredentials, NotImplemented try: # this is where libraries can be imported # we don't want fastr to crash if a specific # library is unavailable # import my-libary except (ImportError, ValueError) as e: pass class KeyringProvider(SecretProvider): def __init__(self): # if libraries are imported in the code above # we need to check if import was succesfull # if it was not, raise a RuntimeError # so that FASTR ignores this SecretProvider # if 'my-library' not in globals(): # raise RuntimeError("my-library module required") pass def get_password_for_user(self, machine, username): # This function should return the password as a string # or raise a CouldNotRetrieveCredentials error if the password # is not found. # In the event that this function is unsupported a # NotImplemented exception should be thrown raise NotImplemented() def set_password_for_user(self, machine, username, password): # This function should set the password for a specified # machine + user. If anything goes wrong while setting # the password a CouldNotSetCredentials error should be raised. # In the event that this function is unsupported a # NotImplemented exception should be thrown raise NotImplemented() def del_password_for_user(self, machine, username): # This function should delete the password for a specified # machine + user. If anything goes wrong while setting # the password a CouldNotDeleteCredentials error should be raised. # In the event that this function is unsupported a # NotImplemented exception should be thrown raise NotImplemented()