Node
The aggregation node is a node reachable from all other nodes in the network and the central node of the framework. All workbenches send their payload, called Artifacts, to the aggregation node; while all the clients query the same node for the next job to run. This allows the clients to have more control on the access and an additional layer of protection: a client node is not reachable from the internet and it is the client that contact the known reference node and initiate the execution process.
The node is composed by a web API written with FastAPI that runs and spawns Ray tasks. The node also uses a database to keep track of every stored object.
The easiest way to deploy a node is using Docker Compose.
The file docker-compose.integration.yaml contains a definition of all services required to create a stack that simulates a central server node and some client nodes.
Installation
The installation of a node is simple:
pip install ferdelance
Once installed it can be run by specifying a YAML configuration file:
python -m ferdelance -c ./config.yaml
Once one node is up and running, with default parameters the node will be reachable at http://server:1456/.
Node configuration
The minimal content of the configuration file is the definition of the server url to use and at least one datasource.
The datasource must have a name and be associated with one or more project thought the token
field.
workdir: ./storage # OPTIONAL: local path of the working directory
mode: node # one of: node, client, standalone
node:
name: FerdelanceNode
healthcheck: 3600.0 # wait in seconds for check self status
heartbeat: 10.0 # wait in seconds for clients to fetch updates
allow_resource_download: true # if false, nobody can download resources from this node
protocol: http # external protocol (http or https)
interface: 0.0.0.0 # interface to use (0.0.0.0 for node, "localhost" for clients)
url: "" # external url that the node will be reachable at
port: 1456 # external port to use to reach the APIs
token_projects_initial: # initial projects available at node start
- name: my_beautiful_project # name of the project
token: 58981bcbab... # unique token assigned to the project
join:
first: true # if true, this is the first node in the distributed network
url: "" # when a node is note the first, set the url for the join node
datasources: # list of available datasources
- name: iris # name of the source
kind: file # how the datasource is stored (only 'file')
type: csv # file format supported (only 'csv' or 'tsv')
path: /data/iris.csv # path to the file to use
token: # list of project token that can access this datasource
- 58981bcbab7...
database:
username: "" # username used to access the database
password: "" # password used to access the database
scheme: ferdelance # specify the name of the database schema to use
memory: false # when set to true, a SQLite in-memory database will be used
dialect: sqlite # current accepted dialects are: SQLite and Postgresql
host: ./sqlite.db # local path for local file (QSLite) or url for remote database
port: "" # port to use to connect to a remote database
Note
It is also possible to specify environment variables in the configuration file using the syntax ${ENVIRONMENT_VARIABLE_NAME}
inside the fields of parameters.
This is specially useful when setting parameters, such as domains or password, through a Docker compose file.
For the first node of the distributed network, the join.first
parameter must always be set to true
.
In the network it must always be a first node with this configuration.
In all the other cases, both for client
and node
mode, the configuration need to specify the join.url
parameter to a valid url of an existing node.
Only urls of nodes in node
mode can be used in this parameter.
Database configuration
Database configuration is completely optional.
Every node needs a database to work properly.
Minimal setup is to use an SQLite in-memory database by setting database.memory: true
.
If not database is configured, then the in-memory database will be used.
Other supported database are:
SQLite file database:
database: scheme: ferdelance dialect: sqlite host: ./sqlite.db memory: false
Postgresql remote database:
database: username: "${DATABASE_USER}" password: "${DATABASE_PASSWORD}" scheme: ferdelance dialect: postgresql host: remote_url port: 5432 memory: false