Configuration

After you have installed Weaver, you can customize its behaviour using multiple configuration settings.

Configuration Settings

All settings are configured using a weaver.ini configuration file. A weaver.ini.example file is provided with default values to help in the configuration process. Explanations of respective settings are also available in this example file.

The configuration file tell the application runner (e.g. Gunicorn, pserve or similar WSGI HTTP Server), how to execute Weaver as well as all settings to provide in order to personalize the application. All settings specific to Weaver employ the format weaver.<setting>.

Most configuration parameters for the manager portion of Weaver (i.e.: WSGI HTTP server for API endpoints) are defined in the [app:main] section of weaver.ini.example, while parameters specific to the worker (task queue handler) are within [celery] section. Note that multiple settings are shared between the two applications, such as the mongodb.[...] configuration or weaver.configuration options. When parameters are shared, they are usually expected to be placed in [app:main] section.

Following is a partial list of most predominant settings specific to Weaver. Many parameters provide alternative or extended functionality when employed in conjunction with other settings. Others are sometimes not necessarily required to be defined if default behaviour is desired. Refer to the relevant details that will describe in which condition they are optional and which default value or operation is applied in each situation.

Note

Refer to weaver.ini.example for the extended list of applicable settings. Some advanced configuration settings are also described in other sections of this page.

  • weaver.configuration = ADES|EMS|HYBRID|DEFAULT
    (default: DEFAULT)

    Tells the application in which mode to run.

    Enabling ADES for instance will disable some EMS-specific operations such as dispatching Workflow process steps to known remote ADES servers. ADES should be used to only run processes locally (as the working unit). EMS will always dispatch execution of jobs to other ADES except for Workflow processes that chains them.
    When HYBRID is specified, Weaver will assume both ADES and EMS roles simultaneously, meaning it will be able to execute local processes by itself and monitor dispatched execution of registered remote providers.
    Finally, DEFAULT configuration will provide very minimalistic operations as all other modes will be unavailable.
  • weaver.url = <url>
    (default: http://localhost:4001)

    Defines the full URL (including HTTP protocol/scheme, hostname and optionally additional path suffix) that will be used as base URL for all other URL settings of Weaver.

Note

This is the URL that you want displayed in responses (e.g.: processDescriptionURL or job location). For the effective URL employed by the WSGI HTTP server, refer to [server:main] section of weaver.ini.example.

  • weaver.schema_url = <url>
    (default: ${weaver.url}/json#/definitions)

    Defines the base URL of schemas to be reported in responses.

    When not provided, the running Web Application instance OpenAPI JSON path will be employed to refer to the schema definitions section. The configuration setting is available to override this endpoint by another static URL location where the corresponding schemas can be found if desired.

New in version 4.0.

  • weaver.cwl_euid = <int> [int, experimental]
    (default: None, auto-resolved by CWL with effective machine user)

    Define the effective machine user ID to be used for running the Application Package.

New in version 1.9.

  • weaver.cwl_egid = <int> [int, experimental]
    (default: None, auto-resolved by CWL with the group of the effective user)

    Define the effective machine group ID to be used for running the Application Package.

New in version 1.9.

  • weaver.wps = true|false [bool-like]
    (default: true)

    Enables the WPS-1/2 endpoint.

See also

WPS Endpoint

Warning

At the moment, this setting must be true to allow Job execution as the worker monitors this endpoint. This could change with future developments (see issue #21).

  • weaver.wps_path = <url-path>
    weaver.wps_url = <full-url>
    (default: path /ows/wps)

    Defines the URL to employ as WPS-1/2 endpoint.

    It can either be the explicit full URL to use or the path relative to weaver.url.
    Setting weaver.wps_path is ignored if its URL equivalent is defined.
    The path variant SHOULD start with / for appropriate concatenation with weaver.url, although this is not strictly enforced.
  • weaver.wps_output_s3_bucket = <s3-bucket-name>
    (default: None)

    AWS S3 bucket where to store WPS outputs. Used in conjunction with weaver.wps_output_s3_region.

    When this parameter is defined, any job result generated by a process execution will be stored (uploaded) to that location. If no bucket is specified, the outputs fall back to using the location specified by weaver.wps_output_dir.

New in version 1.13.

  • weaver.wps_output_s3_region = <s3-region-name>
    (default: None, any S3 Region amongst mypy_boto3_s3.literals.RegionName)

    AWS S3 region to employ for storing WPS outputs. Used in conjunction with weaver.wps_output_s3_bucket.

    When this parameter is defined as well as weaver.wps_output_s3_bucket, it is employed to define which S3 to write output files to. If not defined but weaver.wps_output_s3_bucket is specified, Weaver attempt to retrieve the region from the profile defined in AWS configuration files or environment variables.

New in version 1.13.

  • weaver.wps_output_dir = <directory-path>
    (default: path /tmp)

    Location where WPS outputs (results from Job) will be stored for stage-out.

    When weaver.wps_output_s3_bucket is specified, only WPS XML status and log files are stored under this path. Otherwise, Job results are also located under this directory with a sub-directory named with the Job ID.
    This directory should be mapped to Weaver’s WPS output URL to serve them externally as needed.

Changed in version 4.3: The output directory could be nested under a contextual directory if requested during Job submission. See Outputs Location and below weaver.wps_output_context parameter for more details.

  • weaver.wps_output_context = <sub-directory-path>
    (default: None)

    Default sub-directory hierarchy location to nest WPS outputs (Job results) under.

    If defined, this parameter is used as substitute context when X-WPS-Output-Context header is omitted. When not defined, X-WPS-Output-Context header can still take effect, but omitting it will store results directly under weaver.wps_output_dir instead of default context location.

New in version 4.3.

Changed in version 4.27: Nesting of the context directory from X-WPS-Output-Context or weaver.wps_output_dir will also take effect when storing Job results under S3 when weaver.wps_output_s3_bucket and weaver.wps_output_s3_region are also defined. Previous versions applied the context directory only for local storage using the other WPS output settings.

See also

See Outputs Location for more details about this feature and implications of this setting.

  • weaver.wps_output_path = <url-path>
    weaver.wps_output_url = <full-url>
    (default: path /wpsoutputs)

    Endpoint that will be employed as prefix to refer to WPS outputs (Job results).

    It can either be the explicit full URL to use or the path relative to weaver.url.
    Setting weaver.wps_output_path is ignored if its URL equivalent is defined.
    The path variant SHOULD start with / for appropriate concatenation with weaver.url, although this is not strictly enforced.

Note

The resulting weaver.wps_output_url endpoint, whether directly provided or indirectly resolved by weaver.url and weaver.wps_output_path will not be served by Weaver itself. This location is returned for reference in API responses, but it is up to the infrastructure that hosts Weaver service to make this location available online as deemed necessary.

  • weaver.wps_workdir = <directory-path>
    (default: uses automatically generated temporary directory if none specified)

    Prefix where process Job worker should execute the Process from.
  • weaver.wps_restapi = true|false [bool-like]
    (default: true)

    Enable the WPS-REST endpoint.

Warning

Weaver looses most, if not all, of its useful features without this, and there won’t be much point in using it without REST endpoint, but it should technically be possible to run it as WPS-1/2 only if desired.

  • weaver.wps_restapi_path = <url-path>
    weaver.wps_restapi_url = <full-url>
    (default: path /)

    Endpoint that will be employed as prefix to refer to WPS-REST requests
    (including but not limited to OGC API - Processes schemas).

    It can either be the explicit full URL to use or the path relative to weaver.url.
    Setting weaver.wps_restapi_path is ignored if its URL equivalent is defined.
    The path variant SHOULD start with / for appropriate concatenation with weaver.url, although this is not strictly enforced.
  • weaver.wps_metadata_[...] (multiple settings)

    Metadata fields that will be rendered by either or both the WPS-1/2 and WPS-REST endpoints (GetCapabilities).
  • weaver.wps_email_[...] (multiple settings)

    Defines configuration of email notification functionality on job completion.

    Encryption settings as well as custom email templates are available. Default email template defined in email-template is employed if none is provided. Email notifications are sent only on job completion if an email was provided in the Execute request body (see also: Email Notification).

New in version 4.15.

New in version 4.15.

Changed in version 4.30: Renamed from weaver.exec_sync_max_wait to weaver.execute_sync_max_wait.

Configuration of Celery with MongoDB Backend

Since Weaver employs Celery as task queue manager and MongoDB as backend, relevant settings for the configuration of Celery and the configuration of MongoDB Backend should be employed. Processing of task jobs and results reporting is accomplished according to the specific implementation of these services. Therefore, all applicable settings and extensions should be available for custom server configuration and scaling as needed.

Warning

In order to support synchronous execution, the RESULT_BACKEND setting MUST be defined.

Configuration of AWS S3 Buckets

Any AWS S3 Bucket provided to Weaver needs to be accessible by the application, whether it is to fetch input files or to store output results. This can require from the server administrator to specify credentials by one of reference supported AWS Credentials methodologies to provide necessary role and/or permissions. See also reference AWS Configuration which list various options that will be considered when working with S3 buckets.

Note that Weaver expects the AWS Configuration to define a default profile from which the AWS client can infer which Region it needs to connect to. The S3 bucket to store files should be defined with weaver.wps_output_s3_bucket setting as presented in the previous section.

The S3 file and directory references for input and output in Weaver are expected to be formatted as one of the methods described in AWS S3 Bucket Access Methods (more details about supported formats in AWS S3 Bucket References). The easiest and most common approach is to use a reference using the s3:// scheme as follows:

s3://<bucket>/<file-key>

This implicitly tells Weaver to employ the specified S3 bucket it was configured with as well as the automatically retrieved location (using the region from the default profile) in the AWS Configuration of the application.

Alternatively, the reference can be provided as input more explicitly with any of the supported AWS S3 Bucket References. For example, the AWS S3 link could be specified as follows.

https://s3.{Region}.amazonaws.com/{Bucket}/{file-key}

In this situation, Weaver will parse it as equivalent to the prior shorthand s3:// reference format, by substituting any appropriate details retrieved from the AWS Configuration as needed to form the above HTTP URL variant. For example, an alternative Region from the default could be specified. After resolution, Weaver will still attempt to fetch the file as standard HTTP reference by following the relevant AWS S3 Bucket Access Methods. In each case, read access should be granted accordingly to the corresponding bucket, files and/or directories such that Weaver can stage them locally. For produced outputs, the write access must be granted.

In the above references, file-key is used as anything after the Bucket name. In other words, this value can contain any amount of / separators and path elements. For example, if weaver.wps_output_s3_bucket is defined in the configuration, Weaver will store process output results to S3 using file-key as a combination of {WPS-UUID}/{output-id.ext}, therefore forming the full Job result file references as:

https://s3.{Region}.amazonaws.com/{Bucket}/{WPS-UUID}/{output-id.ext}

Region ::= weaver.wps_output_s3_region
Bucket ::= weaver.wps_output_s3_bucket

Note

Value of WPS-UUID can be retrieved from Weaver internal Job storage from weaver.datatypes.Job.wps_id(). It refers to the Process Execution identifier that accomplished the WPS request to run the Application Package.

Note

The value of file-key also applies for Directory Type references.

Configuration of Data Sources

A typical Data Source file is presented below. This sample is also provided in data_sources.yml.example.

# List Data-Source known locations such that Weaver configured in EMS mode can dispatch processes execution to
# corresponding ADES when input data references match the provided locations.
#
# For the expected Schema Definition, see module:
#   weaver.processes.sources
#
# NOTE:
#   This configuration can be formatted in YAML or JSON at your convenience.
#
example:
  # since this is not the default (see localhost),
  # only data matching that location will be forwarded to corresponding ADES
  netloc: "example-data.com"
  ades: "https://example.com/ADES"

localhost:
  # default is define here, so any unmatched data-source location will fallback to this ADES
  # since that default is 'localhost', default in this case will indicate "run it locally"
  # another ADES location could be set as default to dispatch unknown data-source executions to that specific instance
  netloc: "localhost"
  ades: "https://localhost:4001"
  default: true

opensearchdefault:
  # data-sources that require OpenSearch capabilities require more configuration details
  # this applies to processes that employ OpenSearch query definitions to define process inputs
  # see details and examples:
  #   https://pavics-weaver.readthedocs.io/en/latest/processes.html#opensearch-data-source
  #   tests/json_examples/opensearch_process.json
  #   tests/json_examples/eoimage_inputs_example.json
  ades: "http://localhost:4001"
  collection_id: ""
  accept_schemes: ["http", "https"]
  rootdir: ""
  osdd_url: "http://example.com/opensearchdescription.xml"

Both JSON and YAML are equivalent and supported. The data_sources.yml file is generated by default in the configuration folder based on the default example (if missing). Custom configurations can be placed in the expected location or can also be provide with an alternative path using the Weaver.data_sources configuration setting.

Note

As presented in the above example, the Data Source file can also refer to OpenSearch Data Source which imply additional pre-processing steps.

See also

More details about the implication of Data Source are provided in ADES dispatching using Data Sources.

Configuration of WPS Processes

Weaver allows the configuration of services or processes auto-deployment using definitions from a file formatted as wps_processes.yml.example. On application startup, provided references in processes list will be employed to attempt deployment of corresponding processes locally. Given that the resources can be correctly resolved, they will immediately be available from Weaver’s API without further request needed.

For convenience, every reference URL in the configuration file can either refer to explicit process definition (i.e.: endpoint and query parameters that resolve to DescribeProcess response), or a group of processes under a common WPS server to iteratively register, using a GetCapabilities WPS endpoint. Please refer to wps_processes.yml.example for explicit format, keywords supported, and their resulting behaviour.

Note

Processes defined under processes section registered into Weaver will correspond to a local snapshot of the remote resource at that point in time, and will not update if the reference changes. On the other hand, their listing and description offering will not require the remote service to be available at all time until execution.

New in version 1.14: When references are specified using providers section instead of processes, the registration only saves the remote WPS provider endpoint to dynamically populate WPS processes on demand.

Using this registration method, the processes will always reflect the latest modification from the remote WPS provider.

  • weaver.wps_processes_file = <file-path>
    (default: WEAVER_DEFAULT_WPS_PROCESSES_CONFIG located in WEAVER_CONFIG_DIR)

    Defines a custom YAML file corresponding to wps_processes.yml.example schema to pre-load WPS processes and/or providers for registration at application startup.

    The value defined by this setting will look for the provided path as absolute location, then will attempt to resolve relative path (corresponding to where the application is started from), and will also look within the weaver/config directory. If none of the files can be found, the operation is skipped.

    To ensure that this feature is disabled and to avoid any unexpected auto-deployment provided by this functionality, simply set setting weaver.wps_processes_file as undefined (i.e.: nothing after = in weaver.ini). The default value is employed if the setting is not defined at all.

Configuration of CWL Processes

New in version 4.19.

Although Weaver supports Deployment and dynamic management of Process definitions while the web application is running, it is sometime more convenient for service providers to offer a set of predefined Application Package definitions. In order to automatically register such definitions (or update them if changed), without having to repeat any deployment requests after the application was started, it is possible to employ the configuration setting weaver.cwl_processes_dir. Registration of a Process using this approach will result in an identical definition as if it was Deployed using API requests or using the Weaver CLI and Client interfaces.

  • weaver.cwl_processes_dir = <dir-path>
    (default: WEAVER_CONFIG_DIR)

    Defines the root directory where to recursively and alphabetically load any CWL file to deploy the corresponding Process definitions. Files at higher levels are loaded first before moving down into lower directories of the structure.

    Any failed deployment from a seemingly valid CWL will be logged with the corresponding error message. Loading will proceed by ignoring failing cases according to weaver.cwl_processes_register_error setting. The number of successful Process deployments will also be reported if any should occur.

    The value defined by this setting will look for the provided path as absolute location, then will attempt to resolve relative path (corresponding to where the application is started from). If no CWL file could be found, the operation is skipped.

    To ensure that this feature is disabled and to avoid any unexpected auto-deployment provided by this functionality, simply set setting weaver.cwl_processes_dir as undefined (i.e.: nothing after = in weaver.ini). The default value is employed if the setting is not defined at all.

Note

When registering processes using CWL, it is mandatory for those definitions to provide an id within the file along other CWL details to let Weaver know which Process reference to use for deployment.

Warning

If a Process depends on another definition, such as in the case of a Workflow definition, all dependencies must be registered prior to this Process. Consider naming your CWL files to take advantage of loading order to resolve such situations.

  • weaver.cwl_processes_register_error = true|false [bool]
    (default: false, ignore failures)

    Indicate if Weaver should ignore failing Process deployments (when false), due to unsuccessful registration of CWL files found within any sub-directory of weaver.cwl_processes_dir path, or immediately fail (when true) when an issue is raised during Process deployment.

Configuration of Request Options

New in version 1.8.

It is possible to define Request Options that consist of additional arguments that will be passed down to weaver.utils.request_extra(), which essentially call a traditional request using requests module, but with extended handling capabilities such as caching, retrying, and file reference support. The specific parameters that are passed down for individual requests depend whether a match based on URL (optionally with regex rules) and method definitions can be found in the Request Options file. This file should be provided using the weaver.request_options configuration setting. Using this definition, it is possible to provide specific requests handling options, such as extended timeout, authentication arguments, SSL certification verification setting, etc. on a per-request basis, leave other requests unaffected and generally more secure.

See also

File request_options.yml.example provides more details and sample YAML format of the expected contents for Request Options feature.

See also

Please refer to weaver.utils.request_extra() documentation directly for supported parameters and capabilities.

  • weaver.request_options = <file-path>
    (default: None)

    Path of the Request Options definitions to employ.
  • weaver.ssl_verify = true|false [bool-like]
    (default: true)

    Toggle the SSL certificate verification across all requests.

Warning

It is NOT recommended to disable SSL verification across all requests for security reasons (avoid man-in-the-middle attacks). This is crucial for requests that involve any form of authentication, secured access or personal user data references. This should be employed only for quickly resolving issues during development. Consider fixing SSL certificates on problematic servers, or disable the verification on a per-request basis using Request Options for acceptable cases.

Configuration of Quotation Estimation

New in version 4.30.

Following parameters are relevant when using OGC API - Processes - Quotation extension. If this feature is not desired, simply provide weaver.quotation = false in the weaver.ini configuration file, and all corresponding functionalities, including API endpoints, will be disabled.

New in version 4.30.

New in version 4.30.

  • weaver.quotation_docker_username = <username> [str]

    Username to employ for authentication when retrieving the Docker image used as Quote Estimator.

    Only required if the Docker image is not accessible publicly or already provided through some other means when requested by the Docker daemon. Should be combined with weaver.quotation_docker_password.

    See Currency Conversion for more details on the feature.

New in version 4.30.

  • weaver.quotation_docker_password = <username> [str]

    Password to employ for authentication when retrieving the Docker image used as Quote Estimator.

    Only required if the Docker image is not accessible publicly or already provided through some other means when requested by the Docker daemon. Should be combined with weaver.quotation_docker_username.

    See Currency Conversion for more details on the feature.

New in version 4.30.

  • weaver.quotation_currency_default = <CURRENCY> [str]
    (default: USD)

    Currency code in ISO-4217 format used by default.

    It is up to the specified Quote Estimator algorithm defined by weaver.quotation_docker_image and employed by the various Process to ensure that the returned Quote Estimation cost makes sense according to the specified default currency.

    See Currency Conversion for more details on the feature.

New in version 4.30.

  • weaver.quotation_currency_converter = <converter> [str]

    Reference currency converter to employ for retrieving conversion rates.

    Valid values are:
    - custom

    In each case, requests will be attempted using weaver.quotation_currency_token to authenticate with the API. Request caching of 1 hour will be used by default to limit chances of rate-limiting, but converter-specific plans could block request at any moment depending on the amount of Quotation and Billing requests accomplished. In such case, the conversion will not be performed and will remain in the default currency.

    If a custom URL is desired, the weaver.quotation_currency_custom_url parameter should also be provided.

    If none is provided, conversion rates will not be applied and currencies will always use weaver.quotation_currency_default.

    See Quotation and Billing for more details on the feature.

New in version 4.30.

  • weaver.quotation_currency_custom_url = <URL> [str]

    Reference custom currency converter URL pattern to employ for retrieving conversion rates.

    This applies only when using weaver.quotation_currency_converter = custom

    The specified URL will be used to perform a GET request. This URL should contain the relevant query or path parameters to perform the request. Parameters can be specified using templating ({<param>}), with parameters names token, from, to and amount to perform the conversion. The query parameter token will be filled by weaver.quotation_currency_token, while remaining values will be provided based on the source and target currency conversion requirements. The response body should be in JSON with minimally the conversion result field located at the root. The same caching policy will be applied as for the other API references.

    If none is provided, conversion rates will not be applied and currencies will always use weaver.quotation_currency_default.

    See Quotation and Billing for more details on the feature.

New in version 4.30.

  • weaver.quotation_currency_token = <API access token> [str]

    Password to employ for authentication when retrieving the Docker image used as Quote Estimator.

    Only required if the Docker image is not accessible publicly or already provided through some other means when requested by the Docker daemon. Should be combined with weaver.quotation_docker_username.
    See Quotation and Billing for more details on the feature.

New in version 4.30.

Changed in version 4.30: Renamed from weaver.quote_sync_max_wait to weaver.quotation_sync_max_wait.

Configuration of File Vault

New in version 4.9.

Configuration of the Vault is required in order to obtain access to its functionalities and to enable its API endpoints. This feature is notably employed to push local files to a remote Weaver instance when using the Weaver CLI and Client utilities, in order to use them for the Job execution. Please refer to below references for more details.

  • weaver.vault = true|false [bool-like]
    (default: true)

    Toggles the Vault feature.
  • weaver.vault_dir = <dir-path>
    (default: /tmp/vault)

    Defines the default location where to write files uploaded to the Vault.

    If the directory does not exist, it is created on demand by the feature making use of it.

Starting the Application

Todo

complete docs

make start (or similar command)