weaver.formats ============== .. py:module:: weaver.formats Module Contents --------------- .. py:data:: FileModeSteamType .. py:data:: LOGGER .. py:class:: AcceptLanguage Supported languages. .. py:attribute:: EN_CA :value: 'en-CA' .. py:attribute:: FR_CA :value: 'fr-CA' .. py:attribute:: EN_US :value: 'en-US' .. py:method:: offers() -> List[str] :classmethod: Languages offered by the application. .. py:class:: ContentType Supported ``Content-Type`` values. Media-Type nomenclature:: "/" [x- | "."] ["+" suffix] *[";" parameter=value] .. py:attribute:: APP_DIR :value: 'application/directory' .. py:attribute:: APP_CWL :value: 'application/cwl' .. py:attribute:: APP_CWL_JSON :value: 'application/cwl+json' .. py:attribute:: APP_CWL_YAML :value: 'application/cwl+yaml' .. py:attribute:: APP_CWL_X :value: 'application/x-cwl' .. py:attribute:: APP_FORM :value: 'application/x-www-form-urlencoded' .. py:attribute:: APP_GEOJSON :value: 'application/geo+json' .. py:attribute:: APP_GZIP :value: 'application/gzip' .. py:attribute:: APP_HDF5 :value: 'application/x-hdf5' .. py:attribute:: APP_JSON :value: 'application/json' .. py:attribute:: APP_JSONLD :value: 'application/ld+json' .. py:attribute:: APP_RAW_JSON :value: 'application/raw+json' .. py:attribute:: APP_OAS_JSON :value: 'application/vnd.oai.openapi+json; version=3.0' .. py:attribute:: APP_OGC_PKG_JSON :value: 'application/ogcapppkg+json' .. py:attribute:: APP_OGC_PKG_YAML :value: 'application/ogcapppkg+yaml' .. py:attribute:: APP_NETCDF :value: 'application/x-netcdf' .. py:attribute:: APP_NT :value: 'application/n-triples' .. py:attribute:: APP_OCTET_STREAM :value: 'application/octet-stream' .. py:attribute:: APP_PDF :value: 'application/pdf' .. py:attribute:: APP_TAR :value: 'application/x-tar' .. py:attribute:: APP_TAR_GZ :value: 'application/tar+gzip' .. py:attribute:: APP_VDN_GEOJSON :value: 'application/vnd.geo+json' .. py:attribute:: APP_XML :value: 'application/xml' .. py:attribute:: APP_YAML :value: 'application/x-yaml' .. py:attribute:: APP_ZIP :value: 'application/zip' .. py:attribute:: IMAGE_GEOTIFF :value: 'image/tiff; subtype=geotiff' .. py:attribute:: IMAGE_OGC_GEOTIFF :value: 'image/tiff; application=geotiff' .. py:attribute:: IMAGE_COG :value: 'image/tiff; application=geotiff; profile=cloud-optimized' .. py:attribute:: IMAGE_JPEG :value: 'image/jpeg' .. py:attribute:: IMAGE_GIF :value: 'image/gif' .. py:attribute:: IMAGE_PNG :value: 'image/png' .. py:attribute:: IMAGE_TIFF :value: 'image/tiff' .. py:attribute:: MULTIPART_ANY :value: 'multipart/*' .. py:attribute:: MULTIPART_FORM :value: 'multipart/form-data' .. py:attribute:: MULTIPART_MIXED :value: 'multipart/mixed' .. py:attribute:: MULTIPART_RELATED :value: 'multipart/related' .. py:attribute:: TEXT_ENRICHED :value: 'text/enriched' .. py:attribute:: TEXT_HTML :value: 'text/html' .. py:attribute:: TEXT_PLAIN :value: 'text/plain' .. py:attribute:: TEXT_RICHTEXT :value: 'text/richtext' .. py:attribute:: TEXT_XML :value: 'text/xml' .. py:attribute:: TEXT_PROVN :value: 'text/provenance-notation' .. py:attribute:: TEXT_TURTLE :value: 'text/turtle' .. py:attribute:: VIDEO_MPEG :value: 'video/mpeg' .. py:attribute:: ANY_JSON .. py:attribute:: ANY_CWL .. py:attribute:: ANY_XML .. py:attribute:: ANY_MULTIPART .. py:attribute:: ANY :value: '*/*' .. py:class:: ContentEncoding Supported ``Content-Encoding`` values. .. note:: Value ``binary`` is kept for convenience and backward compatibility with older definitions. It will default to the same encoding strategy as if ``base64`` was specified explicitly. Value ``binary`` is not part of :rfc:`4648`, but remains a common occurrence that dates from when ``format: binary`` was the approach employed to represent binary (JSON-schema Draft-04 and prior) instead of what is now recommended using ``contentEncoding: base64`` (JSON-schema Draft-07). .. seealso:: - https://github.com/json-schema-org/json-schema-spec/issues/803 - https://github.com/json-schema-org/json-schema-spec/pull/862 .. py:attribute:: UTF_8 :type: Literal['UTF-8'] :value: 'UTF-8' .. py:attribute:: BINARY :type: Literal['binary'] :value: 'binary' .. py:attribute:: BASE16 :type: Literal['base16'] :value: 'base16' .. py:attribute:: BASE32 :type: Literal['base32'] :value: 'base32' .. py:attribute:: BASE64 :type: Literal['base64'] :value: 'base64' .. py:method:: is_text(encoding: Any) -> bool :staticmethod: Indicates if the ``Content-Encoding`` value can be categorized as textual data. .. py:method:: is_binary(encoding: Any) -> bool :staticmethod: Indicates if the ``Content-Encoding`` value can be categorized as binary data. .. py:method:: open_parameters(encoding: Any, mode: FileModeSteamType = 'r') -> Tuple[FileModeEncoding, Literal['UTF-8', None]] :staticmethod: Obtains relevant ``mode`` and ``encoding`` parameters for :func:`open` using the specified ``Content-Encoding``. .. py:method:: encode(data: AnyStr, encoding: AnyContentEncoding = BASE64, binary: Literal[True] = True) -> bytes encode(data: AnyStr, encoding: AnyContentEncoding = BASE64, binary: Literal[False] = False) -> str encode(data: DataStrT, encoding: AnyContentEncoding = BASE64, binary: Literal[None] = None) -> DataStrT :staticmethod: Encodes the data to the requested encoding and convert it to the string-like data type representation. :param data: Data to encode. :param encoding: Target encoding method. :param binary: If unspecified, the string-like type will be the same as the input data. Otherwise, convert the encoded data to :class:`str` or :class:`bytes` accordingly. :return: Encoded and converted data. .. py:method:: decode(data: AnyStr, encoding: AnyContentEncoding = BASE64, binary: Literal[True] = True) -> bytes decode(data: AnyStr, encoding: AnyContentEncoding = BASE64, binary: Literal[False] = False) -> str decode(data: DataStrT, encoding: AnyContentEncoding = BASE64, binary: Literal[None] = None) -> DataStrT :staticmethod: Decodes the data from the specified encoding and convert it to the string-like data type representation. :param data: Data to decode. :param encoding: Expected source encoding. :param binary: If unspecified, the string-like type will be the same as the input data. Otherwise, convert the decoded data to :class:`str` or :class:`bytes` accordingly. :return: Decoded and converted data. .. py:class:: OutputFormat Renderer output formats for :term:`CLI`, `OpenAPI` and HTTP response content generation. .. py:attribute:: JSON :type: Literal['JSON', 'json'] .. py:attribute:: JSON_STR :type: Literal['JSON+STR', 'json+str'] .. py:attribute:: JSON_RAW :type: Literal['JSON+RAW', 'json+raw'] .. py:attribute:: YAML :type: Literal['YAML', 'yaml'] .. py:attribute:: YML :type: Literal['YML', 'yml'] .. py:attribute:: XML :type: Literal['XML', 'xml'] .. py:attribute:: XML_STR :type: Literal['XML+STR', 'xml+str'] .. py:attribute:: XML_RAW :type: Literal['XML+RAW', 'xml+raw'] .. py:attribute:: TXT :type: Literal['TXT', 'txt'] .. py:attribute:: TEXT :type: Literal['TEXT', 'text'] .. py:attribute:: HTML :type: Literal['HTML', 'html'] .. py:attribute:: HTML_STR :type: Literal['HTML+STR', 'html+str'] .. py:attribute:: HTML_RAW :type: Literal['HTML+RAW', 'html+raw'] .. py:method:: get(format_or_version: Union[str, AnyOutputFormat, AnyContentType, weaver.base.PropertyDataTypeT], default: Optional[AnyOutputFormat] = None, allow_version: bool = True) -> Union[AnyOutputFormat, weaver.base.PropertyDataTypeT] :classmethod: Resolve the applicable output format. :param format_or_version: Either a :term:`WPS` version, a known value for a ``f``/``format`` query parameter, or an ``Accept`` header that can be mapped to one of the supported output formats. :param default: Default output format if none could be resolved. If no explicit default is specified as default in case of unresolved format, ``JSON`` is used by default. :param allow_version: Enable :term:`WPS` version specifiers to infer the corresponding output representation. :return: Resolved output format. .. py:method:: convert(data: weaver.typedefs.JSON, to: Union[AnyOutputFormat, AnyContentType, None], item_root: str = 'item') -> Union[str, weaver.typedefs.JSON] :classmethod: Converts the input data from :term:`JSON` to another known format. :param data: Input data to convert. Must be a literal :term:`JSON` object, not a :term:`JSON`-like string. :param to: Target format representation. If the output format is not :term:`JSON`, it is **ALWAYS** converted to the formatted string of the requested format to ensure the contents are properly represented as intended. In the case of :term:`JSON` as target format or unknown format, the original object is returned directly. :param item_root: When using :term:`XML` or HTML representations, defines the top-most item name. Unused for other representations. :return: Formatted output. .. py:class:: SchemaRole Constants container that provides similar functionalities to :class:`ExtendedEnum` without explicit Enum membership. .. py:attribute:: JSON_SCHEMA :value: 'https://www.w3.org/2019/wot/json-schema' .. py:data:: _CONTENT_TYPE_EXTENSION_OVERRIDES .. py:data:: _CONTENT_TYPE_EXCLUDE .. py:data:: _EXTENSION_CONTENT_TYPES_OVERRIDES .. py:data:: _CONTENT_TYPE_SCHEMA_OVERRIDES .. py:data:: _CONTENT_TYPE_EXTENSION_MAPPING :type: Dict[str, str] .. py:data:: _CONTENT_TYPE_FORMAT_MAPPING :type: Dict[str, pywps.inout.formats.Format] .. py:data:: _CONTENT_TYPE_EXT_PATTERN .. py:data:: _CONTENT_TYPE_LOCALS_MISSING .. py:data:: _CONTENT_TYPE_LOCALS_MISSING .. py:data:: _EXTENSION_CONTENT_TYPES_MAPPING .. py:data:: _CONTENT_TYPE_CHAR_TYPES :value: ['application', 'multipart', 'text'] .. py:data:: _CONTENT_TYPE_SYNONYM_MAPPING .. py:data:: IANA_NAMESPACE :value: 'iana' .. py:data:: IANA_NAMESPACE_URL :value: 'https://www.iana.org/assignments/media-types/' .. py:data:: IANA_NAMESPACE_DEFINITION .. py:data:: IANA_KNOWN_MEDIA_TYPES .. py:data:: IANA_MAPPING .. py:data:: EDAM_NAMESPACE :value: 'edam' .. py:data:: EDAM_NAMESPACE_URL :value: 'http://edamontology.org/' .. py:data:: EDAM_NAMESPACE_DEFINITION .. py:data:: EDAM_SCHEMA :value: 'http://edamontology.org/EDAM_1.24.owl' .. py:data:: EDAM_MAPPING .. py:data:: OPENGIS_NAMESPACE :value: 'opengis' .. py:data:: OPENGIS_NAMESPACE_URL :value: 'http://www.opengis.net/' .. py:data:: OPENGIS_NAMESPACE_DEFINITION .. py:data:: OPENGIS_MAPPING .. py:data:: OGC_NAMESPACE :value: 'ogc' .. py:data:: OGC_NAMESPACE_URL :value: 'http://www.opengis.net/def/media-type/ogc/1.0/' .. py:data:: OGC_NAMESPACE_DEFINITION .. py:data:: OGC_MAPPING .. py:data:: FORMAT_NAMESPACE_MAPPINGS .. py:data:: FORMAT_NAMESPACE_DEFINITIONS .. py:data:: FORMAT_NAMESPACE_PREFIXES .. py:data:: FORMAT_NAMESPACES .. py:data:: DEFAULT_FORMAT .. py:data:: DEFAULT_FORMAT_MISSING :value: '__DEFAULT_FORMAT_MISSING__' .. py:function:: get_allowed_extensions() -> List[str] Obtain the complete list of extensions that are permitted for processing by the application. .. note:: This is employed for security reasons. Files can still be specified with another allowed extension, but it will not automatically inherit properties applicable to scripts and executables. If a specific file type is refused due to its extension, a PR can be submitted to add it explicitly. .. py:function:: get_format(media_type: str, default: Optional[str] = None) -> Optional[pywps.inout.formats.Format] Obtains a :class:`Format` with predefined extension and encoding details from known media-types. .. py:function:: get_extension(media_type: str, dot: bool = True) -> str Retrieves the extension corresponding to :paramref:`media_type` if explicitly defined, or by parsing it. .. py:function:: get_content_type(extension: str, charset: Optional[str] = None, default: Optional[str] = None) -> Optional[str] Retrieves the Content-Type corresponding to the specified extension if it can be matched. :param extension: Extension for which to attempt finding a known Content-Type. :param charset: Charset to apply to the Content-Type as needed if extension was matched. :param default: Default Content-Type to return if no extension is matched. :return: Matched or default Content-Type. .. py:function:: add_content_type_charset(content_type: Union[str, ContentType], charset: Optional[str]) -> str Apply the specific charset to the content-type with some validation in case of conflicting definitions. :param content_type: Desired Content-Type. :param charset: Desired charset parameter. :return: updated content-type with charset. .. py:function:: get_cwl_file_format(media_type: str) -> Tuple[Optional[weaver.typedefs.JSON], Optional[str]] get_cwl_file_format(media_type: str, make_reference: Literal[False] = False, **__: bool) -> Tuple[Optional[weaver.typedefs.JSON], Optional[str]] get_cwl_file_format(media_type: str, make_reference: Literal[True] = False, **__: bool) -> Optional[str] Obtains the extended schema reference from the media-type identifier. Obtains the corresponding `IANA`/`EDAM`/etc. ``format`` value to be applied under a :term:`CWL` :term:`I/O` ``File`` from the :paramref:`media_type` (``Content-Type`` header) using the first matched one. Lookup procedure is as follows: - If ``make_reference=False``: - If there is a match, returns ``tuple({}, )`` with: 1) corresponding namespace mapping to be applied under ``$namespaces`` in the `CWL`. 2) value of ``format`` adjusted according to the namespace to be applied to ``File`` in the `CWL`. - If there is no match but ``must_exist=False``, returns a literal and non-existing definition as ``tuple({"iana": }, )``. - If there is no match but ``must_exist=True`` **AND** ``allow_synonym=True``, retry the call with the synonym if available, or move to next step. Skip this step if ``allow_synonym=False``. - Otherwise, returns ``(None, None)`` - If ``make_reference=True``: - If there is a match, returns the explicit format reference as ``/``. - If there is no match but ``must_exist=False``, returns the literal reference as ``/`` (N.B.: literal non-official media-type reference will be returned even if an official synonym exists). - If there is no match but ``must_exist=True`` **AND** ``allow_synonym=True``, retry the call with the synonym if available, or move to next step. Skip this step if ``allow_synonym=False``. - Returns a single ``None`` as there is no match (directly or synonym). Note: In situations where ``must_exist=False`` is used and that the namespace and/or full format URL cannot be resolved to an existing reference, `CWL` will raise a validation error as it cannot confirm the ``format``. You must therefore make sure that the returned reference (or a synonym format) really exists when using ``must_exist=False`` before providing it to the `CWL` I/O definition. Setting ``must_exist=False`` should be used only for literal string comparison or pre-processing steps to evaluate formats. :param media_type: Some reference, namespaced or literal (possibly extended) media-type string. :param make_reference: Construct the full URL reference to the resolved media-type. Otherwise, return tuple details. :param must_exist: Return result only if it can be resolved to an official media-type (or synonym if enabled), otherwise ``None``. Non-official media-type can be enforced if disabled, in which case `IANA` namespace/URL is used as it preserves the original ``/`` format. :param allow_synonym: Allow resolution of non-official media-type to an official media-type synonym if available. Types defined as *synonym* have semantically the same format validation/resolution for :term:`CWL`. Requires ``must_exist=True``, otherwise the non-official media-type is employed directly as result. :returns: Resolved media-type format for `CWL` usage, accordingly to specified arguments (see description details). .. py:function:: map_cwl_media_type(cwl_format: Optional[str]) -> Optional[str] Obtains the Media-Type that corresponds to the specified :term:`CWL` ``format``. :param cwl_format: Long form URL or namespaced variant of a :term:`CWL` format referring to an ontology Media-Type. :return: Resolved Media-Type. .. py:function:: clean_media_type_format(media_type: str, suffix_subtype: bool = False, strip_parameters: bool = False) -> Optional[str] Obtains a generic media-type identifier by cleaning up any additional parameters. Removes any additional namespace key or URL from :paramref:`media_type` so that it corresponds to the generic representation (e.g.: ``application/json``) instead of the ``:`` mapping variant used in `CWL->inputs/outputs->File->format` or the complete URL reference. Removes any leading temporary local file prefix inserted by :term:`CWL` when resolving namespace mapping. This transforms ``file:///tmp/dir/path/package#application/json`` to plain ``application/json``. According to provided arguments, it also cleans up additional parameters or extracts sub-type suffixes. :param media_type: Media-Type, full URL to media-type or namespace-formatted string that must be cleaned up. :param suffix_subtype: Remove additional sub-type specializations details separated by ``+`` symbol such that an explicit format like ``application/vnd.api+json`` returns only its most basic suffix format defined as``application/json``. :param strip_parameters: Removes additional media-type parameters such that only the leading part defining the ``type/subtype`` are returned. For example, this will get rid of ``; charset=UTF-8`` or ``; version=4.0`` parameters. .. note:: Parameters :paramref:`suffix_subtype` and :paramref:`strip_parameters` are not necessarily exclusive. .. versionchanged:: 6.3.0 If :paramref:`media_type` contains multiple entries (such as when many acceptable :term:`Media-Type` are specified), only the first one will be considered. .. py:function:: default_format_handler(output_format: Union[str, AnyOutputFormat, AnyContentType]) -> Optional[AnyContentType] .. py:function:: guess_target_format(request: weaver.typedefs.AnyRequestType, default: Optional[Union[ContentType, str]]) -> ContentType guess_target_format(request: weaver.typedefs.AnyRequestType, return_source: Literal[True], override_user_agent: bool) -> Tuple[ContentType, FormatSource] guess_target_format(request: weaver.typedefs.AnyRequestType, default: Optional[Union[ContentType, str]], return_source: Literal[True], override_user_agent: bool) -> Tuple[ContentType, FormatSource] guess_target_format(request: weaver.typedefs.AnyRequestType, **kwargs: Any) -> ContentType Guess the best applicable response ``Content-Type`` header from the request. Considers the request ``Accept`` header, ``format`` query and alternatively ``f`` query to parse possible formats. Full Media-Type are expected in the header. Query parameters can use both the full Media-Type, or only the sub-type (i.e.: :term:`JSON`, :term:`XML`, etc.), with case-insensitive names. Defaults to :py:data:`ContentType.APP_JSON` if none was specified as :paramref:`default` explicitly and that no ``Accept`` header or ``format``/``f`` queries were provided. Otherwise, applies the specified :paramref:`default` format specifiers were not provided in the request. Can apply ``User-Agent`` specific logic to override automatically added ``Accept`` headers by many browsers such that sending requests to the :term:`API` using them will not automatically default back to typical :term:`XML` or :term:`HTML` representations. If browsers are used to send requests, but that ``format``/``f`` queries are used directly in the URL, those will be applied since this is a very intuitive (and easier) approach to request different formats when using browsers. Option :paramref:`override_user_agent` must be enabled to apply this behavior. When ``User-Agent`` clients are identified as another source, such as sending requests from a server or from code, both headers and query parameters are applied directly without question. :returns: Matched media-type or default, and optionally, the source of resolution. .. py:function:: find_supported_media_types(io_definition: weaver.typedefs.ProcessInputOutputItem) -> Optional[List[str]] Finds all supported media-types indicated by an :term:`I/O`. .. note:: Assumes that media-types are indicated under ``formats``, which should have been obtained either by direct submission when using :term:`WPS` deployment, generated from ``schema`` using :term:`OGC` deployment, or using the nested ``format`` of ``File`` types from :term:`CWL` deployment. :param io_definition: :return: supported media-types .. py:function:: json_default_handler(obj: Any) -> Union[weaver.typedefs.JSON, str, None] .. py:function:: repr_json(data: Any, force_string: bool = True, ensure_ascii: bool = False, indent: Optional[int] = 2, separators: Optional[Tuple[str, str]] = None, **kwargs: Any) -> Union[weaver.typedefs.JSON, str, None] Ensure that the input data can be serialized as JSON to return it formatted representation as such. If formatting as JSON fails, returns the data as string representation or ``None`` accordingly.