weaver.xml_util
Define a default XML parser that avoids XXE injection.
Package lxml is employed directly even though some linters (e.g.: bandit) report to employ defusedxml
instead, because that package’s extension with lxml is marked as deprecated.
To use the module, import is as if importing lxml.etree:
from weaver.xml_util import XML # ElementTree
from weaver import xml_util
data = xml_util.fromstring("<xml>content</xml>")
Module Contents
- weaver.xml_util.parse(source: str | io.BufferedReader, parser: lxml.etree.XMLParser = XML_PARSER) XMLTree[source]
- weaver.xml_util._lxml_tree_parser_maker(**parser_kwargs: Any) lxml.etree.HTMLParser[source]
Generate the XML/HTML tree parser.
Uses similar parameters as in
bs4.builder._lxml.LXMLTreeBuilderForXML.default_parser(), but overriding some other options to make it more secure.Without this modification, the builder is usually created using:
etree.XMLParser(target=self, strip_cdata=False, recover=True, encoding=encoding)