weaver.xml_util
Define a default XML parser that avoids XXE injection.
Package lxml
is employed directly even though some linters (e.g.: bandit
) report to employ defusedxml
instead, because that package’s extension with lxml
is marked as deprecated.
To use the module, import is as if importing lxml.etree
:
from weaver.xml_util import XML # ElementTree
from weaver import xml_util
data = xml_util.fromstring("<xml>content</xml>")
Module Contents
- weaver.xml_util.parse(source: str | io.BufferedReader, parser: lxml.etree.XMLParser = XML_PARSER) XMLTree [source]
- weaver.xml_util._lxml_tree_parser_maker(**parser_kwargs: Any) lxml.etree.HTMLParser [source]
Generate the XML/HTML tree parser.
Uses similar parameters as in
bs4.builder._lxml.LXMLTreeBuilderForXML.default_parser()
, but overriding some other options to make it more secure.Without this modification, the builder is usually created using:
etree.XMLParser(target=self, strip_cdata=False, recover=True, encoding=encoding)