Parser
- class headerparser.HeaderParser(normalizer=None, body=None, **kwargs)[source]
A parser for RFC 822-style header sections. Define the fields the parser should recognize with the
add_field()method, configure handling of unrecognized fields withadd_additional(), and then parse input withparse()or anotherparse_*()method.- Parameters
normalizer (callable) – By default, the parser will consider two field names to be equal iff their lowercased forms are equal. This can be overridden by setting
normalizerto a custom callable that takes a field name and returns a “normalized” name for use in equality testing. The normalizer will also be used when looking up keys in theNormalizedDictinstances returned by the parser’sparse_*()methods.body (bool) – whether the parser should allow or forbid a body after the header section;
Truemeans a body is required,Falsemeans a body is prohibited, andNone(the default) means a body is optionalkwargs – scanner options
- add_additional(enable=True, **kwargs)[source]
Specify how the parser should handle fields in the input that were not previously registered with
add_field. By default, unknown fields will cause theparse_*methods to raise anUnknownFieldError, but calling this method withenable=True(the default) will change the parser’s behavior so that all unregistered fields are processed according to the options in**kwargs. (If no options are specified, the additional values will just be stored in the result dictionary.)If this method is called more than once, only the settings from the last call will be used.
Note that additional field values are always stored in the result dictionary using their field name as the key, and two fields are considered the same (for the purposes of
multiple) iff their names are the same after normalization. Customization of the dictionary key and field name can only be done throughadd_field.New in version 0.2.0:
actionargument added- Parameters
enable (bool) – whether the parser should accept input fields that were not registered with
add_field; setting this toFalsedisables additional fields and restores the parser’s default behaviormultiple (bool) – If
True, each additional header field will be allowed to occur more than once in the input, and each field’s values will be stored in a list. IfFalse(the default), aDuplicateFieldErrorwill be raised if an additional field occurs more than once in the input.unfold (bool) – If
True(defaultFalse), additional field values will be “unfolded” (i.e., line breaks will be removed and whitespace around line breaks will be converted to a single space) before applyingtypetype (callable) – a callable to apply to additional field values before storing them in the result dictionary
choices (iterable) – A sequence of values which additional fields are allowed to have. If
choicesis defined, all additional field values in the input must have one of the given values (after applyingtype) or else anInvalidChoiceErroris raised.action (callable) – A callable to invoke whenever the field is encountered in the input. The callable will be passed the current dictionary of header fields, the field’s name, and the field’s value (after processing with
typeandunfoldand checking againstchoices). The callable replaces the default behavior of storing the field’s values in the result dictionary, and so the callable must explicitly store the values if desired.
- Returns
- Raises
if
enableis true and a previous call toadd_fieldused a customdestif
choicesis an empty sequence
- add_field(name, *altnames, **kwargs)[source]
Define a header field for the parser to parse. During parsing, if a field is encountered whose name (modulo normalization) equals either
nameor one of thealtnames, the field’s value will be processed according to the options in**kwargs. (If no options are specified, the value will just be stored in the result dictionary.)New in version 0.2.0:
actionargument added- Parameters
name (string) – the primary name for the field, used in error messages and as the default value of
destaltnames (strings) – field name synonyms
dest – The key in the result dictionary in which the field’s value(s) will be stored; defaults to
name. When additional headers are enabled (seeadd_additional),destmust equal (after normalization) one of the field’s names.required (bool) – If
True(defaultFalse), theparse_*methods will raise aMissingFieldErrorif the field is not present in the inputdefault – The value to associate with the field if it is not present in the input. If no default value is specified, the field will be omitted from the result dictionary if it is not present in the input.
defaultcannot be set when the field is required.type,unfold, andactionwill not be applied to the default value, and the default value need not belong tochoices.multiple (bool) – If
True, the header field will be allowed to occur more than once in the input, and all of the field’s values will be stored in a list. IfFalse(the default), aDuplicateFieldErrorwill be raised if the field occurs more than once in the input.unfold (bool) – If
True(defaultFalse), the field value will be “unfolded” (i.e., line breaks will be removed and whitespace around line breaks will be converted to a single space) before applyingtypetype (callable) – a callable to apply to the field value before storing it in the result dictionary
choices (iterable) – A sequence of values which the field is allowed to have. If
choicesis defined, all occurrences of the field in the input must have one of the given values (after applyingtype) or else anInvalidChoiceErroris raised.action (callable) – A callable to invoke whenever the field is encountered in the input. The callable will be passed the current dictionary of header fields, the field’s
name, and the field’s value (after processing withtypeandunfoldand checking againstchoices). The callable replaces the default behavior of storing the field’s values in the result dictionary, and so the callable must explicitly store the values if desired. Whenactionis defined for a field,destcannot be.
- Returns
- Raises
if another field with the same name or
destwas already definedif
destis not one of the field’s names andadd_additionalis enabledif
defaultis defined andrequiredis trueif
choicesis an empty sequenceif both
destandactionare defined
TypeError – if
nameor one of thealtnamesis not a string
- parse(iterable)[source]
New in version 0.4.0.
Parse an RFC 822-style header field section (possibly followed by a message body) from the contents of the given filehandle or sequence of lines and return a dictionary of the header fields (possibly with body attached). If
iterableis an iterable ofstr, newlines will be appended to lines in multiline header fields where not already present but will not be inserted where missing inside the body.- Parameters
iterable – a text-file-like object or iterable of lines to parse
- Return type
- Raises
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalScannerError – if the header section is malformed
- parse_file(fp)[source]
Parse an RFC 822-style header field section (possibly followed by a message body) from the contents of the given filehandle and return a dictionary of the header fields (possibly with body attached)
Deprecated since version 0.4.0: Use
parse()instead.- Parameters
fp (file-like object) – the file to parse
- Return type
- Raises
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalScannerError – if the header section is malformed
- parse_lines(iterable)[source]
Parse an RFC 822-style header field section (possibly followed by a message body) from the given sequence of lines and return a dictionary of the header fields (possibly with body attached). Newlines will be inserted where not already present in multiline header fields but will not be inserted inside the body.
Deprecated since version 0.4.0: Use
parse()instead.- Parameters
iterable (iterable of strings) – a sequence of lines comprising the text to parse
- Return type
- Raises
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalScannerError – if the header section is malformed
- parse_next_stanza(iterator)[source]
New in version 0.4.0.
Parse a RFC 822-style header field section from the contents of the given filehandle or iterator of lines and return a dictionary of the header fields. Input processing stops at the end of the header section, leaving the rest of the iterator unconsumed. As a message body is not consumed, calling this method when
bodyis true will produce aMissingBodyError.- Parameters
iterator – a text-file-like object or iterator of lines to parse
- Return type
- Raises
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalScannerError – if a header section is malformed
- parse_next_stanza_string(s)[source]
New in version 0.4.0.
Parse a RFC 822-style header field section from the given string and return a pair of a dictionary of the header fields and the rest of the string. As a message body is not consumed, calling this method when
bodyis true will produce aMissingBodyError.- Parameters
s (string) – the text to parse
- Return type
pair of
NormalizedDictand a string- Raises
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalScannerError – if a header section is malformed
- parse_stanzas(iterable)[source]
New in version 0.4.0.
Parse zero or more stanzas of RFC 822-style header fields from the given filehandle or sequence of lines and return a generator of dictionaries of header fields.
All of the input is treated as header sections, not message bodies; as a result, calling this method when
bodyis true will produce aMissingBodyError.- Parameters
iterable – a text-file-like object or iterable of lines to parse
- Return type
generator of
NormalizedDict- Raises
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalScannerError – if a header section is malformed
- parse_stanzas_stream(fields)[source]
New in version 0.4.0.
Parse an iterable of iterables of
(name, value)pairs as returned byscan_stanzas()orscan_stanzas_string()and return a generator of dictionaries of header fields. This is a low-level method that you will usually not need to call.- Parameters
fields – an iterable of iterables of pairs of strings
- Return type
generator of
NormalizedDict- Raises
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalScannerError – if a header section is malformed
- parse_stanzas_string(s)[source]
New in version 0.4.0.
Parse zero or more stanzas of RFC 822-style header fields from the given string and return a generator of dictionaries of header fields.
All of the input is treated as header sections, not message bodies; as a result, calling this method when
bodyis true will produce aMissingBodyError.- Parameters
s (string) – the text to parse
- Return type
generator of
NormalizedDict- Raises
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalScannerError – if a header section is malformed
- parse_stream(fields)[source]
Process a sequence of
(name, value)pairs as returned byscan()orscan_string()and return a dictionary of header fields (possibly with body attached). This is a low-level method that you will usually not need to call.- Parameters
fields (iterable of pairs of strings) – a sequence of
(name, value)pairs representing the input fields- Return type
- Raises
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalValueError – if the input contains more than one body pair
- parse_string(s)[source]
Parse an RFC 822-style header field section (possibly followed by a message body) from the given string and return a dictionary of the header fields (possibly with body attached)
- Parameters
s (string) – the text to parse
- Return type
- Raises
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalScannerError – if the header section is malformed