PROXY Protocol Support
When put behind a “proxy” / load balancer, server programs can no longer “see” the original client’s actual IP Address and Port.
This also affects aiosmtpd.
The HAProxy Developers have created a protocol called “PROXY Protocol” designed to solve this issue. You can read the reasoning behind this in their blog.
This initiative has been accepted and supported by many important software and services such as Amazon Web Services, HAProxy, NGINX, stunnel, Varnish, and many others.
aiosmtpd implements the PROXY Protocol as defined in the documentation accompanying HAProxy v2.3.0;
both Version 1 and Version 2 are supported.
Activating
To activate aiosmtpd’s PROXY Protocol Support,
you have to set the proxy_protocol_timeout parameter of the SMTP Class
to a positive numeric value (int or float)
The PROXY Protocol documentation suggests that the timeout should not be less than 3.0 seconds.
Important
Once you activate PROXY Protocol support, standard (E)SMTP handshake is no longer available.
Clients trying to connect to aiosmtpd will be REQUIRED
to send the PROXY Protocol Header
before they can continue with (E)SMTP transaction.
This is as specified in the PROXY Protocol documentation.
handle_PROXY Hook
In addition to activating the PROXY protocol support as described above,
you MUST implement the handle_PROXY hook.
If the handler object does not implement handle_PROXY,
then all connection attempts will be rejected.
The signature of handle_PROXY must be as follows:
- handle_PROXY(server, session, envelope, proxy_data)
- Parameters:
server (aiosmtpd.smtp.SMTP) – The
SMTPinstance invoking the hook.session (Session) – The Session data so far (see Important note below)
envelope (Envelope) – The Envelope data so far (see Important note below)
proxy_data (ProxyData) – The result of parsing the PROXY Header
- Returns:
Truthy or Falsey, indicating if the connection may continue or not, respectively
Important
The
session.peerattribute will contain theIP:portinformation of the directly adjacent client. In other word, it will contain the endpoint identifier of the proxying entity.Endpoint identifier of the “original” client will be recorded only in the
proxy_dataparameterThe
envelopedata will usually be empty(ish), because the PROXY handshake will take place before client can send any transaction data.
Parsing the Header
You do not have to concern yourself with parsing the PROXY Protocol header;
the aiosmtpd.proxy_protocol module contains the full parsing logic.
All you need to do is to validate the parsed result in the handle_PROXY hook.
Enums
- class aiosmtpd.proxy_protocol.AF
- UNSPEC = 0
- IP4 = 1
- IP6 = 2
- UNIX = 3
For Version 1,
UNKNOWNis mapped toUNSPEC.
- class aiosmtpd.proxy_protocol.PROTO
- UNSPEC = 0
- STREAM = 1
- DGRAM = 2
For Version 1,
UNKNOWNis mapped toUNSPEC, andTCPis mapped intoSTREAM
- class aiosmtpd.proxy_protocol.V2_CMD
- LOCAL = 0
- PROXY = 1
ProxyData API
- class aiosmtpd.proxy_protocol.ProxyData(version=None)
- Attributes & Properties
- version: int | None
Contains the version of the PROXY Protocol header.
If
None, it indicates that parsing has failed and the header is malformed.
- family: AF
Contains the address family.
Valid values for Version 1 excludes
AF.UNIX.
- protocol: PROTO
Contains an integer indicating the transport protocol being proxied.
Valid values for Version 1 excludes
PROTO.DGRAM.
- src_addr: IPv4Address | IPv6Address | AnyStr
Contains the source address (i.e., address of the “original” client).
The type of this attribute depends on the
address family.
- dst_addr: IPv4Address | IPv6Address | AnyStr
Contains the destination address (i.e., address of the proxying entity to which the “original” client connected).
The type of this attribute depends on the address family.
- src_port: int
Contains the source port (i.e., port of the “original” client).
Valid only for address family of
AF.INETorAF.INET6
- dst_port: int
Contains the destination port (i.e., port of the proxying entity to which the “original” client connected).
Valid only for address family of
AF.INETorAF.INET6
- rest: ByteString
The contents depend on the version of the PROXY header and (for version 2) the address family.
For PROXY Header version 1, it contains all the bytes following
b"UNKNOWN"up until, but not including, theCRLFterminator.For PROXY Header version 2:
For address family
UNSPEC, it contains all the bytes following the 16-octet header preambleFor address families
AF.INET,AF.INET6, andAF.UNIXit contains all the bytes following the address information
- tlv: aiosmtpd.proxy_protocol.ProxyTLV
This property contains the result of the TLV Parsing attempt of the
restattribute.If this property returns
Nonethat means either (1)restis empty, or (2) TLV Parsing is not successful.
- valid: bool
This property will indicate if PROXY Header is valid or not.
- whole_raw: bytearray
This attribute contains the whole, undecoded and unmodified, PROXY Header. For version 1, it contains everything up to and including the terminating
\r\n. For version 2, it contains everything up to and including the last TLV Vector.If you need to verify the
CRC32CTLV Vector (PROXYv2), you should run the CRC32C calculation against the contents of this attribute. For more information, see the next section, Note on CRC32C Calculation.
- tlv_start: int
This attribute points to the first TLV Vector if exists.
If you need to verify the
CRC32CTLV Vector, you should run the CRC32C calculation against the contents of this attribute.The value will be
Noneif PROXY version is 1.
Methods- with_error(error_msg: str) ProxyData
- Parameters:
error_msg (str) – Error message
- Returns:
self
Sets the instance’s
errorattribute and returns itself.
- same_attribs(_raises=False, **kwargs) bool
- Parameters:
_raises (bool) – If
True, raise exception if attribute not match/not found, instead of returning a bool. Defaults toFalse- Raises:
ValueError – if
_raises=Trueand attribute is found but value is wrongKeyError – if
_raises=Trueand attribute is not found
A helper method to quickly verify whether an attribute exists and contain the same value as expected.
Example usage:
proxy_data.same_attribs( version=1, protocol=b"TCP4", unknown_attrib=None )
In the above example,
same_attribswill check that all attributesversion,protocol, andunknown_attribexist, and contains the values1,b"TCP4", andNone, respectively.Missing attributes and/or differing values will return a
False(unless_raises=True)Note
For other examples, take a look inside the
test_proxyprotocol.pyfile. That file extensively usessame_attribs.
ProxyTLV API
- class aiosmtpd.proxy_protocol.ProxyTLV
This class parses the TLV portion of the PROXY Header and presents the value in an easy-to-use way: A “TLV Vector” whose “Type” is found in
PP2_TYPENAMEcan be accessed through the .<NAME> attribute.It is a subclass of
dict, so all ofdict’s methods are available. It is basically a Dict[str, Any] with additional methods and attributes. The list below only describes methods & attributes added to this class.- PP2_TYPENAME: Dict[int, str]
A mapping of numeric Type to a human-friendly Name.
The names are identical to the ones listed in the documentation, but with the
PP2_TYPE_/PP2_SUBTYPE_prefixes removed.Note
The
SSLName is special. Rather than containing the TLV Subvectors as described in the standard, it is aboolvalue that indicates whether the PP2_SUBTYPE_SSL
- tlv_loc: Dict[str, int]
A mapping to show the start location of certain TLV Vectors.
The keys are the TYPENAME (see
PP2_TYPENAMEabove), and the value is the offset from start of the TLV Vectors.
- same_attribs(_raises=False, **kwargs) bool
- Parameters:
_raises (bool) – If
True, raise exception if attribute not match/not found, instead of returning a bool. Defaults toFalse- Raises:
ValueError – if
_raises=Trueand attribute is found but value is wrongKeyError – if
_raises=Trueand attribute is not found
A helper method to quickly verify whether an attribute exists and contain the same value as expected.
Example usage:
assert isinstance(proxy_tlv, ProxyTLV) proxy_tlv.same_attribs( AUTHORITY=b"some_authority", SSL=True, )
In the above example,
same_attribswill check that the attributesAUTHORITYandSSLexist, and contains the valuesb"some_authority"andTrue, respectively.Missing attributes and/or differing values will return a
False(unless_raises=True)Note
For other examples, take a look inside the
test_proxyprotocol.pyfile. That file extensively usessame_attribs.
- classmethod from_raw(raw) ProxyTLV | None
- Parameters:
raw (ByteString) – The raw bytes containing the TLV Vectors
- Returns:
A new instance of ProxyTLV, or
Noneif parsing failed
This triggers the parsing of raw bytes/bytearray into a ProxyTLV instance.
Internally it relies on the
parse()classmethod to perform the parsing.Unlike the default behavior of
parse(),from_rawwill NOT perform a partial parsing.
- classmethod parse(chunk, partial_ok=True) Dict[str, Any]
- Parameters:
chunk (ByteString) – The bytes to parse into TLV Vectors
partial_ok (bool) – If
True, return partially-parsed TLV Vectors as is. IfFalse, (re)raiseMalformedTLV
- Returns:
A mapping of typenames and values
This performs a recursive parsing of the bytes. If it encounters a TYPE that ProxyTLV doesn’t recognize, the TLV Vector will be assigned a typename of “xNN”
Partial parsing is possible when
partial_ok=True; if during the parsing an error happened, parse will abort returning the TLV Vectors it had successfully decoded.
- classmethod name_to_num(name) int | None
- Parameters:
name (str) – The name to back-map into TYPE numeric
- Returns:
The numeric value associated to the typename,
Noneif no such mapping is found
This is a helper method to perform back-mapping of typenames.
Note on CRC32C Calculation
Neither the ProxyData nor ProxyTLV classes implement PROXYv2 CRC32C validation;
the main reason being that Python has no built-in module for calculating CRC32C.
To perform CRC32C, third-party modules need to be installed,
but we are uncomfortable doing that for the following reasons:
There are more than one third-party modules providing CRC32C, e.g.,
crcmod,crc32c,google-crc32c, etc. Problem is, there is no known clear comparison between them, so we cannot tell easily which one is ‘best’.Some of these third-party modules seem to be no longer being maintained.
Most of the available third-party modules are binary distribution. This potentially causes problems with existing binaries/libraries, not to mention possible (albeit unlikely) vector for malware.
We really don’t like adding dependencies outside those that are really needed.
In short, we have strong reasons to NOT implement PROXYv2 CRC32C validation, and we have plans to NEVER implement it.
If you absolutely need PROXYv2 CRC32C validation,
you should perform it yourself in the handle_PROXY() hook.
To assist you, we have provided the whole_raw, tlv_start, and tlv_loc attributes.
You should do the following:
Choose a CRC32C module of your liking, install that, and import it.
Find the “CRC32C” TLV Vector in
whole_raw; it would start at bytetlv_start + tlv_loc["CRC32C"]Zero out the 4-octet Value part of the “CRC32C” TLV Vector
Perform CRC32C calculation over the modified
whole_rawConvert the result to big-endian bytes, and compare with the
.CRC32Cattribute of the ProxyTLV instance
Example:
# The int(3) at end is to skip over the "T" and "L" part
offset = proxy_data.tlv_start + proxy_data.tlv.tlv_loc["CRC32C"] + 3
# Since whole_raw is a bytearray, we can do slice replacement
proxy_data.whole_raw[offset:offset + 4] = "\x00\x00\x00\x00"
# Actual syntax will depend on the module you use
calculated: int = crc32c(proxy_data.whole_raw)
# Adjust first part as necessary if calculated is not int
validated = calculated.to_bytes(4, "big") == proxy_data.tlv.CRC32C
Good luck!