14710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm"""RFC 2822 message manipulation. 24710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 34710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmNote: This is only a very rough sketch of a full RFC-822 parser; in particular 44710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmthe tokenizing of addresses does not adhere to all the quoting rules. 54710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 64710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmNote: RFC 2822 is a long awaited update to RFC 822. This module should 74710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmconform to RFC 2822, and is thus mis-named (it's not worth renaming it). Some 84710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmeffort at RFC 2822 updates have been made, but a thorough audit has not been 94710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmperformed. Consider any RFC 2822 non-conformance to be a bug. 104710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 114710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm RFC 2822: http://www.faqs.org/rfcs/rfc2822.html 124710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm RFC 822 : http://www.faqs.org/rfcs/rfc822.html (obsolete) 134710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 144710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmDirections for use: 154710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 164710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmTo create a Message object: first open a file, e.g.: 174710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 184710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm fp = open(file, 'r') 194710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 204710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmYou can use any other legal way of getting an open file object, e.g. use 214710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmsys.stdin or call os.popen(). Then pass the open file object to the Message() 224710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmconstructor: 234710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 244710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm m = Message(fp) 254710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 264710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmThis class can work with any input object that supports a readline method. If 274710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmthe input object has seek and tell capability, the rewindbody method will 284710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmwork; also illegal lines will be pushed back onto the input stream. If the 294710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylminput object lacks seek but has an `unread' method that can push back a line 304710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmof input, Message will use that to push back illegal lines. Thus this class 314710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmcan be used to parse messages coming from a buffered stream. 324710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 334710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmThe optional `seekable' argument is provided as a workaround for certain stdio 344710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmlibraries in which tell() discards buffered data before discovering that the 354710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmlseek() system call doesn't work. For maximum portability, you should set the 364710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmseekable argument to zero to prevent that initial \code{tell} when passing in 374710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylman unseekable object such as a a file object created from a socket object. If 384710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmit is 1 on entry -- which it is by default -- the tell() method of the open 394710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmfile object is called once; if this raises an exception, seekable is reset to 404710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm0. For other nonzero values of seekable, this test is not made. 414710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 424710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmTo get the text of a particular header there are several methods: 434710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 444710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm str = m.getheader(name) 454710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm str = m.getrawheader(name) 464710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 474710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmwhere name is the name of the header, e.g. 'Subject'. The difference is that 484710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmgetheader() strips the leading and trailing whitespace, while getrawheader() 494710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmdoesn't. Both functions retain embedded whitespace (including newlines) 504710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmexactly as they are specified in the header, and leave the case of the text 514710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmunchanged. 524710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 534710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmFor addresses and address lists there are functions 544710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 554710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm realname, mailaddress = m.getaddr(name) 564710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm list = m.getaddrlist(name) 574710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 584710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmwhere the latter returns a list of (realname, mailaddr) tuples. 594710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 604710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmThere is also a method 614710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 624710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm time = m.getdate(name) 634710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 644710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmwhich parses a Date-like field and returns a time-compatible tuple, 654710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmi.e. a tuple such as returned by time.localtime() or accepted by 664710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmtime.mktime(). 674710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 684710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmSee the class definition for lower level access methods. 694710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 704710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmThere are also some utility functions here. 714710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm""" 724710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm# Cleanup and extensions by Eric S. Raymond <esr@thyrsus.com> 734710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 744710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmimport time 754710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 764710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmfrom warnings import warnpy3k 774710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmwarnpy3k("in 3.x, rfc822 has been removed in favor of the email package", 784710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm stacklevel=2) 794710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 804710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm__all__ = ["Message","AddressList","parsedate","parsedate_tz","mktime_tz"] 814710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 824710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm_blanklines = ('\r\n', '\n') # Optimization for islast() 834710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 844710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 854710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmclass Message: 864710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Represents a single RFC 2822-compliant message.""" 874710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 884710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __init__(self, fp, seekable = 1): 894710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Initialize the class instance and read the headers.""" 904710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if seekable == 1: 914710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # Exercise tell() to make sure it works 924710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # (and then assume seek() works, too) 934710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm try: 944710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm fp.tell() 954710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm except (AttributeError, IOError): 964710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm seekable = 0 974710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.fp = fp 984710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.seekable = seekable 994710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.startofheaders = None 1004710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.startofbody = None 1014710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # 1024710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if self.seekable: 1034710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm try: 1044710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.startofheaders = self.fp.tell() 1054710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm except IOError: 1064710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.seekable = 0 1074710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # 1084710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.readheaders() 1094710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # 1104710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if self.seekable: 1114710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm try: 1124710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.startofbody = self.fp.tell() 1134710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm except IOError: 1144710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.seekable = 0 1154710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 1164710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def rewindbody(self): 1174710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Rewind the file to the start of the body (if seekable).""" 1184710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if not self.seekable: 1194710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm raise IOError, "unseekable file" 1204710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.fp.seek(self.startofbody) 1214710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 1224710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def readheaders(self): 1234710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Read header lines. 1244710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 1254710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm Read header lines up to the entirely blank line that terminates them. 1264710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm The (normally blank) line that ends the headers is skipped, but not 1274710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm included in the returned list. If a non-header line ends the headers, 1284710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm (which is an error), an attempt is made to backspace over it; it is 1294710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm never included in the returned list. 1304710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 1314710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm The variable self.status is set to the empty string if all went well, 1324710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm otherwise it is an error message. The variable self.headers is a 1334710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm completely uninterpreted list of lines contained in the header (so 1344710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm printing them will reproduce the header exactly as it appears in the 1354710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm file). 1364710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 1374710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.dict = {} 1384710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.unixfrom = '' 1394710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.headers = lst = [] 1404710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.status = '' 1414710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm headerseen = "" 1424710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm firstline = 1 1434710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm startofline = unread = tell = None 1444710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if hasattr(self.fp, 'unread'): 1454710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm unread = self.fp.unread 1464710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.seekable: 1474710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm tell = self.fp.tell 1484710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm while 1: 1494710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if tell: 1504710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm try: 1514710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm startofline = tell() 1524710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm except IOError: 1534710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm startofline = tell = None 1544710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.seekable = 0 1554710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm line = self.fp.readline() 1564710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if not line: 1574710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.status = 'EOF in headers' 1584710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm break 1594710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # Skip unix From name time lines 1604710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if firstline and line.startswith('From '): 1614710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.unixfrom = self.unixfrom + line 1624710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm continue 1634710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm firstline = 0 1644710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if headerseen and line[0] in ' \t': 1654710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # It's a continuation line. 1664710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm lst.append(line) 1674710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm x = (self.dict[headerseen] + "\n " + line.strip()) 1684710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.dict[headerseen] = x.strip() 1694710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm continue 1704710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.iscomment(line): 1714710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # It's a comment. Ignore it. 1724710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm continue 1734710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.islast(line): 1744710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # Note! No pushback here! The delimiter line gets eaten. 1754710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm break 1764710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm headerseen = self.isheader(line) 1774710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if headerseen: 1784710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # It's a legal header line, save it. 1794710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm lst.append(line) 1804710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.dict[headerseen] = line[len(headerseen)+1:].strip() 1814710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm continue 1824710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 1834710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # It's not a header line; throw it back and stop here. 1844710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if not self.dict: 1854710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.status = 'No headers' 1864710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 1874710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.status = 'Non-header line where header expected' 1884710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # Try to undo the read. 1894710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if unread: 1904710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm unread(line) 1914710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif tell: 1924710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.fp.seek(startofline) 1934710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 1944710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.status = self.status + '; bad seek' 1954710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm break 1964710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 1974710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def isheader(self, line): 1984710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Determine whether a given line is a legal header. 1994710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 2004710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm This method should return the header name, suitably canonicalized. 2014710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm You may override this method in order to use Message parsing on tagged 2024710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm data in RFC 2822-like formats with special header formats. 2034710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 2044710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm i = line.find(':') 2054710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if i > 0: 2064710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return line[:i].lower() 2074710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return None 2084710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 2094710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def islast(self, line): 2104710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Determine whether a line is a legal end of RFC 2822 headers. 2114710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 2124710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm You may override this method if your application wants to bend the 2134710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm rules, e.g. to strip trailing whitespace, or to recognize MH template 2144710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm separators ('--------'). For convenience (e.g. for code reading from 2154710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm sockets) a line consisting of \r\n also matches. 2164710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 2174710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return line in _blanklines 2184710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 2194710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def iscomment(self, line): 2204710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Determine whether a line should be skipped entirely. 2214710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 2224710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm You may override this method in order to use Message parsing on tagged 2234710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm data in RFC 2822-like formats that support embedded comments or 2244710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm free-text data. 2254710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 2264710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return False 2274710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 2284710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getallmatchingheaders(self, name): 2294710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Find all header lines matching a given header name. 2304710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 2314710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm Look through the list of headers and find all lines matching a given 2324710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm header name (and their continuation lines). A list of the lines is 2334710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm returned, without interpretation. If the header does not occur, an 2344710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm empty list is returned. If the header occurs multiple times, all 2354710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm occurrences are returned. Case is not important in the header name. 2364710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 2374710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm name = name.lower() + ':' 2384710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm n = len(name) 2394710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm lst = [] 2404710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm hit = 0 2414710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm for line in self.headers: 2424710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if line[:n].lower() == name: 2434710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm hit = 1 2444710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif not line[:1].isspace(): 2454710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm hit = 0 2464710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if hit: 2474710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm lst.append(line) 2484710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return lst 2494710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 2504710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getfirstmatchingheader(self, name): 2514710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Get the first header line matching name. 2524710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 2534710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm This is similar to getallmatchingheaders, but it returns only the 2544710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm first matching header (and its continuation lines). 2554710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 2564710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm name = name.lower() + ':' 2574710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm n = len(name) 2584710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm lst = [] 2594710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm hit = 0 2604710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm for line in self.headers: 2614710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if hit: 2624710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if not line[:1].isspace(): 2634710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm break 2644710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif line[:n].lower() == name: 2654710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm hit = 1 2664710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if hit: 2674710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm lst.append(line) 2684710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return lst 2694710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 2704710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getrawheader(self, name): 2714710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """A higher-level interface to getfirstmatchingheader(). 2724710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 2734710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm Return a string containing the literal text of the header but with the 2744710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm keyword stripped. All leading, trailing and embedded whitespace is 2754710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm kept in the string, however. Return None if the header does not 2764710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm occur. 2774710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 2784710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 2794710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm lst = self.getfirstmatchingheader(name) 2804710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if not lst: 2814710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return None 2824710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm lst[0] = lst[0][len(name) + 1:] 2834710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return ''.join(lst) 2844710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 2854710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getheader(self, name, default=None): 2864710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Get the header value for a name. 2874710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 2884710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm This is the normal interface: it returns a stripped version of the 2894710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm header value for a given header name, or None if it doesn't exist. 2904710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm This uses the dictionary version which finds the *last* such header. 2914710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 2924710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return self.dict.get(name.lower(), default) 2934710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm get = getheader 2944710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 2954710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getheaders(self, name): 2964710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Get all values for a header. 2974710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 2984710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm This returns a list of values for headers given more than once; each 2994710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm value in the result list is stripped in the same way as the result of 3004710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm getheader(). If the header is not given, return an empty list. 3014710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 3024710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm result = [] 3034710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm current = '' 3044710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm have_header = 0 3054710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm for s in self.getallmatchingheaders(name): 3064710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if s[0].isspace(): 3074710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if current: 3084710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm current = "%s\n %s" % (current, s.strip()) 3094710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 3104710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm current = s.strip() 3114710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 3124710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if have_header: 3134710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm result.append(current) 3144710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm current = s[s.find(":") + 1:].strip() 3154710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm have_header = 1 3164710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if have_header: 3174710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm result.append(current) 3184710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return result 3194710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 3204710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getaddr(self, name): 3214710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Get a single address from a header, as a tuple. 3224710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 3234710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm An example return value: 3244710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm ('Guido van Rossum', 'guido@cwi.nl') 3254710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 3264710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # New, by Ben Escoto 3274710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm alist = self.getaddrlist(name) 3284710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if alist: 3294710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return alist[0] 3304710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 3314710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return (None, None) 3324710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 3334710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getaddrlist(self, name): 3344710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Get a list of addresses from a header. 3354710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 3364710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm Retrieves a list of addresses from a header, where each address is a 3374710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm tuple as returned by getaddr(). Scans all named headers, so it works 3384710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm properly with multiple To: or Cc: headers for example. 3394710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 3404710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm raw = [] 3414710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm for h in self.getallmatchingheaders(name): 3424710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if h[0] in ' \t': 3434710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm raw.append(h) 3444710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 3454710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if raw: 3464710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm raw.append(', ') 3474710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm i = h.find(':') 3484710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if i > 0: 3494710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm addr = h[i+1:] 3504710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm raw.append(addr) 3514710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm alladdrs = ''.join(raw) 3524710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm a = AddressList(alladdrs) 3534710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return a.addresslist 3544710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 3554710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getdate(self, name): 3564710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Retrieve a date field from a header. 3574710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 3584710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm Retrieves a date field from the named header, returning a tuple 3594710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm compatible with time.mktime(). 3604710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 3614710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm try: 3624710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm data = self[name] 3634710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm except KeyError: 3644710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return None 3654710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return parsedate(data) 3664710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 3674710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getdate_tz(self, name): 3684710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Retrieve a date field from a header as a 10-tuple. 3694710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 3704710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm The first 9 elements make up a tuple compatible with time.mktime(), 3714710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm and the 10th is the offset of the poster's time zone from GMT/UTC. 3724710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 3734710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm try: 3744710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm data = self[name] 3754710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm except KeyError: 3764710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return None 3774710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return parsedate_tz(data) 3784710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 3794710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 3804710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # Access as a dictionary (only finds *last* header of each type): 3814710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 3824710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __len__(self): 3834710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Get the number of headers in a message.""" 3844710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return len(self.dict) 3854710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 3864710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __getitem__(self, name): 3874710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Get a specific header, as from a dictionary.""" 3884710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return self.dict[name.lower()] 3894710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 3904710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __setitem__(self, name, value): 3914710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Set the value of a header. 3924710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 3934710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm Note: This is not a perfect inversion of __getitem__, because any 3944710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm changed headers get stuck at the end of the raw-headers list rather 3954710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm than where the altered header was. 3964710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 3974710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm del self[name] # Won't fail if it doesn't exist 3984710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.dict[name.lower()] = value 3994710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm text = name + ": " + value 4004710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm for line in text.split("\n"): 4014710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.headers.append(line + "\n") 4024710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4034710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __delitem__(self, name): 4044710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Delete all occurrences of a specific header, if it is present.""" 4054710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm name = name.lower() 4064710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if not name in self.dict: 4074710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return 4084710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm del self.dict[name] 4094710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm name = name + ':' 4104710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm n = len(name) 4114710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm lst = [] 4124710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm hit = 0 4134710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm for i in range(len(self.headers)): 4144710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm line = self.headers[i] 4154710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if line[:n].lower() == name: 4164710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm hit = 1 4174710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif not line[:1].isspace(): 4184710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm hit = 0 4194710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if hit: 4204710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm lst.append(i) 4214710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm for i in reversed(lst): 4224710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm del self.headers[i] 4234710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4244710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def setdefault(self, name, default=""): 4254710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm lowername = name.lower() 4264710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if lowername in self.dict: 4274710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return self.dict[lowername] 4284710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 4294710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm text = name + ": " + default 4304710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm for line in text.split("\n"): 4314710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.headers.append(line + "\n") 4324710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.dict[lowername] = default 4334710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return default 4344710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4354710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def has_key(self, name): 4364710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Determine whether a message contains the named header.""" 4374710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return name.lower() in self.dict 4384710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4394710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __contains__(self, name): 4404710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Determine whether a message contains the named header.""" 4414710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return name.lower() in self.dict 4424710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4434710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __iter__(self): 4444710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return iter(self.dict) 4454710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4464710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def keys(self): 4474710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Get all of a message's header field names.""" 4484710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return self.dict.keys() 4494710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4504710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def values(self): 4514710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Get all of a message's header field values.""" 4524710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return self.dict.values() 4534710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4544710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def items(self): 4554710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Get all of a message's headers. 4564710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4574710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm Returns a list of name, value tuples. 4584710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 4594710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return self.dict.items() 4604710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4614710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __str__(self): 4624710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return ''.join(self.headers) 4634710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4644710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4654710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm# Utility functions 4664710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm# ----------------- 4674710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4684710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm# XXX Should fix unquote() and quote() to be really conformant. 4694710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm# XXX The inverses of the parse functions may also be useful. 4704710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4714710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4724710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmdef unquote(s): 4734710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Remove quotes from a string.""" 4744710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if len(s) > 1: 4754710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if s.startswith('"') and s.endswith('"'): 4764710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return s[1:-1].replace('\\\\', '\\').replace('\\"', '"') 4774710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if s.startswith('<') and s.endswith('>'): 4784710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return s[1:-1] 4794710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return s 4804710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4814710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4824710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmdef quote(s): 4834710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Add quotes around a string.""" 4844710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return s.replace('\\', '\\\\').replace('"', '\\"') 4854710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4864710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4874710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmdef parseaddr(address): 4884710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Parse an address into a (realname, mailaddr) tuple.""" 4894710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm a = AddressList(address) 4904710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm lst = a.addresslist 4914710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if not lst: 4924710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return (None, None) 4934710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return lst[0] 4944710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4954710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4964710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmclass AddrlistClass: 4974710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Address parser class by Ben Escoto. 4984710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 4994710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm To understand what this class does, it helps to have a copy of 5004710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm RFC 2822 in front of you. 5014710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5024710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm http://www.faqs.org/rfcs/rfc2822.html 5034710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5044710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm Note: this class interface is deprecated and may be removed in the future. 5054710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm Use rfc822.AddressList instead. 5064710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 5074710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5084710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __init__(self, field): 5094710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Initialize a new instance. 5104710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5114710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm `field' is an unparsed address header field, containing one or more 5124710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm addresses. 5134710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 5144710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.specials = '()<>@,:;.\"[]' 5154710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos = 0 5164710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.LWS = ' \t' 5174710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.CR = '\r\n' 5184710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.atomends = self.specials + self.LWS + self.CR 5194710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # Note that RFC 2822 now specifies `.' as obs-phrase, meaning that it 5204710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # is obsolete syntax. RFC 2822 requires that we recognize obsolete 5214710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # syntax, so allow dots in phrases. 5224710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.phraseends = self.atomends.replace('.', '') 5234710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.field = field 5244710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.commentlist = [] 5254710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5264710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def gotonext(self): 5274710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Parse up to the start of the next address.""" 5284710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm while self.pos < len(self.field): 5294710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if self.field[self.pos] in self.LWS + '\n\r': 5304710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos = self.pos + 1 5314710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] == '(': 5324710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.commentlist.append(self.getcomment()) 5334710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: break 5344710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5354710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getaddrlist(self): 5364710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Parse all addresses. 5374710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5384710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm Returns a list containing all of the addresses. 5394710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 5404710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm result = [] 5414710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm ad = self.getaddress() 5424710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm while ad: 5434710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm result += ad 5444710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm ad = self.getaddress() 5454710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return result 5464710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5474710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getaddress(self): 5484710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Parse the next address.""" 5494710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.commentlist = [] 5504710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.gotonext() 5514710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5524710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm oldpos = self.pos 5534710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm oldcl = self.commentlist 5544710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm plist = self.getphraselist() 5554710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5564710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.gotonext() 5574710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm returnlist = [] 5584710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5594710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if self.pos >= len(self.field): 5604710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # Bad email address technically, no domain. 5614710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if plist: 5624710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm returnlist = [(' '.join(self.commentlist), plist[0])] 5634710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5644710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] in '.@': 5654710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # email address is just an addrspec 5664710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # this isn't very efficient since we start over 5674710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos = oldpos 5684710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.commentlist = oldcl 5694710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm addrspec = self.getaddrspec() 5704710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm returnlist = [(' '.join(self.commentlist), addrspec)] 5714710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5724710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] == ':': 5734710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # address is a group 5744710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm returnlist = [] 5754710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5764710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm fieldlen = len(self.field) 5774710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 5784710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm while self.pos < len(self.field): 5794710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.gotonext() 5804710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if self.pos < fieldlen and self.field[self.pos] == ';': 5814710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 5824710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm break 5834710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm returnlist = returnlist + self.getaddress() 5844710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5854710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] == '<': 5864710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # Address is a phrase then a route addr 5874710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm routeaddr = self.getrouteaddr() 5884710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5894710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if self.commentlist: 5904710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm returnlist = [(' '.join(plist) + ' (' + \ 5914710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm ' '.join(self.commentlist) + ')', routeaddr)] 5924710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: returnlist = [(' '.join(plist), routeaddr)] 5934710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 5944710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 5954710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if plist: 5964710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm returnlist = [(' '.join(self.commentlist), plist[0])] 5974710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] in self.specials: 5984710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 5994710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 6004710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.gotonext() 6014710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if self.pos < len(self.field) and self.field[self.pos] == ',': 6024710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 6034710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return returnlist 6044710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 6054710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getrouteaddr(self): 6064710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Parse a route address (Return-path value). 6074710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 6084710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm This method just skips all the route stuff and returns the addrspec. 6094710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 6104710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if self.field[self.pos] != '<': 6114710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return 6124710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 6134710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm expectroute = 0 6144710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 6154710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.gotonext() 6164710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm adlist = "" 6174710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm while self.pos < len(self.field): 6184710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if expectroute: 6194710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.getdomain() 6204710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm expectroute = 0 6214710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] == '>': 6224710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 6234710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm break 6244710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] == '@': 6254710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 6264710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm expectroute = 1 6274710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] == ':': 6284710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 6294710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 6304710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm adlist = self.getaddrspec() 6314710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 6324710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm break 6334710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.gotonext() 6344710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 6354710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return adlist 6364710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 6374710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getaddrspec(self): 6384710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Parse an RFC 2822 addr-spec.""" 6394710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm aslist = [] 6404710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 6414710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.gotonext() 6424710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm while self.pos < len(self.field): 6434710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if self.field[self.pos] == '.': 6444710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm aslist.append('.') 6454710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 6464710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] == '"': 6474710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm aslist.append('"%s"' % self.getquote()) 6484710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] in self.atomends: 6494710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm break 6504710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: aslist.append(self.getatom()) 6514710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.gotonext() 6524710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 6534710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if self.pos >= len(self.field) or self.field[self.pos] != '@': 6544710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return ''.join(aslist) 6554710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 6564710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm aslist.append('@') 6574710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 6584710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.gotonext() 6594710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return ''.join(aslist) + self.getdomain() 6604710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 6614710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getdomain(self): 6624710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Get the complete domain name from an address.""" 6634710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm sdlist = [] 6644710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm while self.pos < len(self.field): 6654710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if self.field[self.pos] in self.LWS: 6664710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 6674710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] == '(': 6684710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.commentlist.append(self.getcomment()) 6694710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] == '[': 6704710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm sdlist.append(self.getdomainliteral()) 6714710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] == '.': 6724710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 6734710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm sdlist.append('.') 6744710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] in self.atomends: 6754710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm break 6764710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: sdlist.append(self.getatom()) 6774710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return ''.join(sdlist) 6784710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 6794710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getdelimited(self, beginchar, endchars, allowcomments = 1): 6804710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Parse a header fragment delimited by special characters. 6814710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 6824710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm `beginchar' is the start character for the fragment. If self is not 6834710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm looking at an instance of `beginchar' then getdelimited returns the 6844710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm empty string. 6854710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 6864710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm `endchars' is a sequence of allowable end-delimiting characters. 6874710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm Parsing stops when one of these is encountered. 6884710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 6894710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm If `allowcomments' is non-zero, embedded RFC 2822 comments are allowed 6904710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm within the parsed fragment. 6914710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 6924710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if self.field[self.pos] != beginchar: 6934710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return '' 6944710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 6954710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm slist = [''] 6964710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm quote = 0 6974710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 6984710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm while self.pos < len(self.field): 6994710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if quote == 1: 7004710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm slist.append(self.field[self.pos]) 7014710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm quote = 0 7024710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] in endchars: 7034710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 7044710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm break 7054710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif allowcomments and self.field[self.pos] == '(': 7064710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm slist.append(self.getcomment()) 7074710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm continue # have already advanced pos from getcomment 7084710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] == '\\': 7094710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm quote = 1 7104710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 7114710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm slist.append(self.field[self.pos]) 7124710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 7134710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7144710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return ''.join(slist) 7154710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7164710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getquote(self): 7174710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Get a quote-delimited fragment from self's field.""" 7184710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return self.getdelimited('"', '"\r', 0) 7194710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7204710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getcomment(self): 7214710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Get a parenthesis-delimited fragment from self's field.""" 7224710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return self.getdelimited('(', ')\r', 1) 7234710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7244710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getdomainliteral(self): 7254710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Parse an RFC 2822 domain-literal.""" 7264710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return '[%s]' % self.getdelimited('[', ']\r', 0) 7274710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7284710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getatom(self, atomends=None): 7294710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Parse an RFC 2822 atom. 7304710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7314710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm Optional atomends specifies a different set of end token delimiters 7324710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm (the default is to use self.atomends). This is used e.g. in 7334710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm getphraselist() since phrase endings must not include the `.' (which 7344710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm is legal in phrases).""" 7354710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm atomlist = [''] 7364710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if atomends is None: 7374710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm atomends = self.atomends 7384710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7394710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm while self.pos < len(self.field): 7404710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if self.field[self.pos] in atomends: 7414710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm break 7424710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: atomlist.append(self.field[self.pos]) 7434710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 7444710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7454710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return ''.join(atomlist) 7464710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7474710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def getphraselist(self): 7484710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Parse a sequence of RFC 2822 phrases. 7494710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7504710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm A phrase is a sequence of words, which are in turn either RFC 2822 7514710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm atoms or quoted-strings. Phrases are canonicalized by squeezing all 7524710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm runs of continuous whitespace into one space. 7534710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 7544710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm plist = [] 7554710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7564710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm while self.pos < len(self.field): 7574710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if self.field[self.pos] in self.LWS: 7584710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.pos += 1 7594710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] == '"': 7604710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm plist.append(self.getquote()) 7614710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] == '(': 7624710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.commentlist.append(self.getcomment()) 7634710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif self.field[self.pos] in self.phraseends: 7644710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm break 7654710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 7664710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm plist.append(self.getatom(self.phraseends)) 7674710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7684710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return plist 7694710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7704710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmclass AddressList(AddrlistClass): 7714710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """An AddressList encapsulates a list of parsed RFC 2822 addresses.""" 7724710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __init__(self, field): 7734710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm AddrlistClass.__init__(self, field) 7744710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if field: 7754710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.addresslist = self.getaddrlist() 7764710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 7774710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.addresslist = [] 7784710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7794710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __len__(self): 7804710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return len(self.addresslist) 7814710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7824710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __str__(self): 7834710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return ", ".join(map(dump_address_pair, self.addresslist)) 7844710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7854710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __add__(self, other): 7864710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # Set union 7874710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm newaddr = AddressList(None) 7884710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm newaddr.addresslist = self.addresslist[:] 7894710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm for x in other.addresslist: 7904710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if not x in self.addresslist: 7914710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm newaddr.addresslist.append(x) 7924710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return newaddr 7934710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 7944710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __iadd__(self, other): 7954710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # Set union, in-place 7964710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm for x in other.addresslist: 7974710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if not x in self.addresslist: 7984710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.addresslist.append(x) 7994710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return self 8004710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 8014710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __sub__(self, other): 8024710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # Set difference 8034710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm newaddr = AddressList(None) 8044710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm for x in self.addresslist: 8054710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if not x in other.addresslist: 8064710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm newaddr.addresslist.append(x) 8074710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return newaddr 8084710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 8094710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __isub__(self, other): 8104710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # Set difference, in-place 8114710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm for x in other.addresslist: 8124710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if x in self.addresslist: 8134710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm self.addresslist.remove(x) 8144710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return self 8154710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 8164710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm def __getitem__(self, index): 8174710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # Make indexing, slices, and 'in' work 8184710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return self.addresslist[index] 8194710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 8204710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmdef dump_address_pair(pair): 8214710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Dump a (name, address) pair in a canonicalized form.""" 8224710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if pair[0]: 8234710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return '"' + pair[0] + '" <' + pair[1] + '>' 8244710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 8254710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return pair[1] 8264710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 8274710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm# Parse a date field 8284710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 8294710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm_monthnames = ['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 8304710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 'aug', 'sep', 'oct', 'nov', 'dec', 8314710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 'january', 'february', 'march', 'april', 'may', 'june', 'july', 8324710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 'august', 'september', 'october', 'november', 'december'] 8334710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm_daynames = ['mon', 'tue', 'wed', 'thu', 'fri', 'sat', 'sun'] 8344710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 8354710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm# The timezone table does not include the military time zones defined 8364710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm# in RFC822, other than Z. According to RFC1123, the description in 8374710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm# RFC822 gets the signs wrong, so we can't rely on any such time 8384710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm# zones. RFC1123 recommends that numeric timezone indicators be used 8394710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm# instead of timezone names. 8404710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 8414710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm_timezones = {'UT':0, 'UTC':0, 'GMT':0, 'Z':0, 8424710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 'AST': -400, 'ADT': -300, # Atlantic (used in Canada) 8434710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 'EST': -500, 'EDT': -400, # Eastern 8444710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 'CST': -600, 'CDT': -500, # Central 8454710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 'MST': -700, 'MDT': -600, # Mountain 8464710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 'PST': -800, 'PDT': -700 # Pacific 8474710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm } 8484710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 8494710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 8504710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmdef parsedate_tz(data): 8514710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Convert a date string to a time tuple. 8524710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 8534710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm Accounts for military timezones. 8544710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 8554710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if not data: 8564710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return None 8574710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm data = data.split() 8584710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if data[0][-1] in (',', '.') or data[0].lower() in _daynames: 8594710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # There's a dayname here. Skip it 8604710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm del data[0] 8614710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 8624710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # no space after the "weekday,"? 8634710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm i = data[0].rfind(',') 8644710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if i >= 0: 8654710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm data[0] = data[0][i+1:] 8664710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if len(data) == 3: # RFC 850 date, deprecated 8674710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm stuff = data[0].split('-') 8684710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if len(stuff) == 3: 8694710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm data = stuff + data[1:] 8704710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if len(data) == 4: 8714710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm s = data[3] 8724710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm i = s.find('+') 8734710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if i > 0: 8744710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm data[3:] = [s[:i], s[i+1:]] 8754710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 8764710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm data.append('') # Dummy tz 8774710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if len(data) < 5: 8784710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return None 8794710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm data = data[:5] 8804710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm [dd, mm, yy, tm, tz] = data 8814710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm mm = mm.lower() 8824710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if not mm in _monthnames: 8834710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm dd, mm = mm, dd.lower() 8844710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if not mm in _monthnames: 8854710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return None 8864710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm mm = _monthnames.index(mm)+1 8874710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if mm > 12: mm = mm - 12 8884710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if dd[-1] == ',': 8894710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm dd = dd[:-1] 8904710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm i = yy.find(':') 8914710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if i > 0: 8924710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm yy, tm = tm, yy 8934710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if yy[-1] == ',': 8944710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm yy = yy[:-1] 8954710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if not yy[0].isdigit(): 8964710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm yy, tz = tz, yy 8974710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if tm[-1] == ',': 8984710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm tm = tm[:-1] 8994710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm tm = tm.split(':') 9004710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if len(tm) == 2: 9014710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm [thh, tmm] = tm 9024710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm tss = '0' 9034710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm elif len(tm) == 3: 9044710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm [thh, tmm, tss] = tm 9054710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 9064710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return None 9074710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm try: 9084710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm yy = int(yy) 9094710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm dd = int(dd) 9104710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm thh = int(thh) 9114710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm tmm = int(tmm) 9124710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm tss = int(tss) 9134710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm except ValueError: 9144710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return None 9154710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm tzoffset = None 9164710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm tz = tz.upper() 9174710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if tz in _timezones: 9184710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm tzoffset = _timezones[tz] 9194710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 9204710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm try: 9214710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm tzoffset = int(tz) 9224710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm except ValueError: 9234710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm pass 9244710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # Convert a timezone offset into seconds ; -0500 -> -18000 9254710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if tzoffset: 9264710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if tzoffset < 0: 9274710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm tzsign = -1 9284710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm tzoffset = -tzoffset 9294710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 9304710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm tzsign = 1 9314710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm tzoffset = tzsign * ( (tzoffset//100)*3600 + (tzoffset % 100)*60) 9324710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return (yy, mm, dd, thh, tmm, tss, 0, 1, 0, tzoffset) 9334710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 9344710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 9354710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmdef parsedate(data): 9364710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Convert a time string to a time tuple.""" 9374710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm t = parsedate_tz(data) 9384710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if t is None: 9394710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return t 9404710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return t[:9] 9414710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 9424710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 9434710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmdef mktime_tz(data): 9444710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Turn a 10-tuple as returned by parsedate_tz() into a UTC timestamp.""" 9454710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if data[9] is None: 9464710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm # No zone info, so localtime is better assumption than GMT 9474710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return time.mktime(data[:8] + (-1,)) 9484710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 9494710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm t = time.mktime(data[:8] + (0,)) 9504710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return t - data[9] - time.timezone 9514710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 9524710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmdef formatdate(timeval=None): 9534710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """Returns time format preferred for Internet standards. 9544710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 9554710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123 9564710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 9574710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm According to RFC 1123, day and month names must always be in 9584710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm English. If not for that, this code could use strftime(). It 9594710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm can't because strftime() honors the locale and could generated 9604710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm non-English names. 9614710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm """ 9624710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if timeval is None: 9634710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm timeval = time.time() 9644710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm timeval = time.gmtime(timeval) 9654710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm return "%s, %02d %s %04d %02d:%02d:%02d GMT" % ( 9664710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm ("Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun")[timeval[6]], 9674710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm timeval[2], 9684710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm ("Jan", "Feb", "Mar", "Apr", "May", "Jun", 9694710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")[timeval[1]-1], 9704710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm timeval[0], timeval[3], timeval[4], timeval[5]) 9714710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 9724710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 9734710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm# When used as script, run a small test program. 9744710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm# The first command line argument must be a filename containing one 9754710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm# message in RFC-822 format. 9764710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm 9774710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylmif __name__ == '__main__': 9784710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm import sys, os 9794710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm file = os.path.join(os.environ['HOME'], 'Mail/inbox/1') 9804710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if sys.argv[1:]: file = sys.argv[1] 9814710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm f = open(file, 'r') 9824710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm m = Message(f) 9834710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm print 'From:', m.getaddr('from') 9844710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm print 'To:', m.getaddrlist('to') 9854710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm print 'Subject:', m.getheader('subject') 9864710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm print 'Date:', m.getheader('date') 9874710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm date = m.getdate_tz('date') 9884710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm tz = date[-1] 9894710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm date = time.localtime(mktime_tz(date)) 9904710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if date: 9914710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm print 'ParsedDate:', time.asctime(date), 9924710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm hhmmss = tz 9934710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm hhmm, ss = divmod(hhmmss, 60) 9944710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm hh, mm = divmod(hhmm, 60) 9954710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm print "%+03d%02d" % (hh, mm), 9964710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if ss: print ".%02d" % ss, 9974710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm print 9984710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm else: 9994710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm print 'ParsedDate:', None 10004710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm m.rewindbody() 10014710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm n = 0 10024710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm while f.readline(): 10034710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm n += 1 10044710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm print 'Lines:', n 10054710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm print '-'*70 10064710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm print 'len =', len(m) 10074710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if 'Date' in m: print 'Date =', m['Date'] 10084710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm if 'X-Nonsense' in m: pass 10094710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm print 'keys =', m.keys() 10104710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm print 'values =', m.values() 10114710c53dcad1ebf3755f3efb9e80ac24bd72a9b2darylm print 'items =', m.items() 1012