[code.view]

[top] / python / PyMOTW / hmac / index.rst

     =============================================================
     hmac -- Cryptographic signature and verification of messages.
     =============================================================
     
     .. module:: hmac
         :synopsis: Cryptographic signature and verification of messages.
     
     :Purpose: 
         The hmac module implements keyed-hashing for message authentication, as
         described in :rfc:`2104`.
     :Python Version: 2.2
     
     The HMAC algorithm can be used to verify the integrity of information
     passed between applications or stored in a potentially vulnerable
     location. The basic idea is to generate a cryptographic hash of the
     actual data combined with a shared secret key. The resulting hash can
     then be used to check the transmitted or stored message to determine a
     level of trust, without transmitting the secret key.
     
     Disclaimer: I'm not a security expert. For the full details on HMAC,
     check out :rfc:`2104`.
     
     Example
     =======
     
     Creating the hash is not complex. Here's a simple example which uses
     the default MD5 hash algorithm:
     
     .. include:: hmac_simple.py
         :literal:
         :start-after: #end_pymotw_header
     
     When run, the code reads its source file and computes an HMAC
     signature for it:
     
     .. {{{cog
     .. cog.out(run_script(cog.inFile, 'hmac_simple.py'))
     .. }}}
     
     ::
     
     	$ python hmac_simple.py
     
     	4bcb287e284f8c21e87e14ba2dc40b16
     
     .. {{{end}}}
     
     .. note::
     
        If I haven't changed the file by the time I release the example
        source for this week, the copy you download should produce the same
        hash.
     
     SHA vs. MD5
     ===========
     
     Although the default cryptographic algorithm for :mod:`hmac` is MD5,
     that is not the most secure method to use. MD5 hashes have some
     weaknesses, such as collisions (where two different messages produce
     the same hash). The SHA-1 algorithm is considered to be stronger, and
     should be used instead.
     
     .. include:: hmac_sha.py
         :literal:
         :start-after: #end_pymotw_header
     
     ``hmac.new()`` takes 3 arguments. The first is the secret key, which
     should be shared between the two endpoints which are communicating so
     both ends can use the same value. The second value is an initial
     message. If the message content that needs to be authenticated is
     small, such as a timestamp or HTTP POST, the entire body of the
     message can be passed to ``new()`` instead of using the update()
     method. The last argument is the digest module to be used. The default
     is ``hashlib.md5``. The previous example substitutes ``hashlib.sha1``.
     
     .. {{{cog
     .. cog.out(run_script(cog.inFile, 'hmac_sha.py'))
     .. }}}
     
     ::
     
     	$ python hmac_sha.py
     
     	69b26d1731a0a5f0fc7a92fc6c540823ec210759
     
     .. {{{end}}}
     
     Binary Digests
     ==============
     
     The first few examples used the ``hexdigest()`` method to produce
     printable digests. The hexdigest is is a different representation of
     the value calculated by the ``digest()`` method, which is a binary
     value that may include unprintable or non-ASCII characters, including
     NULs. Some web services (Google checkout, Amazon S3) use the
     ``base64`` encoded version of the binary digest instead of the
     hexdigest.
     
     .. include:: hmac_base64.py
         :literal:
         :start-after: #end_pymotw_header
     
     The base64 encoded string ends in a newline, which frequently needs to be
     stripped off when embedding the string in HTTP headers or other
     formatting-sensitive contexts.
     
     .. {{{cog
     .. cog.out(run_script(cog.inFile, 'hmac_base64.py'))
     .. }}}
     
     ::
     
     	$ python hmac_base64.py
     
     	olW2DoXHGJEKGU0aE9fOwSVE/o4=
     	
     
     .. {{{end}}}
     
     
     Applications
     ============
     
     HMAC authentication should be used for any public network service, and
     any time data is stored where security is important. For example, when
     sending data through a pipe or socket, that data should be signed and
     then the signature should be tested before the data is used. The
     extended example below is available in the ``hmac_pickle.py`` file as
     part of the PyMOTW source package.
     
     First, let's establish a function to calculate a digest for a string,
     and a simple class to be instantiated and passed through a
     communication channel.
     
     ::
     
         import hashlib
         import hmac
         try:
             import cPickle as pickle
         except:
             import pickle
         import pprint
         from StringIO import StringIO
     
     
         def make_digest(message):
             "Return a digest for the message."
             return hmac.new('secret-shared-key-goes-here', message, hashlib.sha1).hexdigest()
     
     
         class SimpleObject(object):
             "A very simple class to demonstrate checking digests before unpickling."
             def __init__(self, name):
                 self.name = name
             def __str__(self):
                 return self.name
     
     Next, create a :mod:`StringIO` buffer to represent the socket or
     pipe. We will using a naive, but easy to parse, format for the data
     stream. The digest and length of the data are written, followed by a
     new line. The serialized representation of the object, generated by
     :mod:`pickle`, follows. In a real system, we would not want to depend
     on a length value, since if the digest is wrong the length is probably
     wrong as well. Some sort of terminator sequence not likely to appear
     in the real data would be more appropriate.
     
     For this example, we will write two objects to the stream. The first is
     written using the correct digest value. 
     
     ::
     
         # Simulate a writable socket or pipe with StringIO
         out_s = StringIO()
     
         # Write a valid object to the stream:
         #  digest\nlength\npickle
         o = SimpleObject('digest matches')
         pickled_data = pickle.dumps(o)
         digest = make_digest(pickled_data)
         header = '%s %s' % (digest, len(pickled_data))
         print '\nWRITING:', header
         out_s.write(header + '\n')
         out_s.write(pickled_data)
     
     The second object is written to the stream with an invalid digest, produced by
     calculating the digest for some other data instead of the pickle.
     
     ::
     
         # Write an invalid object to the stream
         o = SimpleObject('digest does not match')
         pickled_data = pickle.dumps(o)
         digest = make_digest('not the pickled data at all')
         header = '%s %s' % (digest, len(pickled_data))
         print '\nWRITING:', header
         out_s.write(header + '\n')
         out_s.write(pickled_data)
     
         out_s.flush()
     
     
     Now that the data is in the :mod:`StringIO` buffer, we can read it
     back out again.  The first step is to read the line of data with the
     digest and data length.  Then the remaining data is read (using the
     length value). We could use ``pickle.load()`` to read directly from
     the stream, but that assumes a trusted data stream and we do not yet
     trust the data enough to unpickle it. Reading the pickle as a string
     collect the data from the stream, without actually unpickling the
     object.
     
     ::
     
         # Simulate a readable socket or pipe with StringIO
         in_s = StringIO(out_s.getvalue())
     
         # Read the data
         while True:
             first_line = in_s.readline()
             if not first_line:
                 break
             incoming_digest, incoming_length = first_line.split(' ')
             incoming_length = int(incoming_length)
             print '\nREAD:', incoming_digest, incoming_length
             incoming_pickled_data = in_s.read(incoming_length)
     
     Once we have the pickled data, we can recalculate the digest value and
     compare it against what we read. If the digests match, we know it is
     safe to trust the data and unpickle it.
     
     ::
     
             actual_digest = make_digest(incoming_pickled_data)
             print 'ACTUAL:', actual_digest
     
             if incoming_digest != actual_digest:
                 print 'WARNING: Data corruption'
             else:
                 obj = pickle.loads(incoming_pickled_data)
                 print 'OK:', obj
     
     The output shows that the first object is verified and the second is deemed
     "corrupted", as expected:
     
     .. {{{cog
     .. cog.out(run_script(cog.inFile, 'hmac_pickle.py'))
     .. }}}
     
     ::
     
     	$ python hmac_pickle.py
     
     	
     	WRITING: 387632cfa3d18cd19bdfe72b61ac395dfcdc87c9 124
     	
     	WRITING: b01b209e28d7e053408ebe23b90fe5c33bc6a0ec 131
     	
     	READ: 387632cfa3d18cd19bdfe72b61ac395dfcdc87c9 124
     	ACTUAL: 387632cfa3d18cd19bdfe72b61ac395dfcdc87c9
     	OK: digest matches
     	
     	READ: b01b209e28d7e053408ebe23b90fe5c33bc6a0ec 131
     	ACTUAL: dec53ca1ad3f4b657dd81d514f17f735628b6828
     	WARNING: Data corruption
     
     .. {{{end}}}
     
     
     .. seealso::
     
         `hmac <http://docs.python.org/library/hmac.html>`_
             The standard library documentation for this module.
         
         :rfc:`2104`
             HMAC: Keyed-Hashing for Message Authentication
     
         :mod:`hashlib`
             The :mod:`hashlib` module.
     
         :mod:`pickle`
             Serialization library.
     
         `WikiPedia: MD5 <http://en.wikipedia.org/wiki/MD5>`_
             Description of the MD5 hashing algorithm.
     
         `Authenticating to Amazon S3 Web Service <http://docs.amazonwebservices.com/AmazonS3/2006-03-01/index.html?S3_Authentication.html>`_
             Instructions for authenticating to S3 using HMAC-SHA1 signed credentials.
     

[top] / python / PyMOTW / hmac / index.rst

contact | logmethods.com