<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Data Persistence and Exchange — Python Module of the Week</title> <link rel="stylesheet" href="../_static/sphinxdoc.css" type="text/css" /> <link rel="stylesheet" href="../_static/pygments.css" type="text/css" /> <script type="text/javascript"> var DOCUMENTATION_OPTIONS = { URL_ROOT: '../', VERSION: '1.132', COLLAPSE_INDEX: false, FILE_SUFFIX: '.html', HAS_SOURCE: true }; </script> <script type="text/javascript" src="../_static/jquery.js"></script> <script type="text/javascript" src="../_static/underscore.js"></script> <script type="text/javascript" src="../_static/doctools.js"></script> <link rel="author" title="About these documents" href="../about.html" /> <link rel="top" title="Python Module of the Week" href="../index.html" /> <link rel="up" title="Features of the Standard Library" href="index.html" /> <link rel="next" title="In-Memory Data Structures" href="data_structures.html" /> <link rel="prev" title="Features of the Standard Library" href="index.html" /> </head> <body> <div class="related"> <h3>Navigation</h3> <ul> <li class="right" style="margin-right: 10px"> <a href="../genindex.html" title="General Index" accesskey="I">index</a></li> <li class="right" > <a href="../py-modindex.html" title="Python Module Index" >modules</a> |</li> <li class="right" > <a href="data_structures.html" title="In-Memory Data Structures" accesskey="N">next</a> |</li> <li class="right" > <a href="index.html" title="Features of the Standard Library" accesskey="P">previous</a> |</li> <li><a href="../contents.html">PyMOTW</a> »</li> <li><a href="index.html" accesskey="U">Features of the Standard Library</a> »</li> </ul> </div> <div class="sphinxsidebar"> <div class="sphinxsidebarwrapper"> <h3><a href="../contents.html">Table Of Contents</a></h3> <ul> <li><a class="reference internal" href="#">Data Persistence and Exchange</a><ul> <li><a class="reference internal" href="#serializing-objects">Serializing Objects</a></li> <li><a class="reference internal" href="#storing-serialized-objects">Storing Serialized Objects</a></li> <li><a class="reference internal" href="#relational-databases">Relational Databases</a></li> <li><a class="reference internal" href="#data-exchange-through-standard-formats">Data Exchange Through Standard Formats</a></li> </ul> </li> </ul> <h4>Previous topic</h4> <p class="topless"><a href="index.html" title="previous chapter">Features of the Standard Library</a></p> <h4>Next topic</h4> <p class="topless"><a href="data_structures.html" title="next chapter">In-Memory Data Structures</a></p> <h3>This Page</h3> <ul class="this-page-menu"> <li><a href="../_sources/articles/data_persistence.txt" rel="nofollow">Show Source</a></li> </ul> <div id="searchbox" style="display: none"> <h3>Quick search</h3> <form class="search" action="../search.html" method="get"> <input type="text" name="q" size="18" /> <input type="submit" value="Go" /> <input type="hidden" name="check_keywords" value="yes" /> <input type="hidden" name="area" value="default" /> </form> <p class="searchtip" style="font-size: 90%"> Enter search terms or a module, class or function name. </p> </div> <script type="text/javascript">$('#searchbox').show(0);</script> </div> </div> <div class="document"> <div class="documentwrapper"> <div class="bodywrapper"> <div class="body"> <div class="section" id="data-persistence-and-exchange"> <span id="article-data-persistence"></span><h1>Data Persistence and Exchange<a class="headerlink" href="#data-persistence-and-exchange" title="Permalink to this headline">¶</a></h1> <p>Python provides several modules for storing data. There are basically two aspects to persistence: converting the in-memory object back and forth into a format for saving it, and working with the storage of the converted data.</p> <div class="section" id="serializing-objects"> <h2>Serializing Objects<a class="headerlink" href="#serializing-objects" title="Permalink to this headline">¶</a></h2> <p>Python includes two modules capable of converting objects into a transmittable or storable format (<em>serializing</em>): <a class="reference internal" href="../pickle/index.html#module-pickle" title="pickle: Python object serialization"><tt class="xref py py-mod docutils literal"><span class="pre">pickle</span></tt></a> and <a class="reference internal" href="../json/index.html#module-json" title="json: JavaScript Object Notation Serializer"><tt class="xref py py-mod docutils literal"><span class="pre">json</span></tt></a>. It is most common to use <a class="reference internal" href="../pickle/index.html#module-pickle" title="pickle: Python object serialization"><tt class="xref py py-mod docutils literal"><span class="pre">pickle</span></tt></a>, since there is a fast C implementation and it is integrated with some of the other standard library modules that actually store the serialized data, such as <a class="reference internal" href="../shelve/index.html#module-shelve" title="shelve: Persistent storage of arbitrary Python objects"><tt class="xref py py-mod docutils literal"><span class="pre">shelve</span></tt></a>. Web-based applications may want to examine <a class="reference internal" href="../json/index.html#module-json" title="json: JavaScript Object Notation Serializer"><tt class="xref py py-mod docutils literal"><span class="pre">json</span></tt></a>, however, since it integrates better with some of the existing web service storage applications.</p> </div> <div class="section" id="storing-serialized-objects"> <h2>Storing Serialized Objects<a class="headerlink" href="#storing-serialized-objects" title="Permalink to this headline">¶</a></h2> <p>Once the in-memory object is converted to a storable format, the next step is to decide how to store the data. A simple flat-file with serialized objects written one after the other works for data that does not need to be indexed in any way. But Python includes a collection of modules for storing key-value pairs in a simple database using one of the DBM format variants.</p> <p>The simplest interface to take advantage of the DBM format is provided by <a class="reference internal" href="../shelve/index.html#module-shelve" title="shelve: Persistent storage of arbitrary Python objects"><tt class="xref py py-mod docutils literal"><span class="pre">shelve</span></tt></a>. Simply open the shelve file, and access it through a dictionary-like API. Objects saved to the shelve are automatically pickled and saved without any extra work on your part.</p> <p>One drawback of shelve is that with the default interface you can’t guarantee which DBM format will be used. That won’t matter if your application doesn’t need to share the database files between hosts with different libraries, but if that is needed you can use one of the classes in the module to ensure a specific format is selected (<a class="reference internal" href="../shelve/index.html#shelve-shelf-types"><em>Specific Shelf Types</em></a>).</p> <p>If you’re going to be passing a lot of data around via JSON anyway, using <a class="reference internal" href="../json/index.html#module-json" title="json: JavaScript Object Notation Serializer"><tt class="xref py py-mod docutils literal"><span class="pre">json</span></tt></a> and <a class="reference internal" href="../anydbm/index.html#module-anydbm" title="anydbm: anydbm provides a generic dictionary-like interface to DBM-style, string-keyed databases"><tt class="xref py py-mod docutils literal"><span class="pre">anydbm</span></tt></a> can provide another persistence mechanism. Since the DBM database keys and values must be strings, however, the objects won’t be automatically re-created when you access the value in the database.</p> </div> <div class="section" id="relational-databases"> <h2>Relational Databases<a class="headerlink" href="#relational-databases" title="Permalink to this headline">¶</a></h2> <p>The excellent <a class="reference internal" href="../sqlite3/index.html#module-sqlite3" title="sqlite3: Embedded relational database"><tt class="xref py py-mod docutils literal"><span class="pre">sqlite3</span></tt></a> in-process relational database is available with most Python distributions. It stores its database in memory or in a local file, and all access is from within the same process, so there is no network lag. The compact nature of <a class="reference internal" href="../sqlite3/index.html#module-sqlite3" title="sqlite3: Embedded relational database"><tt class="xref py py-mod docutils literal"><span class="pre">sqlite3</span></tt></a> makes it especially well suited for embedding in desktop applications or development versions of web apps.</p> <p>All access to the database is through the Python DBI 2.0 API, by default, as no object relational mapper (ORM) is included. The most popular general purpose ORM is <a class="reference external" href="http://www.sqlalchemy.org/">SQLAlchemy</a>, but others such as Django’s native ORM layer also support SQLite. SQLAlchemy is easy to install and set up, but if your objects aren’t very complicated and you are worried about overhead, you may want to use the DBI interface directly.</p> </div> <div class="section" id="data-exchange-through-standard-formats"> <h2>Data Exchange Through Standard Formats<a class="headerlink" href="#data-exchange-through-standard-formats" title="Permalink to this headline">¶</a></h2> <p>Although not usually considered a true persistence format <a class="reference internal" href="../csv/index.html#module-csv" title="csv: Read and write comma separated value files."><tt class="xref py py-mod docutils literal"><span class="pre">csv</span></tt></a>, or comma-separated-value, files can be an effective way to migrate data between applications. Most spreadsheet programs and databases support both export and import using CSV, so dumping data to a CSV file is frequently the simplest way to move data out of your application and into an analysis tool.</p> </div> </div> </div> </div> </div> <div class="clearer"></div> </div> <div class="related"> <h3>Navigation</h3> <ul> <li class="right" style="margin-right: 10px"> <a href="../genindex.html" title="General Index" >index</a></li> <li class="right" > <a href="../py-modindex.html" title="Python Module Index" >modules</a> |</li> <li class="right" > <a href="data_structures.html" title="In-Memory Data Structures" >next</a> |</li> <li class="right" > <a href="index.html" title="Features of the Standard Library" >previous</a> |</li> <li><a href="../contents.html">PyMOTW</a> »</li> <li><a href="index.html" >Features of the Standard Library</a> »</li> </ul> </div> <div class="footer"> © Copyright Doug Hellmann. Last updated on Oct 24, 2010. Created using <a href="http://sphinx.pocoo.org/">Sphinx</a>. <br/><a href="http://creativecommons.org/licenses/by-nc-sa/3.0/us/" rel="license"><img alt="Creative Commons License" style="border-width:0" src="http://i.creativecommons.org/l/by-nc-sa/3.0/us/88x31.png"/></a> </div> </body> </html>