    Data Persistence and Exchange
Data Persistence and Exchange
<p>Python provides several modules for storing data.  There are basically two aspects to persistence: converting the in-memory object back and forth into a format for saving it, and working with the storage of the converted data.</p>
Serializing Objects
<p>Python includes two modules capable of converting objects into a transmittable or storable format (<em>serializing</em>): <a class="reference internal" href="../pickle/index.html#module-pickle" title="pickle: Python object serialization"><tt class="xref py py-mod docutils literal"><span class="pre">pickle</span></tt></a> and <a class="reference internal" href="../json/index.html#module-json" title="json: JavaScript Object Notation Serializer"><tt class="xref py py-mod docutils literal"><span class="pre">json</span></tt></a>.  It is most common to use <a class="reference internal" href="../pickle/index.html#module-pickle" title="pickle: Python object serialization"><tt class="xref py py-mod docutils literal"><span class="pre">pickle</span></tt></a>, since there is a fast C implementation and it is integrated with some of the other standard library modules that actually store the serialized data, such as <a class="reference internal" href="../shelve/index.html#module-shelve" title="shelve: Persistent storage of arbitrary Python objects"><tt class="xref py py-mod docutils literal"><span class="pre">shelve</span></tt></a>.  Web-based applications may want to examine <a class="reference internal" href="../json/index.html#module-json" title="json: JavaScript Object Notation Serializer"><tt class="xref py py-mod docutils literal"><span class="pre">json</span></tt></a>, however, since it integrates better with some of the existing web service storage applications.</p>
Storing Serialized Objects
<p>Once the in-memory object is converted to a storable format, the next step is to decide how to store the data.  A simple flat-file with serialized objects written one after the other works for data that does not need to be indexed in any way.  But Python includes a collection of modules for storing key-value pairs in a simple database using one of the DBM format variants.</p>
<p>The simplest interface to take advantage of the DBM format is provided by <a class="reference internal" href="../shelve/index.html#module-shelve" title="shelve: Persistent storage of arbitrary Python objects"><tt class="xref py py-mod docutils literal"><span class="pre">shelve</span></tt></a>.  Simply open the shelve file, and access it through a dictionary-like API.  Objects saved to the shelve are automatically pickled and saved without any extra work on your part.</p>
<p>One drawback of shelve is that with the default interface you can&#8217;t guarantee which DBM format will be used.  That won&#8217;t matter if your application doesn&#8217;t need to share the database files between hosts with different libraries, but if that is needed you can use one of the classes in the module to ensure a specific format is selected (<a class="reference internal" href="../shelve/index.html#shelve-shelf-types"><em>Specific Shelf Types</em></a>).</p>
<p>If you&#8217;re going to be passing a lot of data around via JSON anyway, using <a class="reference internal" href="../json/index.html#module-json" title="json: JavaScript Object Notation Serializer"><tt class="xref py py-mod docutils literal"><span class="pre">json</span></tt></a> and <a class="reference internal" href="../anydbm/index.html#module-anydbm" title="anydbm: anydbm provides a generic dictionary-like interface to DBM-style, string-keyed databases"><tt class="xref py py-mod docutils literal"><span class="pre">anydbm</span></tt></a> can provide another persistence mechanism.  Since the DBM database keys and values must be strings, however, the objects won&#8217;t be automatically re-created when you access the value in the database.</p>
Relational Databases
<p>The excellent <a class="reference internal" href="../sqlite3/index.html#module-sqlite3" title="sqlite3: Embedded relational database"><tt class="xref py py-mod docutils literal"><span class="pre">sqlite3</span></tt></a> in-process relational database is available with most Python distributions.  It stores its database in memory or in a local file, and all access is from within the same process, so there is no network lag.  The compact nature of <a class="reference internal" href="../sqlite3/index.html#module-sqlite3" title="sqlite3: Embedded relational database"><tt class="xref py py-mod docutils literal"><span class="pre">sqlite3</span></tt></a> makes it especially well suited for embedding in desktop applications or development versions of web apps.</p>
<p>All access to the database is through the Python DBI 2.0 API, by default, as no object relational mapper (ORM) is included.  The most popular general purpose ORM is <a class="reference external" href="http://www.sqlalchemy.org/">SQLAlchemy</a>, but others such as Django&#8217;s native ORM layer also support SQLite.  SQLAlchemy is easy to install and set up, but if your objects aren&#8217;t very complicated and you are worried about overhead, you may want to use the DBI interface directly.</p>
Data Exchange Through Standard Formats
<p>Although not usually considered a true persistence format <a class="reference internal" href="../csv/index.html#module-csv" title="csv: Read and write comma separated value files."><tt class="xref py py-mod docutils literal"><span class="pre">csv</span></tt></a>, or comma-separated-value, files can be an effective way to migrate data between applications.  Most spreadsheet programs and databases support both export and import using CSV, so dumping data to a CSV file is frequently the simplest way to move data out of your application and into an analysis tool.</p>

