<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>mmap – Memory-map files — Python Module of the Week</title> <link rel="stylesheet" href="../_static/sphinxdoc.css" type="text/css" /> <link rel="stylesheet" href="../_static/pygments.css" type="text/css" /> <script type="text/javascript"> var DOCUMENTATION_OPTIONS = { URL_ROOT: '../', VERSION: '1.132', COLLAPSE_INDEX: false, FILE_SUFFIX: '.html', HAS_SOURCE: true }; </script> <script type="text/javascript" src="../_static/jquery.js"></script> <script type="text/javascript" src="../_static/underscore.js"></script> <script type="text/javascript" src="../_static/doctools.js"></script> <link rel="author" title="About these documents" href="../about.html" /> <link rel="top" title="Python Module of the Week" href="../index.html" /> <link rel="up" title="Optional Operating System Services" href="../optional_os.html" /> <link rel="next" title="multiprocessing – Manage processes like threads" href="../multiprocessing/index.html" /> <link rel="prev" title="threading – Manage concurrent threads" href="../threading/index.html" /> </head> <body> <div class="related"> <h3>Navigation</h3> <ul> <li class="right" style="margin-right: 10px"> <a href="../genindex.html" title="General Index" accesskey="I">index</a></li> <li class="right" > <a href="../py-modindex.html" title="Python Module Index" >modules</a> |</li> <li class="right" > <a href="../multiprocessing/index.html" title="multiprocessing – Manage processes like threads" accesskey="N">next</a> |</li> <li class="right" > <a href="../threading/index.html" title="threading – Manage concurrent threads" accesskey="P">previous</a> |</li> <li><a href="../contents.html">PyMOTW</a> »</li> <li><a href="../optional_os.html" accesskey="U">Optional Operating System Services</a> »</li> </ul> </div> <div class="sphinxsidebar"> <div class="sphinxsidebarwrapper"> <h3><a href="../contents.html">Table Of Contents</a></h3> <ul> <li><a class="reference internal" href="#">mmap – Memory-map files</a><ul> <li><a class="reference internal" href="#reading">Reading</a></li> <li><a class="reference internal" href="#writing">Writing</a><ul> <li><a class="reference internal" href="#access-copy-mode">ACCESS_COPY Mode</a></li> </ul> </li> <li><a class="reference internal" href="#regular-expressions">Regular Expressions</a></li> </ul> </li> </ul> <h4>Previous topic</h4> <p class="topless"><a href="../threading/index.html" title="previous chapter">threading – Manage concurrent threads</a></p> <h4>Next topic</h4> <p class="topless"><a href="../multiprocessing/index.html" title="next chapter">multiprocessing – Manage processes like threads</a></p> <h3>This Page</h3> <ul class="this-page-menu"> <li><a href="../_sources/mmap/index.txt" rel="nofollow">Show Source</a></li> </ul> <div id="searchbox" style="display: none"> <h3>Quick search</h3> <form class="search" action="../search.html" method="get"> <input type="text" name="q" size="18" /> <input type="submit" value="Go" /> <input type="hidden" name="check_keywords" value="yes" /> <input type="hidden" name="area" value="default" /> </form> <p class="searchtip" style="font-size: 90%"> Enter search terms or a module, class or function name. </p> </div> <script type="text/javascript">$('#searchbox').show(0);</script> </div> </div> <div class="document"> <div class="documentwrapper"> <div class="bodywrapper"> <div class="body"> <div class="section" id="module-mmap"> <span id="mmap-memory-map-files"></span><h1>mmap – Memory-map files<a class="headerlink" href="#module-mmap" title="Permalink to this headline">¶</a></h1> <table class="docutils field-list" frame="void" rules="none"> <col class="field-name" /> <col class="field-body" /> <tbody valign="top"> <tr class="field"><th class="field-name">Purpose:</th><td class="field-body">Memory-map files instead of reading the contents directly.</td> </tr> <tr class="field"><th class="field-name">Python Version:</th><td class="field-body">2.1 and later</td> </tr> </tbody> </table> <p>Memory-mapping a file uses the operating system virtual memory system to access the data on the filesystem directly, instead of using normal I/O functions. Memory-mapping typically improves I/O performance because it does not involve a separate system call for each access and it does not require copying data between buffers – the memory is accessed directly.</p> <p>Memory-mapped files can be treated as mutable strings or file-like objects, depending on your need. A mapped file supports the expected file API methods, such as <tt class="xref py py-func docutils literal"><span class="pre">close()</span></tt>, <tt class="xref py py-func docutils literal"><span class="pre">flush()</span></tt>, <tt class="xref py py-func docutils literal"><span class="pre">read()</span></tt>, <a class="reference internal" href="../readline/index.html#module-readline" title="readline: Interface to the GNU readline library"><tt class="xref py py-func docutils literal"><span class="pre">readline()</span></tt></a>, <tt class="xref py py-func docutils literal"><span class="pre">seek()</span></tt>, <tt class="xref py py-func docutils literal"><span class="pre">tell()</span></tt>, and <tt class="xref py py-func docutils literal"><span class="pre">write()</span></tt>. It also supports the string API, with features such as slicing and methods like <tt class="xref py py-func docutils literal"><span class="pre">find()</span></tt>.</p> <p>All of the examples use the text file <tt class="docutils literal"><span class="pre">lorem.txt</span></tt>, containing a bit of Lorem Ipsum. For reference, the text of the file is:</p> <div class="highlight-python"><pre>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec egestas, enim et consectetuer ullamcorper, lectus ligula rutrum leo, a elementum elit tortor eu quam. Duis tincidunt nisi ut ante. Nulla facilisi. Sed tristique eros eu libero. Pellentesque vel arcu. Vivamus purus orci, iaculis ac, suscipit sit amet, pulvinar eu, lacus. Praesent placerat tortor sed nisl. Nunc blandit diam egestas dui. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Aliquam viverra fringilla leo. Nulla feugiat augue eleifend nulla. Vivamus mauris. Vivamus sed mauris in nibh placerat egestas. Suspendisse potenti. Mauris massa. Ut eget velit auctor tortor blandit sollicitudin. Suspendisse imperdiet justo. </pre> </div> <div class="admonition note"> <p class="first admonition-title">Note</p> <p class="last">There are differences in the arguments and behaviors for <a class="reference internal" href="#module-mmap" title="mmap: Memory-map files"><tt class="xref py py-func docutils literal"><span class="pre">mmap()</span></tt></a> between Unix and Windows, which are not discussed below. For more details, refer to the standard library documentation.</p> </div> <div class="section" id="reading"> <h2>Reading<a class="headerlink" href="#reading" title="Permalink to this headline">¶</a></h2> <p>Use the <a class="reference internal" href="#module-mmap" title="mmap: Memory-map files"><tt class="xref py py-func docutils literal"><span class="pre">mmap()</span></tt></a> function to create a memory-mapped file. The first argument is a file descriptor, either from the <tt class="xref py py-func docutils literal"><span class="pre">fileno()</span></tt> method of a <tt class="xref py py-class docutils literal"><span class="pre">file</span></tt> object or from <tt class="xref py py-func docutils literal"><span class="pre">os.open()</span></tt>. The caller is responsible for opening the file before invoking <a class="reference internal" href="#module-mmap" title="mmap: Memory-map files"><tt class="xref py py-func docutils literal"><span class="pre">mmap()</span></tt></a>, and closing it after it is no longer needed.</p> <p>The second argument to <a class="reference internal" href="#module-mmap" title="mmap: Memory-map files"><tt class="xref py py-func docutils literal"><span class="pre">mmap()</span></tt></a> is a size in bytes for the portion of the file to map. If the value is <tt class="docutils literal"><span class="pre">0</span></tt>, the entire file is mapped. If the size is larger than the current size of the file, the file is extended.</p> <div class="admonition note"> <p class="first admonition-title">Note</p> <p class="last">You cannot create a zero-length mapping under Windows.</p> </div> <p>An optional keyword argument, <em>access</em>, is supported by both platforms. Use <tt class="xref py py-const docutils literal"><span class="pre">ACCESS_READ</span></tt> for read-only access, <tt class="xref py py-const docutils literal"><span class="pre">ACCESS_WRITE</span></tt> for write-through (assignments to the memory go directly to the file), or <tt class="xref py py-const docutils literal"><span class="pre">ACCESS_COPY</span></tt> for copy-on-write (assignments to memory are not written to the file).</p> <div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">mmap</span> <span class="kn">import</span> <span class="nn">contextlib</span> <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'lorem.txt'</span><span class="p">,</span> <span class="s">'r'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span> <span class="k">with</span> <span class="n">contextlib</span><span class="o">.</span><span class="n">closing</span><span class="p">(</span><span class="n">mmap</span><span class="o">.</span><span class="n">mmap</span><span class="p">(</span><span class="n">f</span><span class="o">.</span><span class="n">fileno</span><span class="p">(),</span> <span class="mi">0</span><span class="p">,</span> <span class="n">access</span><span class="o">=</span><span class="n">mmap</span><span class="o">.</span><span class="n">ACCESS_READ</span><span class="p">))</span> <span class="k">as</span> <span class="n">m</span><span class="p">:</span> <span class="k">print</span> <span class="s">'First 10 bytes via read :'</span><span class="p">,</span> <span class="n">m</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="mi">10</span><span class="p">)</span> <span class="k">print</span> <span class="s">'First 10 bytes via slice:'</span><span class="p">,</span> <span class="n">m</span><span class="p">[:</span><span class="mi">10</span><span class="p">]</span> <span class="k">print</span> <span class="s">'2nd 10 bytes via read :'</span><span class="p">,</span> <span class="n">m</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="mi">10</span><span class="p">)</span> </pre></div> </div> <p>The file pointer tracks the last byte accessed through a slice operation. In this example, the pointer moves ahead 10 bytes after the first read. It is then reset to the beginning of the file by the slice operation, and moved ahead 10 bytes again by the slice. After the slice operation, calling <tt class="xref py py-func docutils literal"><span class="pre">read()</span></tt> again gives the bytes 11-20 in the file.</p> <div class="highlight-python"><pre>$ python mmap_read.py First 10 bytes via read : Lorem ipsu First 10 bytes via slice: Lorem ipsu 2nd 10 bytes via read : m dolor si</pre> </div> </div> <div class="section" id="writing"> <h2>Writing<a class="headerlink" href="#writing" title="Permalink to this headline">¶</a></h2> <p>To set up the memory mapped file to receive updates, start by opening it for appending with mode <tt class="docutils literal"><span class="pre">'r+'</span></tt> (not <tt class="docutils literal"><span class="pre">'w'</span></tt>) before mapping it. Then use any of the API method that change the data (<tt class="xref py py-func docutils literal"><span class="pre">write()</span></tt>, assignment to a slice, etc.).</p> <p>Here’s an example using the default access mode of <tt class="xref py py-const docutils literal"><span class="pre">ACCESS_WRITE</span></tt> and assigning to a slice to modify part of a line in place:</p> <div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">mmap</span> <span class="kn">import</span> <span class="nn">shutil</span> <span class="kn">import</span> <span class="nn">contextlib</span> <span class="c"># Copy the example file</span> <span class="n">shutil</span><span class="o">.</span><span class="n">copyfile</span><span class="p">(</span><span class="s">'lorem.txt'</span><span class="p">,</span> <span class="s">'lorem_copy.txt'</span><span class="p">)</span> <span class="n">word</span> <span class="o">=</span> <span class="s">'consectetuer'</span> <span class="nb">reversed</span> <span class="o">=</span> <span class="n">word</span><span class="p">[::</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="k">print</span> <span class="s">'Looking for :'</span><span class="p">,</span> <span class="n">word</span> <span class="k">print</span> <span class="s">'Replacing with :'</span><span class="p">,</span> <span class="nb">reversed</span> <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'lorem_copy.txt'</span><span class="p">,</span> <span class="s">'r+'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span> <span class="k">with</span> <span class="n">contextlib</span><span class="o">.</span><span class="n">closing</span><span class="p">(</span><span class="n">mmap</span><span class="o">.</span><span class="n">mmap</span><span class="p">(</span><span class="n">f</span><span class="o">.</span><span class="n">fileno</span><span class="p">(),</span> <span class="mi">0</span><span class="p">))</span> <span class="k">as</span> <span class="n">m</span><span class="p">:</span> <span class="k">print</span> <span class="s">'Before:'</span><span class="p">,</span> <span class="n">m</span><span class="o">.</span><span class="n">readline</span><span class="p">()</span><span class="o">.</span><span class="n">rstrip</span><span class="p">()</span> <span class="n">m</span><span class="o">.</span><span class="n">seek</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="c"># rewind</span> <span class="n">loc</span> <span class="o">=</span> <span class="n">m</span><span class="o">.</span><span class="n">find</span><span class="p">(</span><span class="n">word</span><span class="p">)</span> <span class="n">m</span><span class="p">[</span><span class="n">loc</span><span class="p">:</span><span class="n">loc</span><span class="o">+</span><span class="nb">len</span><span class="p">(</span><span class="n">word</span><span class="p">)]</span> <span class="o">=</span> <span class="nb">reversed</span> <span class="n">m</span><span class="o">.</span><span class="n">flush</span><span class="p">()</span> <span class="n">m</span><span class="o">.</span><span class="n">seek</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="c"># rewind</span> <span class="k">print</span> <span class="s">'After :'</span><span class="p">,</span> <span class="n">m</span><span class="o">.</span><span class="n">readline</span><span class="p">()</span><span class="o">.</span><span class="n">rstrip</span><span class="p">()</span> </pre></div> </div> <p>The word “consectetuer” is replaced in the middle of the first line:</p> <div class="highlight-python"><pre>$ python mmap_write_slice.py Looking for : consectetuer Replacing with : reutetcesnoc Before: Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec After : Lorem ipsum dolor sit amet, reutetcesnoc adipiscing elit. Donec</pre> </div> <div class="section" id="access-copy-mode"> <h3>ACCESS_COPY Mode<a class="headerlink" href="#access-copy-mode" title="Permalink to this headline">¶</a></h3> <p>Using the access setting <tt class="xref py py-const docutils literal"><span class="pre">ACCESS_COPY</span></tt> does not write changes to the file on disk.</p> <div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">mmap</span> <span class="kn">import</span> <span class="nn">shutil</span> <span class="kn">import</span> <span class="nn">contextlib</span> <span class="c"># Copy the example file</span> <span class="n">shutil</span><span class="o">.</span><span class="n">copyfile</span><span class="p">(</span><span class="s">'lorem.txt'</span><span class="p">,</span> <span class="s">'lorem_copy.txt'</span><span class="p">)</span> <span class="n">word</span> <span class="o">=</span> <span class="s">'consectetuer'</span> <span class="nb">reversed</span> <span class="o">=</span> <span class="n">word</span><span class="p">[::</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'lorem_copy.txt'</span><span class="p">,</span> <span class="s">'r+'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span> <span class="k">with</span> <span class="n">contextlib</span><span class="o">.</span><span class="n">closing</span><span class="p">(</span><span class="n">mmap</span><span class="o">.</span><span class="n">mmap</span><span class="p">(</span><span class="n">f</span><span class="o">.</span><span class="n">fileno</span><span class="p">(),</span> <span class="mi">0</span><span class="p">,</span> <span class="n">access</span><span class="o">=</span><span class="n">mmap</span><span class="o">.</span><span class="n">ACCESS_COPY</span><span class="p">))</span> <span class="k">as</span> <span class="n">m</span><span class="p">:</span> <span class="k">print</span> <span class="s">'Memory Before:'</span><span class="p">,</span> <span class="n">m</span><span class="o">.</span><span class="n">readline</span><span class="p">()</span><span class="o">.</span><span class="n">rstrip</span><span class="p">()</span> <span class="k">print</span> <span class="s">'File Before :'</span><span class="p">,</span> <span class="n">f</span><span class="o">.</span><span class="n">readline</span><span class="p">()</span><span class="o">.</span><span class="n">rstrip</span><span class="p">()</span> <span class="k">print</span> <span class="n">m</span><span class="o">.</span><span class="n">seek</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="c"># rewind</span> <span class="n">loc</span> <span class="o">=</span> <span class="n">m</span><span class="o">.</span><span class="n">find</span><span class="p">(</span><span class="n">word</span><span class="p">)</span> <span class="n">m</span><span class="p">[</span><span class="n">loc</span><span class="p">:</span><span class="n">loc</span><span class="o">+</span><span class="nb">len</span><span class="p">(</span><span class="n">word</span><span class="p">)]</span> <span class="o">=</span> <span class="nb">reversed</span> <span class="n">m</span><span class="o">.</span><span class="n">seek</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="c"># rewind</span> <span class="k">print</span> <span class="s">'Memory After :'</span><span class="p">,</span> <span class="n">m</span><span class="o">.</span><span class="n">readline</span><span class="p">()</span><span class="o">.</span><span class="n">rstrip</span><span class="p">()</span> <span class="n">f</span><span class="o">.</span><span class="n">seek</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="k">print</span> <span class="s">'File After :'</span><span class="p">,</span> <span class="n">f</span><span class="o">.</span><span class="n">readline</span><span class="p">()</span><span class="o">.</span><span class="n">rstrip</span><span class="p">()</span> </pre></div> </div> <p>It is necessary to rewind the file handle in this example separately from the mmap handle because the internal state of the two objects is maintained separately.</p> <div class="highlight-python"><pre>$ python mmap_write_copy.py Memory Before: Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec File Before : Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec Memory After : Lorem ipsum dolor sit amet, reutetcesnoc adipiscing elit. Donec File After : Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec</pre> </div> </div> </div> <div class="section" id="regular-expressions"> <h2>Regular Expressions<a class="headerlink" href="#regular-expressions" title="Permalink to this headline">¶</a></h2> <p>Since a memory mapped file can act like a string, it can be used with other modules that operate on strings, such as regular expressions. This example finds all of the sentences with “nulla” in them.</p> <div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">mmap</span> <span class="kn">import</span> <span class="nn">re</span> <span class="kn">import</span> <span class="nn">contextlib</span> <span class="n">pattern</span> <span class="o">=</span> <span class="n">re</span><span class="o">.</span><span class="n">compile</span><span class="p">(</span><span class="s">r'(\.\W+)?([^.]?nulla[^.]*?\.)'</span><span class="p">,</span> <span class="n">re</span><span class="o">.</span><span class="n">DOTALL</span> <span class="o">|</span> <span class="n">re</span><span class="o">.</span><span class="n">IGNORECASE</span> <span class="o">|</span> <span class="n">re</span><span class="o">.</span><span class="n">MULTILINE</span><span class="p">)</span> <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'lorem.txt'</span><span class="p">,</span> <span class="s">'r'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span> <span class="k">with</span> <span class="n">contextlib</span><span class="o">.</span><span class="n">closing</span><span class="p">(</span><span class="n">mmap</span><span class="o">.</span><span class="n">mmap</span><span class="p">(</span><span class="n">f</span><span class="o">.</span><span class="n">fileno</span><span class="p">(),</span> <span class="mi">0</span><span class="p">,</span> <span class="n">access</span><span class="o">=</span><span class="n">mmap</span><span class="o">.</span><span class="n">ACCESS_READ</span><span class="p">))</span> <span class="k">as</span> <span class="n">m</span><span class="p">:</span> <span class="k">for</span> <span class="n">match</span> <span class="ow">in</span> <span class="n">pattern</span><span class="o">.</span><span class="n">findall</span><span class="p">(</span><span class="n">m</span><span class="p">):</span> <span class="k">print</span> <span class="n">match</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="s">'</span><span class="se">\n</span><span class="s">'</span><span class="p">,</span> <span class="s">' '</span><span class="p">)</span> </pre></div> </div> <p>Because the pattern includes two groups, the return value from <tt class="xref py py-func docutils literal"><span class="pre">findall()</span></tt> is a sequence of tuples. The <strong class="command">print</strong> statement pulls out the sentence match and replaces newlines with spaces so the result prints on a single line.</p> <div class="highlight-python"><pre>$ python mmap_regex.py Nulla facilisi. Nulla feugiat augue eleifend nulla.</pre> </div> <div class="admonition-see-also admonition seealso"> <p class="first admonition-title">See also</p> <dl class="last docutils"> <dt><a class="reference external" href="http://docs.python.org/lib/module-mmap.html">mmap</a></dt> <dd>Standard library documentation for this module.</dd> <dt><a class="reference internal" href="../os/index.html#module-os" title="os: Portable access to operating system specific features."><tt class="xref py py-mod docutils literal"><span class="pre">os</span></tt></a></dt> <dd>The os module.</dd> <dt><a class="reference internal" href="../contextlib/index.html#module-contextlib" title="contextlib: Utilities for creating and working with context managers."><tt class="xref py py-mod docutils literal"><span class="pre">contextlib</span></tt></a></dt> <dd>Use the <tt class="xref py py-func docutils literal"><span class="pre">closing()</span></tt> function to create a context manager for a memory mapped file.</dd> <dt><a class="reference internal" href="../re/index.html#module-re" title="re: Searching within and changing text using formal patterns."><tt class="xref py py-mod docutils literal"><span class="pre">re</span></tt></a></dt> <dd>Regular expressions.</dd> </dl> </div> </div> </div> </div> </div> </div> <div class="clearer"></div> </div> <div class="related"> <h3>Navigation</h3> <ul> <li class="right" style="margin-right: 10px"> <a href="../genindex.html" title="General Index" >index</a></li> <li class="right" > <a href="../py-modindex.html" title="Python Module Index" >modules</a> |</li> <li class="right" > <a href="../multiprocessing/index.html" title="multiprocessing – Manage processes like threads" >next</a> |</li> <li class="right" > <a href="../threading/index.html" title="threading – Manage concurrent threads" >previous</a> |</li> <li><a href="../contents.html">PyMOTW</a> »</li> <li><a href="../optional_os.html" >Optional Operating System Services</a> »</li> </ul> </div> <div class="footer"> © Copyright Doug Hellmann. Last updated on Oct 24, 2010. Created using <a href="http://sphinx.pocoo.org/">Sphinx</a>. <br/><a href="http://creativecommons.org/licenses/by-nc-sa/3.0/us/" rel="license"><img alt="Creative Commons License" style="border-width:0" src="http://i.creativecommons.org/l/by-nc-sa/3.0/us/88x31.png"/></a> </div> </body> </html>