[code.view]

[top] / python / PyMOTW / docs / shlex / index.html


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
    <title>shlex – Lexical analysis of shell-style syntaxes. &mdash; Python Module of the Week</title>
    <link rel="stylesheet" href="../_static/sphinxdoc.css" type="text/css" />
    <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    '../',
        VERSION:     '1.132',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true
      };
    </script>
    <script type="text/javascript" src="../_static/jquery.js"></script>
    <script type="text/javascript" src="../_static/underscore.js"></script>
    <script type="text/javascript" src="../_static/doctools.js"></script>
    <link rel="author" title="About these documents" href="../about.html" />
    <link rel="top" title="Python Module of the Week" href="../index.html" />
    <link rel="up" title="Program Frameworks" href="../frameworks.html" />
    <link rel="next" title="Development Tools" href="../dev_tools.html" />
    <link rel="prev" title="cmd – Create line-oriented command processors" href="../cmd/index.html" /> 
  </head>
  <body>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="../genindex.html" title="General Index"
             accesskey="I">index</a></li>
        <li class="right" >
          <a href="../py-modindex.html" title="Python Module Index"
             >modules</a> |</li>
        <li class="right" >
          <a href="../dev_tools.html" title="Development Tools"
             accesskey="N">next</a> |</li>
        <li class="right" >
          <a href="../cmd/index.html" title="cmd – Create line-oriented command processors"
             accesskey="P">previous</a> |</li>
        <li><a href="../contents.html">PyMOTW</a> &raquo;</li>
          <li><a href="../frameworks.html" accesskey="U">Program Frameworks</a> &raquo;</li> 
      </ul>
    </div>
      <div class="sphinxsidebar">
        <div class="sphinxsidebarwrapper">
  <h3><a href="../contents.html">Table Of Contents</a></h3>
  <ul>
<li><a class="reference internal" href="#">shlex &#8211; Lexical analysis of shell-style syntaxes.</a><ul>
<li><a class="reference internal" href="#quoted-strings">Quoted Strings</a></li>
<li><a class="reference internal" href="#embedded-comments">Embedded Comments</a></li>
<li><a class="reference internal" href="#split">Split</a></li>
<li><a class="reference internal" href="#including-other-sources-of-tokens">Including Other Sources of Tokens</a></li>
<li><a class="reference internal" href="#controlling-the-parser">Controlling the Parser</a></li>
<li><a class="reference internal" href="#error-handling">Error Handling</a></li>
<li><a class="reference internal" href="#posix-vs-non-posix-parsing">POSIX vs. Non-POSIX Parsing</a></li>
</ul>
</li>
</ul>

  <h4>Previous topic</h4>
  <p class="topless"><a href="../cmd/index.html"
                        title="previous chapter">cmd &#8211; Create line-oriented command processors</a></p>
  <h4>Next topic</h4>
  <p class="topless"><a href="../dev_tools.html"
                        title="next chapter">Development Tools</a></p>
  <h3>This Page</h3>
  <ul class="this-page-menu">
    <li><a href="../_sources/shlex/index.txt"
           rel="nofollow">Show Source</a></li>
  </ul>
<div id="searchbox" style="display: none">
  <h3>Quick search</h3>
    <form class="search" action="../search.html" method="get">
      <input type="text" name="q" size="18" />
      <input type="submit" value="Go" />
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
    <p class="searchtip" style="font-size: 90%">
    Enter search terms or a module, class or function name.
    </p>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>

    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body">
            
  <div class="section" id="module-shlex">
<span id="shlex-lexical-analysis-of-shell-style-syntaxes"></span><h1>shlex &#8211; Lexical analysis of shell-style syntaxes.<a class="headerlink" href="#module-shlex" title="Permalink to this headline">¶</a></h1>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field"><th class="field-name">Purpose:</th><td class="field-body">Lexical analysis of shell-style syntaxes.</td>
</tr>
<tr class="field"><th class="field-name">Python Version:</th><td class="field-body">1.5.2, with additions in later versions</td>
</tr>
</tbody>
</table>
<p>The shlex module implements a class for parsing simple shell-like syntaxes. It
can be used for writing your own domain specific language, or for parsing
quoted strings (a task that is more complex than it seems, at first).</p>
<div class="section" id="quoted-strings">
<h2>Quoted Strings<a class="headerlink" href="#quoted-strings" title="Permalink to this headline">¶</a></h2>
<p>A common problem when working with input text is to identify a sequence of
quoted words as a single entity. Splitting the text on quotes does not always
work as expected, especially if there are nested levels of quotes. Take the
following text:</p>
<div class="highlight-python"><pre>This string has embedded "double quotes" and 'single quotes' in it,
and even "a 'nested example'".
</pre>
</div>
<p>A naive approach might attempt to construct a regular expression to
find the parts of the text outside the quotes to separate them from
the text inside the quotes, or vice versa. Such an approach would be
unnecessarily complex and prone to errors resulting from edge cases
like apostrophes or even typos. A better solution is to use a true
parser, such as the one provided by the <a class="reference internal" href="#module-shlex" title="shlex: Lexical analysis of shell-style syntaxes."><tt class="xref py py-mod docutils literal"><span class="pre">shlex</span></tt></a> module. Here is a
simple example that prints the tokens identified in the input file:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">shlex</span>
<span class="kn">import</span> <span class="nn">sys</span>

<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">)</span> <span class="o">!=</span> <span class="mi">2</span><span class="p">:</span>
    <span class="k">print</span> <span class="s">&#39;Please specify one filename on the command line.&#39;</span>
    <span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>

<span class="n">filename</span> <span class="o">=</span> <span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="n">body</span> <span class="o">=</span> <span class="nb">file</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="s">&#39;rt&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="k">print</span> <span class="s">&#39;ORIGINAL:&#39;</span><span class="p">,</span> <span class="nb">repr</span><span class="p">(</span><span class="n">body</span><span class="p">)</span>
<span class="k">print</span>

<span class="k">print</span> <span class="s">&#39;TOKENS:&#39;</span>
<span class="n">lexer</span> <span class="o">=</span> <span class="n">shlex</span><span class="o">.</span><span class="n">shlex</span><span class="p">(</span><span class="n">body</span><span class="p">)</span>
<span class="k">for</span> <span class="n">token</span> <span class="ow">in</span> <span class="n">lexer</span><span class="p">:</span>
    <span class="k">print</span> <span class="nb">repr</span><span class="p">(</span><span class="n">token</span><span class="p">)</span>
</pre></div>
</div>
<p>When run on data with embedded quotes, the parser produces the list of tokens
we expect:</p>
<div class="highlight-python"><pre>$ python shlex_example.py quotes.txt

ORIGINAL: 'This string has embedded "double quotes" and \'single quotes\' in it,\nand even "a \'nested example\'".\n'

TOKENS:
'This'
'string'
'has'
'embedded'
'"double quotes"'
'and'
"'single quotes'"
'in'
'it'
','
'and'
'even'
'"a \'nested example\'"'
'.'</pre>
</div>
<p>Isolated quotes such as apostrophes are also handled.  Given this
input file:</p>
<div class="highlight-python"><pre>This string has an embedded apostrophe, doesn't it?</pre>
</div>
<p>The token with the embedded apostrophe is no problem:</p>
<div class="highlight-python"><pre>$ python shlex_example.py apostrophe.txt

ORIGINAL: "This string has an embedded apostrophe, doesn't it?"

TOKENS:
'This'
'string'
'has'
'an'
'embedded'
'apostrophe'
','
"doesn't"
'it'
'?'</pre>
</div>
</div>
<div class="section" id="embedded-comments">
<h2>Embedded Comments<a class="headerlink" href="#embedded-comments" title="Permalink to this headline">¶</a></h2>
<p>Since the parser is intended to be used with command languages, it needs to
handle comments. By default, any text following a # is considered part of a
comment, and ignored. Due to the nature of the parser, only single character
comment prefixes are supported. The set of comment characters used can be
configured through the commenters property.</p>
<div class="highlight-python"><pre>$ python shlex_example.py comments.txt

ORIGINAL: 'This line is recognized.\n# But this line is ignored.\nAnd this line is processed.'

TOKENS:
'This'
'line'
'is'
'recognized'
'.'
'And'
'this'
'line'
'is'
'processed'
'.'</pre>
</div>
</div>
<div class="section" id="split">
<h2>Split<a class="headerlink" href="#split" title="Permalink to this headline">¶</a></h2>
<p>If you just need to split an existing string into component tokens,
the convenience function <tt class="xref py py-func docutils literal"><span class="pre">split()</span></tt> is a simple wrapper around
the parser.</p>
<div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">shlex</span>

<span class="n">text</span> <span class="o">=</span> <span class="s">&quot;&quot;&quot;This text has &quot;quoted parts&quot; inside it.&quot;&quot;&quot;</span>
<span class="k">print</span> <span class="s">&#39;ORIGINAL:&#39;</span><span class="p">,</span> <span class="nb">repr</span><span class="p">(</span><span class="n">text</span><span class="p">)</span>
<span class="k">print</span>

<span class="k">print</span> <span class="s">&#39;TOKENS:&#39;</span>
<span class="k">print</span> <span class="n">shlex</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="n">text</span><span class="p">)</span>
</pre></div>
</div>
<p>The result is a list:</p>
<div class="highlight-python"><pre>$ python shlex_split.py

ORIGINAL: 'This text has "quoted parts" inside it.'

TOKENS:
['This', 'text', 'has', 'quoted parts', 'inside', 'it.']</pre>
</div>
</div>
<div class="section" id="including-other-sources-of-tokens">
<h2>Including Other Sources of Tokens<a class="headerlink" href="#including-other-sources-of-tokens" title="Permalink to this headline">¶</a></h2>
<p>The <a class="reference internal" href="#module-shlex" title="shlex: Lexical analysis of shell-style syntaxes."><tt class="xref py py-class docutils literal"><span class="pre">shlex</span></tt></a> class includes several configuration properties
which allow us to control its behavior. The <em>source</em> property
enables a feature for code (or configuration) re-use by allowing one
token stream to include another. This is similar to the Bourne shell
<tt class="docutils literal"><span class="pre">source</span></tt> operator, hence the name.</p>
<div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">shlex</span>

<span class="n">text</span> <span class="o">=</span> <span class="s">&quot;&quot;&quot;This text says to source quotes.txt before continuing.&quot;&quot;&quot;</span>
<span class="k">print</span> <span class="s">&#39;ORIGINAL:&#39;</span><span class="p">,</span> <span class="nb">repr</span><span class="p">(</span><span class="n">text</span><span class="p">)</span>
<span class="k">print</span>

<span class="n">lexer</span> <span class="o">=</span> <span class="n">shlex</span><span class="o">.</span><span class="n">shlex</span><span class="p">(</span><span class="n">text</span><span class="p">)</span>
<span class="n">lexer</span><span class="o">.</span><span class="n">wordchars</span> <span class="o">+=</span> <span class="s">&#39;.&#39;</span>
<span class="n">lexer</span><span class="o">.</span><span class="n">source</span> <span class="o">=</span> <span class="s">&#39;source&#39;</span>

<span class="k">print</span> <span class="s">&#39;TOKENS:&#39;</span>
<span class="k">for</span> <span class="n">token</span> <span class="ow">in</span> <span class="n">lexer</span><span class="p">:</span>
    <span class="k">print</span> <span class="nb">repr</span><span class="p">(</span><span class="n">token</span><span class="p">)</span>
</pre></div>
</div>
<p>Notice the string <tt class="docutils literal"><span class="pre">source</span> <span class="pre">quotes.txt</span></tt> embedded in the original
text. Since the source property of the lexer is set to &#8220;source&#8221;, when
the keyword is encountered the filename appearing in the next title is
automatically included. In order to cause the filename to appear as a
single token, the <tt class="docutils literal"><span class="pre">.</span></tt> character needs to be added to the list of
characters that are included in words (otherwise &#8220;<tt class="docutils literal"><span class="pre">quotes.txt</span></tt>&#8221;
becomes three tokens, &#8220;<tt class="docutils literal"><span class="pre">quotes</span></tt>&#8221;, &#8220;<tt class="docutils literal"><span class="pre">.</span></tt>&#8221;, &#8220;<tt class="docutils literal"><span class="pre">txt</span></tt>&#8221;). The output
looks like:</p>
<div class="highlight-python"><pre>$ python shlex_source.py

ORIGINAL: 'This text says to source quotes.txt before continuing.'

TOKENS:
'This'
'text'
'says'
'to'
'This'
'string'
'has'
'embedded'
'"double quotes"'
'and'
"'single quotes'"
'in'
'it'
','
'and'
'even'
'"a \'nested example\'"'
'.'
'before'
'continuing.'</pre>
</div>
<p>The &#8220;source&#8221; feature uses a method called <tt class="xref py py-func docutils literal"><span class="pre">sourcehook()</span></tt> to load
the additional input source, so you can subclass <a class="reference internal" href="#module-shlex" title="shlex: Lexical analysis of shell-style syntaxes."><tt class="xref py py-class docutils literal"><span class="pre">shlex</span></tt></a> to
provide your own implementation to load data from anywhere.</p>
</div>
<div class="section" id="controlling-the-parser">
<h2>Controlling the Parser<a class="headerlink" href="#controlling-the-parser" title="Permalink to this headline">¶</a></h2>
<p>I have already given an example changing the <em>wordchars</em> value to
control which characters are included in words. It is also possible to
set the <em>quotes</em> character to use additional or alternative
quotes. Each quote must be a single character, so it is not possible
to have different open and close quotes (no parsing on parentheses,
for example).</p>
<div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">shlex</span>

<span class="n">text</span> <span class="o">=</span> <span class="s">&quot;&quot;&quot;|Col 1||Col 2||Col 3|&quot;&quot;&quot;</span>
<span class="k">print</span> <span class="s">&#39;ORIGINAL:&#39;</span><span class="p">,</span> <span class="nb">repr</span><span class="p">(</span><span class="n">text</span><span class="p">)</span>
<span class="k">print</span>

<span class="n">lexer</span> <span class="o">=</span> <span class="n">shlex</span><span class="o">.</span><span class="n">shlex</span><span class="p">(</span><span class="n">text</span><span class="p">)</span>
<span class="n">lexer</span><span class="o">.</span><span class="n">quotes</span> <span class="o">=</span> <span class="s">&#39;|&#39;</span>

<span class="k">print</span> <span class="s">&#39;TOKENS:&#39;</span>
<span class="k">for</span> <span class="n">token</span> <span class="ow">in</span> <span class="n">lexer</span><span class="p">:</span>
    <span class="k">print</span> <span class="nb">repr</span><span class="p">(</span><span class="n">token</span><span class="p">)</span>
</pre></div>
</div>
<p>In this example, each table cell is wrapped in vertical bars:</p>
<div class="highlight-python"><pre>$ python shlex_table.py

ORIGINAL: '|Col 1||Col 2||Col 3|'

TOKENS:
'|Col 1|'
'|Col 2|'
'|Col 3|'</pre>
</div>
<p>It is also possible to control the whitespace characters used to split words.
If we modify the example in shlex_example.py to include period and comma, as
follows:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">shlex</span>
<span class="kn">import</span> <span class="nn">sys</span>

<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">)</span> <span class="o">!=</span> <span class="mi">2</span><span class="p">:</span>
    <span class="k">print</span> <span class="s">&#39;Please specify one filename on the command line.&#39;</span>
    <span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>

<span class="n">filename</span> <span class="o">=</span> <span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="n">body</span> <span class="o">=</span> <span class="nb">file</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="s">&#39;rt&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="k">print</span> <span class="s">&#39;ORIGINAL:&#39;</span><span class="p">,</span> <span class="nb">repr</span><span class="p">(</span><span class="n">body</span><span class="p">)</span>
<span class="k">print</span>

<span class="k">print</span> <span class="s">&#39;TOKENS:&#39;</span>
<span class="n">lexer</span> <span class="o">=</span> <span class="n">shlex</span><span class="o">.</span><span class="n">shlex</span><span class="p">(</span><span class="n">body</span><span class="p">)</span>
<span class="n">lexer</span><span class="o">.</span><span class="n">whitespace</span> <span class="o">+=</span> <span class="s">&#39;.,&#39;</span>
<span class="k">for</span> <span class="n">token</span> <span class="ow">in</span> <span class="n">lexer</span><span class="p">:</span>
    <span class="k">print</span> <span class="nb">repr</span><span class="p">(</span><span class="n">token</span><span class="p">)</span>
</pre></div>
</div>
<p>The results change to:</p>
<div class="highlight-python"><pre>$ python shlex_whitespace.py quotes.txt

ORIGINAL: 'This string has embedded "double quotes" and \'single quotes\' in it,\nand even "a \'nested example\'".\n'

TOKENS:
'This'
'string'
'has'
'embedded'
'"double quotes"'
'and'
"'single quotes'"
'in'
'it'
'and'
'even'
'"a \'nested example\'"'</pre>
</div>
</div>
<div class="section" id="error-handling">
<h2>Error Handling<a class="headerlink" href="#error-handling" title="Permalink to this headline">¶</a></h2>
<p>When the parser encounters the end of its input before all quoted
strings are closed, it raises <a class="reference internal" href="../exceptions/index.html#exceptions-valueerror"><em>ValueError</em></a>. When that happens, it is useful to examine
some of the properties of the parser maintained as it processes the
input. For example, <em>infile</em> refers to the name of the file being
processed (which might be different from the original file, if one
file sources another). The <em>lineno</em> reports the line when the error is
discovered. The <em>lineno</em> is typically the end of the file, which may
be far away from the first quote. The <em>token</em> attribute contains the
buffer of text not already included in a valid token. The
<tt class="xref py py-func docutils literal"><span class="pre">error_leader()</span></tt> method produces a message prefix in a style
similar to Unix compilers, which enables editors such as emacs to
parse the error and take the user directly to the invalid line.</p>
<div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">shlex</span>

<span class="n">text</span> <span class="o">=</span> <span class="s">&quot;&quot;&quot;This line is ok.</span>
<span class="s">This line has an &quot;unfinished quote.</span>
<span class="s">This line is ok, too.</span>
<span class="s">&quot;&quot;&quot;</span>

<span class="k">print</span> <span class="s">&#39;ORIGINAL:&#39;</span><span class="p">,</span> <span class="nb">repr</span><span class="p">(</span><span class="n">text</span><span class="p">)</span>
<span class="k">print</span>

<span class="n">lexer</span> <span class="o">=</span> <span class="n">shlex</span><span class="o">.</span><span class="n">shlex</span><span class="p">(</span><span class="n">text</span><span class="p">)</span>

<span class="k">print</span> <span class="s">&#39;TOKENS:&#39;</span>
<span class="k">try</span><span class="p">:</span>
    <span class="k">for</span> <span class="n">token</span> <span class="ow">in</span> <span class="n">lexer</span><span class="p">:</span>
        <span class="k">print</span> <span class="nb">repr</span><span class="p">(</span><span class="n">token</span><span class="p">)</span>
<span class="k">except</span> <span class="ne">ValueError</span><span class="p">,</span> <span class="n">err</span><span class="p">:</span>
    <span class="n">first_line_of_error</span> <span class="o">=</span> <span class="n">lexer</span><span class="o">.</span><span class="n">token</span><span class="o">.</span><span class="n">splitlines</span><span class="p">()[</span><span class="mi">0</span><span class="p">]</span>
    <span class="k">print</span> <span class="s">&#39;ERROR:&#39;</span><span class="p">,</span> <span class="n">lexer</span><span class="o">.</span><span class="n">error_leader</span><span class="p">(),</span> <span class="nb">str</span><span class="p">(</span><span class="n">err</span><span class="p">),</span> <span class="s">&#39;following &quot;&#39;</span> <span class="o">+</span> <span class="n">first_line_of_error</span> <span class="o">+</span> <span class="s">&#39;&quot;&#39;</span>
</pre></div>
</div>
<p>The example above produces this output:</p>
<div class="highlight-python"><pre>$ python shlex_errors.py

ORIGINAL: 'This line is ok.\nThis line has an "unfinished quote.\nThis line is ok, too.\n'

TOKENS:
'This'
'line'
'is'
'ok'
'.'
'This'
'line'
'has'
'an'
ERROR: "None", line 4:  No closing quotation following ""unfinished quote."</pre>
</div>
</div>
<div class="section" id="posix-vs-non-posix-parsing">
<h2>POSIX vs. Non-POSIX Parsing<a class="headerlink" href="#posix-vs-non-posix-parsing" title="Permalink to this headline">¶</a></h2>
<p>The default behavior for the parser is to use a backwards-compatible style
which is not POSIX-compliant. For POSIX behavior, set the posix argument when
constructing the parser.</p>
<div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">shlex</span>

<span class="k">for</span> <span class="n">s</span> <span class="ow">in</span> <span class="p">[</span> <span class="s">&#39;Do&quot;Not&quot;Separate&#39;</span><span class="p">,</span>
           <span class="s">&#39;&quot;Do&quot;Separate&#39;</span><span class="p">,</span>
           <span class="s">&#39;Escaped \e Character not in quotes&#39;</span><span class="p">,</span>
           <span class="s">&#39;Escaped &quot;\e&quot; Character in double quotes&#39;</span><span class="p">,</span>
           <span class="s">&quot;Escaped &#39;\e&#39; Character in single quotes&quot;</span><span class="p">,</span>
           <span class="s">r&quot;Escaped &#39;\&#39;&#39; </span><span class="se">\&quot;</span><span class="s">\&#39;</span><span class="se">\&quot;</span><span class="s"> single quote&quot;</span><span class="p">,</span>
           <span class="s">r&#39;Escaped &quot;\&quot;&quot; </span><span class="se">\&#39;</span><span class="s">\&quot;</span><span class="se">\&#39;</span><span class="s"> double quote&#39;</span><span class="p">,</span>
           <span class="s">&quot;</span><span class="se">\&quot;</span><span class="s">&#39;Strip extra layer of quotes&#39;</span><span class="se">\&quot;</span><span class="s">&quot;</span><span class="p">,</span>
           <span class="p">]:</span>
    <span class="k">print</span> <span class="s">&#39;ORIGINAL :&#39;</span><span class="p">,</span> <span class="nb">repr</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
    <span class="k">print</span> <span class="s">&#39;non-POSIX:&#39;</span><span class="p">,</span>

    <span class="n">non_posix_lexer</span> <span class="o">=</span> <span class="n">shlex</span><span class="o">.</span><span class="n">shlex</span><span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="n">posix</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="k">print</span> <span class="nb">repr</span><span class="p">(</span><span class="nb">list</span><span class="p">(</span><span class="n">non_posix_lexer</span><span class="p">))</span>
    <span class="k">except</span> <span class="ne">ValueError</span><span class="p">,</span> <span class="n">err</span><span class="p">:</span>
        <span class="k">print</span> <span class="s">&#39;error(</span><span class="si">%s</span><span class="s">)&#39;</span> <span class="o">%</span> <span class="n">err</span>

    
    <span class="k">print</span> <span class="s">&#39;POSIX    :&#39;</span><span class="p">,</span>
    <span class="n">posix_lexer</span> <span class="o">=</span> <span class="n">shlex</span><span class="o">.</span><span class="n">shlex</span><span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="n">posix</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="k">print</span> <span class="nb">repr</span><span class="p">(</span><span class="nb">list</span><span class="p">(</span><span class="n">posix_lexer</span><span class="p">))</span>
    <span class="k">except</span> <span class="ne">ValueError</span><span class="p">,</span> <span class="n">err</span><span class="p">:</span>
        <span class="k">print</span> <span class="s">&#39;error(</span><span class="si">%s</span><span class="s">)&#39;</span> <span class="o">%</span> <span class="n">err</span>

    <span class="k">print</span>
</pre></div>
</div>
<p>Here are a few examples of the differences in parsing behavior:</p>
<div class="highlight-python"><pre>$ python shlex_posix.py

ORIGINAL : 'Do"Not"Separate'
non-POSIX: ['Do"Not"Separate']
POSIX    : ['DoNotSeparate']

ORIGINAL : '"Do"Separate'
non-POSIX: ['"Do"', 'Separate']
POSIX    : ['DoSeparate']

ORIGINAL : 'Escaped \\e Character not in quotes'
non-POSIX: ['Escaped', '\\', 'e', 'Character', 'not', 'in', 'quotes']
POSIX    : ['Escaped', 'e', 'Character', 'not', 'in', 'quotes']

ORIGINAL : 'Escaped "\\e" Character in double quotes'
non-POSIX: ['Escaped', '"\\e"', 'Character', 'in', 'double', 'quotes']
POSIX    : ['Escaped', '\\e', 'Character', 'in', 'double', 'quotes']

ORIGINAL : "Escaped '\\e' Character in single quotes"
non-POSIX: ['Escaped', "'\\e'", 'Character', 'in', 'single', 'quotes']
POSIX    : ['Escaped', '\\e', 'Character', 'in', 'single', 'quotes']

ORIGINAL : 'Escaped \'\\\'\' \\"\\\'\\" single quote'
non-POSIX: error(No closing quotation)
POSIX    : ['Escaped', '\\ \\"\\"', 'single', 'quote']

ORIGINAL : 'Escaped "\\"" \\\'\\"\\\' double quote'
non-POSIX: error(No closing quotation)
POSIX    : ['Escaped', '"', '\'"\'', 'double', 'quote']

ORIGINAL : '"\'Strip extra layer of quotes\'"'
non-POSIX: ['"\'Strip extra layer of quotes\'"']
POSIX    : ["'Strip extra layer of quotes'"]</pre>
</div>
<div class="admonition-see-also admonition seealso">
<p class="first admonition-title">See also</p>
<dl class="last docutils">
<dt><a class="reference external" href="http://docs.python.org/lib/module-shlex.html">shlex</a></dt>
<dd>Standard library documentation for this module.</dd>
<dt><a class="reference internal" href="../cmd/index.html#module-cmd" title="cmd: Create line-oriented command processors."><tt class="xref py py-mod docutils literal"><span class="pre">cmd</span></tt></a></dt>
<dd>Tools for building interactive command interpreters.</dd>
<dt><a class="reference internal" href="../optparse/index.html#module-optparse" title="optparse: Command line option parser to replace :mod:`getopt`."><tt class="xref py py-mod docutils literal"><span class="pre">optparse</span></tt></a></dt>
<dd>Command line option parsing.</dd>
<dt><a class="reference internal" href="../getopt/index.html#module-getopt" title="getopt: Command line option parsing"><tt class="xref py py-mod docutils literal"><span class="pre">getopt</span></tt></a></dt>
<dd>Command line option parsing.</dd>
</dl>
</div>
</div>
</div>


          </div>
        </div>
      </div>
      <div class="clearer"></div>
    </div>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="../genindex.html" title="General Index"
             >index</a></li>
        <li class="right" >
          <a href="../py-modindex.html" title="Python Module Index"
             >modules</a> |</li>
        <li class="right" >
          <a href="../dev_tools.html" title="Development Tools"
             >next</a> |</li>
        <li class="right" >
          <a href="../cmd/index.html" title="cmd – Create line-oriented command processors"
             >previous</a> |</li>
        <li><a href="../contents.html">PyMOTW</a> &raquo;</li>
          <li><a href="../frameworks.html" >Program Frameworks</a> &raquo;</li> 
      </ul>
    </div>
    <div class="footer">
      &copy; Copyright Doug Hellmann.
      Last updated on Oct 24, 2010.
      Created using <a href="http://sphinx.pocoo.org/">Sphinx</a>.

    <br/><a href="http://creativecommons.org/licenses/by-nc-sa/3.0/us/" rel="license"><img alt="Creative Commons License" style="border-width:0" src="http://i.creativecommons.org/l/by-nc-sa/3.0/us/88x31.png"/></a>
    
    </div>
  </body>
</html>

[top] / python / PyMOTW / docs / shlex / index.html

contact | logmethods.com