<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <font size="-1"><font face="Verdana">Dear all, <br>
        I<font size="-1">'ve got a quite <font size="-1">simple</font>
          <font size="-1">question<font size="-1">, <font size="-1">and
                I hope the </font></font>answer might be <font
              size="-1">equally simple. </font> </font></font><br>
        <br>
        <font size="-1"><font size="-1"><font size="-1">We are wor<font
                size="-1">kin<font size="-1">g with </font></font></font></font></font></font></font><font
      size="-1"><font face="Verdana"><font size="-1"><font size="-1"><font
              size="-1"><font size="-1"><font size="-1"><font size="-1"><font
                      size="-1"><font size="-1"><font face="Verdana"><font
                            size="-1"><font size="-1"><font size="-1"><font
                                  size="-1"><font size="-1"><font
                                      size="-1"><font size="-1"><font
                                          size="-1"><font size="-1"><font
                                              size="-1"><font size="-1"><font
                                                  size="-1">n<font
                                                    size="-1">-g</font></font>rams<font
                                                  size="-1">, </font>which
                                                are stored as<font
                                                  size="-1">:</font></font></font></font></font></font></font></font></font></font></font></font></font></font>
                      <br>
                      <font size="-1">token1, <font size="-1">lemma1,
                          tag<font size="-1">set1, </font></font></font><font
                        size="-1"><font face="Verdana"><font size="-1"><font
                              size="-1"><font size="-1"><font size="-1"><font
                                    face="Verdana"><font size="-1">token<font
                                        size="-1">2</font>, <font
                                        size="-1">lemma<font size="-1">2</font>,
                                        tag<font size="-1">set<font
                                            size="-1">2, [<font
                                              size="-1">and so<font
                                                size="-1"> on</font></font>]</font></font></font></font></font></font></font></font></font></font></font><br>
                    </font></font></font></font></font></font></font><br>
        I am <font size="-1">wondering, if</font> ther<font size="-1">e
          <font size="-1">is a <font size="-1">standard</font> way to <font
              size="-1">covert<font size="-1"> these</font></font> <font
              size="-1"><font size="-1">n</font>-grams<font size="-1"><font
                  size="-1"> <font size="-1"><font size="-1">into a
                      datab<font size="-1">ase<font size="-1">?</font></font></font></font></font></font></font></font></font></font></font><font
      size="-1"><font face="Verdana"><font size="-1"><font size="-1"><font
              size="-1"><font size="-1"><font face="Verdana"><font
                    size="-1"><font size="-1"><font size="-1"><font
                          size="-1"></font></font></font></font></font></font></font></font></font><font
          size="-1"><font size="-1"><br>
            Technically<font size="-1"><font size="-1">, ther<font
                  size="-1">e is, of course, no problem to covert<font
                    size="-1"> but my questi<font size="-1">o<font
                        size="-1">n is which <font size="-1"><font
                            size="-1">in<font size="-1">dexes should be
                              buil<font size="-1">t</font> </font></font><font
                            size="-1"><font size="-1"><font size="-1">and
                                what <font size="-1">should</font> be
                                stored as i<font size="-1">s without</font>
                                any <font size="-1">optimization. <br>
                                  And more <font size="-1">specifically</font>,
                                  does i<font size="-1">t make <font
                                      size="-1">any s<font size="-1">en<font
                                          size="-1">s</font>e t</font></font>o
                                    keep the whole ta<font size="-1">gset<font
                                        size="-1">s</font>, or a<font
                                        size="-1"> better </font></font></font>way
                                  is to store each tag<font size="-1"> </font>separately<font
                                    size="-1">?</font><br>
                                  <br>
                                  Thank you<font size="-1">!</font><br>
                                </font></font></font></font></font></font></font></font></font></font></font></font></font>M<font
          size="-1">ik<font size="-1">hail Kopotev</font></font><br>
        <br>
      </font></font>
    <pre class="moz-signature" cols="72">-- 
Mikhail Kopotev, PhD, Adj.Prof.
University Lecturer
Department of Modern Languages 
University of Helsinki
<a class="moz-txt-link-freetext" href="http://www.helsinki.fi/~kopotev">http://www.helsinki.fi/~kopotev</a> </pre>
  </body>
</html>