<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<font size="-1"><font face="Verdana">Dear all, <br>
I<font size="-1">'ve got a quite <font size="-1">simple</font>
<font size="-1">question<font size="-1">, <font size="-1">and
I hope the </font></font>answer might be <font
size="-1">equally simple. </font> </font></font><br>
<br>
<font size="-1"><font size="-1"><font size="-1">We are wor<font
size="-1">kin<font size="-1">g with </font></font></font></font></font></font></font><font
size="-1"><font face="Verdana"><font size="-1"><font size="-1"><font
size="-1"><font size="-1"><font size="-1"><font size="-1"><font
size="-1"><font size="-1"><font face="Verdana"><font
size="-1"><font size="-1"><font size="-1"><font
size="-1"><font size="-1"><font
size="-1"><font size="-1"><font
size="-1"><font size="-1"><font
size="-1"><font size="-1"><font
size="-1">n<font
size="-1">-g</font></font>rams<font
size="-1">, </font>which
are stored as<font
size="-1">:</font></font></font></font></font></font></font></font></font></font></font></font></font></font>
<br>
<font size="-1">token1, <font size="-1">lemma1,
tag<font size="-1">set1, </font></font></font><font
size="-1"><font face="Verdana"><font size="-1"><font
size="-1"><font size="-1"><font size="-1"><font
face="Verdana"><font size="-1">token<font
size="-1">2</font>, <font
size="-1">lemma<font size="-1">2</font>,
tag<font size="-1">set<font
size="-1">2, [<font
size="-1">and so<font
size="-1"> on</font></font>]</font></font></font></font></font></font></font></font></font></font></font><br>
</font></font></font></font></font></font></font><br>
I am <font size="-1">wondering, if</font> ther<font size="-1">e
<font size="-1">is a <font size="-1">standard</font> way to <font
size="-1">covert<font size="-1"> these</font></font> <font
size="-1"><font size="-1">n</font>-grams<font size="-1"><font
size="-1"> <font size="-1"><font size="-1">into a
datab<font size="-1">ase<font size="-1">?</font></font></font></font></font></font></font></font></font></font></font><font
size="-1"><font face="Verdana"><font size="-1"><font size="-1"><font
size="-1"><font size="-1"><font face="Verdana"><font
size="-1"><font size="-1"><font size="-1"><font
size="-1"></font></font></font></font></font></font></font></font></font><font
size="-1"><font size="-1"><br>
Technically<font size="-1"><font size="-1">, ther<font
size="-1">e is, of course, no problem to covert<font
size="-1"> but my questi<font size="-1">o<font
size="-1">n is which <font size="-1"><font
size="-1">in<font size="-1">dexes should be
buil<font size="-1">t</font> </font></font><font
size="-1"><font size="-1"><font size="-1">and
what <font size="-1">should</font> be
stored as i<font size="-1">s without</font>
any <font size="-1">optimization. <br>
And more <font size="-1">specifically</font>,
does i<font size="-1">t make <font
size="-1">any s<font size="-1">en<font
size="-1">s</font>e t</font></font>o
keep the whole ta<font size="-1">gset<font
size="-1">s</font>, or a<font
size="-1"> better </font></font></font>way
is to store each tag<font size="-1"> </font>separately<font
size="-1">?</font><br>
<br>
Thank you<font size="-1">!</font><br>
</font></font></font></font></font></font></font></font></font></font></font></font></font>M<font
size="-1">ik<font size="-1">hail Kopotev</font></font><br>
<br>
</font></font>
<pre class="moz-signature" cols="72">--
Mikhail Kopotev, PhD, Adj.Prof.
University Lecturer
Department of Modern Languages
University of Helsinki
<a class="moz-txt-link-freetext" href="http://www.helsinki.fi/~kopotev">http://www.helsinki.fi/~kopotev</a> </pre>
</body>
</html>