<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta name=Generator content="Microsoft Word 14 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";
color:black;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0cm;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";
color:black;}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:"Consolas","serif";
color:black;}
span.EmailStyle19
{mso-style-type:personal-reply;
font-family:"Courier New";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body bgcolor=white lang=EN-GB link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:9.0pt;font-family:"Courier New";color:#1F497D'>Mike – an economical and pleasing solution!<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:9.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:9.0pt;font-family:"Courier New";color:#1F497D'>Thanks<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:9.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:9.0pt;font-family:"Courier New";color:#1F497D'>C:<o:p></o:p></span></p><div><p class=MsoNormal><span style='font-size:9.0pt;font-family:"Courier New";color:#1F497D'>--<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:9.0pt;font-family:"Courier New";color:#1F497D'>Dr Christopher Tribble<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:9.0pt;font-family:"Courier New";color:#1F497D'>EMAIL || ctribble@clara.co.uk<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:9.0pt;font-family:"Courier New";color:#1F497D'>WEB || www.ctribble.co.uk <o:p></o:p></span></p></div><p class=MsoNormal><span style='font-size:9.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><div style='border:none;border-left:solid blue 1.5pt;padding:0cm 0cm 0cm 4.0pt'><div><div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal><b><span lang=EN-US style='font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext'>From:</span></b><span lang=EN-US style='font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext'> corpora-bounces@uib.no [mailto:corpora-bounces@uib.no] <b>On Behalf Of </b>Mike Scott<br><b>Sent:</b> 21 October 2013 17:39<br><b>To:</b> corpora@uib.no<br><b>Subject:</b> Re: [Corpora-List] Wordsmith tag searches of CLAWS 7 Pseudo XML corpus<o:p></o:p></span></p></div></div><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>The problem is WordSmith's handling of mark-up where there are multiple attributes. Hitherto it has only been possible to search on one attribute and, until today, you could only use a limited range of wildcards. As a result of Peter's query, I have found a way of making a single asterisk represent any attribute, just as it can represent a single word.<br>Thus <o:p></o:p></p><p class=MsoNormal><b>prevent* * from</b><br>will find (and previously found) <br><i>... preventing others from reaching ...</i><o:p></o:p></p><p class=MsoNormal>and now<o:p></o:p></p><p class=MsoNormal><b><w * pos="V*>giv*</b><br>finds (from today's version (6.0.161) onwards)<br><i>...<w id-"123" pos="VV0>give ...</i><br><i>...<w id-"1234" pos="VV0>gives ...</i><br>etc.<o:p></o:p></p><p class=MsoNormal>Georg's solution is to treat all mark-up as ordinary text, which will suit some uses but not others, as he says. Another solution I considered was to make it easy to remove unwanted mark-up (as opposed to all mark-up) using WordSmith's Text Converter, but in the end it seemed better to make the lone asterisk mean the same as it does outside the mark-up.<br><br>Cheers -- Mike<br><br> <o:p></o:p></p><div><p class=MsoNormal>On 20/10/2013 21:40, Marko, Georg (<a href="mailto:georg.marko@uni-graz.at">georg.marko@uni-graz.at</a>) wrote:<o:p></o:p></p></div><blockquote style='margin-top:5.0pt;margin-bottom:5.0pt'><pre>Dear Peter,<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>I probably misunderstand the question, but what happens if you delete the "<*>" in "Mark-up to ignore". It will probably make estimating distances difficult, with all the pieces included in the tags here, but if you look for the core bit - the "VV0", e.g. - this should be there (at least it was, when I did a little test with the line you've given as a µ-corpus).<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>Simplistic solution, and probably not what you meant, but maybe...<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>Best<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>Georg<o:p></o:p></pre><pre>________________________________________<o:p></o:p></pre><pre>Von: <a href="mailto:corpora-bounces@uib.no">corpora-bounces@uib.no</a> [<a href="mailto:corpora-bounces@uib.no">corpora-bounces@uib.no</a>] im Auftrag von Peter Saunders [<a href="mailto:peter.saunders@lang.ox.ac.uk">peter.saunders@lang.ox.ac.uk</a>]<o:p></o:p></pre><pre>Gesendet: Sonntag, 20. Oktober 2013 22:01<o:p></o:p></pre><pre>An: <a href="mailto:corpora@uib.no">corpora@uib.no</a><o:p></o:p></pre><pre>Betreff: [Corpora-List] Wordsmith tag searches of CLAWS 7 Pseudo XML corpus<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>Dear All<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>Does anyone know how I can configure Wordsmith settings so that it will do tag searches on a CLAWS 7 Pseudo XML tagged corpus? Here's a corpus line:<o:p></o:p></pre><pre><o:p> </o:p></pre><pre><w id="2.5" pos="VV0">give</w> <w id="2.6" pos="AT1">an</w><o:p></o:p></pre><pre><o:p> </o:p></pre><pre>I think the id="*" parameter causes problems and I don't know how to strip this part out of tag searches.<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>Best<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>Peter<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>_______________________________________________<o:p></o:p></pre><pre>UNSUBSCRIBE from this page: <a href="http://mailman.uib.no/options/corpora">http://mailman.uib.no/options/corpora</a><o:p></o:p></pre><pre>Corpora mailing list<o:p></o:p></pre><pre><a href="mailto:Corpora@uib.no">Corpora@uib.no</a><o:p></o:p></pre><pre><a href="http://mailman.uib.no/listinfo/corpora">http://mailman.uib.no/listinfo/corpora</a><o:p></o:p></pre></blockquote><p class=MsoNormal><br><br><o:p></o:p></p><pre>-- <o:p></o:p></pre><pre>Mike Scott<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>***<o:p></o:p></pre><pre>If you publish research which uses WordSmith, do let me know so I can include it at<o:p></o:p></pre><pre><a href="http://www.lexically.net/wordsmith/corpus_linguistics_links/papers_using_wordsmith.htm">http://www.lexically.net/wordsmith/corpus_linguistics_links/papers_using_wordsmith.htm</a><o:p></o:p></pre><pre>***<o:p></o:p></pre><pre>University of Aston and Lexical Analysis Software Ltd.<o:p></o:p></pre><pre><a href="mailto:mike.scott@aston.ac.uk">mike.scott@aston.ac.uk</a><o:p></o:p></pre><pre><a href="http://www.lexically.net">www.lexically.net</a><o:p></o:p></pre></div></div></body></html>