<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    The problem is WordSmith's handling of mark-up where there are

    multiple attributes. Hitherto it has only been possible to search on

    one attribute and, until today, you could only use a limited range

    of wildcards. As a result of Peter's query, I have found a way of

    making a single asterisk represent any attribute, just as it can

    represent a single word.<br>

    Thus <br>

    <blockquote><b>prevent* * from</b><br>

      will find (and previously found) <br>

      <i>... preventing others from reaching ...</i><br>

    </blockquote>

    and now<br>

    <blockquote><b><w * pos="V*>giv*</b><br>

      finds (from today's version (6.0.161) onwards)<br>

      <i>...<w id-"123" pos="VV0>give ...</i><br>

      <i>

        ...<w id-"1234" pos="VV0>gives ...</i><br>

      <i>

      </i>etc.<br>

    </blockquote>

    Georg's solution is to treat all mark-up as ordinary text, which

    will suit some uses but not others, as he says. Another solution I

    considered was to make it easy to remove unwanted mark-up (as

    opposed to all mark-up) using WordSmith's Text Converter, but in the

    end it seemed better to make the lone asterisk mean the same as it

    does outside the mark-up.<br>

    <br>

    Cheers -- Mike<br>

    <br>

     <br>

    <div class="moz-cite-prefix">On 20/10/2013 21:40, Marko, Georg

      (<a class="moz-txt-link-abbreviated" href="mailto:georg.marko@uni-graz.at">georg.marko@uni-graz.at</a>) wrote:<br>

    </div>

    <blockquote

cite="mid:F603C3481BBFBB448EF1400F3606509803FC84CEEB@ARTEMIS.pers.ad.uni-graz.at"

      type="cite">

      <pre wrap="">Dear Peter,

I probably misunderstand the question, but what happens if you delete the "<*>" in "Mark-up to ignore". It will probably make estimating distances difficult, with all the pieces included in the tags here, but if you look for the core bit - the "VV0", e.g. - this should be there (at least it was, when I did a little test with the line you've given as a µ-corpus).

Simplistic solution, and probably not what you meant, but maybe...

Best

Georg

________________________________________

Von: <a class="moz-txt-link-abbreviated" href="mailto:corpora-bounces@uib.no">corpora-bounces@uib.no</a> [<a class="moz-txt-link-abbreviated" href="mailto:corpora-bounces@uib.no">corpora-bounces@uib.no</a>] im Auftrag von Peter Saunders [<a class="moz-txt-link-abbreviated" href="mailto:peter.saunders@lang.ox.ac.uk">peter.saunders@lang.ox.ac.uk</a>]

Gesendet: Sonntag, 20. Oktober 2013 22:01

An: <a class="moz-txt-link-abbreviated" href="mailto:corpora@uib.no">corpora@uib.no</a>

Betreff: [Corpora-List] Wordsmith tag searches of CLAWS 7 Pseudo XML corpus

Dear All

Does anyone know how I can configure Wordsmith settings so that it will do tag searches on a CLAWS 7 Pseudo XML tagged corpus? Here's a corpus line:

<w id="2.5" pos="VV0">give</w> <w id="2.6" pos="AT1">an</w>

I think the id="*"  parameter causes problems and I don't know how to strip this part out of tag searches.

Best

Peter

_______________________________________________

UNSUBSCRIBE from this page: <a class="moz-txt-link-freetext" href="http://mailman.uib.no/options/corpora">http://mailman.uib.no/options/corpora</a>

Corpora mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Corpora@uib.no">Corpora@uib.no</a>

<a class="moz-txt-link-freetext" href="http://mailman.uib.no/listinfo/corpora">http://mailman.uib.no/listinfo/corpora</a>

</pre>

    </blockquote>

    <br>

    <pre class="moz-signature" cols="72">-- 

Mike Scott

***

If you publish research which uses WordSmith, do let me know so I can include it at

<a class="moz-txt-link-freetext" href="http://www.lexically.net/wordsmith/corpus_linguistics_links/papers_using_wordsmith.htm">http://www.lexically.net/wordsmith/corpus_linguistics_links/papers_using_wordsmith.htm</a>

***

University of Aston and Lexical Analysis Software Ltd.

<a class="moz-txt-link-abbreviated" href="mailto:mike.scott@aston.ac.uk">mike.scott@aston.ac.uk</a>

<a class="moz-txt-link-abbreviated" href="http://www.lexically.net">www.lexically.net</a>

</pre>

  </body>

</html>