[Corpora-List] Corpora Digest, Vol 29, Issue 29

Davis, Boyd bdavis at uncc.edu
Sun Nov 29 13:41:44 UTC 2009


Yuri, you'll find a range of Southern English dialect features and phraseology in the online conversations, interviews and narratives at New South Voices, http://newsouthvoices.uncc.edu  -- a number were included in the American National Corpus --
and additional examples in the 'narrative of the week' section in Project More's collection for teachers, http://education.uncc.edu/more

Vivien Leigh's dialect coach for GWTW, Dr Charles Hadley, has retired and lives in Charlotte; you may wish to write him c/o Queens University of Charlotte, if you want to ask him about which particular dialect features he chose.

Boyd H. Davis, PhD. | Bonnie E. Cone Professor of Teaching
Professor, Applied Linguistics/English | Professor, Gerontology
UNC Charlotte | 255A Fretwell
9201 University City Blvd | Charlotte NC 28223
Phone *704-687-4209 | Fax *704-687-3961 
http://english.uncc.edu/faculty/80-boyd-h-davis.html
bdavis at uncc.edu | http://webpages.uncc.edu/~bdavis/




-----Original Message-----
From: corpora-bounces at uib.no on behalf of corpora-request at uib.no
Sent: Sun 11/29/2009 6:00 AM
To: corpora at uib.no
Subject: Corpora Digest, Vol 29, Issue 29
 
Today's Topics:

   1.  Are there any piculiar features in Southern US	ENglish?
      (Yuri Tambovtsev)
   2. Re:  Are there any piculiar features in Southern US ENglish?
      (Angus B. Grieve-Smith)
   3. Re:  Open source multilingual syntactic parser (pablo gamallo)


----------------------------------------------------------------------

Message: 1
Date: Sun, 29 Nov 2009 02:40:30 -0800
From: "Yuri Tambovtsev" <yutamb at mail.ru>
Subject: [Corpora-List] Are there any piculiar features in Southern US
	ENglish?
To: <corpora at uib.no>

Dear Corpora colleagues, Are there any piculiar features in Southern US ENglish? First of all in its pronunciation. Do you think that GONE WITH THE WIND can give some data on that? Looking forward to hearing from you to yutamb at mail.ru  Yours sincerely Yuri Tambovtsev, Novosibirsk, Russia
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 646 bytes
Desc: not available
URL: <http://www.uib.no/mailman/public/corpora/attachments/20091129/2dfa000e/attachment.txt>

------------------------------

Message: 2
Date: Sat, 28 Nov 2009 17:29:53 -0500
From: "Angus B. Grieve-Smith" <grvsmth at panix.com>
Subject: Re: [Corpora-List] Are there any piculiar features in
	Southern US ENglish?
To: corpora at uib.no

Yuri Tambovtsev wrote:
>
> Dear Corpora colleagues, Are there any piculiar features in Southern 
> US ENglish? First of all in its pronunciation. Do you think that GONE 
> WITH THE WIND can give some data on that? Looking forward to hearing 
> from you to yutamb at mail.ru <mailto:yutamb at mail.ru>  Yours sincerely 
> Yuri Tambovtsev, Novosibirsk, Russia
>
    Yuri, while the book may be able to tell us something about Southern 
English, its perspective is relatively limited.  I suggested to Kate the 
work of Walt Wolfram and his associates at North Carolina State 
University; they've done hundreds of hours of interviews, and hopefully 
some of that is available for corpus study.

    As I found when living in North Carolina, the diversity of features 
in even a small area can be astounding.  You can get a sense of this 
from the videos that Wolfram's group has posted to youTube.  They're all 
good, but I particularly recommend the "Ocracoke Brogue excerpt," 
"African American English," and the two "Lumbee English" videos.

http://www.youtube.com/user/NCLLP#g/u

    And that's just North Carolina!  If there were thirteen (or more, 
depending on how you count) linguists with the skills and resources that 
Wolfram has brought to North Carolina English, they could do the same 
for every other state in the South.  For the Civil War period, we have a 
corpus of letters to draw on:

http://etext.virginia.edu/civilwar/

    In comparison, Mitchell's book just tells us about the speech of 
upper class Atlantans and their slaves as she imagines it to have been 
forty years before her birth. It needs to be approached with the same 
skepticism required for analyzing any work of fiction.

-- 
				-Angus B. Grieve-Smith
				grvsmth at panix.com




------------------------------

Message: 3
Date: Sat, 28 Nov 2009 21:43:04 -0100
From: pablo gamallo <pablo.gamallo at gmail.com>
Subject: Re: [Corpora-List] Open source multilingual syntactic parser
To: corpora at uib.no, linasvepstas at gmail.com

Thanks, Linas, for your comments and suggestions. I try to reply to
some of your questions below:


>Estase citando Linas Vepstas <linasvepstas at gmail.com>:

> 2009/11/27 pablo gamallo <pablo.gamallo at usc.es>:
>>
>> DepPattern is available with GPL license at:
>> http://gramatica.usc.es/pln/tools/deppattern.html
>
> Thanks!
>
> A quick glance suggests that this parser is generating
> dependencies that are similar to, but different from those
> of other dependency parsers.   Is there any effort anywhere
> to  standardize on the set of dependencies generated?

The toolkit DepPattern is provided, not only with specific parsers,
but with a tutorial to write formal grammars that are compiled into
parsers. Names of dependencies are declared into a configuration file:
?dependencies.conf?. So, you can define in this file whatever set of
dependency labels to be used to write the grammar rules..


> I maintain a rule-based dependency parser (RelEx) and
> recently added a "Stanford Parser compatibility mode"
> because the RelEx dependencies are slightly different,
> and, because from an engineering standpoint, compatibility
> is something that users like.  (And, yes, I actually learned a lot
> by looking at how these two systems differed.)
>
> I wrote up what I found here:
>
> http://opencog.org/wiki/Dependency_relationship
>
> which describes RelEx, and how it differs from the
> Stanford parser (and from MiniPar)

Thanks for the link!


> I would be vaguely interested in creating a "DepPattern"
> compatibility mode, if that was the right thing to do --
> is it?  But perhaps it would be better if all dependency
> parsers moved to a common set of dependencies and
> feature sets?

Thanks for your interest! A common set of labels and features would be
useful to build further applications (Information Extraction,
Question-Answering...), based on standarized dependencies.
Yet, as you say in your wiki, parser outputs differ in more deeply
ways than just dependency labeling. For instance, your RelEx system
aims at grasping the semantic content of sentences, and not just a
literal syntactic structure. Using the formalism of DepPattern, it is
possible to write either more syntactic-oriented grammars or more
semantically motivated rules (as in Constraint Grammar and Link
Grammar, I guess). For instance, with DepPattern formalism,
you have the choice of generating a prepositional object and a
prepositional complement, or to collapse both of these into a single
prepositional relation, with the preposition linking the head and the
modifier/object. Following your example, the expression ?go to the
store? can be analyzed either:

pobj (go, store)
pcomp (go, to)

or:

to(go, store)

I think a tricky task the research community should to define is the
following: given a particular NLP application (word similarity
extraction, question-answering...), what type of dependencies are
required to improve the application's results?



> Is there a more detailed description of DepPattern's
> dependency output? It is hinted at in section 1.8.1 of
> the user guide -- features, such as lemma, number,
> person, tense, genre, possessor, politeness, type -- the
> first 4 I can guess, the last 4 are ???


Up to now, the tutorial (http://gramatica.usc.es/pln/tools/tutorialGrammar.pdf)
(not the user_guide) is the more accurate description of the formalism
and the output of the parser. Morpho-syntactic features are described
in section 1.3.2. They are based on those used by FreeLing, which are
based in turn on the Eagles project.


All the best,
Pablo

Pablo Gamallo Otero
Departamento de Língua Espanhola
Faculdade de Filologia
Campus Universitário Norte
15782 Santiago de Compostela
Espanha / Spain

phone: (+34) 981 563100, ext. 11761
Fax:   (+34) 981 574646
pablo.gamallo at usc.es
http://gramatica.usc.es/~gamallo/



----------------------------------------------------------------------
Send Corpora mailing list submissions to
	corpora at uib.no

To subscribe or unsubscribe via the World Wide Web, visit
	http://mailman.uib.no/listinfo/corpora
or, via email, send a message with subject or body 'help' to
	corpora-request at uib.no

You can reach the person managing the list at
	corpora-owner at uib.no

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Corpora digest..."


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


End of Corpora Digest, Vol 29, Issue 29
***************************************

-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list