<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div><b><font size="3" color="purple" face="Arial"><span style="font-size:12.0pt;font-family:Arial;color:purple;
font-weight:bold">Research Centre for <st1:personname w:st="on">Linguistic
 Typology</st1:personname> Seminar:</span></font></b></div><blockquote type="cite"></blockquote><b><font size="3" color="purple" face="Arial"><span style="font-size:12.0pt;font-family:Arial;color:purple;
font-weight:bold"><o:p> </o:p></span></font></b><br><blockquote type="cite"></blockquote><b><font size="2" color="purple" face="Arial"><span style="font-size:10.0pt;font-family:Arial;color:purple;
font-weight:bold"><o:p> </o:p></span></font></b><br><blockquote type="cite"></blockquote><b><font size="3" face="Arial"><span style="font-size:12.0pt;font-family:Arial;font-weight:bold">Who:   
 <font color="navy"><span style="color:navy">Andrew Margetts (<st1:placename w:st="on">Monash</st1:placename> <st1:placetype w:st="on">University</st1:placetype>)
and Dr <st1:personname w:st="on">Anna Margetts </st1:personname>(<st1:place w:st="on"><st1:placename w:st="on">Monash</st1:placename> <st1:placetype w:st="on">University</st1:placetype></st1:place>
and RCLT)</span></font></span></font></b><br><blockquote type="cite"></blockquote><b><font size="3" face="Arial"><span style="font-size:12.0pt;font-family:Arial;font-weight:bold">What:  <font color="navy"><span style="color:navy">  Enhancing a text collection with a
document-oriented database model: a Toolbox based example (Andrew Margetts)</span></font></span></font></b><br><blockquote type="cite"></blockquote><b><font size="3" color="navy" face="Arial"><span style="font-size:12.0pt;font-family:Arial;color:navy;
font-weight:bold">               Filming with native speaker commentary: making the most of filming for the
community</span></font></b><b><font size="4" color="#333399" face="Calibri"><span style="font-size:14.0pt;font-family:Calibri;color:#333399;font-weight:bold"> (Dr
Anna Margetts)             </span></font></b><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="color: rgb(0, 0, 128); font-family: Calibri; font-size: 16px; ">              </span><br><blockquote type="cite"></blockquote><b><font size="3" face="Arial"><span style="font-size:12.0pt;font-family:Arial;font-weight:bold">When:
  3:30 - 5:00pm, Thursday, <font color="navy"><span style="color:navy">27</span></font>
January, 2011</span></font></b><br><blockquote type="cite"></blockquote><b><font size="3" face="Arial"><span style="font-size:12.0pt;font-family:Arial;font-weight:bold"><o:p> </o:p></span></font></b><br><blockquote type="cite"></blockquote><b><font size="3" face="Arial"><span style="font-size:12.0pt;font-family:Arial;font-weight:bold">Where: Reading
Room, Research Centre for <st1:personname w:st="on">Linguistic Typology</st1:personname>,</span></font></b><br><blockquote type="cite"></blockquote><b><font size="3" face="Arial"><span style="font-size:12.0pt;font-family:Arial;font-weight:bold">                   Building NR6, La <st1:place w:st="on"><st1:placename w:st="on">Trobe</st1:placename>
 <st1:placetype w:st="on">University</st1:placetype></st1:place>, Bundoora</span></font></b><br><blockquote type="cite"></blockquote><b><font size="3" face="Arial"><span style="font-size:12.0pt;font-family:Arial;font-weight:bold">                   Map on Melways: 19 G5</span></font></b><br><blockquote type="cite"></blockquote><b><font size="3" face="Arial"><span style="font-size:12.0pt;font-family:Arial;font-weight:bold">                   Map on RCLT website:</span></font></b> <b><font face="Arial"><span style="font-family:Arial;font-weight:bold"><a href="http://www.latrobe.edu.au/rclt/location.htm">http://www.latrobe.edu.au/rclt/location.htm</a></span></font></b><br><blockquote type="cite"></blockquote><b><font size="3" face="Arial"><span style="font-size:12.0pt;font-family:Arial;font-weight:bold"><o:p> </o:p></span></font></b><br><blockquote type="cite"></blockquote><b><font size="3" face="Arial"><span style="font-size:12.0pt;font-family:Arial;font-weight:bold"><o:p> </o:p></span></font></b><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: Arial; font-size: 13px; "> </span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: Arial; font-size: 21px; line-height: 31px; ">Abstract:</span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-size: 21px; font-weight: bold; "><b><font size="2" face="Arial"><span style="font-size:10.0pt">Enhancing a text
collection with a document-oriented database model: </span></font></b></span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-size: 21px; font-weight: bold; "><b><font size="2" face="Arial"><span style="font-size:10.0pt">a Toolbox based example</span></font></b></span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: Arial; font-size: 13px; "> </span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: Arial; font-size: 13px; ">As data-sets grow in complexity it is common to expand them
from a flat-file to a relational database. This approach offers many
advantages: new types of questions can be asked and integrity of the data can
be ensured. But there are also costs. The process of converting to this model
– i.e. 'normalizing' the data – can be very involved, and the
resulting data is difficult to interpret except through the database software.
Speed of use can also suffer since many processes are only performed when a
query is actually run.</span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: Arial; font-size: 13px; "> </span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: Arial; font-size: 13px; ">An alternative approach is to expand the utility of the
original database through simple scripts which augment the primary data-set by
feeding relevant information from related sets. The result is an enriched,
semi-structured document which remains readable to the human eye, yet is
capable of handling complex queries comparable to those achievable through
Structured Query Language (SQL) and a relational database. In fact some very
complicated data relationships become easier to model than in a relational
database.</span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: Arial; font-size: 13px; "> </span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: Arial; font-size: 13px; ">This paper looks at the document-oriented model through the
development of a typical Toolbox text collection. It draws on a sample Toolbox
project containing interrelated data-sets, plus a set of scripts for
manipulating the data. I explain the process of feeding supplementary data to
the main data-set, and demonstrate some typical queries. I also discuss one
situation where the model is superior to a relational database due to the
intricacy of the relationships. I conclude with a brief demonstration of
importing the data to MongoDB, a versatile document-oriented database.</span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: Arial; font-size: 13px; "> </span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: Arial; font-size: 13px; ">Far from being merely a way to avoid building a full-blown
relational database, this model is increasingly being used for certain large
scale applications, particularly on the web. The reasons include speed of
read/write operations, a more intuitive data model (which implies quicker setting
up, revision and maintenance), and the ability to scale (i.e. become larger)
without problems. It is much easier to translate a Toolbox database to such a
system than to an equivalent relational database, and so this provides a
straightforward path to exposing the data on the web and adding functionality
not available in Toolbox.</span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: Arial; font-size: 13px; "> </span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: Arial; font-size: 13px; "> </span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-size: 21px; font-weight: bold; "><b><font size="4" face="Arial"><span style="font-size:13.0pt">Filming with
native speaker commentary: </span></font></b></span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-size: 21px; font-weight: bold; "><b><font size="4" face="Arial"><span style="font-size:13.0pt">making the most of filming for the community</span></font></b></span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: 'Times New Roman'; font-size: 16px; "> </span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: 'Times New Roman'; font-size: 16px; ">Linguistic fieldworkers often feel the tension of different
expectations placed on them by different parties. On one side are the demands
of writing a thesis or other academic work based on the fieldwork. Time and
funding for fieldwork and research is limited and there is pressure from
funding agencies and universities to deliver. On the other side there is a
justified expectation (typically not least by the researchers themselves) that
there should be a clear benefit to the community. Making materials produced for
the community valuable for linguistic research and vice versa can reduce the
tension between conflicting demands and make for more productive fieldwork.</span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: 'Times New Roman'; font-size: 16px; "> </span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: 'Times New Roman'; font-size: 16px; ">In this paper I discuss
our experiences in enhancing materials created for the language community so
that they were valuable as linguistic recordings. The community we were working
with had a prime interest in the video documentation of certain events that
were basically of no interest from a linguistic perspective. In one instance we
were asked film a soccer match and we invited a speaker to provide a running
commentary of the match. This strategy proved a success and provided a stream
of spontaneous spoken language for analysis and a new text type for our
database. It also enhanced the value of the recording for the community. We
applied this technique in another context with similar success. I discuss the
data collected, the limitations of this method, and the equipment and recoding
set-up.</span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: 'Times New Roman'; font-size: 16px; "> </span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="font-family: 'Times New Roman'; font-size: 16px; "> </span><br><blockquote type="cite"></blockquote><span class="Apple-style-span" style="color: rgb(0, 0, 128); font-family: Calibri; font-size: 16px; line-height: 24px; "> </span><br><o:smarttagtype namespaceuri="urn:schemas-microsoft-com:office:smarttags" name="PlaceType"><o:smarttagtype namespaceuri="urn:schemas-microsoft-com:office:smarttags" name="PlaceName"><o:smarttagtype namespaceuri="urn:schemas-microsoft-com:office:smarttags" name="place"><o:smarttagtype namespaceuri="urn:schemas-microsoft-com:office:smarttags" name="PersonName"><div lang="EN-AU" link="blue" vlink="purple"><div class="Section1">



<p class="MsoNormal" style="line-height:150%;text-autospace:none"><font size="3" color="navy" face="Calibri"><span style="font-size:12.0pt;line-height:150%;
font-family:Calibri;color:navy"><o:p> </o:p></span></font></p>

</div>

</div>


</o:smarttagtype></o:smarttagtype></o:smarttagtype></o:smarttagtype></div><br></body></html>