28.3517, Review: Computational Linguistics; General Linguistics; Text/Corpus Linguistics: Prieto (2017)

The LINGUIST List linguist at listserv.linguistlist.org
Thu Aug 24 17:43:15 UTC 2017


LINGUIST List: Vol-28-3517. Thu Aug 24 2017. ISSN: 1069 - 4875.

Subject: 28.3517, Review: Computational Linguistics; General Linguistics; Text/Corpus Linguistics: Prieto (2017)

Moderators: linguist at linguistlist.org (Damir Cavar, Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Helen Aristar-Dry, Robert Coté,
                                   Michael Czerniakowski)
Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           http://funddrive.linguistlist.org/donate/

Editor for this issue: Clare Harshey <clare at linguistlist.org>
================================================================


Date: Thu, 24 Aug 2017 13:43:10
From: Adriana Picoral [adrianaps at email.arizona.edu]
Subject: Text linguistics for the contrastive study of online customer comments

 
Discuss this message:
http://linguistlist.org/pubs/reviews/get-review.cfm?subid=36292537


Book announced at http://linguistlist.org/issues/28/28-415.html

AUTHOR: Raul Sanchez Prieto
TITLE: Text linguistics for the contrastive study of online customer comments
SUBTITLE: Text-linguistic patterns in German, Dutch, Spanish and French hotel comments and reviews
SERIES TITLE: Studien zur kontrastiven deutsch-iberoromanischen Sprachwissenschaft
PUBLISHER: Narr Francke Attempto Verlag GmbH + Co. KG
YEAR: 2017

REVIEWER: Adriana Picoral, University of Arizona

REVIEWS EDITOR: Helen Aristar-Dry

SUMMARY

The explicitly stated target audience of this volume includes both linguists
and hotel businesses. The first group is to be catered by a practical
demonstration of text linguistics tools through a contrastive analysis of the
linguistic patterns of hotel customer comments and reviews found online. The
second group is to take advantage of the analysis findings.

The book is comprised of 4 chapters. The first chapter is very brief, with its
main purpose being a quick introduction to the field of text-linguistics and a
description of what the study at hand encompasses.  The second chapter defines
the textual genre being analyzed, i.e., online comments written by hotel
customers divided into three subcategories according to where they were
retrieved from: 1) online travel and hotel booking websites (i.e.,
Booking.com, Expedia and TripAdvisor); 2) a social networking website (i.e.,
Facebook hotel pages); and 3) video-sharing (i.e., YouTube) and Wiki
discussion pages. For each one of these subcategories, the author describes
the text actions, situationality, external structure, and wording patterns.

Text actions are actions realized through the text, “according to or related
to the conventions of a speech community's members” (Sandig, 1990, p. 91).
Examples of text actions the author gleaned from the corpus are: describing
the room and the hotel premises; assessing the cleanliness; and commenting on
the performance of hotel staff. These actions can assume two types of text
functions: informative (i.e., the main purpose is to describe something or
provide information) and appellative (i.e., the main purpose is to influence
the reader in a given way through the expression of personal preferences). 

Somewhat similar to Corpus Linguistics’ situational characteristics, where
situation of use is described (Biber & Conrad, 2009), Prieto details the
situationality of each text type by specifying three aspects of the situation:
1) the channel and the communicative form (i.e., whether these are private,
half-private or universally accessible texts); 2) the superficial text
structure (e.g., font design, size and color; background color.); and 3) the
visual text structure (e.g., presence of an avatar; use of country flags;
visual representation of the score awarded to the hotel by the reviewer). 

External structure, which according to Prieto differs from text structure, is
also addressed for each of the three text types. Titles, information about the
reviewer, information about the hotel stay, and score are examples of features
that define the external structure of the text.

Finally, wording patterns are grammar, syntactic, morphological and lexical
features. Here, the author offers a broad overview of structures present in
each text type, such as pronominal forms used (e.g., the first person and
anaphoric references are more common in booking websites), lexical items
(e.g., the words “hotel” and “room” are most often repeated), deixis (e.g.,
absolute spatial deictic structures are common), and morphological structures,
among others. For each text type, a general wording pattern comparison among
six languages (i.e., Dutch, French, German, Italian, Portuguese and Spanish)
is also provided.

Chapter 2 closes with a detailed description of the two corpora described and
analyzed in this volume. A smaller corpus of the six languages already
mentioned, containing 1,800 comments total, is used in a more qualitative
approach, to illustrate examples and concepts throughout the book. A second
larger corpus in 4 of the 6 languages (i.e., Dutch, French, German and
Spanish), containing a total of 2,000 online comments, is used for the
qualitative analysis presented in the third chapter.

By far the longest chapter in the volume, Chapter 3 presents the mainly
quantitative analysis of the online comments from a contrastive point of view.
The analysis is divided into two main parts: 1) analysis of the communicative
macrostructure, focusing on the text actions and text functions discussed in
chapter two, offering some expansion on text functions; and 2) analysis of the
text-grammatical structures, which were also introduced in Chapter 2, but are
expanded and discussed in detail in Chapter 3. As was the case in the previous
chapter, each text action, text function, and text-grammatical structure is
appropriately illustrated with a comment retrieved from the corpus. Counts and
percentages of each feature are presented in summary tables, and comparisons
are drawn in prose based on similarities and differences in percentages among
the 4 languages in the corpus. 

The last chapter in the book, Chapter 4, presents a short conclusion that
summarizes the findings presented and discussed in Chapter 3. Although Pietro
reiterates that relevant differences among the four languages could not be
found for text actions and functions, the author claims the German and Dutch
corpora contain more instances of “describing breakfast choices” and
“commenting of quietness and privacy”, while the French and Dutch corpora
present more cases of “recommending or discouraging a stay at a given hotel”,
and the Spanish corpus consists of a larger number of occurrences of
“indicating parking availability or commenting on parking-related problems.”
Regarding text-grammatical structures, the conclusion is that the Spanish
corpus presents higher occurrence of referential impersonal pronouns and lower
incidence of first person pronouns. The author also claims that, when writing
comments on Facebook, French users opt to still use the formal manner of
address, while users in the other three languages choose other deictic devices
for person.  Finally, conjunctions are said to be more common as a connective
element in the Spanish corpus, and subordinating conjunctions are more
frequently used by both German and Spanish customers. Limitations are not
discussed.

EVALUATION

This manuscript is, in general, well organized and clearly written. Regarding
target audience, linguists familiar with the field would not have problems
following the examples and the analysis, if they also possess some reading
skills in all 6 languages that comprise the corpora analyzed. It would be a
more difficult read for hotel owners and managers. Many linguistic terms such
as anaphoric and cataphoric are not immediately defined. Working definitions
for some less well-established terms such as verbal style are also not
provided. In addition, the author uses terms that are not frequently deployed
in the field. For example, when describing the situationality of the text
types, the term perigraphemic is used with no definition provided. A citation
is given (i.e., Schütte, 2004, p. 94), but upon a quick search, this is the
only other reference (besides the present volume) where this term is used.
Providing explicit clear definitions for terms used would not only help
readers, but would encourage future researchers to maintain consistency in the
field by making use of the same term-definition dyads. 

Comments retrieved from the corpora to illustrate concepts and terms are
always shown in the original language with no translation to English. While
the body of the text is in English, citations from sources in German pepper
the manuscript, with no translation or rephrasing provided. While this type of
multilingual approach does not affect global understanding, more nuanced
perspectives are lost for those who do not read German (or any the other 5
languages used). 

Regarding the quantitative analysis itself, which takes most of the book’s
content, only descriptive statistics is used to summarize count data into
percentages. Pietro justifies this approach by stating this is an “efficient
statistical method that can be applied to assess the differences in the
multilingual use of text actions” (p. 50). This claim is not completely
accurate, since without inferential statistical methods no assertions can be
made about differences. 

Some questions also remain regarding the methods used. It is not clear whether
the data were coded only by the author, or if there were other coders, with
measurements of agreement rates and discussion of any disagreements among
coders. Pietro also fails to address the representativeness of the corpus
compiled, or whether the sample collected is an adequate representation of the
text types and the language varieties being analyzed (McEnery, Xiao & Tono,
2006). In fact, the author justifies the suitability of the booking website
corpus by explaining that “the first ten positive comments from the five
highest-rated hotels […] and the first ten negative comments from the five
lowest-rated hotels” (p. 37) were collected. There is no discussion on whether
this type of sample would also hold for all online reviews in this type of
website, or whether simple or stratified random sampling (McEnery, Xiao &
Tono, 2006) would be more appropriate. 

All in all, I would agree that this volume is a good reference on text
linguistic research and practice. The text linguistic tools are clearly
explained and the analysis is well grounded on theory, but the description of
the methods, with respect to both corpus compilation and statistical analysis,
is lacking. Nevertheless, some of the findings are indeed compelling and
extremely relevant to hotel owners and managers. 

REFERENCES

Biber, D., & Conrad, S. (2009). Register, genre, and style. Cambridge
University Press.

McEnery, T., Xiao, R., & Tono, Y. (2006). Corpus-based language studies: An
advanced resource book. Taylor & Francis.

Sandig, B. (1990). Holistic linguistics as a perspective for the nineties.
Text-Interdisciplinary Journal for the Study of Discourse, 10(1-2), 91-96.


ABOUT THE REVIEWER

Adriana Picoral is a PhD student in the Second Language Acquisition and
Teaching program at the University of Arizona. Her research interests include
corpus linguistics, computational linguistics, and technology-enhanced
language teaching. Her work includes research on pedagogical practices built
based on corpus analysis and learner analytics, such as open-learner models.





------------------------------------------------------------------------------

*****************    LINGUIST List Support    *****************
Please support the LL editors and operation with a donation at:
            http://funddrive.linguistlist.org/donate/
 


----------------------------------------------------------
LINGUIST List: Vol-28-3517	
----------------------------------------------------------
Visit LL's Multitree project for over 1000 trees dynamically generated
from scholarly hypotheses about language relationships:
          http://multitree.org/







More information about the LINGUIST mailing list