[Corpora-List] Quotable Statistics on Unstructured Data on the WWW
Trevor Jenkins
trevor.jenkins at suneidesis.com
Fri Dec 6 13:17:55 UTC 2013
On 6 Dec 2013, at 12:12, Daniel Gerber <dgerber at informatik.uni-leipzig.de> wrote:
> Hallo Adam,
>
> On 06.12.2013, at 12:45, Adam Kilgarriff <adam at lexmasterclass.com> wrote:
>
>> I always squirm when I hear text referred to as unstructured data. (Daniel - I see you do too, from the '(semi-)'.) It feels like a teenager declaring everyone over 25 as old.
>
> As what do you see text then? Yes, I typically refer to text as being unstructured, tables and so on as semi structured und databases as structured.
Can't speak for Adam Kilgarriff but I see text as structured with individual glyphs forming words, words forming sentences, sentences forming paragraphs, paragraphs forming chapters, chapters forming books. And a variety of similar structures.
I see databases and their internal tables as over-restrictively based on a highly biased perception of data and information. Relational databases are not the only solution. I worked for many years as a "database" consultant who could just as easily recommend a text database as a relational, hierarchical or network solution. One of these database organisation may be a better "fit" in a particular situation but relationalism is /not/ the panacea certain software suppliers sell it to be.
Regards, Trevor.
<>< Re: deemed!
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list