22.812, Jobs: Computational Linguistics: Programmer, Tioga Lake Consulting

linguist at LINGUISTLIST.ORG linguist at LINGUISTLIST.ORG
Fri Feb 18 10:54:03 UTC 2011


LINGUIST List: Vol-22-812. Fri Feb 18 2011. ISSN: 1068 - 4875.

Subject: 22.812, Jobs: Computational Linguistics: Programmer, Tioga Lake Consulting

Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>
 
Reviews: Veronika Drake, U of Wisconsin-Madison  
Monica Macaulay, U of Wisconsin-Madison  
Eric Raimy, U of Wisconsin-Madison  
Joseph Salmons, U of Wisconsin-Madison  
Anja Wanner, U of Wisconsin-Madison  
       <reviews at linguistlist.org> 

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, 
and donations from subscribers and publishers.

Editor for this issue: Erin Smith <erin at linguistlist.org>
================================================================  

The LINGUIST List strongly encourages employers to engage in non-discriminatory 
hiring practices. We urge employers not to discriminate on the grounds of race, 
ethnicity, nationality, disability, age, religion, gender, or sexual orientation.
However, we have no means of enforcing these standards.

Job seekers should pay special attention to language in ads regarding
employment requirements and are encouraged to consult our international
employment page http://linguistlist.org/jobs/jobnet.html. This page has been set 
up so that people can report on the employment standards of various countries.

To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.cfm.

===========================Directory==============================  

1)
Date: 15-Feb-2011
From: Brian Buchanan [brian.buchanan at gmail.com]
Subject: Computational Linguistics: Programmer, Tioga Lake Consulting, LLC
 

	
-------------------------Message 1 ---------------------------------- 
Date: Fri, 18 Feb 2011 05:53:03
From: Brian Buchanan [brian.buchanan at gmail.com]
Subject: Computational Linguistics: Programmer, Tioga Lake Consulting, LLC

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=22-812.html&submissionid=4497175&topicid=7&msgnumber=1
  


University or Organization: Tioga Lake Consulting, LLC 
Job Rank: Programmer  

Specialty Areas: Computational Linguistics 


Description:

Contract Developer - Web Content Crawler Project

Contract project -- 12 weeks minimum

Seeking a programmer or computational linguist to create heuristics for
extracting certain information from small/medium business websites, such as
business name, description, contact information, hours of operation,
restaurant menus, etc.

The web crawling / HTML parsing framework for this project is already in
place.  The objective of this contract is the development and refinement of
the actual code for extracting the required data from each website and
converting it to a structured format.

To be considered for this contract, a candidate need not be an expert
programmer; however, basic programming ability and familiarity with
Javascript or Ruby is required.  The ideal candidate has worked on a 
previous web crawling project involving data aggregation and already has an
intuitive sense about how to approach this problem.

Job description:
-Review websites and create training data set.
-Program heuristics to extract specified data items from web pages and
perform aggregate analysis of websites.
-Load heuristics into website analyzer and run analyzer on the training
data set.
-Compare output of website analyzer with expected results from the training
set.
-Identify problems with the heuristics and examine the affected websites to
determine the cause.
-Develop new possible heuristics for improving the accuracy of the website
analyzer.
-Program new heuristics and repeat the review process.
-Create algorithms for estimating the accuracy of each heuristic for a
given webpage.

Required qualifications:
-Working competence with the Javascript programming language
-Excellent working knowledge of web technologies (HTML, etc.)
-Experience working with structured data & aggregation

Desired qualifications:
-Background in statistics and/or machine learning (e.g. Bayesian filtering)
-Background in computational linguistics
-Working knowledge of Ruby, C++, or at least one other programming language
-Experience with UNIX command-line tools (e.g. using the command shell on
Linux or MacOS X)


Application Deadline: 28-Feb-2011 
	  
Email Address for Applications: brian.buchanan at gmail.com 
Contact Information:
	Brian Buchanan 
	Email: brian.buchanan at gmail.com 



-----------------------------------------------------------
LINGUIST List: Vol-22-812	
----------------------------------------------------------


	



More information about the LINGUIST mailing list