Return-Path: 
X-Original-To: job-opps-relayxyz-outgoing
Delivered-To: job-opps-relayxyz-outgoing@cs.swarthmore.edu
Received: by allspice.cs.swarthmore.edu (Postfix, from userid 1442)
	id A49111FF5C; Wed, 25 Jan 2006 17:19:55 -0500 (EST)
X-Original-To: job-opps@cs.swarthmore.edu
Delivered-To: job-opps@cs.swarthmore.edu
From: "Charles Kelemen" 
Date: Wed, 25 Jan 2006 17:19:55 -0500
To: job-opps@cs.swarthmore.edu
Subject: [JOB OPP] [sporterfield@jhu.edu: CLSP Summer Workshop Opportunity]
Message-ID: <20060125221955.GA4839@cs.swarthmore.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.9i
Sender: owner-job-opps@cs.swarthmore.edu
Precedence: bulk
Reply-To: "Charles Kelemen" 

Rich knows all about this. 

--charles


----- Forwarded message from Sue Porterfield  -----

To: 'Sue Porterfield' 
From: Sue Porterfield 
Date: Wed, 25 Jan 2006 16:40:58 -0500
Subject: CLSP Summer Workshop Opportunity
X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on 
	allspice.cs.swarthmore.edu
X-Spam-Level: 
X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham 
	version=3.1.0
X-Original-To: cfk@cs.swarthmore.edu
Delivered-To: cfk@cs.swarthmore.edu
X-BrightmailFiltered: true
X-IronPort-AV: i="4.01,218,1136178000"; 
   d="pdf'?scan'208"; a="109573052:sNHT1841678672"
X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.2180
X-Mailer: Microsoft Office Outlook, Build 11.0.6353
Thread-Index: AcUCKN5AiH2pRwL6R+CZxe2pdAv110fyxGGQAACEw8AAAFYqoA==

Dear Colleague: 

The Center for Language and Speech Processing at the Johns Hopkins
University is offering a unique summer internship opportunity, which we
would like you to bring to the attention of your best students in the
current junior class. Only three weeks remain for students to apply for
these internships. 

This internship is unique in the sense that the selected students will
participate in cutting edge research as full members alongside leading
scientists from industry, academia, and the government. The exciting nature
of the internship is the exposure of the undergraduate students to the
emerging fields of language engineering, such as automatic speech
recognition (ASR), natural language processing (NLP), and machine
translation (MT). 

We are specifically looking to attract new talent into the field and, as
such, do not require the students to have prior knowledge of language
engineering technology. Please take a few moments to nominate suitable
bright students who may be interested in this internship. On-line
applications for the program can be found at http://www.clsp.jhu.edu/ along
with additional information regarding plans for the 2006 Workshop and
information on past workshops. The application deadline is February 17,
2006. 

If you have questions, please contact us by phone (410-516-4237), e-mail
sporterfield@jhu.edu or via the Internet http://www.clsp.jhu.edu/

Sincerely,
Frederick Jelinek
J.S. Smith Professor and Director


Project Descriptions for this Summer:

1.  Open Source Toolkit for Statistical Machine Translation
-----------------------------------------
Machine translation research has recently been energized by novel
statistical methods. Now, computers automatically learn how to translate,
say, from Chinese to English by analyzing human translated text and deducing
translation rules. With millions of words of so-called parallel text, we are
able to build machine translation system that are competitive with (or
better than?) commercial products. In this project, we will refine and
advance state-of-the-art methods in an open source toolkit for statistical
machine translation.
We will also develop and test promising new ideas to improve translation
quality and the integration of MT into larger applications. One idea we will
develop is factored translation models: by representing words as feature
vectors that include additional surface-level, syntactic, and semantic
annotation, we will be able to enrich our translation models on many levels
to improve lexical translation, reordering, and fluent output. We will also
work to integrate speech recognition and machine translation technology: we
want our machine translation system to process the ambiguous output of a
speech recognizer, a so-called word lattice. This will enable better speech
translation systems.

2.  Articulatory Feature-based Speech Recognition
---------------------------------------------
Mainstream approaches to automatic speech recognition (ASR) are based on
breaking up words into phone units, much like the pronunciation key in a
dictionary.  This project will explore an alternative approach, based on
recent linguistic theories as well as shortcomings of the phone-based model.
Our models will be based on articulatory features, such as the positions of
the lips and tongue.  We will explicitly represent the multiple streams of
articulatory features in a probabilistic graphical model, a flexible tool
for representing and performing efficient computations in complex
statistical systems.  The model will allow for both asynchrony between the
streams and substitution of canonical feature values with more reduced
values.
As part of this project, we will explore some of the many design issues
involved in building articulatory feature-based ASR systems, such as:
.	the type and amount of inter-feature asynchrony allowed,
.	the modeling of reduced articulations (e.g. the incomplete lip
closure in fast renditions of "probably"),
.	the effect of context (e.g. phonemic, syllabic, or prosodic) on
asynchrony and reductions, and
.	the use of different feature sets and different ways of classifying
those features.  

3.  Joint Modeling of Words and Actions
----------------------------------------------
The project will focus on discovering relations between the patterns of
human communications and activities that people undertake in relation to
these communications. In other words, we aim to connect what people say or
hear to the way they act afterwards. The research effort will focus on two
distinct domains: financial markets and multi-player games. In the financial
domain we will aim to predict investor reaction to events discussed in news,
such as the increased demand for Hewlett-Packard stock that followed an
announcement of CEO Carly Fiorina leaving the company. In the multi-player
domain our goal will be to monitor chat messages between the players and
predict when they are about to engage in a difficult task that requires
collaboration, such as collectively attacking a monster that is too
dangerous for any individual player to attack on their own.
In the course of the workshop we will
.	design filters to identify instances of unusual activity (e.g.
detecting that many game players are converging to the same point on the
map),
.	engineer feature functions that will extract interesting word
patterns and serve as building blocks for predicting human activities, and
.	design and implementing a statistical model for discovering
relations between extracted word patterns and unusual activities.

4.  Creativity in Musical Expression
-------------------------------------------------
Musical interpretation is the link between the composer and listener that
breathes life into a musical score.  Composers encode the music they
conceive in a score, performers decode the score, create a mental model of
the musical ideas and structures, and choose a musical affect, and render
the music expressively in a performance in such a way as to communicate the
simultaneous agendas.  The listener then receives the performer's
interpretation of the score. 
The nature of expressive musical performance has been the subject of
considerable musicological study and opinion, but rarely that of scientific
measurement and analysis.  We propose to analyze and understand musical
expression in piano music using actual measurements of timing and velocity
derived from the performance.   We will relate these data to the musical
score in a way that automatically "explains" a performance agenda in
musically meaningful terms - for example, what are the timing and dynamic
stresses apparent in the performance, and their implied perceptual
groupings.  The results of our proposed study will present the basis for
expressive synthesis of performances from new musical scores.



----- End forwarded message -----
Charles F. Kelemen, Edward Hicks Magill Professor
Chair, Computer Science Department
Swarthmore College 
500 College Avenue			
Swarthmore, PA  19081 
610-328-8515   
cfk@cs.swarthmore.edu
kelemen@swarthmore.edu
________________________________________________________________________