[Corpora-List] Wavelet for NLP

Pascale Fung pascale at cs.ust.hk
Sat Jun 10 17:18:12 UTC 2006


"Time frequency transformation" is basically wavelet transform.

I think you are talking about discrete wavelet transform, which is
bijective, and used for source coding purposes. I used continuous wavelet
transform, which is injective, and used for recognition (or analysis)
purposes.

Discrete wavelet transform is used for coding purposes where you'd be
concerned with recovering the original signal. Whereas in the application
of bilingual word translation, I was interested in recognizing the
patterns. I would say most NLP tasks are recognition rather than coding
tasks.

Nevertheless, in this particular recognition application (of bilingual
word pair extraction) you can still recover the orginal "signal" from the
output of the transformation because the output can only correspond to one
and only one input.

regards,
Pascale

>
> I'm not an expert on signal processing, but are you sure this is
> really a wavelet transformation? It looks more like a sort of time-
> frequency analysis (which does give a nice visualisation!), while
> wavelet transforms are supposed to be bijective.  Can you reconstruct
> the original signal from the time-frequency distribution?
>
> Best wishes,
> Stefan
>
>
> On 9 Jun 2006, at 06:49, Pascale Fung wrote:
>
>>
>> Wavelet was used to find bilingual word pairs from non-parallel
>> corpora in
>> the following paper from 1996.
>>
>> regards,
>> Pascale Fung
>>
>>
>> @inproceedings{ fung96domain,
>>     author = "P. Fung",
>>     title = "Domain Word Translation by Space-Frequency Analysis of
>> Context Length Histograms",
>>     booktitle = "Proc. {ICASSP} '96",
>>     address = "Atlanta, GA",
>>     pages = "184--187",
>>     year = "1996"
>> }
>
>
>



More information about the Corpora mailing list