Batchalign 2 library question: Running forced alignment on an existing transcript.

Mon Sep 2 17:15:00 UTC 2024

Dear Houjun,
Thanks for your fast response. I can see what I was doing wrong before.   I've
tried running that line of code you gave me:

chat = ba.CHATFile(path = "input.cha")
doc = chat.doc
nlp = ba.BatchalignPipeline.new("fa", lang="eng", num_speakers=2)
doc = nlp(chat)
final = ba.CHATFile(doc=doc)
final.write("final.cha")

but I get a very lengthy error output. The error output that I get is
actually very similar to that when I tried to run the program using the
Whisper Engine before. The main error seems to be:

 2572         raise ValueError(   2573             "You cannot cast a
GPTQ model in a new `dtype`. Make sure to load the model using
`from_pretrained` using the desired"   2574             " `dtype` by
passing the correct `torch_dtype` argument."

But here's the full error in case you may want to see it. (Apologies
if I shouldn't include here)

Thank you once again for your time and availability.

Kind Regards,

RuntimeError                              Traceback (most recent call last)
Cell In[37], line 3      1 chat = ba.CHATFile(path =
"Desktop/Batchallign Capstone/P800FATSTR.cha")      2 doc =
chat.doc----> 3 nlp = ba.BatchalignPipeline.new("fa", lang="eng",
num_speakers=2)      4 doc = nlp(chat)       5 final =
ba.CHATFile(doc=doc)

File /opt/anaconda3/lib/python3.11/site-packages/batchalign/pipelines/pipeline.py:60,
in BatchalignPipeline.new(tasks, lang, num_speakers, **arg_overrides)
   37 """Create the pipeline.     38      39 Parameters   (...)     56
    The pipeline to run.     57 """     59 from
batchalign.pipelines.dispatch import dispatch_pipeline---> 60 return
dispatch_pipeline(tasks, lang=lang, num_speakers=num_speakers,
**arg_overrides)

File /opt/anaconda3/lib/python3.11/site-packages/batchalign/pipelines/dispatch.py:115,
in dispatch_pipeline(pkg_str, lang, num_speakers, **arg_overrides)
113     engines.append(NgramRetraceEngine())    114 elif engine ==
"whisper_fa":--> 115     engines.append(WhisperFAEngine())    116 elif
engine == "whisper_utr":    117
engines.append(WhisperUTREngine(lang=lang))

File /opt/anaconda3/lib/python3.11/site-packages/batchalign/pipelines/fa/whisper_fa.py:29,
in WhisperFAEngine.__init__(self, model)     26 if model == None:
27     model = "openai/whisper-large-v2"---> 29 self.__whisper =
WhisperFAModel(model)

File /opt/anaconda3/lib/python3.11/site-packages/batchalign/models/whisper/infer_fa.py:44,
in WhisperFAModel.__init__(self, model, target_sample_rate)     42 def
__init__(self, model="openai/whisper-large-v2",
target_sample_rate=16000):     43     L.debug("Initializing whisper FA
model...")---> 44     self.__model =
WhisperForConditionalGeneration.from_pretrained(model,
attn_implementation="eager").to(DEVICE)     45     self.__model.eval()
    46     L.debug("Done, initalizing processor and config...")

File /opt/anaconda3/lib/python3.11/site-packages/transformers/modeling_utils.py:2576,
in PreTrainedModel.to(self, *args, **kwargs)   2571     if
dtype_present_in_args:   2572         raise ValueError(   2573
    "You cannot cast a GPTQ model in a new `dtype`. Make sure to load
the model using `from_pretrained` using the desired"   2574
 " `dtype` by passing the correct `torch_dtype` argument."   2575
   )-> 2576 return super().to(*args, **kwargs)

File /opt/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py:1160,
in Module.to(self, *args, **kwargs)   1156         return t.to(device,
dtype if t.is_floating_point() or t.is_complex() else None,   1157
                non_blocking, memory_format=convert_to_format)   1158
   return t.to(device, dtype if t.is_floating_point() or
t.is_complex() else None, non_blocking)-> 1160 return
self._apply(convert)

File /opt/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py:810,
in Module._apply(self, fn, recurse)    808 if recurse:    809     for
module in self.children():--> 810         module._apply(fn)    812 def
compute_should_use_set_data(tensor, tensor_applied):    813     if
torch._has_compatible_shallow_copy_type(tensor, tensor_applied):
814         # If the new tensor has compatible tensor type as the
existing tensor,    815         # the current behavior is to change
the tensor in-place using `.data =`,   (...)    820         # global
flag to let the user control whether they want the future    821
  # behavior of overwriting the existing tensor or not.

File /opt/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py:833,
in Module._apply(self, fn, recurse)    829 # Tensors stored in modules
are graph leaves, and we don't want to    830 # track autograd history
of `param_applied`, so we have to use    831 # `with torch.no_grad():`
   832 with torch.no_grad():--> 833     param_applied = fn(param)
834 should_use_set_data = compute_should_use_set_data(param,
param_applied)    835 if should_use_set_data:

File /opt/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py:1158,
in Module.to.<locals>.convert(t)   1155 if convert_to_format is not
None and t.dim() in (4, 5):   1156     return t.to(device, dtype if
t.is_floating_point() or t.is_complex() else None,   1157
   non_blocking, memory_format=convert_to_format)-> 1158 return
t.to(device, dtype if t.is_floating_point() or t.is_complex() else
None, non_blocking)
RuntimeError: MPS backend out of memory (MPS allocated: 8.99 GB, other
allocations: 86.54 MB, max allowed: 9.07 GB). Tried to allocate 1.17
MB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to
disable upper limit for memory allocations (may cause system failure).

On Mon, 2 Sept 2024 at 09:08, Jean Quigley <jeanquigley380 at gmail.com> wrote:

> *Dear All, *
> *I've been trying to audio-link existing CHAT transcripts using the
> batchalign 2 library. I was able to successfully read the transcript into
> the program using the following code: *
>
>
>
> *chat = ba.CHATFile(path = "path.cha")doc = chat.doc*
>
> *However, I've been unable to run a force alignment without first running
> the ASR. The problem is that when running the ASR again, it changes the
> transcripts and introduces errors into it. Would you know what line of code
> I could use to audio link the transcript without overriding it? *
>
> *Many thanks*
> *Jean Quigley*
>
>

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CAKHy7HDuQ8An%2Ba-i%3DD5SXu8eNZBoOCfwTAaqjPfub%2Bm0aOa3nQ%40mail.gmail.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20240902/fec99004/attachment-0001.htm>