[Sw-l] Next Steps. Video to Mocap data for signing.

John Carlson yottzumm at gmail.com
Wed Aug 2 04:53:02 UTC 2023


I need a large collection of signing videos to run an experiment converting
video geometry. I do not particularly have large drives to do this, so I
may rent space on a cloud service.

I plan to use python packages cv2 (openCV)
https://pypi.org/project/opencv-python/, cvzone
https://github.com/cvzone/cvzone, and MediaPipe
https://developers.google.com/mediapipe/solutions/guide to convert video
files into geometry and transformations, either BVH (BioVision Hierarchy)
or some other mocap format (HAnim+BVH?).  That is, we are converting signs
and body language to line segments and points, and ultimately sets of
geometry and transformations, and then translating those to something like
English. I do not know if facial expressions are really recognizable or
not.  I may try my hand at lipreading video, IDK.  If the video has sound,
we'll transcribe that.

Ideally, I'll be able to store geometry, transformations and translation
(possibly achieved by transcribing sound or lipreading) along with links to
a video URL.  The step after that is to find a translation from geometry
and transformations to English, and back.

An acquaintance suggested that depth was required but not available,  Elon
Musk says depth is not required for autonomous driving.  IDK, but I want to
find out.

If anyone has already tried this, let me know.  It would be interesting to
convert geometry to SignWriting as well.

I am not sure if SignTube does this automatically, or if it uses human
transcribers.

Any knowledge of a media solution or publically available database that
links all this data would be helpful, too.

If someone wants to provide assistance on this effort, let me know.

John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/sw-l/attachments/20230801/f201b632/attachment.htm>


More information about the Sw-l mailing list