Hausarbeiten logo
Shop
Shop
Tutorials
De En
Shop
Tutorials
  • How to find your topic
  • How to research effectively
  • How to structure an academic paper
  • How to cite correctly
  • How to format in Word
Trends
FAQ
Go to shop › Computer Science

Generating Instrument Sounds Aligned with Video via Human Body Keypoints

A Deep Learning Approach to Multimodal Audio-Visual Synthesis

Title: Generating Instrument Sounds Aligned with Video via Human Body Keypoints

Research Paper (undergraduate) , 2026 , 34 Pages , Grade: Good

Autor:in: Haruka Okano (Author), Yuichi Sei (Author), Yasuyuki Tahara (Author), Akihiko Ohsuga (Author)

Computer Science

Excerpt & Details   Look inside the ebook
Summary Details

Historical video archives and recordings from the past often suffer from degraded or completely missing audio tracks due to deterioration of storage media, recording limitations of the era, or loss during archival processes. Similarly, silent films and performance documentation may lack synchronized sound entirely. Emerging generative artificial intelligence techniques have demonstrated the potential to reconstruct missing audio content by analyzing visual information alone—a capability particularly valuable for restoring cultural heritage materials and historical performance recordings. However, when applied to complex activities such as musical instrument performance, existing methods have shown limited accuracy in capturing the nuances of sound production. Prior research has established that SpecVQGAN architectures combined with Transformer-based mechanisms can improve video-to-audio generation. This work introduces an enhanced model that augments SpecVQGAN by incorporating human skeletal pose features, specifically designed to elevate the quality of generated musical instrument sounds. Through comprehensive evaluation using both subjective user studies and objective quantitative metrics, we demonstrate that the proposed framework significantly outperforms existing approaches in reconstructing authentic instrumental audio from archival and silent performance videos.

Details

Title
Generating Instrument Sounds Aligned with Video via Human Body Keypoints
Subtitle
A Deep Learning Approach to Multimodal Audio-Visual Synthesis
Grade
Good
Authors
Haruka Okano (Author), Yuichi Sei (Author), Yasuyuki Tahara (Author), Akihiko Ohsuga (Author)
Publication Year
2026
Pages
34
Catalog Number
V1696412
ISBN (eBook)
9783389179826
ISBN (Book)
9783389179833
Language
English
Tags
Deep Learning Audio-Visual Learning Audio Generation Multi-modal
Product Safety
GRIN Publishing GmbH
Quote paper
Haruka Okano (Author), Yuichi Sei (Author), Yasuyuki Tahara (Author), Akihiko Ohsuga (Author), 2026, Generating Instrument Sounds Aligned with Video via Human Body Keypoints, Munich, GRIN Verlag, https://www.hausarbeiten.de/document/1696412
Look inside the ebook
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
Excerpt from  34  pages
Hausarbeiten logo
  • Facebook
  • Instagram
  • TikTok
  • Shop
  • Tutorials
  • FAQ
  • Payment & Shipping
  • About us
  • Contact
  • Privacy
  • Terms
  • Imprint