Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Capturing and Reproducing Spatial Sound: Physic...

Capturing and Reproducing Spatial Sound: Physics-based Approach to VR/AR audio

Keynote talk at MIMSVAI 2024 (Workshop in conjunction with Ubicomp/ISWC 2024)

NII S. Koyama's Lab

October 05, 2024
Tweet

More Decks by NII S. Koyama's Lab

Other Decks in Research

Transcript

  1. Capturing and Reproducing Spatial Sound: Physics-based Approach to VR/AR audio

    Shoichi Koyama National Institute of Informatics, Tokyo, Japan
  2. Spatial Audio in VR/AR Ø Why is spatial audio crucial

    in VR/AR? – Enhancing immersion and realism by creating 3D sound environment – Improving interaction and situational awareness by reproducing specific direction of sound – Reducing visual load by providing auditory cues of virtual objects or events October 5, 2024 2 Long history of research in the areas of psychoacoustics, physical acoustics, signal processing, and machine learning
  3. Spatial Hearing Ø How do humans perceive spatial sound? –

    Interaural time difference (ITD): Difference in arrival time of sound between two ears – Interaural level difference (ILD): Difference in sound intensity between two ears – Spectral cue: Spectrum of sound affected by reflections on pinnae October 5, 2024 3 Combination of some auditory cues generates sound images in psychological space of humans Perception of horizontal/vertical direction of sound is made possible by the above cues
  4. Spatial Hearing Ø Head-Related Transfer Functions (HRTFs) play an important

    role in characterizing auditory cues – HRTF: Transfer characteristics of sound from a source to two ears October 5, 2024 4 HRTF from source to right ear Transfer characteristics to entrance of ear canal or eardrum Transfer characteristics to the head center without the listener Source <latexit sha1_base64="MF+pUfqMwlGZkDHhn/c0jLJrTas=">AAADcHicdZLNihNBEIB7M/6s8S+rIIIHW4Ow7iHMqKAXIayXPbiwgtldyIRQ06lJmvTP0N2jieP4MF71hXwNn8CeJKtmMyloKL7qj+ouKskEty4Mf+00gitXr13fvdG8eev2nbutvXunVueGYY9poc15AhYFV9hz3Ak8zwyCTASeJdN3Vf3sExrLtfro5hkOJIwVTzkD59Gw9eBoWMQS3MTI4n1Z7sda4hieD1vtsBMugm4m0Sppk1WcDPcah/FIs1yickyAtf0ozNygAOM4E1g249xiBmwKY+z7VIFEOygWHyjpM09GNNXGH+Xogv5vFCCtncvE36zeai/XKlhX6+cufTMouMpyh4otG6W5oE7Tahp0xA0yJ+Y+AWa4fytlEzDAnJ/ZWpdE+j8o/My0lKBGRQxmLGFWLqensyI2knr2rYKx4JI7W2NwVWN4uNWwebZpVHCrwVW6aVSw3vDdPf6C685fukWCWZ10Qf9JEqYIfiudH2gzZt1jmB2DM3zmV9O+jV5e3NBugsZvXXR5xzaT0xedKOxEH161u4er/dslj8hTsk8i8pp0yRE5IT3CyFfynfwgPxu/g4fB4+DJ8mpjZ+XcJ2sRHPwBclYo0w==</latexit> <latexit sha1_base64="pjuWapq3cPUmPMQbKDZkyW+YIpU=">AAADcHicdZLNihNBEIB7M/6s8S+rIIIHW4Ow7iHMqKAXIayXvSysYnYXMiHUdGqSJv0zdPdo4jg+jFd9IV/DJ7AnyarZTAoaiq/6o7qLSjLBrQvDXzuN4MrVa9d3bzRv3rp9525r796p1blh2GNaaHOegEXBFfYcdwLPM4MgE4FnyfRdVT/7hMZyrT66eYYDCWPFU87AeTRsPTgaFrEENzGy+FCW+7GWOIbnw1Y77ISLoJtJtEraZBUnw73GYTzSLJeoHBNgbT8KMzcowDjOBJbNOLeYAZvCGPs+VSDRDorFB0r6zJMRTbXxRzm6oP8bBUhr5zLxN6u32su1CtbV+rlL3wwKrrLcoWLLRmkuqNO0mgYdcYPMiblPgBnu30rZBAww52e21iWR/g8KPzMtJahREYMZS5iVy+nprIiNpJ59q2AsuOTO1hhc1RgebjVsnm0aFdxqcJVuGhWsN3x3j7/guvOXbpFgVidd0H+ShCmC30rnB9qMWfcYZsfgDJ/51bRvo5cXN7SboPFbF13esc3k9EUnCjvR+1ft7uFq/3bJI/KU7JOIvCZdckROSI8w8pV8Jz/Iz8bv4GHwOHiyvNrYWTn3yVoEB38AhFAo2Q==</latexit> <latexit sha1_base64="BaMzpTEc9SqCjCT7bcr2fM316ho=">AAADqXicdZJbixMxFICzHS9rvXX10ZdgEVaEMqOCviyUFaQvi7vLdrfYKeVMeqYNnSRDktHWYfxX/hjxVf+HmV5Wu50GAiffycdJDidKE26s7//cq3m3bt+5u3+vfv/Bw0ePGwdPLo3KNMMuU4nSvQgMJlxi13KbYC/VCCJK8CqafijzV19QG67khZ2nOBAwljzmDKxDw8anUICdaJF3zi8+FsN8fTwvisNQCRzDS3pEw1gDyzuV6cJx//owbDT9lr9YdDsIVkGTrNbp8KB2HI4UywRKyxIwph/4qR3koC1nCRb1MDOYApvCGPsulCDQDPLFzwv6wpERjZV2W1q6oP8bOQhj5iJyN8uXm5u5Elbl+pmN3w9yLtPMomTLQnGWUKto2UY64hqZTeYuAKa5eytlE3Bdsq7ZG1Ui4f4g8StTQoAc5SHosYBZseylSvNQC+rY9xKGCRfcmgqDywrDwZ2GydJto4Q7DS7jbaOE1Yar7vA33HSu6Q4JZlXSmv6TBEwR3Dhb19B6yNonMDsBq/nMzbQ5Ct6sbyg7Qe2mLrg5Y9vB5etW4LeCs7fN9vFq/vbJM/KcHJKAvCNt0iGnpEsY+UF+kd/kj/fKO/N63ufl1dreynlKNpbH/gIcCj9T</latexit>
  5. Spatial Audio Reproduction Techniques Ø Perception-based techniques – Stereophonic/surround sound

    • Presenting spatial sound image based on auditory effect called summing localization Ø Physics-based techniques – Binaural reproduction • Synthesizing binaural sounds by using HRTF or its proxy – Sound field reproduction • Synthesizing physical sound field by using multiple loudspeakers October 5, 2024 5 Classification of spatial audio reproduction techniques
  6. Stereophonic/Surround Sound Ø Presenting spatial sound images by controlling time

    and/or level difference of multiple loudspeakers – 2ch stereo, 5.1ch surround, and others October 5, 2024 6 Synthesized sound image Single sound image is perceived if two identical signals arrive at delay < 1ms (summing localization)
  7. Binaural Reproduction Ø Synthesizing binaural sound by convoluting source signal

    and HRTF – Given source position, source signal, and HRTF at that position, binaural signal from that source can be accurately synthesized October 5, 2024 7 HRTF Binaural sound Source signal
  8. Sound Field Reproduction Ø Synthesizing physical sound field by using

    multiple loudspeakers – Can be applied in cases when there are multiple listeners or when listener is moving by achieving large listening area October 5, 2024 8
  9. Spatial Audio Reproduction Techniques Ø Perception-based techniques – Pros: •

    Possible to achieve with small-scale systems • Easy to deform or modify – Cons: • “Sound design” is necessary (not physical reproduction) • Effective only if listener is at the center of loudspeakers (sweet spot) Ø Physics-based techniques – Pros: • High fidelity through physical reproduction • Large listening area applicable to multiple moving listeners – Cons: • High cost for HRTF measurement or loudspeaker array • HRTF individualization is still unsolved problem for binaural synthesis October 5, 2024 9
  10. Physics-based Spatial Audio Processing October 5, 2024 10 Microphone Loudspeaker

    Sound field capturing Acoustic simulation Sound field reproduction Binaural reproduction https://www.comsol.com Representation Reproduction
  11. Sound Field Capturing Problem October 5, 2024 11 Mathematical formulation

    of sound field capturing/estimation Estimate pressure distribution with observations at discrete set of mics in the frequency domain Microphone Target region:
  12. Sound Field Capturing Problem Ø General interpolation techniques – Representing

    by model parameters as – Solve the optimization problem October 5, 2024 12 Microphone Target region: Mathematical formulation of sound field capturing/estimation <latexit sha1_base64="QB9up2Xq4KtGilt/H+nRpzcDgKk=">AAADJnicfVLdahQxFM6MP63rT3f10pvgImx1WWZEqiBCUS+8sFjFbQubdchkz8yGJjNDkhHXkKfwJXwa70S80zcxM93qtrUeCHx85zv/SSvBtYmiH0F44eKly2vrVzpXr12/sdHt3dzTZa0YjFkpSnWQUg2CFzA23Ag4qBRQmQrYTw+fN/79D6A0L4t3ZlHBVNK84Bln1Hgq6VoiecEl/wSJJam0xMzBUOcwkdTMGRX2lSMCMjPAxOJ60GiUS+QTvKLexMQlVj6N3fudYevQPoHi+dxs4vt/U711g9WopNuPRlFr+CyIl6CPlrab9ILPZFayWkJhmKBaT+KoMlNLleFMgOuQWkNF2SHNYeJhQSXoqW235PBdz8xwVir/CoNbdjXCUqn1QqZe2TSsT/sa8l++SW2yx1PLi6o2ULCjQlktsClxs3I84wqYEQsPKFPc94rZnCrKjD9M50SZVA6b/Epn2k/zAvyUCnY887oCRU2p7llCVS7pR+enzsmwQf8T8uJY6NF5Qp+k/QLO/kHnSpe/xUuPkT9jfPpoZ8Heg1G8Ndp687C//Wx50HV0G91BAxSjR2gbvUS7aIwY+hWsBd2gF34Jv4bfwu9H0jBYxtxCJyz8+RuXLAcP</latexit> minimize ✓ L {u(rm; ✓)}M m=1 , s + R(✓) Loss function for observation Regularization term for <latexit sha1_base64="ru0OGkVwXcC9n3iBv3AvtKvDAnk=">AAACtHicfZFdSxtBFIYnW1vTaFttL71ZDAURCbtSopeh7YU3RQtGhWyQs5OzyeB8LDNnxbjsryi9rb+r/6azMRE/e2Dg4T3vzJyPNJfCURT9bQSvll6/WW6+ba2svnv/YW3944kzheXY50Yae5aCQyk09kmQxLPcIqhU4ml68a3On16idcLoY5rmOFQw1iITHMhLgyRVZUITJKjO19pRJ5pF+BTiObTZPI7O1xu/kpHhhUJNXIJzgzjKaViCJcElVq2kcJgDv4AxDjxqUOiG5azmKvzslVGYGeuPpnCm3r9RgnJuqlLvVEAT9zhXi8/lBgVl+8NS6Lwg1Pz2o6yQIZmwHkA4EhY5yakH4Fb4WkM+AQuc/JhaD75J1U79vnWZ8918R9+lxR9eOczRAhm7XSZgxwquKt/1ONmp6X9GoRdGTy8Z/SNCiWusyjt60Sr0wrogv8b48dKewsluJ+52uj+/tHtf5wttsg22ybZYzPZYjx2wI9ZnnBn2m/1hN0E3SAIe4K01aMzvfGIPItD/AKw13BQ=</latexit> ✓ <latexit sha1_base64="wnDK9ytRrHjhZwyUfqPJAj2Ykik=">AAAC03icfZHLahRBFIZrOlHjeJvo0k3hIAQZhm6R6EYIxoWbYIRMEphum9M1pydF6tJUVUsmRW/ErW58Fbf6Hr6N1ZOZYBLjgYKP//xVdS5FJbh1cfy7E62s3rh5a+12987de/cf9NYf7ltdG4YjpoU2hwVYFFzhyHEn8LAyCLIQeFAcb7f5g09oLNdqz80qzCRMFS85AxekvLeRFtLbhr6mY5snA5qKiXZ2QG2+k330qQR3ZEu/1zR5rx8P43nQq5AsoE8WsZuvd76nE81qicoxAdaOk7hymQfjOBPYdNPaYgXsGKY4DqhAos38vKWGPg3KhJbahKMcnat/3/AgrZ3JIjjnNV7OteK/cuPala8yz1VVO1Ts7KOyFtRp2s6HTrhB5sQsADDDQ62UHYEB5sIUuxe+KeSgfd/Y0oZu3mLo0uBOUN5XaMBp88ynYKYSTprQ9TQdtPQ/I1dLY6DrjOERLvkpNv6crrVytbQuKawxuby0q7D/fJhsDjc/vOhvvVksdI08Jk/IBknIS7JF3pFdMiKMfCM/yE/yKxpFPvocfTmzRp3FnUfkQkRf/wAenuda</latexit> s = [s1, . . . , sM ]T <latexit sha1_base64="bgfM3WFHH+06JgGQdjycQWMo7ww=">AAACwXicfZHBbhMxEIadhUIILU1A6oWLRVSprapoF6GCxKWiPfSCaKWmrZSNollnNrFqe1f2LCIs+xS8AVd4Id6m3jRBtKWMZPnTP//YHk+SK+koDH83ggcPVx49bj5pPV1de7be7jw/c1lhBfZFpjJ7kYBDJQ32SZLCi9wi6ETheXJ5UOfPP6N1MjOnNMtxqGFiZCoFkJdG7Y1iK050aav3vN5jmiJBtT1qd8NeOA9+F6IFdNkijkedxvd4nIlCoyGhwLlBFOY0LMGSFAqrVlw4zEFcwgQHHg1odMNy3kDFN70y5mlm/TLE5+rfFSVo52Y68U4NNHW3c7X4r9ygoPTdsJQmLwiNuL4oLRSnjNe/wcfSoiA18wDCSv9WLqZgQZD/s9aNaxK9W59vXep8N4fou7T40SufcrRAmd0pY7ATDV8q3/Uk3q3pf0ZplkZP9xn9IVLLr1iVf+heqzRL65L8GKPbQ7sLZ6970V5v7+RNd//DYqBN9pK9YlssYm/ZPjtix6zPBPvGfrCf7FdwEMggD+y1NWgsal6wGxGUV4wM4Gk=</latexit> u(r; ✓) <latexit sha1_base64="yLksvcVXn4jdtyNYM7Soz6yW9Ko=">AAACqnicfZHbSgMxEIbT9VxPrV56s1gEkVJ2RaqXRb3wRjxgq9IWmU1nazDJLklWrEufwEv14Xwbs7UVrYeBwMc//yQzmSDmTBvPe8s5E5NT0zOzc/n5hcWl5UJxpaGjRFGs04hH6ioAjZxJrBtmOF7FCkEEHC+Du4Msf3mPSrNIXphejG0BXclCRsFY6Sy5KZS8ijcI9yf4QyiRYZzeFHPPrU5EE4HSUA5aN30vNu0UlGGUYz/fSjTGQO+gi02LEgTqdjrotO9uWKXjhpGyRxp3oH6tSEFo3ROBdQowt3o8l4m/5ZqJCffaKZNxYlDSj4fChLsmcrOx3Q5TSA3vWQCqmO3VpbeggBr7OflvzwSinN2vdKjtNIdop1R4bJWTGBWYSG2lLVBdAQ99O3W3Vc7oPyOTI6Olv4z2EibYI/bTT/rTyuTIOiK7Rn98aT+hsV3xq5Xq2U6ptj9c6CxZI+tkk/hkl9TIETkldUIJkifyQl6dsnPuXDvND6uTG9askm/hdN4Bu6fXkA==</latexit> u <latexit sha1_base64="ru0OGkVwXcC9n3iBv3AvtKvDAnk=">AAACtHicfZFdSxtBFIYnW1vTaFttL71ZDAURCbtSopeh7YU3RQtGhWyQs5OzyeB8LDNnxbjsryi9rb+r/6azMRE/e2Dg4T3vzJyPNJfCURT9bQSvll6/WW6+ba2svnv/YW3944kzheXY50Yae5aCQyk09kmQxLPcIqhU4ml68a3On16idcLoY5rmOFQw1iITHMhLgyRVZUITJKjO19pRJ5pF+BTiObTZPI7O1xu/kpHhhUJNXIJzgzjKaViCJcElVq2kcJgDv4AxDjxqUOiG5azmKvzslVGYGeuPpnCm3r9RgnJuqlLvVEAT9zhXi8/lBgVl+8NS6Lwg1Pz2o6yQIZmwHkA4EhY5yakH4Fb4WkM+AQuc/JhaD75J1U79vnWZ8918R9+lxR9eOczRAhm7XSZgxwquKt/1ONmp6X9GoRdGTy8Z/SNCiWusyjt60Sr0wrogv8b48dKewsluJ+52uj+/tHtf5wttsg22ybZYzPZYjx2wI9ZnnBn2m/1hN0E3SAIe4K01aMzvfGIPItD/AKw13BQ=</latexit> ✓
  13. Sound Field Capturing Problem Ø Function to be interpolated should

    satisfy Helmholtz eq – Homogeneous Helmholtz eq in source-free target region – Conventional approach: expansion into element solutions of Helmholtz eq • Plane wave expantion (or Herglotz wave function) • Spherical wave function expansion • Equivalent source distribution (or single layer potential) October 5, 2024 13 What kind of physical properties can be embedded? <latexit sha1_base64="3MQnLgSPT9UXBsXZ4NqNl8viJ5k=">AAACynicfZHPThsxEMadhbY0/Rfg2EOtRpVoi6LdCAEXJEQ59FBUKjWAlA3RrDMbrNjele1FBGtv9Al4DK70Yfo29YakKlA6kqWfvvnG9swkueDGhuGvWjA3/+jxk4Wn9WfPX7x81VhcOjBZoRl2WCYyfZSAQcEVdiy3Ao9yjSATgYfJ6FOVPzxFbXimvttxjj0JQ8VTzsB6qd94sxIrSAT0XZxIp8vyuE0/0tFx+z0t6BYN+41m2AonQe9DNIUmmcZ+f7F2GQ8yVkhUlgkwphuFue050JYzgWU9LgzmwEYwxK5HBRJNz006Kek7rwxomml/lKUT9e8KB9KYsUy8U4I9MXdzlfivXLew6WbPcZUXFhW7eSgtBLUZrcZCB1wjs2LsAZjm/q+UnYAGZv3w6reeSeRqdb82qfHd7KLvUuOeV77mqMFm+oOLQQ8lnJW+62G8WtH/jFzNjJ4eMvpLuOTnWLo/9KCVq5l1Rn6N0d2l3YeDditab61/W2tu70wXukBek7dkhURkg2yTz2SfdAgjP8gVuSY/gy+BDsaBu7EGtWnNMrkVwcVvo0niRw==</latexit> (r2 r + k2)u = 0 § Simple array geometry (e.g., spherical array) is required § Truncation/discretization of expansion representation is necessary
  14. Kernel Interpolation of Spatial Sound Ø Problem to be solved

    October 5, 2024 14 Estimation with constraint that the interpolated function satisfies Helmholtz eq [Ueno+ IEEE SPL 2018, IEEE TSP 2021] Solution space of Helmholtz eq § If is properly defined, this problem has closed-form solution § Can be regarded as infinite-dimensional basis expansion <latexit sha1_base64="CCNheoHdwk+3+XQETj9IXlSAYTs=">AAACtHicfZHbSsNAEIa38VzPeulNsAgiUhKR6qWoF96IClaFpshkO6mLewi7G7GGPIV4q8/l27iprXgeWPj459/ZmZ045czYIHiteCOjY+MTk1PV6ZnZufmFxaULozJNsUkVV/oqBoOcSWxaZjlepRpBxBwv49uDMn95h9owJc9tL8W2gK5kCaNgndSKBNgbQ3V+VFwv1IJ60A//J4QDqJFBnF4vVh6jjqKZQGkpB2NaYZDadg7aMsqxqEaZwRToLXSx5VCCQNPO+z0X/ppTOn6itDvS+n31840chDE9ETtnv8fvuVL8LdfKbLLbzplMM4uSvj+UZNy3yi8/wO8wjdTyngOgmrlefXoDGqh131T98kwsNsv62iTGTXOIbkqNx045SVGDVXojj0B3BdwXbuputFnSf0Ymh0ZHfxldESbYAxb5B/1pZXJoHZJbY/h9aT/hYqseNuqNs+3a3v5goZNkhaySdRKSHbJHjsgpaRJKFHkiz+TFa3iRRz18t3qVwZ1l8iU8+QacaNwN</latexit> H Arbitrary array geometry, no truncation/discretization necessary
  15. Kernel Interpolation of Spatial Sound Ø Unique solution with closed-form

    for reproducing kernel Hilbert space (RKHS) – Based on representer theorem, the solution is represented by weighted sum of reproducing kernel function : – Vector of is obtained by with October 5, 2024 15 Estimation is performed by convoluting FIR filter in time domain : Gram matrix <latexit sha1_base64="zt46LX9gSaZcNDUXQVGZZLwj3Zc=">AAADdnicjZHditNAFMcnjR9r/OrqpSCDZXVXSklEVm+ERb0QpLiC3S40pZ5MTrNDM5MwM1m2hjyFj+UT+CZeOmmbxXateGDgx//858w5c6I85dr4/k+n5V67fuPmzi3v9p279+63dx+c6KxQDAcsSzN1GoHGlEscGG5SPM0VgohSHEazd3V+eI5K80x+MfMcxwISyaecgbHSpP0jjET5sfLeUC+MMOGyjAQYxS8qL5xBnsN+bVDVJOjShg7oUxqyODO6hi2u/gENQy88b2xxA0ulzq3d7P9X/f5mfZTxZceTdsfv+YugVyFYQYes4niy63wP44wVAqVhKWg9CvzcjEtQhrMU7RcUGnNgM0hwZFGCQD0uF79e0T2rxHSaKXukoQv1zxslCK3nIrJO2+CZ3szV4t9yo8JMX49LLvPCoGTLh6ZFSk1G6xXSmCtkJp1bAKa47ZWyM1DAjF20t/ZMJLp1faWn2k7zHu2UCvtW+ZSjApOp52UIKhFwUdmpk7Bb07+MXDZGS9uMtggX/BtW5SVttXLZWBuyaww2l3YVTl70gsPe4eeXnaO3q4XukEfkCdknAXlFjsgHckwGhDmBM3S+OtD65T5299xnS2vLWd15SNbC9X8DK8EdQA==</latexit> K = 2 6 4 (r1, r1) · · · (r1, rM ) . . . ... . . . (rM , r1) · · · (rM , rM ) 3 7 5
  16. Kernel Interpolation of Spatial Sound Ø RKHS based on plane

    wave expansion Ø Inner product and norm over using directional weighting October 5, 2024 16 How to design RKHS? Prior information on directions of high amplitude (e.g., source directions) can be incorporated <latexit sha1_base64="isRGdCdgAyvToZm2G6zcexSm7tk=">AAADfnicfVFbb9MwGHUWLqPcOnjkxaICdagrTTWN8YA0DR72AGIIuk2qS+S4X1LT2IlsByjGv4KfxhP/Bjdrq10YnxTp5Hznu/kkZc616fX+BGvhtes3bq7faty+c/fe/ebGgyNdVIrBgBV5oU4SqiHnEgaGmxxOSgVUJDkcJ9PX8/zxV1CaF/KTmZUwEjSTPOWMGk/Fzd9EUDPRTNkDh19hkkNqiMUkVZTZyNltUnKHCZcmtrU0SexHF/c9Z3g+Blu5NkmEJWCo28S1RAkL7rPdWv58cVPfmMosB7zSdmqofB9VZ9yqduxWIkww2edZ9nMOlvPm2+C3cb99dp8O/rZJFM8mhri42ep1e3XgyyBagBZaxGG8Efwi44JVAqRhOdV6GPVKM7JUGc78ag1SaSgpm9IMhh5KKkCPbP36Dj/xzBinhfKfNLhmz1ZYKrSeicQr67e+mJuT/8oNK5PujiyXZWVAstNBaZVjU+C5lXjMFTCTzzygTHG/K2YT6o0z3vDGuTGJ6NSvq1Ptr3kD/koF7zzzvgRFTaGeWUJVJuh356/OSGeO/ifkcin06Cqhb8IF/wHOrtCVUi6X0iXyNkYXTbsMjvrdaKe782G7tbe/MHQdPUKPURtF6AXaQwfoEA0QC14GcTAJeIjCp+FW+PxUuhYsah6icxHu/gUg4iX8</latexit> H = ( 1 4⇡ Z S 2 ˜ u(⌘)e jkh⌘,rid⌘ ˜ u 2 L2(S 2, w) )
  17. Kernel Interpolation of Spatial Sound Ø Kernel function for based

    on von Mises‒Fisher distribution Ø When no prior information, i.e., uniform weight , October 5, 2024 17 with How to design RKHS?
  18. Kernel Interpolation of Spatial Sound Ø Kernel ridge regression –

    : Vector and matrix consisting of kernel function Ø Kernel function for constraint of Helmholtz eq October 5, 2024 18 Extension to adaptive kernel with the aid of neural network [Ribeiro+ IEEE/ACM TASLP 2024] w/ prior information w/o prior information
  19. Kernel Interpolation of Spatial Sound Ø Experimental results using real

    data – Reconstructing pulse signal from single loudspeaker w/ 18 mics October 5, 2024 19 True Proposed Gaussian kernel (Black dots indicate mic positions) Impulse response measurement system [Koyama+ IEEE WASPAA 2021]
  20. Physics-Informed Machine Learning (PIML) Ø Recent advances in PIML-based sound

    field estimation – To appear in IEEE Signal Processing Magazine October 5, 2024 20 https://arxiv.org/abs/2408.14731
  21. Binaural Reproduction From Mic Array Recordings October 5, 2024 21

    Conversion into binaural sounds Recording Reproduction [Iijima+ JASA 2021] Reproducing binaural sound as if the listener were in recording area Single spherical mic array is usually used, but large array is necessary for large listening area
  22. Binaural Reproduction From Mic Array Recordings October 5, 2024 22

    Conversion into binaural sounds Recording Reproduction [Iijima+ JASA 2021] Reproducing binaural sound as if the listener were in recording area Binaural reproduction from recordings of multiple small arrays for broad listening area with flexible and scalable system
  23. Binaural Reproduction From Mic Array Recordings October 5, 2024 23

    Selected listening position Sound Field Estimation <latexit sha1_base64="vjqa4+RdjpjErnCueDR4pTdsZr0=">AAACvnicfZFbSxtBFMcn23pprBr1sS+LQZAiYVdEfQzqQ1+kKRgVsiGcnZyNg3NZZmbFOOyn6Afwtf1I/TbO5lKqVg8M/Pif/8ycS5pzZmwU/akFHz4uLC4tf6qvfF5dW29sbF4aVWiKXaq40tcpGORMYtcyy/E61wgi5XiV3p5W+as71IYpeWHHOfYFjCTLGAXrpUFjK0mF0+XAJQLsjRZOleWg0Yxa0STC1xDPoElm0Rls1H4mQ0ULgdJSDsb04ii3fQfaMsqxrCeFwRzoLYyw51GCQNN3k+rLcMcrwzBT2h9pw4n67w0HwpixSL2zqtG8zFXi/3K9wmbHfcdkXliUdPpRVvDQqrAaRThkGqnlYw9ANfO1hvQGNFDrB1Z/9k0q9ibzMZnx3Zyh71LjuVe+56jBKv3VJaBHAu5L3/Uo2avoPSOTc6Ont4z+ESbYA5buL71pZXJunZNfY/xyaa/hcr8VH7YOfxw02yezhS6TL2Sb7JKYHJE2+UY6pEsoGZNH8ov8DtpBFohATa1BbXZnizyL4P4J4N3gOA==</latexit> ro Rotation acc. listening direction Binaural Rendering using HRTF <latexit sha1_base64="0+grDE9dCzIAinpHvid3+X9QvSg=">AAACy3icfZHPThsxEMadBUoaaBvosReLCAkqFO1WiPaIKIdeUKlEACkbRbPObGLhPyvbWzVZ9lheoK/Ra/suvA3ekKAChZEs/fTNZ49nJskEty4Mr2vBwuLSi+X6y8bK6qvXb5pr66dW54Zhh2mhzXkCFgVX2HHcCTzPDIJMBJ4lF5+r/Nl3NJZrdeLGGfYkDBVPOQPnpX6TxoksYhDZCMqtik3ZL2IJbmRkoctyu99she1wGvQxRDNokVkc99dqv+KBZrlE5ZgAa7tRmLleAcZxJrBsxLnFDNgFDLHrUYFE2yumrZR00ysDmmrjj3J0qv57owBp7Vgm3ll90j7MVeL/ct3cpZ96BVdZ7lCx20JpLqjTtJoLHXCDzImxB2CG+79SNgIDzPnpNe6VSeTOdEA2tb6bQ/RdGjzyytcMDTht3vuJmqGEH6XvehjvVPSckau50dNTRv8Il3yCZXFHT1q5mlvn5NcYPVzaYzj90I722nvfdlv7B7OF1sk7skG2SEQ+kn3yhRyTDmHkivwmf8jf4CiwwSS4vLUGtdmdt+ReBD9vAE3y5ZA=</latexit> ↵(ro) <latexit sha1_base64="scldugFoYDlPt9Yn4jnRpVonsVg=">AAAC0XicfZFNTxsxEIadhRZIvwIcuawataIVinYrBBwRcOilKqUNIGWjaNaZTSz8sbK9iLCyhLjSW/9Jr+0f4d/gDUlVoHQkW4/eef0xM2nOmbFRdF0LZmafPJ2bX6g/e/7i5avG4tKhUYWm2KaKK32cgkHOJLYtsxyPc40gUo5H6clulT86RW2Ykt/sKMeugIFkGaNgvdRrvE1SUR64ak+A50NwqxVr1ysTAXaoRamce9drNKNWNI7wIcQTaJJJ7PcWaz+SvqKFQGkpB2M6cZTbbgnaMsrR1ZPCYA70BAbY8ShBoOmW44Jc+MYr/TBT2i9pw7H694kShDEjkXpn9UlzP1eJ/8p1CpttdUsm88KipLcPZQUPrQqr7oR9ppFaPvIAVDP/15AOQQO1vof1O8+kYm3cIJMZX80e+io1fvLK5xw1WKXf+47qgYAz56seJGsV/c/I5NTo6TGjv4QJdo6u/EOPWpmcWqfkxxjfH9pDOPzQijdaG1/Wm9s7k4HOkxXymqySmGySbfKR7JM2oeQ7+Ul+kd/B12AUXASXt9agNjmzTO5EcHUDHQzoQQ==</latexit> R↵(ro) Implemented as a single MIMO FIR filter Expansion coefficients at <latexit sha1_base64="vjqa4+RdjpjErnCueDR4pTdsZr0=">AAACvnicfZFbSxtBFMcn23pprBr1sS+LQZAiYVdEfQzqQ1+kKRgVsiGcnZyNg3NZZmbFOOyn6Afwtf1I/TbO5lKqVg8M/Pif/8ycS5pzZmwU/akFHz4uLC4tf6qvfF5dW29sbF4aVWiKXaq40tcpGORMYtcyy/E61wgi5XiV3p5W+as71IYpeWHHOfYFjCTLGAXrpUFjK0mF0+XAJQLsjRZOleWg0Yxa0STC1xDPoElm0Rls1H4mQ0ULgdJSDsb04ii3fQfaMsqxrCeFwRzoLYyw51GCQNN3k+rLcMcrwzBT2h9pw4n67w0HwpixSL2zqtG8zFXi/3K9wmbHfcdkXliUdPpRVvDQqrAaRThkGqnlYw9ANfO1hvQGNFDrB1Z/9k0q9ibzMZnx3Zyh71LjuVe+56jBKv3VJaBHAu5L3/Uo2avoPSOTc6Ont4z+ESbYA5buL71pZXJunZNfY/xyaa/hcr8VH7YOfxw02yezhS6TL2Sb7JKYHJE2+UY6pEsoGZNH8ov8DtpBFohATa1BbXZnizyL4P4J4N3gOA==</latexit> ro Rotation by multiplying Wigner-D matrix
  24. Binaural Reproduction From Mic Array Recordings Ø Recording system using

    multiple ambisonics mics and 360-degree cameras October 5, 2024 24 Small mic arrays (ambisonics mics) 360-degree cameras • 8 mic arrays • 8 cardioid mics in each array • 360-degree cameras for images
  25. Binaural Reproduction From Mic Array Recordings Ø Comparison of binaural

    reproduction error October 5, 2024 25 1st-order ambisonics mic Proposed system High reproduction accuracy in a large region
  26. Sound Field Reproduction Ø Optimization problem to be solved October

    5, 2024 27 Goal: Synthesizing desired sound field inside with secondary sources (loudspeakers) <latexit sha1_base64="SbdBLMPsjw6ciQkjzJuiJJhjGvk=">AAACF3icbVDLSsNAFJ3UV62vqksXBotQQUoiRV0W3bisYB/QlDCZ3LZDZyZhZiKUkKUf4Te41bU7cevSpX/i9LGwrQcuHM65l3vvCWJGlXacbyu3srq2vpHfLGxt7+zuFfcPmipKJIEGiVgk2wFWwKiAhqaaQTuWgHnAoBUMb8d+6xGkopF40KMYuhz3Be1RgrWR/OJx4qcex3ogeRqCyrKyF/BUZudexKGPz/xiyak4E9jLxJ2REpqh7hd/vDAiCQehCcNKdVwn1t0US00Jg6zgJQpiTIa4Dx1DBeaguunkkcw+NUpo9yJpSmh7ov6dSDFXasQD0zm+WS16Y/E/r5Po3nU3pSJONAgyXdRLmK0je5yKHVIJRLORIZhIam61yQBLTLTJbm5LwDOTibuYwDJpXlTcy0r1vlqq3czSyaMjdILKyEVXqIbuUB01EEFP6AW9ojfr2Xq3PqzPaWvOms0cojlYX7/bXqDc</latexit> udes(r, !) <latexit sha1_base64="OlngG7bMYm0AnUElZFcGXeuuUPg=">AAAB/HicbVA9SwNBEJ2LXzF+RS1tFoNgFe5E1DJoY2cE8wHJEfY2k2TN7t2xuyeEI/4GW63txNb/Yuk/cZNcYRIfDDzem2FmXhALro3rfju5ldW19Y38ZmFre2d3r7h/UNdRohjWWCQi1QyoRsFDrBluBDZjhVQGAhvB8GbiN55QaR6FD2YUoy9pP+Q9zqixUr19J7FPO8WSW3anIMvEy0gJMlQ7xZ92N2KJxNAwQbVueW5s/JQqw5nAcaGdaIwpG9I+tiwNqUTtp9Nrx+TEKl3Si5St0JCp+ncipVLrkQxsp6RmoBe9ifif10pM78pPeRgnBkM2W9RLBDERmbxOulwhM2JkCWWK21sJG1BFmbEBzW0J5Nhm4i0msEzqZ2Xvonx+f16qXGfp5OEIjuEUPLiECtxCFWrA4BFe4BXenGfn3flwPmetOSebOYQ5OF+/In2Vmg==</latexit> ⌦ <latexit sha1_base64="VJ5RMQ2GKmQZdUQJz96dZvLRnxA=">AAAB93icbVA9SwNBEN2LXzF+RS1tFoNgFe4kqGXQxsIiAfMByRH2NnPJkt29Y3dPOI78Alut7cTWn2PpP3GTXGESHww83pthZl4Qc6aN6347hY3Nre2d4m5pb//g8Kh8fNLWUaIotGjEI9UNiAbOJLQMMxy6sQIiAg6dYHI/8zvPoDSL5JNJY/AFGUkWMkqMlZqPg3LFrbpz4HXi5aSCcjQG5Z/+MKKJAGkoJ1r3PDc2fkaUYZTDtNRPNMSETsgIepZKIkD72fzQKb6wyhCHkbIlDZ6rfycyIrRORWA7BTFjverNxP+8XmLCWz9jMk4MSLpYFCYcmwjPvsZDpoAanlpCqGL2VkzHRBFqbDZLWwIxtZl4qwmsk/ZV1buu1pq1Sv0uT6eIztA5ukQeukF19IAaqIUoAvSCXtGbkzrvzofzuWgtOPnMKVqC8/ULSPOTbw==</latexit> L Difficult to solve owing to regional integration : Target region Secondary source Synthesized sound field <latexit sha1_base64="jXBfTdmzlzCxe3eJ2RCk10ZxZxw=">AAACo3icfZHbSgMxEIbT9VzPeunNYhFEpOyKqJeiXggiHqtCdymz6XQbTLJLkhXrsi8geKvP5tuY1lY8DwQ+/vmTmclEKWfaeN5ryRkaHhkdG58oT05Nz8zOzS9c6SRTFGs04Ym6iUAjZxJrhhmON6lCEBHH6+h2v5u/vkOlWSIvTSfFUEAsWYtRMFa6aDZ4Y67iVb1euD/B70OF9OO0MV96DJoJzQRKQzloXfe91IQ5KMMox6IcZBpToLcQY92iBIE6zHu9Fu6KVZpuK1H2SOP21M83chBad0RknQJMW3/PdcXfcvXMtHbCnMk0Myjpe6FWxl2TuN3B3SZTSA3vWACqmO3VpW1QQI39ni9VIlGUy8EB2uEUHttCJykqMIlaywNQsYD7wg4bB+td+s/I5MBo6S+jfYQJ9oBF/kF/WpkcWAdkt+d/39VPuNqo+lvVzbPNyu5ef4/jZIksk1Xik22ySw7JKakRSmLyRJ7Ji7PiHDnnzuW71Sn17yySL+GEb9ci1JI=</latexit> dl <latexit sha1_base64="2143p1RAQMlzka1H5YkQBHUMHYo=">AAACq3icfZFLb9NAEMc3ppQS6AuOXKxGSKVqIxtVLceo7YELIqhNWxFH0XgzdlbZh7W7RgTLX4B7r/C5+DasE6dqWspIK/0085+dV5xxZmwQ/Gl4T1aerj5be9588XJ9Y3Nr+9WlUbmm2KOKK30dg0HOJPYssxyvM40gYo5X8eS0il99Q22Ykhd2muFAQCpZwihY5+qnQ74bxaLQ5bvhVitoBzPzH0JYQ4vU1h1uN35GI0VzgdJSDsb0wyCzgwK0ZZRj2YxygxnQCaTYdyhBoBkUs55L/63zjPxEafek9WfeuxkFCGOmInZKAXZs7scq579i/dwmHwYFk1luUdJ5oSTnvlV+tQB/xDRSy6cOgGrmevXpGDRQ69a0VCUWZbMZnaEbTuMnV+hzhhqs0ntFBDoV8L10w6bRfkX/EzK5EDp6TOg+YYL9wLK4pUelTC6kC3LXC+/f6iFcvm+HR+3DL4etzkl9xzXyhuyQXRKSY9IhH0mX9AglityQX+S3d+Cde1+9aC71GnXOa7JkHv4Fm6XXyw==</latexit> gl(r) • : Driving signal of th secondary sources • : Transfer function of th secondary source <latexit sha1_base64="bKgRyukOmCpfZztJEARaEMlc1nM=">AAACoXicfZFNSwMxEIbT9bt+Vo9eFosgImVXRD0W9aAHsYrVQreU2XRag0l2SbJiXfYP6FV/nP/GtLZi1ToQeHjnTWYmE8acaeN57zlnYnJqemZ2Lj+/sLi0vFJYvdFRoihWacQjVQtBI2cSq4YZjrVYIYiQ4214f9zL3z6g0iyS16YbY0NAR7I2o2CsdMmbK0Wv5PXD/Q3+AIpkEJVmIfcctCKaCJSGctC67nuxaaSgDKMcs3yQaIyB3kMH6xYlCNSNtN9p5m5apeW2I2WPNG5f/X4jBaF1V4TWKcDc6Z+5nvhXrp6Y9mEjZTJODEr6WaidcNdEbm9st8UUUsO7FoAqZnt16R0ooMZ+zkiVUGT5fHCCdjiF57bQRYwKTKS20wBUR8BjZoftBDs9+s/I5NBoaZzRPsIEe8Is/aKxViaH1iHZ7fk/d/UbbnZL/n5p73KvWD4a7HGWrJMNskV8ckDK5JRUSJVQguSFvJI3p+icORXn6tPq5AZ31shIOPUP1irTuw==</latexit> l <latexit sha1_base64="bKgRyukOmCpfZztJEARaEMlc1nM=">AAACoXicfZFNSwMxEIbT9bt+Vo9eFosgImVXRD0W9aAHsYrVQreU2XRag0l2SbJiXfYP6FV/nP/GtLZi1ToQeHjnTWYmE8acaeN57zlnYnJqemZ2Lj+/sLi0vFJYvdFRoihWacQjVQtBI2cSq4YZjrVYIYiQ4214f9zL3z6g0iyS16YbY0NAR7I2o2CsdMmbK0Wv5PXD/Q3+AIpkEJVmIfcctCKaCJSGctC67nuxaaSgDKMcs3yQaIyB3kMH6xYlCNSNtN9p5m5apeW2I2WPNG5f/X4jBaF1V4TWKcDc6Z+5nvhXrp6Y9mEjZTJODEr6WaidcNdEbm9st8UUUsO7FoAqZnt16R0ooMZ+zkiVUGT5fHCCdjiF57bQRYwKTKS20wBUR8BjZoftBDs9+s/I5NBoaZzRPsIEe8Is/aKxViaH1iHZ7fk/d/UbbnZL/n5p73KvWD4a7HGWrJMNskV8ckDK5JRUSJVQguSFvJI3p+icORXn6tPq5AZ31shIOPUP1irTuw==</latexit> l
  27. Pressure Matching Ø Discretize target region into control points Ø

    Optimization problem for pressure matching becomes simple least-squares problem October 5, 2024 28 <latexit sha1_base64="V9SycBRbehDy9zsTAA8X3hDIM1c=">AAACpnicfZFLSwMxEMfT9V2f1aOXxSKISNkVUY9FPXjxBbYVukVn0+kaTbJLkhXrsl/Ak1f9Zn4b09qK74HAj//8k5nJhAln2njea8EZGR0bn5icKk7PzM7NL5QW6zpOFcUajXmsLkLQyJnEmmGG40WiEETIsRHe7vfyjTtUmsXy3HQTbAmIJOswCsZK9eBEYASXC2Wv4vXD/Qn+AMpkEKeXpcJj0I5pKlAaykHrpu8lppWBMoxyzItBqjEBegsRNi1KEKhbWb/d3F21StvtxMoeady++vlGBkLrrgitU4C51t9zPfG3XDM1nd1WxmSSGpT0vVAn5a6J3d7sbpsppIZ3LQBVzPbq0mtQQI39oS9VQpEXi8EB2uEUHtlCJwkqMLFazwJQkYD73A4bBRs9+s/I5NBo6S+jfYQJ9oB59kF/WpkcWodkt+d/39VPqG9W/O3K1tlWubo32OMkWSYrZI34ZIdUySE5JTVCyQ15Is/kxVlzjp2a03i3OoXBnSXyJZyrN7rP1cY=</latexit> ⌦ <latexit sha1_base64="q0uECYVB1NlCDNpVOene+1Qk19c=">AAACq3icfZFLTxsxEMedpVBIeXPkYhFVggqi3QoBR0Q5cKAF1AZQsxGadSYbCz9WtheRrvYLcOcKn6vfpk5IKt4jWfpp5j+eV5IJbl0Y/q0EYx/GJz5OTlU/Tc/Mzs0vLJ5anRuGDaaFNucJWBRcYcNxJ/A8MwgyEXiWXH7rx8+u0Fiu1S/Xy7AlIVW8wxk472r+oDFdjVOkh2sX87WwHg6MvoRoCDUytOOLhcpN3NYsl6gcE2BtMwoz1yrAOM4EltU4t5gBu4QUmx4VSLStYtBzST97T5t2tPFPOTrwPs4oQFrbk4lXSnBd+zzWd74Wa+aus9MquMpyh4o9FOrkgjpN+wugbW6QOdHzAMxw3ytlXTDAnF/TkyqJLKvVeB/9cAa/+0JHGRpw2nwpYjCphOvSD5vG6316T8jVSOjpLaH/hEv+B8viP70p5WokHZG/XvT8Vi/h9Gs92qpvnmzWdveGd5wky2SFrJKIbJNdckCOSYMwosktuSP3wUbwM/gdxA/SoDLMWSJPLMB/tKXWgg==</latexit> N ( L) <latexit sha1_base64="Sfob4tqImS+gkH7eBMlgYAHh4Uo=">AAADDnicfZJNb9QwEIad8FWWj27hyMVihYT4WCVVRXusKBIcQBSJbSutd1eOd5K1ajuR7SCWkD/AiV/DDXHlzI1/wpFJmiLaUkaK9OqdZzweT5JCSeej6GcQXrh46fKVlau9a9dv3Fztr93ac3lpBYxErnJ7kHAHShoYeekVHBQWuE4U7CeHO01+/x1YJ3Pz1i8LmGieGZlKwT1as75iWhqp5QeYVSzR1bymTBrKNPeLJKl26ulLdBSknn2kDfAcicetKutp1XIWy8DVyFmZLRCcrtOHlIHntCtqTkV31h9Ew6gNelbEnRiQLnZna8EnNs9FqcF4obhz4zgq/KTi1kuhoO6x0kHBxSHPYIzScA1uUrXPUtN76Mxpmlv8jKet+3dFxbVzS50g2YzhTuca81+5cenTrUklTVF6MOKoUVoq6nPavDGdSwvCqyUKLqzEu1Kx4JYLj5s40SXRda/HngEOZ+EVNnpdgOU+tw8qxm2m+fsah83Yo0b9D5TmGER1HoiHtKuuqz/qXLT7KxA9Vri9+PSuzoq99WH8ZLjxZmOw/bTb4wq5Q+6S+yQmm2SbvCC7ZEQE+UF+BUEQhp/DL+HX8NsRGgZdzW1yIsLvvwHgg/35</latexit> minimize d2CL Gd udes 2 + ⌘kdk2 <latexit sha1_base64="i9WfaOjLrhWO59r+nVt9MC7/SLM=">AAADEHicfZHLbhMxFIY90wIlXJrCko1FhFRu0QyqgA1SBUiUBaJIpK0Up5HHOTOxantGtgcRLL8AS56GHWLLmg2vwgrPJEH0xpEsff7Pf3xsn6wS3Ngk+RXFK6sXLl5au9y5cvXa9fXuxo09U9aawYCVotQHGTUguIKB5VbAQaWBykzAfnb0osnvfwBteKne21kFI0kLxXPOqA3SuKtIJt3E42eYCMjtJm72r/yhI5LaqcndjvdzCd/HBCxtDa89JpoXU3v30D1M/VlFrVYvNR2agPF+3O0l/aQNfBrSBfTQInbHG9FnMilZLUFZJqgxwzSp7MhRbTkT4DukNlBRdkQLGAZUVIIZufZjPL4TlAnOSx2WsrhV/61wVBozk1lwtjc/mWvEs3LD2uZPR46rqrag2LxRXgtsS9z8Mp5wDcyKWQDKNA93xWxKNWU2zOJYl0z6Toe8hPA4DW9Co7cVaGpLfc8RqgtJP/rw2II8aOh/Rq6WxkDnGcMhXPJP4N1fOtfK1dK6pDC99OSsTsPeo376uL/1bqu3/XwxxzV0C91GmyhFT9A22kG7aIAY+ol+RyvRavwl/hp/i7/PrXG0qLmJjkX84w9/cP+K</latexit> d = GHG + ⌘I 1 GHudes Transfer function matrix Driving signal vector Desired pressure vector Regularization term Closed-form solution is obtained as J Simple implementation L Fine discretization of is necessary <latexit sha1_base64="V9SycBRbehDy9zsTAA8X3hDIM1c=">AAACpnicfZFLSwMxEMfT9V2f1aOXxSKISNkVUY9FPXjxBbYVukVn0+kaTbJLkhXrsl/Ak1f9Zn4b09qK74HAj//8k5nJhAln2njea8EZGR0bn5icKk7PzM7NL5QW6zpOFcUajXmsLkLQyJnEmmGG40WiEETIsRHe7vfyjTtUmsXy3HQTbAmIJOswCsZK9eBEYASXC2Wv4vXD/Qn+AMpkEKeXpcJj0I5pKlAaykHrpu8lppWBMoxyzItBqjEBegsRNi1KEKhbWb/d3F21StvtxMoeady++vlGBkLrrgitU4C51t9zPfG3XDM1nd1WxmSSGpT0vVAn5a6J3d7sbpsppIZ3LQBVzPbq0mtQQI39oS9VQpEXi8EB2uEUHtlCJwkqMLFazwJQkYD73A4bBRs9+s/I5NBo6S+jfYQJ9oB59kF/WpkcWodkt+d/39VPqG9W/O3K1tlWubo32OMkWSYrZI34ZIdUySE5JTVCyQ15Is/kxVlzjp2a03i3OoXBnSXyJZyrN7rP1cY=</latexit> ⌦ : Target region Secondary source Control points
  28. Weighted Pressure Matching Ø Original cost function is approximated as

    Ø Driving signals are obtained as weighted least squares solution October 5, 2024 29 Pressure matching for continuous region based on kernel interpolation of sound field <latexit sha1_base64="ots8Wmdw3c6kEAqnCoYqjjAHV2E=">AAAD0HiclVJbb9MwFHYWLqNctsEjL9YqpgJb1UwT8II0ARIDhDbQuk6au8pxT1KrcRLZDmrJLMQjvPM7eIWfwr/BSdOKbhSJIyX+9J3v3HzspxFXutX65Sy5ly5fubp8rXb9xs1bK6trt49UkkkGbZZEiTz2qYKIx9DWXEdwnEqgwo+g4w+fF/7OB5CKJ/GhHqfQFTSMecAZ1ZbqrTnrr/EGoWkqkxEmPNa9nOwLCKnBJIJAn2Hii5wMrYKaRoGluX+aE0H1QAX5oal0jVL3xjwkI16gV5aXPBxoK97y5kQvTfHvm63iyEyVTFoKlJmFVefZ6Tae+c2kPiaktvG0SllmXJRs1sK0372igFV2zH+E91brrWarNHwReBWoo8oO7K1+I/2EZQJizSKq1InXSnU3p1JzFoGpkUxBStmQhnBiYUwFqG5ebtPge5bp4yCR9os1Ltk/I3IqlBoL3yrLqc77CvJvvpNMB0+6OY/TTEPMJoWCLMI6wcXTwH0ugelobAFlktteMRtQSZm2D6g2V8YXm+VFqUAZ63kBdkwJby21n4KkOpEPckJlKOjI2LFDslmgfwl5PBVatEhok3DBP4LJZ2ihlMdT6RTV7CK982u7CI62m96j5s67nfrus2qly+guWkcN5KHHaBftoQPURsz54nx3fjg/3ffuyP3kfp5Il5wq5g6aM/frbwuZR80=</latexit> J ⇡ Z ⌦ (r)T (K + ⇠I) 1 Gd udes 2 dr = Gd udes H W Gd udes <latexit sha1_base64="wBoAciZ2FK9Swr355ix7ob1QUvE=">AAADEnicfZHLbtNAFIYn5lbCLYUlmxERUkEosqsK2CBVLQs2qEVqmoo4RMeTY3dUz9iaGVdNR/MWPADPwQrElhfgBdjCIzB2EkR64UiWP/3nn9t/kjLn2oThj1Zw5eq16zdWbrZv3b5z915n9f6+LirFsM+KvFAHCWjMucS+4SbHg1IhiCTHQXK0XfcHx6g0L+SemZY4EpBJnnIGxkvjzvs4EXbg6Csac2nGNt4RmIGjtXzq1uqfck8+2Bi0uUAVYA51avec79WshJ3MbMqNO92wFzZFz0M0hy6Z1+54tfUpnhSsEigNy0HrYRSWZmRBGc5ydO240lgCO4IMhx4lCNQj24Tg6GOvTGhaKP9JQxv13xUWhNZTkXhnc+uzvVq8qDesTPpyZLksK4OSzQ5Kq5yagtaJ0glXyEw+9QBMcX9Xyg5BATM+96VTErH0BtskplPt2u34NfonK3zrpZ0SFZhCPfWpq0zAifMRZPGzmv5n5HJh9HSZ0W/CBT9FZ//SpVYuF9YF+ZlGZyd4HvbXe9Hz3sa7je7m1ny6K+QheUTWSERekE3yhuySPmHkC/lJfpHfwcfgc/A1+DazBq35mgdkqYLvfwAs/wN9</latexit> W = Z ⌦ z(r)⇤z(r)Tdr <latexit sha1_base64="UqfxQe3UGsqqnmcvxH4VBmwhEN8=">AAAC/XicfZFdaxNBFIYn61eNH03rpTeDQYhaQ1aKFkEo6oUiYoSmLWRjODs5uxkyM7vMzErTZfBX+BO8Ey/1twje6u9wdpuIba0HBh7e952PMyfOBTe21/veCM6dv3Dx0srl5pWr166vttbWd01WaIYDlolM78dgUHCFA8utwP1cI8hY4F48e1b5e+9RG56pHTvPcSQhVTzhDKyXxq1+FMvy0NHHT2hF0QzyHFynYu3uvCsjCXZqknLHORoJTGynzr1y92h0wCt86Q3N06n16fuhG7favW6vLnoawgW0yaL647XGx2iSsUKiskyAMcOwl9tRCdpyJtA1o8JgDmwGKQ49KpBoRmXduqO3vTKhSab9UpbW6t87SpDGzGXsk3UnJ71K/Jc3LGyyNSq5yguLih1dlBSC2oxW/0gnXCOzYu4BmOb+rZRNQQOz/rebx66J5UZ1vjaJcd55jr5Nja+99CZHDTbTd8sIdCrhwPm202ijov8FuVoGPZ0V9IdwyQ/RlX/ozChXy+iSmn6Q4cmxnYbdB93wYXfz7WZ7++lipCvkJrlFOiQkj8g2eUH6ZEAY+Up+kJ/kV/Ah+BR8Dr4cRYPGYs8NcqyCb78Bcfj4Dw==</latexit> z := (r)T (K + ⇠I) 1 Kernel interpolation [Koyama+ JAES 2023] <latexit sha1_base64="CRJvQ0OuNgbbSk0jTZw5dDlZAo8=">AAADxHiclVJbaxNBFJ7teqnx0ovgiy+DQanahqyU6otQrNgKihVMU8ikYXZyNhk6M7vMzIpxGf+Fv8BX/UH+G2e3G0nbRPDAwrff+c5tzokzwY1tt38HS+GVq9euL99o3Lx1+87K6tr6kUlzzaDDUpHq45gaEFxBx3Ir4DjTQGUsoBuf7pX+7mfQhqfqk51k0Jd0pHjCGbWeGqwF98iY2oLEshg6hx+9xITqkeRqUHOEK0wkteM4LvbcyTuHiYDEbuDSvT90eKtCuTspKpn2QWB8KqL5aGwf17RJioOS9NLu/6bAhDTKxmaC3Jy0+w4/xQQsLf/ezjSwFbnFUW5+7cFqs91qV4Yvg6gGTVTboX/J72SYslyCskxQY3pRO7P9gmrLmQDXILmBjLJTOoKeh4pKMP2i2qDDDz0zxEmq/acsrtjZiIJKYyYy9spqgIu+kpzn6+U2edEvuMpyC4qdFUpygW2Ky3PAQ66BWTHxgDLNfa+YjammzPqjaZwrE8vN6oVMYpz3vAY/pob3nvqQgaY21U+K6njoF+fHHpHNEv1LyNVU6NEioU/CJf8KrviLFkq5mkqnqOEXGV1c22Vw9KwV7bS2P243d1/VK11G99EDtIEi9BztogN0iDqIBS74EfwMfoVvQhGaMD+TLgV1zF10zsJvfwD8G0Dj</latexit> ˆ d = arg min d2CL Gd udes H W Gd udes = GHW G + ⌘I 1 GHW udes
  29. Weighted Pressure Matching Ø Comparison between Pressure Matching (PM) and

    Weighted Pressure Matching (WPM) October 5, 2024 30 PM WPM (uniform) WPM (directional) Pressure Error
  30. Spatial Aliasing Artifacts in Sound Field Reproduction Ø Owing to

    discrete placement of secondary sources, spatial aliasing artifacts are unavoidable – E.g., synthesizing sound field by 12 loudspeakers at 800 Hz October 5, 2024 31 Desired Pressure Matching Pressure § Degradation of sound localization § Coloration of source signals
  31. Idea for perceptual quality enhancement Ø ILD is the dominant

    cue for horizontal sound localization above 1500 Hz, compared with ITD Ø Amplitude response should be accurately synthesized as much as possible, rather than phase response, to alleviate coloration effects October 5, 2024 32 Synthesizing amplitude (or magnitude) distribution leaving phase distribution arbitrary at high frequencies Applying amplitude matching for high frequencies Pressure Magnitude
  32. Amplitude Matching Ø Synthesizing desired magnitude at control points –

    By leaving phase arbitrary, number of parameters to be control can be reduced – First proposed for multizone sound field control for personal audio Ø Optimization problem of amplitude matching October 5, 2024 33 [Abe+ IEEE/ACM TASLP 2023] : Target region Secondary source Desired magnitude No closed-form solution, but majorization minimization (MM) algorithm or alternating direction method of multipliers (ADMM) can be applied Element-wise absolute value
  33. Amplitude Matching for Multizone Sound Field Control Ø Synthesizing different

    sound fields in multiple zones – E.g., application to personal audio October 5, 2024 34 Sound zone Sound zone Amplitude matching is suitable when there is no need to synthesize the direction of sound (e.g., audible and non-audible regions)
  34. Amplitude Matching for Multizone Sound Field Control October 5, 2024

    35 https://youtu.be/oYw7kmpZcY4 Full version:
  35. Proposed Method for Perceptual Quality Enhancement Ø Combination of pressure

    and amplitude matching – is determined so that for low frequencies and for high frequencies – For example, can be defined as sigmoid function October 5, 2024 36 <latexit sha1_base64="7PBZJNy5EyF+eucvkOfjzch+WsM=">AAADWHicfZLtbtMwFIadFthWPtaOn/yxqJA6WKummhiaNKkwJBACMSS6TarbynFOO2uxE9kOoqS+DK6CK4KrwUlTYF9YivLqPc/x8fFxkERcm273p1ep3rp9Z219o3b33v0Hm/XG1rGOU8VgwOIoVqcB1RBxCQPDTQSniQIqgghOgvPDPH7yBZTmsfxs5gmMBJ1JPuWMGmdN6j+I4JIL/g0mGQlEFlrCJRHUnAVBdmjH7y1+11oGtvH+AW75bRKAoduYLHL7TWjb+T+146xIUw4FbS1ZTHrjHn6GC9zRK3yB23hxbcrib07kOghpWSMsN5vUm91Ot1j4qvBL0UTlOpo0vO8kjFkqQBoWUa2Hfjcxo4wqw1kEtkZSDQll53QGQyclFaBHWXGrFj9xToinsXKfNLhw/83IqNB6LgJH5l3oy7HcvC42TM30xSjjMkkNSLYsNE0jbGKcjwiHXAEz0dwJyhR3Z8XsjCrKjBtk7UKZQOwUN6in2rrIa3BtKvjgrI8JKGpi9TQjVM0E/Wpd2zOyk6v/gVyuQKduAt0mxZux2R91I1o+L4euVM0N0r88tqviuNfxn3d2P+02+6/Kka6jR+gxaiEf7aE+eouO0AAxr+HteX3vZeVXFVXXqhtLtOKVOQ/RhVXd+g1mbRVc</latexit> minimize d2CL J(d) := (1 )kGd udesk2 2 + k|Gd| |udes|k2 2 + kdk2 2 <latexit sha1_base64="A4dv1PYDkl/3NJ52frUgh4Qg2ME=">AAACsHicfZFNSyNBEIY7o66a1fVjj14Gg7CIhBkR9SjuHvayqGA0kAmhplMTW7t7hu4aMTvMb9iToD/Nf7M9MRG/Cxoe3nqru6orzqSwFAQPNW9qeubL7Nx8/evC4rel5ZXVM5vmhmOLpzI17RgsSqGxRYIktjODoGKJ5/HVzyp/fo3GilSf0jDDroKBFongQE5qRTES9JYbQTMYhf8WwjE02DiOeyu126if8lyhJi7B2k4YZNQtwJDgEst6lFvMgF/BADsONSi03WLUbelvOKXvJ6lxR5M/Up9XFKCsHarYORXQhX2dq8T3cp2ckv1uIXSWE2r++FCSS59Svxrd7wuDnOTQAXAjXK8+vwADnNwH1V88E6ut6n5jE1u6zC90Yxr846SjDA1QajaLCMxAwU3pxh5EWxV9ZhR6YnT0kdFdIpT4i2XxRB9ahZ5YJ1R3iwxfr+0tnG03w93mzslO4+BwvNI5tsbW2Q8Wsj12wH6zY9ZinAn2j92xe2/ba3s9Dx6tXm1c8529CO/yPyvr2Vk=</latexit> <latexit sha1_base64="v6ZpLa9I3hnLLs2O89d3S1l6XjM=">AAACsnicfZFNaxsxEIblzUdTN9899rLEFEIIZjeYNJdAaHPopTSBOA7xmjArzzrCknaRZkPcZX9ED7m0vyz/JlrHDvkeEDy88440o4kzKSwFwW3Nm5mdm/+w8LH+aXFpeWV1bf3Uprnh2OapTM1ZDBal0NgmQRLPMoOgYomdePijyneu0FiR6hMaZdhTMNAiERzISZ0oRoL98GK1ETSDcfgvIZxAg03i6GKtdhP1U54r1MQlWNsNg4x6BRgSXGJZj3KLGfAhDLDrUINC2yvG/Zb+V6f0/SQ17mjyx+rjigKUtSMVO6cCurTPc5X4Wq6bU7LXK4TOckLN7x9KculT6lfD+31hkJMcOQBuhOvV55dggJP7ovqTZ2K1Xd1vbGJLlzlEN6bBX076naEBSs1WEYEZKLgu3diDaLui94xCT42O3jK6S4QSf7AsHuhNq9BT65TqbpHh87W9hNOdZrjbbB23GgffJytdYF/YBttkIfvGDthPdsTajLMh+8v+sf9eyzv3wOP3Vq82qfnMnoQn7wB2ednb</latexit> = 1 <latexit sha1_base64="nFwv8TaL4YCj47fcATfYtfC3o0Y=">AAACsnicfZFNSyNBEIY74+6qcT/8OHoZNgjLImFGwrqXBVEPXkQFY2QzQWo6NbFJd8/QXbNsHOZHePCiv8x/Y09MxO+Choe33uqu6oozKSwFwW3Nm/nw8dPs3Hx94fOXr98Wl5ZPbJobjm2eytScxmBRCo1tEiTxNDMIKpbYiYc7Vb7zD40VqT6mUYY9BQMtEsGBnNSJYiT4E5wtNoJmMA7/JYQTaLBJHJ4t1a6ifspzhZq4BGu7YZBRrwBDgkss61FuMQM+hAF2HWpQaHvFuN/SX3NK309S444mf6w+rihAWTtSsXMqoHP7PFeJr+W6OSW/e4XQWU6o+f1DSS59Sv1qeL8vDHKSIwfAjXC9+vwcDHByX1R/8kys1qv7jU1s6TK76MY0uO+kgwwNUGp+FhGYgYL/pRt7EK1X9J5R6KnR0VtGd4lQ4gLL4oHetAo9tU6p7hYZPl/bSzjZaIa/mq2jVmNre7LSObbKvrMfLGSbbIvtsUPWZpwN2SW7Zjdey/vrgcfvrV5tUrPCnoQn7wB0Mdna</latexit> = 0 <latexit sha1_base64="A4dv1PYDkl/3NJ52frUgh4Qg2ME=">AAACsHicfZFNSyNBEIY7o66a1fVjj14Gg7CIhBkR9SjuHvayqGA0kAmhplMTW7t7hu4aMTvMb9iToD/Nf7M9MRG/Cxoe3nqru6orzqSwFAQPNW9qeubL7Nx8/evC4rel5ZXVM5vmhmOLpzI17RgsSqGxRYIktjODoGKJ5/HVzyp/fo3GilSf0jDDroKBFongQE5qRTES9JYbQTMYhf8WwjE02DiOeyu126if8lyhJi7B2k4YZNQtwJDgEst6lFvMgF/BADsONSi03WLUbelvOKXvJ6lxR5M/Up9XFKCsHarYORXQhX2dq8T3cp2ckv1uIXSWE2r++FCSS59Svxrd7wuDnOTQAXAjXK8+vwADnNwH1V88E6ut6n5jE1u6zC90Yxr846SjDA1QajaLCMxAwU3pxh5EWxV9ZhR6YnT0kdFdIpT4i2XxRB9ahZ5YJ1R3iwxfr+0tnG03w93mzslO4+BwvNI5tsbW2Q8Wsj12wH6zY9ZinAn2j92xe2/ba3s9Dx6tXm1c8529CO/yPyvr2Vk=</latexit> <latexit sha1_base64="ROtRu2ks9BH5ktUl1T0BgfcB3qQ=">AAADB3icfVHNbhMxEHaWv7L8peXIxSJCaqGNdqsKekGqgAMXRJGatiIOkdeZ3Vpd2yvbiwiWH4AH4Dm4IY7wGLwAV3gEvMkG0ZYykuVP33zjGX+TVSU3Nkm+d6ILFy9dvrJ0Nb52/cbNW93llX2jas1gwFSp9GFGDZRcwsByW8JhpYGKrISD7Phpkz94C9pwJffstIKRoIXkOWfUBmrcfU0ysHSVKAEFXcOPcUxyTZlLvUsfEEHtkRYO/Bu3MeeJ4YWg3m2Sivu2bGN+jd1Cv+f9mvfjbi/pJ7PAZ0Hagh5qY3e83PlIJorVAqRlJTVmmCaVHTmqLWcl+JjUBirKjmkBwwAlFWBGbmaCx/cCM8G50uFIi2fs3xWOCmOmIgvKZkxzOteQ/8oNa5tvjxyXVW1BsnmjvC6xVbhxFE+4BmbLaQCUaR5mxeyIBrNs8D0+0SYT6zOLTG58yDyD8E0NLwL1sgJNrdL3HaE6OPzOh28XZL1B/xNyuRAGdJ4wPMIFfw/e/UHnSrlcSBcoDotMT6/tLNjf7KcP+1uvtno7T9qVLqE76C5aRSl6hHbQc7SLBoihr+gH+ol+RR+iT9Hn6MtcGnXamtvoRETffgMYmvzf</latexit> (!) = 1 1 + e 2⇡ (! !T) Transition frequency Can still be solved by MM algorithm or ADMM Pressure matching Amplitude matching [Kimura+ IEEE WASPAA 2023]
  36. Numerical Experiments Ø Setting – 3D free field – Target

    region : Cuboid of 1.0 m x 1.0 m x 0.04 m – 32 loudspeakers on borders of squares of 2.0 m x 2.0 m at z=±0.1 m – 1152 control points regularly placed over every 0.04 m – Desired sound field: point source at (2.0 m, 0.0 m, 0.0 m) – Proposed method and pressure matching (PM) are compared October 5, 2024 37 <latexit sha1_base64="aJVm6ibCtbqRlu7KgruECMVr7Qw=">AAACvHicfZFNT+MwEIbdwAJbdvk87iWiQkJoVSUIASeEgAMXVFaiBamp0MSdBFPbiWwHUaL8B65w4m/xb3D6saJ8jWTp0TuvPTOeMOVMG897qThT0z9mZud+Vud//V5YXFpeaekkUxSbNOGJugxBI2cSm4YZjpepQhAhx4uwd1TmL25RaZbIc9NPsSMglixiFIyVWkFDYAxXSzWv7g3C/Qj+CGpkFGdXy5XnoJvQTKA0lIPWbd9LTScHZRjlWFSDTGMKtAcxti1KEKg7+aDdwl23SteNEmWPNO5AfXsjB6F1X4TWKcBc6/e5Uvws185MtNfJmUwzg5IOC0UZd03ilrO7XaaQGt63AFQx26tLr0EBNfaHJqqEYmKGvKyldKSLajU4RjuywlMrNVJUYBK1mQegYgF3hf2COPhb0ndGJsdGS18Z7SNMsHss8v/0pZXJsXVMdqf++w1+hNZW3d+pb//brh0cjrY7R/6QNbJBfLJLDsgJOSNNQskNeSCP5MnZd7pOzxFDq1MZ3VklE+HcvgLkeN8p</latexit> ⌦ <latexit sha1_base64="aJVm6ibCtbqRlu7KgruECMVr7Qw=">AAACvHicfZFNT+MwEIbdwAJbdvk87iWiQkJoVSUIASeEgAMXVFaiBamp0MSdBFPbiWwHUaL8B65w4m/xb3D6saJ8jWTp0TuvPTOeMOVMG897qThT0z9mZud+Vud//V5YXFpeaekkUxSbNOGJugxBI2cSm4YZjpepQhAhx4uwd1TmL25RaZbIc9NPsSMglixiFIyVWkFDYAxXSzWv7g3C/Qj+CGpkFGdXy5XnoJvQTKA0lIPWbd9LTScHZRjlWFSDTGMKtAcxti1KEKg7+aDdwl23SteNEmWPNO5AfXsjB6F1X4TWKcBc6/e5Uvws185MtNfJmUwzg5IOC0UZd03ilrO7XaaQGt63AFQx26tLr0EBNfaHJqqEYmKGvKyldKSLajU4RjuywlMrNVJUYBK1mQegYgF3hf2COPhb0ndGJsdGS18Z7SNMsHss8v/0pZXJsXVMdqf++w1+hNZW3d+pb//brh0cjrY7R/6QNbJBfLJLDsgJOSNNQskNeSCP5MnZd7pOzxFDq1MZ3VklE+HcvgLkeN8p</latexit> ⌦
  37. Numerical Experiments Ø Evaluation of ILD – Binaural signals in

    the synthesized sound field were calculated by using transfer functions from loudspeakers to a listener obtained by Mesh2HRTF [Ziegelwanger+ 2015] – Evaluation measure was normalized error of ILD: – Distribution of NE October 5, 2024 38 <latexit sha1_base64="EndMQejlYdBkvqIegxmgN2TX4IM=">AAAD5niclVLdatRAFJ40amv8a+ulN4OL0Eq7JFLa3hSKVqjgTwW3LWyWZTJ7sjs0MwkzE3Gdzit4pXjb5/BWX8K3cbLdSLe7FXsg8OU73/mZc05SZEzpMPztzfk3bt6aX7gd3Ll77/6DxaXlQ5WXkkKL5lkujxOiIGMCWprpDI4LCYQnGRwlJy8q/9FHkIrl4oMeFtDhpC9YyijRjuoueesxJ3oguXn70q7ECTfSdk3N7Vu7indwEKeSUBOrkjtfMWATCotP679Xr/cuRKuhsDNzrk3lWF2fnULLEv47xym21+7xegVsd7ERNsOR4WkQjUEDje3ADfgs7uW05CA0zYhS7SgsdMcQqRnNwAZxqaAg9IT0oe2gIBxUx4wWa/ETx/Rwmkv3CY1H7MUIQ7hSQ544ZdWmuuyryFm+dqnT7Y5hoig1CHpeKC0zrHNcXQnuMQlUZ0MHCJXM9YrpgLgj0O6WJqokfOINZjQulSobBPEeuCdLeOOodwVIonP51MRE9jn5ZN0I+vFahf4lZKIWOnSV0CVhnH0Ga/6iK6VM1NIauZ1Glzc4DQ6fNaPN5sb7jcbu8/F2F9Aj9BitoAhtoV20jw5QC1Hvq/fD++n98gf+F/+b//1cOueNYx6iCfPP/gCusFhL</latexit> NE(r H ) = P H |ILD syn (r H , H ) ILD true (r H , H )| P H |ILD true (r H , H )| Position and direction of listener’s head PM Proposed
  38. Listening Experiments Ø Evaluation by MUSHRA – Desired sound field:

    point source at (2.0 m, 0.5 m, 0.0 m) – Reverberation time (T60 ): 0.19 s – 14 male subjects in 20-30s – Listening at center of target region – Test signals: • Reference: Source signal from reference loudspeaker • C1/Hidden anchor: lowpass-filtered source signal up to 3.5 kHz • C2/PM: Synthesized sound by PM • C3/Proposed: Synthesized sound by Proposed • C4/Hidden reference: Same as reference October 5, 2024 40
  39. Listening Experiments Ø Results of two source signals (Vocals/Instrumental) October

    5, 2024 41 Vocals Instrumental C1/Hidden anchor C2/PM C3/Proposed C4/Hidden reference Synthesized sound by Proposed is perceptually close to reference sound compared to PM
  40. Ø Further reducing the number of mics and loudspeakers –

    Further development of PIML-based techniques is necessary Ø HRTF individualization for binaural reproduction – No established technique so far, but recent ML techniques might be applied Ø Further integration of perception-based and physics-based techniques – Technologies that integrate perceptual-based and physics-based methods at a higher level are required Future Challenges October 5, 2024 42
  41. Conclusion Ø Physics-based sound field capturing and reproduction October 5,

    2024 43 Basic Technologies of Sound Field Estimation and Control VR/AR audio Active noise control Local-field recording and reproduction Signal enhancement Visualization/auralization Room acoustic analysis