Robust Sound Field Reproduction against Listener’s Movement Utilizing Image Sensor
1. Robust Sound Field Reproduction against
Listener’s Movement Utilizing Image Sensor
Toshihide Aketo,Hiroshi Saruwatari,Satoshi Nakamura
(Nara Institute of Science and Technology, Japan)
2. Outline
Research background
Conventional method
Spectral Division Method
Local sound field synthesis
Proposed method
Equiangular filter
Sound field reproduction system utilizing image sensor
Simulation experiment
Subjective assessment
on directional perception
on sound quality
3. Research background (1/3)
Objective of sound field reproduction (SFR) system
To reproduce the primary sound field to another space with wide range
and high accuracy.
However, it is difficult to realize such a system because the system size
becomes larger and the system configuration becomes complex.
Therefore, the recent research is focused on reproducing sound field with wide
range and high accuracy using small and simple system.
Surrounded
(large and complex)
Circular or spherical
(a little complex)
Linear or planer
(simple)
Boundary surface control
(BoSC)
Ambisonics
Stereo or surround system
Wave field synthesis
(WFS)
Focused
Complex
Simple
4. Research background (2/3)
Spectral Division Method (SDM) [J. Ahrens, S. Spors., 2008]
One of the SFR methods that reproduces the sound field by synthesizing a
number of wavefronts.
This method can be realized with a simple system like linear loudspeaker
array.
However, SDM has two problems.
Problem 1: A sound pressure error is occurred by mismatching the
reference listening line.
Problem 2: A disturbance of wavefront is occurred by a spatial aliasing.
Reproduction accuracy: Low
Reproduction region: Wide
High
We aim to reproduce the sound field with high
accuracy by solving these problems in SDM.
5. Research background (3/3)
To cope with these problems, we propose the novel SFR system with
linear loudspeaker array, which combines listener’s position
estimation by Kinect and SDM with local sound field synthesis.
Image sensor
Kinect
Local sound
field synthesis
Reproduction accuracy
Low
Reproduction region:
Wide
Reproduction accuracy:
High
Reproduction region:
localized around listener
6. Spectral Division Method (SDM) [J. Ahrens, S. Spors., 2008]
Primary source
Primary source
nth secondary
source
nth secondary
source
Reference
listening line
Reference
listening line
Spatial domain
IDFT
Fourier transform
Wavenumber domain
The driving function in the wavenumber domain
The driving function in the spatial domain
: angular frequency
: wavenumber in
: speed of sound
-direction
: imaginary unit
: reference listening distance
: zero-th order modified Bessel function of the second kind
: zero-th order Hankel function of the second kind
7. Spectral Division Method (SDM) [J. Ahrens, S. Spors., 2008]
Primary source
Primary source
nth secondary
source
nth secondary
source
Reference
listening line
Reference
listening line
Spatial domain
IDFT
Fourier transform
Wavenumber domain
The driving function in the wavenumber domain
The driving function in the spatial domain
:reference listening distance
Problems in SDM
A sound pressure error is occurred by mismatching the reference listening line.
A disturbance of wavefront is occurred by a spatial aliasing.
8. Problem 1 : sound pressure error
A sound pressure is correctly reproduced only on the reference
listening line under 2.5-dimensional synthesis condition.
Sound pressure is correctly
reproduced on the
reference listening line.
2.0
2.0
1.0
1.0
0.0
0.0
-1.0
0.0
1.0
Primary sound field
-1.0
0.0
Sound pressure error
1.0
occurs outside the
reference listening line.
Reproduced sound field
Therefore, to correctly reproduce the sound field to listener's position,
we must set the reference listening distance equal to listener's distance.
9. Problem 2: spatial aliasing (1/2)
0
10
-24
0
-48
0
R参
加
-30
30
20
0
10
-24
0
-48
-30
0
30
Spectral overlap occurs
Discretization of the secondary source
Magnitude[dB]
20
Magnitude [dB]
In SDM, a spectral overlap of the driving function is occurred by
discretization of secondary source, and filter power at high frequency
becomes larger like in the right figure.
10. Problem 2: spatial aliasing (2/2)
The effect of spectral overlap in the wavenumber domain appears as a
spatial aliasing in the spatial domain.
1.5
0.00
0.0
-1.5
0.0
1.5
-0.10
3.0
Synthesized wavefront
(discrete array)
0.10
1.5
0.00
0.0
-1.5
0.0
1.5
Amplitude
0.10
Amplitude
3.0
Synthesized wavefront
(continuous array)
-0.10
Disturbance of wavefront occurs
Discretization of the secondary source
11. Local sound field synthesis (1/2) [J. Ahrens, S. Spors., 2011]
0
10
-24
0
-48
-30
0
30
Spectral overlap occurs
20
0
10
-24
0
-48
-30
0
30
Spectral overlap is suppressed
Rectangular window for the spectrum of the driving function
By applying a rectangular window to a spectrum in the left figure, we
enable to suppress a spectral overlap like in the right figure.
Magnitude[dB]
20
Magnitude[dB]
Local sound field synthesis: the method enables to suppress a spatial
aliasing by limiting spatial bandwidth in the wavenumber domain.
12. Local sound field synthesis (2/2) [J. Ahrens, S. Spors., 2011]
By applying a rectangular window, we enable to suppresses a
disturbance of wavefront and enable to increase the maximum
frequency in which the sound field can be correctly reproduced.
Synthesized wavefront (unfiltered)
Synthesized wavefront (filtered)
0.0
-1.5
0.0
1.5
-0.10
Spatial aliasing occurs
1.5
0.00
0.0
-1.5
0.0
1.5
Amplitude
0.00
Amplitude
1.5
0.10
3.0
0.10
3.0
-0.10
Disturbance of wavefront is suppressed
Reproduction area is localized
Therefore, It is necessary to design a filter to precisely control the
reproduced direction in order to take advantage of this method.
13. Equiangular filter
In order to design a filter to accurately control the reproduced direction,
we derive the relation equation between reproduced direction ,
wavenumber in -direction
and frequency .
constant
proportional
: wavenumber in
-direction
: speed of sound
:reproduced direction
: frequency
If reproduced direction is constant, since it is found that
proportional to , we design a new filter as follows
: angular frequency
: angular width
: wavenumber
: equiangular filter
is
14. Result of applying the equiangular filter (1/2)
An example when we applied a designed filter to a spectrum
0
10
-24
0
-48
-30
0
30
Spectral overlap occurs
and the angular width
is
.
20
0
10
-24
0
-48
-30
0
30
Spectral overlap is suppressed
Equiangular filter for the spectrum of the driving function
Equiangular filter used in this presentation is cut by applying a low-pass
filter with respect to the frequency that exceeds the maximum
frequency
, and we do not reproduce the sound field.
Magnitude[dB]
20
is
Magnitude[dB]
This case that the angular
15. Result of applying the equiangular filter (2/2)
By applying the equiangular filter, we enable to suppress a disturbance
of wavefront and enable to reproduce the sound field to the specific
direction.
Synthesized wavefront (unfiltered)
Synthesized wavefront (filtered)
0.0
-1.5
0.0
1.5
-0.10
Spatial aliasing occurs
1.5
0.00
0.0
-1.5
0.0
1.5
Amplitude
0.00
Amplitude
1.5
0.10
3.0
0.10
3.0
-0.10
Disturbance of wavefront is suppressed
However, there is a problem that it is impossible to match the sweet spot
to the listener’s position if listener’s direction is unknown in advance.
16. Summary of problems
Problems in SDM
A sound pressure error occurs in the case that the reference
listening distance does not match listener's distance.
A spatial aliasing is occurred by discretization of secondary sources.
Second problem can be solved by applying an equiangular filter
Problems in equiangular filter
It is impossible to match the sweet spot to the listener’s position if
listener’s direction is unknown in advance.
These problems can be solved if we know the listener’s
position,
therefore, introduction of the image sensor enables to solve
these problems.
17. Condition of simulation experiment
Primary source (monopole source)
34 ch linear secondary
source array (monopole source)
Parameter name
measurement plane
aliasing frequency
Parameter value
W4.0 D4.0
approximately 2019 Hz
angular width
reproduced direction
Reference
listening line
synthesis frequency
3, 5 kHz
Evaluation score
: radiation characteristic of primary sound field
: radiation characteristic of secondary sound field
It is assumed that listener’s position is obtained by the image sensor, we calculate
the reproduced direction from sound source position and listener's position.
19. Results of simulation experiment
0.10
0.10
2.0
1.0
0.00
0.0
-1.0
-1.5
0.0
1.5 -0.10
2.0
1.0
0.00
0.0
-1.0
-1.5
0.0
Amplitude
Synthesized wavefront (5 kHz)
Amplitude
Synthesized wavefront (3 kHz)
1.5 -0.10
Evaluated value (3 kHz)
Evaluated value (5 kHz)
0
0
2.0
2.0
-24
0.0
-1.0
1.0
-24
-48
1.0
0.0
-48
-1.0
-1.5
0.0
1.5
-1.5
0.0
1.5
The sound field is correctly reproduced
at listener’s direction regardless of the frequency.
: Listener
: Primary source
20. Condition of subjective assessment on directional perception
parameter name
Acoustic transparent
curtain
: Primary source
: Answer number card
parameter value
sampling frequency
48 kHz
quantization bit rate
16 bit
test sound
white Gaussian noise with 3 seconds
aliasing frequency
34 ch linear
loudspeaker array angular width
approximately 2019 Hz
sound source direction
number of evaluator
type of sound source
Loudspeaker
distance
Reference
listening line
7
・sound source without bandwidth limitation
(Conventional1)
・sound source with bandwidth limitation in
frequencies under 2 kHz (Conventional2)
・sound source in which we applied the
equiangular filter(Proposed)
Evaluation score
Pos 1
Pos 2
Pos 3
: number of evaluator
: answered direction
: true source direction
We asked evaluators to answer which card position you perceive the sound
source exists as an evaluation procedure.
21. Results of subjective assessment on directional perception
Conventional1 (without bandwidth limitation)
Conventional2 (with bandwidth limitation in frequencies under 2 kHz)
Proposed (in which we applied the equiangular filter)
Bad
(a) In Pos1
(b) In Pos2
(c) In Pos3
Good
Proposed is superior to Conventional1 and Conventional2 in Pos1 and Pos2.
However, Proposed is almost the same as Conventional2 in Pos3.
This is because in equiangular filter, as the angle of reproduced direction becomes
larger, the maximum frequency becomes low.
As the user moves to right (from Pos1 to Pos3), directional perception error of
Conventional1 becomes larger owing to the effect of a spatial aliasing.
The superiority of the proposed method is shown on directional perception.
22. Condition of subjective assessment on sound quality
Acoustic transparent
curtain
: Primary source
: Reference loudspeaker
parameter name
parameter value
sampling frequency
34 ch linear
loudspeaker array
48 kHz
quantization bit rate
16 bit
test sound
aliasing frequency
White Gaussian noise with 3 seconds
approximately 2019 Hz
angular width
Loudspeaker
distance
sound source direction
number of evaluator
type of sound source
Reference
listening line
Pos 1
Pos 2
Pos 3
7
・sound source without bandwidth
limitation (Conventional1)
・sound source with bandwidth limitation
in frequencies under 2 kHz
(Conventional2)
sound source in which we applied the
equiangular filter(Proposed)
We sounded two synthesized sound after reference sound radiated by reference
loudspeaker, and asked evaluators to answer which synthesized sound you
perceive closer to the reference sound as an evaluation procedure.
23. Results of subjective assessment on sound quality
Conventional1 (without bandwidth limitation)
Conventional2 (with bandwidth limitation in frequencies under 2 kHz)
Proposed (in which we applied the equiangular filter)
Good
(a) In Pos1
(b) In Pos2
(c) In Pos3
ꥰꥰ
Bad
In all results, evaluators chose Conventional1 or Proposed, and didn’t
choose Conventional2.
In all listener’s position, more evaluator chose Conventional1 than
Proposed.
It was suggested that the effect in which high frequency region of sound is
cut is larger than the effect of spatial aliasing on sound quality.
24. Conclusion
The objective of SFR system is to reproduce the primary sound field to
another space with wide range and high accuracy as much as possible.
Since it is difficult to reproduce the sound field with a complex system, the
SFR method utilizing simple system has been desired.
SDM can be realized with a simple system like linear loudspeaker array.
However, to reproduce the sound field with high accuracy utilizing this
method is impossible.
ꥰꥰ
We proposed the SFR system which reproduce the sound field with high
accuracy to listener's position by estimating the listener's direction.
As results of subjective assessment, the superiority of proposed
method is shown on directional perception.
However, since the superiority failed to show on sound quality, it is
necessary to improve the equiangular filter that we do not apply the lowpass filter.
Thank you for your attention!