First, we have to understand the property of human ears. Ears are one of
the most important input devices to the brain as well as eyes. We also
sense sound pressure by our skin, too. However, we discuss about ears as
a input device to the brain, here.
Sound is transmitted from a sound source to our ears by the air. Ears introduce
incoming sound wave into a extended auditory canal for gain amplification.
Then, a tympanic membrane is vibrated by sound pressure. The mechanical
vibration is transmitted to three tiny bones. Vibration of bones is now
transferred to a cochlea , then to cochlear nerves. Signals are transmitted
to the brain.
||This is the famous Flecher-Munson curve. To hear the same phon at 1KHz,
how much SPL is needed in each frequency band, is shown. This means that
obtaining the same ear output in the low frequency domain, we need to have
higher SPL than SPL at 1KHz. Which means that we need more speaker energy
in lower frequency in smaller phon.
|Data from Wikipedia
||Data from Wikipedia
||This is a curve came from the latest research activities. The data is from
ISO226 updated in 2003. Difference between higher and lower phon in the
low frequency domain is much lager than the Flecher-Munson curve. ISO defines
that this curves as the “Equal Loudness Curve”. So, we refer this curve
when we compensate the “Equal Loudness” in a listening room.
||A compensation curves are shown. Higher frequency compensation is not needed
because differences between curves are almost identical. Therefore, practically
low frequency compensation is applied to make the sound more real.
|Data from Wikipedia (ISO226-2003)
||Data from Yuichi's article in MJ
This is a horizontal
directivity property of human ears. Frequency is at 8KHz. One side (right) curve
is inverted to generate other side (left) curve and assembled for better
visibility on this chart. The width of the face is neglected. The horizontal
directivity property is generally fair for all angles.
||This is a vertical directivity property of human ears. Frequency is also
at 8KHz. One side (right) curve is inverted to generate other side (left)
and assembled for better visibility on this chart. The width of the face
is neglected. Upper side direction has almost flat sensitivity, but down
side is poor because of the body.
|Data: from Hirotake Yoshizawa and Makoto Namekata,
Gunma Sangyou Gijutu Center 2004
||Data: from Hirotake Yoshizawa and Makoto Namekata,
GunmaSangyou Gijutu Center 2004
||I am not so happy to see this data. But, unfortunately the data was taken
by researchers and now it is ISO standard. Anyway, people get old, so high
frequency sensitivity gets down. This is the data for male. Surprisingly
enough, degradation starts from 2KHz, and from age 40.
||This is the
same data for female. High frequency sensitivity of ladies ears also gets down,
but not like male case. Ladies ears keep better sensitivity than gents. My God!
|Edited from Data: ISO-7029
||Edited from Data: ISO-7029
This is a STI (Speech Transmission Index) and a classification of easiness
of speech recognition. X axis shows reberveration time (sec) and Y axis
is the grade of easiness (STI).ISO defines STI five classes including “Excellent”, “Good”, ”Fair”, “Poor”
and “Bad” level.
||This data shows shifting of the STI by age. My
God! Again, aging of human ears degrades speech recognition capability.
Therefore, older people have to listen music or vocal in noise free
environment. Otherwise, they get frustrated.
|Data from Yuichi's article in MJ
Magazine, ISO and AIST No.26
||Data: Edited from AIST No.26 presentation
Above data is from statistics of human ears. Needless to say, the performance
and properties depend upon each individual person. It depends upon individual
nature, past training, environment where he/she has lived and etc. It is
not difficult matter to know that our frequency sensitivity by conventional
method. We can use sine wave generator from free download side and try
to hear by comparing microphone output. Please !!! start from ZERO volume of the amplifier and use higher quality speaker. And see your sensitivity at 5K, 10K, 15HHz.
|Next, we try to understand human brain, a cerebrum in this case, which
is the most mysterious space in our body. I only try to see its properties
from the point of music reproduction, here. I try to explain in very simple
Picture of the cerebrum shows a basic flow diagram from the sound signal
input, which is the output of ears, to the output of the brain. But in
parallel, knowledge, visual images and others related to the music are
also input to the brain.
Output signal from ears is transmitted to cochlear nerves. These nerves
are parallel lines and feed these signals to a auditory cortex in a cerebral
cortex by each frequency band. The transmitted data includes the “Sound
data” not the “Music data”, at this point yet.
|(2) The signal is now brought to filters. This function eliminates noises.
On top, some sort of sound information which is not preferable topics are
Then, selected sound information is transmitted to a hippocampus. The hippocampus
is a temporary memory or a working memory to keep the sound or series of
sounds for a certain period of time. The contents in the temporary memory
will disappear soon, or may be in an hour or so.
However, we can recognize music scenario here, so it may have some functions
to assemble music from series of sound here, too.
|| (4) Information which has strong impression or repetitive memorization in the
hippocampus is now transmitted to the long term memory or non volatile
memory area. In this memory, there are lots of information are already
stored including his/her knowledge, past experiences, happy/sad feelings,
visual and sound images, music images, unique value, etc. etc.
New ideas or new findings or value will be generated from new combination
of those pieces of information.
Output from the memory triggers a left cortex and a right cortex. Information
from the hippocampus may also directly hits cortexs.
The right cortex feels excitement including positive and negative feeling.
In case of positive stimuli are added, poeple feel hope, happiness and
vivid feelings. On the other hand people feel fear or frustration when
negative stimuli are given.
|(7) The left cortex handles logical function. If frequency response of a speaker
system is flat, then it decides the reproduced sound must be good, for
|(8) Memorized information does not stay the same. The brain always fetches
memory and process it and then returns it back. Information is changed
a little bit. Most of old days memory bacome nicer. We forget some un-wanted
There is a common database for all human. Most of people do not like some
sort of noise or sound such as glass rubbing. Most of all people have some
favorite sounds or tempos. Those influences all the decision making process
in the brain. Therefore, we can share the same feelings by this common
database among individuals.
|Finally, the cerebrum generates a control and or a activation signals to the body.
|Rules of brain operations
related to music reproduction process are as follows.
(A) : In general, the brain tries to draw out a conclusion with the most economical way even if obtained conclusion is true or not. This is because the brain
has to finish current job as soon as possible and to make itself ready
condition for the next urgent job(s) which may come soon.
(B) : The brain always likes to stay in the most stable and to be in comfortable
situation. In case the brain has unsolved problem or unhappy situation, this is
unstable condition. So, the brain tries to solve the problem as soon as
possible. Sometimes the brain reaches temporary solution, it is happy even
if the solution does not eliminates a real cause.
ex. In case, someone bought very expensive audio cables, he/she tries to
understand the sound must be very good and feels satisfaction even if physical
sound stay exactly the same as before. This short cut to the conclusion
is come from the behavior (A) and (B) above. “OK, no doubt! I am satisfied
with. My investment is fully justified.”
(C) : Visual information is superior to auditory information in most of the cases. When we listen the music, we also see the speaker
system or amplifiers or interior of the room by our eyes. We have already
memorized beautiful photo information, good catalogue data regarding amplifiers
or speakers and may be gold plated terminals / cables and etc. These visual
information greatly influence to the sound or the music in the brain not
by ears. It is always happen. Good restaurants serve not only good food
but also good atmosphere including plates and decoration with candles.
(D) : Re-generation of memory
Even if there is no input from the input device such as ears, eyes, nose
or skin, the brain generates output, independently from the input. Older
person does not sense higher frequency sound physically. However, he/she
has music memories stored long time ago when they were young. So, they
listen music 70% of all information in the listening room now, however
they can hear 100% in the brain.
(E) : 24H365D operation
The brain works 24 hour 365 days. It does not sleep. When we are in bed,
the brain fetches data from the memory and return it back. It may modify
the data or add some or combine with other data in the memory. Therefore,
what we remember is not always the same as before.
||Left chart describes a process from the “Music creation” to the “Music
Reproduction”. First, some impression exists. Then, (a) musician(s) tries
to express his/her impression by sound or music. There are many people
involved in the process. However, these activities generate a series of
sound as a conclusion. The sound is captured by microphone(s) and recorded
in the master tape or HDD. The sound is pressed on a recording media after
some editing process. Sound is now in the media. Or, in the file server
for distribution through the internet in recent years.
The sound is now reproduced in the listening room using sophisticated equipment.
The listener hears the sound. The sound reaches to listeners ears and assembled
as a music in their brain. Finally, they try to reproduce original impression.
However, it is impossible to do this work 100%. Because they are not the
same as the musician.
So, they try to create their own impression or music based upon their
interpretation. Series of process is realized by a layers system as illustrated in this
picture. Left side layers are for the music creation process. A bottom
line layer is for generation of media. This process yields CD or LP or
digital data for music listeners. And the right side layers are for reproduction
or re-creation of music.
|There are some discussions
related to quality of source sound. If we mix up quality of source and quality
of reproduction, we get lost the way where we go. Therefore, we forget about the
quality of source, first. There are many contribution factors to decide the
sound quality in between the source data and the sound reaches to our ears.
There are mostly influenced by equipment quality including a CD Player, amplifiers,
speakers, room acoustics and performance of ears. Performance of ears includes not
only the equal loudness but also our own ear property such as frequency
response, dynamic range, S/N ratio, and etc.
Then, captured sound by ears is brought to the brain system, the brain
reproduces the original music. However, problem is that most of listeners
do not know the original sound or music except live listeners. So, reference
may be "listener’s feelings". “I prefer this, or I don’t like”
in most of the cases.
||This is physical process from the recording media to human ears. In electronics equipment interface world, usually output impedance is lower enough than input impedance of next stage. Therefore, the next stage does not influence the previous stage.
The loudspeaker and the power amplifier behave differently in case DF (dumping
factor) is low such as a tube amp. Largest contributors to the sound quality
are loudspeakers and a listening room as of today. Technology of other
electronics components is well established.
Let's start from the
bottom. One of the most important factors to obtain high quality reproduction
system is to have good physical property. This belongs to the bottom layer.
Quality of the CD player to the loud speaker system including the listening
room can be physically measured. The properties include such as the frequency response,
the distortions, the signal to noise ratio, the dynamic range, the reflection
from the walls, the acoustic property including the RT60 and others at
the point of the listening location. This layer is for the “Sound” reproduction.
Because of this reason, many people share the parameters or properties,
and improve those because reference is relatively clear.
In the middle layer, it get a little complex. Every individual has different
ears and different feelings. In this layer, we can adjust and tune up or
compensate the sound for the equal loudness, for the ear property and for
the acoustics of listening room and for loudspeaker property. There are
no physical references. Therefore, the reference may be “I prefer this
sound or music” or “I don’t like” may happen. Compensation can also be
done for original source. Good example is that some individuals enhance
lower frequency for certain CD because they like it.
The highest layer is much more difficult. This is a kind of creation of
the listener's music or reproduction refers to their old memories and imagination.
Sometimes it can be bad signal to noise ratio. In this case, noise is a
part of the music which recalls 50-60 years ago. The cable matter exists
in this layer. Some expensive cables is not effective for physical sound
quality. Only the blind test can detect the difference. But, some people
like to spend money for expensive cables, because he listen better sound
in their brain. I do not feel the difference between expensive cables and
normal cables by ears, but by the brain sometimes. Feelings fluctuate day
by day and as time goes by.
Most important thing is that we always start from the bottom layer. Then, we go up. If we satisfied, that is fine. If not, we can anytime back to the bottom layer. Then, we do not get lost.
When we discuss music reproduction quality, we better clarify if it is
sound quality or muic quality or image quality. Those are located in separate
layers. Then, we can focus the discussion in a same layer. Now, we can
||Another headache question is a reverberation of a listening room. If there
is no reverberations in the room, reproduced sound is not realistic in
general. Adequate reverberation must be added even in the listening room
in the current two channel stereo technology. The reverberation time is
displayed so called RT60 which is the time from 0dB to –60dB. So, how much
second? Is it for the classical music? Solo? Jazz? Instrumental? Vocal?
It is widely said that the time is less than 1 sec depending upon how he
or she likes. This is the most money spending matter for reproduction of
music but it is very important.