Emilia Blåsten a281de9bcc Give a short description of the centroid in the readme | %!s(int64=5) %!d(string=hai) anos | |
---|---|---|
.gitignore | %!s(int64=6) %!d(string=hai) anos | |
Makefile | %!s(int64=5) %!d(string=hai) anos | |
README.md | %!s(int64=5) %!d(string=hai) anos | |
agpl.md | %!s(int64=6) %!d(string=hai) anos | |
sspect.cpp | %!s(int64=5) %!d(string=hai) anos |
This is a simple audio spectrogram primarily intended as a visualization tool for transgender vocal training, but can be of course used for other purposes.
The spectrogram uses OpenGL, GLUT and ALSA for display and sound input. The large main window shows a scrolling real-time spectrogram with time on the horizontal axis, frequency on the vertical axis, and the color intensity shows the amplitude (i.e. energy) of that particular frequency at that particular time. Higher pitched sounds will produce curves higher up in the display, and lower pitched ones will show lower. Optionally the spectrogram will draw the centroid, i.e. a line which shows at which frequency the average energy is distributed in the current moment. In other words the brighter the sound (i.e. the more energy in the high frequencies) the higher the centroid would be. In addition there are two smaller graphs on the bottom of the screen. The left-side one shows the real-time spectrum of the sound, in other words a vertical slice of the main plot if you will. The right-side graph shows the instantanous signal, i.e. the time-varying sound pressure in the current timing window used to calculate the spectrum.
The spectrogram is started from the command line and the various arguments are as follows:
sspect [-f] [-v] [-d <device_number>] [-sf <scroll_factor>] [-w <windowtype>] [-t twowinsize] [-fl freqline] [-c]
Command line arguments:
-f fullscreen
-v verbose mode
device_number = 0,1,... the ALSA input device number (default 0)
windowtype = 0 (no window) (will be crappy)
1 (Hann)
2 (Gaussian trunc at +-4sigma) (default, recommended)
scroll_factor = 1,2,... How many vSyncs (@ 60Hz) to wait per scroll pixel
(default 1)
twowinsize = 11,12,...,16 is the power of 2 giving FFT win_size N (default 13)
(Note: this controls the vertical frequency resolution and range)
freqline draws at line at frequency freqline. Several -fl options can be used to draw several lines
-c draws the centroid line
Keys & mouse: arrows or middle button drag - brightness/contrast
i - step through colormaps (B/W, inverse B/W, color)
q or Esc - quit
[ and ] - control horizontal scroll factor (rate)
In addition to the command line arguments, various settings can be
changed in real-time as the spectrogram runs. See Keys & mouse
above. In addition left-clicking on a point on the main graph will
draw a horizontal line above which the current frequency and musical
note will be shown. Left-clicking does the same, but also displays a
harmonic series of notes. This is interesting for understanding vocal
formants.
This depends on OpenGL, GLUT and ALSA, so is intended to run on a
GNU/Linux system. On Debian-based systems it requires the packages
freeglut3
, freeglut3-dev
, libfftw3-dev
, libasound2
and
libasound2-dev
. Similar packages are provided on other GNU/Linux
distributions, but their names might be slightly different.
In addition to the above you should have a C++
compiler and
make
. Under Debian-based systems these are provided by the package
build-essential
. Then build it with a simple make
. Run it with
./sspect
.
The large effort of writing the core of the spectrogram is due to Alex
Barnett (author of glSpect
)
who based his code somewhat on glScope
by Luke Campagnola. This
current code has had some default settings changed, added a few
frequency lines for voice practice, and also added the option to
choose the sound input device on the command line. In addition there's
a lot of code and documentation cleaning up.
Anthony Agnone forked glSpect in August 2016, and his version is
called
audio_visualization
.
Alex and Luke licensed their work with the following sentence
Distributed under a completely free license; this means you can do
absolutely anything you want with this code.
So the part of this code that's their work is under the above
extremely permissive license. The rest is licensed under the terms of
the GNU Affero General Public License as published by the Free
Software Foundation, either version 3 of the License, or (at your
option) any later version. See agpl.md
for the full terms.