ARver/arver/audio
arcctgx c140510342
Some checks failed
Sync mirrors / sync-gitlab (push) Has been cancelled
Sync mirrors / sync-codeberg (push) Has been cancelled
Unit tests / unit-tests (3.10) (push) Has been cancelled
Unit tests / unit-tests (3.11) (push) Has been cancelled
Unit tests / unit-tests (3.12) (push) Has been cancelled
Unit tests / unit-tests (3.7) (push) Has been cancelled
Unit tests / unit-tests (3.8) (push) Has been cancelled
Unit tests / unit-tests (3.9) (push) Has been cancelled
calculate CRC32 in an auxiliary thread
The code calculates different checksums of the same data, and this is an
embarrassingly parallel problem: the data is only read and the checksums
are independent of one another. By offloading one checksum calculation
to an auxiliary thread the total calculation time is reduced from the
sum of two calculations' times to the duration of the calculation which
takes longer.

The performance was measured by changing the extension code to perform
the CPU-bound checksum calculation 100 times in a row, loading the data
from the disk just once. CRC and AR columns show the times it takes to
calculate respective checksums, seq and par columns show the times it
takes to calculate both checksum types sequentially and in parallel. The
measured times are accurate to about 0.1 second:

 length   CRC [s]   AR [s]   AR/CRC   seq [s]   par [s]   par/seq

 5:47.27      3.9      1.5     0.38       5.4       3.9      0.72
15:52.70     10.6      4.1     0.39      14.7      10.7      0.73
30:45.36     20.6      8.1     0.39      28.7      20.7      0.72
45:45.05     30.7     12.0     0.39      42.7      30.8      0.72
62:00.37     41.5     16.3     0.39      57.5      41.8      0.73

It appears that the theoretical performance improvement is about 28%,
but the benchmark is artificial, and the real-life performance will be
limited mainly by I/O: the CPU-bound part constitutes a small fraction
of the total runtime. In any case, the threading code is not as complex
as I was afraid it would be, so in the end I think it's worth keeping.
2025-06-15 23:44:54 +02:00
..
__init__.py rename checksum package to audio 2024-03-03 14:08:21 +01:00
_audio.c calculate CRC32 in an auxiliary thread 2025-06-15 23:44:54 +02:00
checksums.py improve FLAC processing performance 2024-03-30 18:23:21 +01:00
properties.py rename nframes() and get_nframes() 2025-02-10 21:53:13 +01:00