PhantomSDR Support Forum

General Category => PhantomSDR Bugs => Topic started by: Bas ON5HB on Sep 20, 2024, 03:40 PM

Title: Spectrumserver stops running.....
Post by: Bas ON5HB on Sep 20, 2024, 03:40 PM
I was installing AMD OpenCL today and it gave me the following error....
Might explain what is wrong and why acceleration stops.

Using OpenCL device: gfx902:xnack+
thread 'main' panicked at src/main.rs:380:52:
Transfer failed: Disconnected
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'main' panicked at library/std/src/io/stdio.rs:1118:9:
failed printing to stdout: Broken pipe (os error 32)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Drijvendekomma-berekeningsfout (geheugendump gemaakt)

Translated, it tells me there is a floating-point-calc error. and it dumped.

Dmesg:

 2645.316402] traps: spectrumserver[60248] trap divide error ip:7ffbaefe21c1 sp:7ffbad471420 error:0 in libhsa-runtime64.so.1.13.60103[7ffbaefb8000+10b000]

I hope this helps to find why it stops...

Title: Re: Spectrumserver stops running.....
Post by: Bas ON5HB on Sep 20, 2024, 05:03 PM
Playing without OpenCL but on Ryzen and this happens after a bit:

Quote|__)|_  _  _ |_ _  _ (_ |  \|__) _|_
|   | )(_|| )|_(_)|||__)|__/| \   | 
                                     
Thank you for using PhantomSDR+, you are supporting the Development of an Open-Source WebSDR Project ♥
No FFTW wisdom file found. Planning from scratch. This may take long on the first time but will then be fast.
Markers updated.
Waterfall is sent every 10 FFTs
thread 'main' panicked at src/main.rs:380:52:
Transfer failed: PollTimeout
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Looks to me if it misses a stdin it dies and stops working.

Without acceleration it gives weird noises, with opencl it dies.

I believe the problem is with recovering from a few missed STD's and it goes nuts in all directions.

Tried everything and it can be recreated quickly without OpenCL and run 30MHz....after a few minutes it happens.
Title: Re: Spectrumserver stops running.....
Post by: Bas ON5HB on Sep 20, 2024, 05:30 PM
Also tested this today:

Thank you for using PhantomSDR+, you are supporting the Development of an Open-Source WebSDR Project ♥
Using MKL
terminate called after throwing an instance of 'char const*'
thread 'main' panicked at library/std/src/io/stdio.rs:1118:9:
failed printing to stdout: Broken pipe (os error 32)

Maybe it helps....seems to me it's failing on STD from time to time.
Title: Re: Spectrumserver stops running.....
Post by: magicint1337 on Sep 20, 2024, 08:29 PM
The last error you posted is because you dont have the libs for MKL Installed nothing to do with pipe.
Title: Re: Spectrumserver stops running.....
Post by: magicint1337 on Sep 20, 2024, 08:29 PM
"Using MKL
terminate called after throwing an instance of 'char const*'"

Means it throws because there is something missing for MKL
Title: Re: Spectrumserver stops running.....
Post by: magicint1337 on Sep 20, 2024, 09:31 PM
Also for stdin i cant really replicate your issues, i have been running for weeks and never had issues. I think fft size should not be increased, its not a buffer. I personally know it as my system is limited and on high fft size stutters.
Title: Re: Spectrumserver stops running.....
Post by: magicint1337 on Sep 20, 2024, 11:43 PM
I think it may be with your intel chip, as its newer than ours i guess? I think rigi and me use 6th generation and we dont have crashes, it may be with 7th gen? We have to look more then.
Title: Re: Spectrumserver stops running.....
Post by: magicint1337 on Sep 20, 2024, 11:43 PM
I know that it runs fine on a rx580, i did that in the past it was awesome. 0-64mhz :)

I ran it with 64mhz for about 2-4 weeks before going back to a smaller setup.
Title: Re: Spectrumserver stops running.....
Post by: Bas ON5HB on Sep 21, 2024, 11:37 AM
Quote from: magicint1337 on Sep 20, 2024, 09:31 PMAlso for stdin i cant really replicate your issues, i have been running for weeks and never had issues. I think fft size should not be increased, its not a buffer. I personally know it as my system is limited and on high fft size stutters.

It happens on any system I use.
I have 2x RX888MK2 and my public is running on an Intel.
The other on a Ryzen5.

Both crash after a few days.

When I reduce the fft-size it only crashes faster.

I have tried with OpenCL and without, it really doesn't matter.

The RX888 driver keeps running but the Spectrumserver is gone, often with the dmesg but also without.

Somewhere it makes a wrong divide by zero, and it seems to hit it very seldom. But it does happen every now and then.

For moment I have the OpenCL running from Intel from the github, but I doubt it's an OpenCL problem.
Title: Re: Spectrumserver stops running.....
Post by: Bas ON5HB on Sep 21, 2024, 03:50 PM
I have made you a backtrace, with not too large FFT_size...this is what happens, no acceleration, just the Ryzen5 CPU:

thread 'main' panicked at src/main.rs:380:52:
Transfer failed: Disconnected
stack backtrace:
   0:     0x649253ef7945 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h1b9dad2a88e955ff
   1:     0x649253f17d8b - core::fmt::write::h4b5a1270214bc4a7
   2:     0x649253ef594f - std::io::Write::write_fmt::hd04af345a50c312d
   3:     0x649253ef8a91 - std::panicking::default_hook::{{closure}}::h96ab15e9936be7ed
   4:     0x649253ef876c - std::panicking::default_hook::h3cacb9c27561ad33
   5:     0x649253ef9061 - std::panicking::rust_panic_with_hook::hfe205f6954b2c97b
   6:     0x649253ef8f57 - std::panicking::begin_panic_handler::{{closure}}::h6cb44b3a50f28c44
   7:     0x649253ef7e09 - std::sys::backtrace::__rust_end_short_backtrace::hf1c1f2a92799bb0e
   8:     0x649253ef8be4 - rust_begin_unwind
   9:     0x649253e49963 - core::panicking::panic_fmt::h3d8fc78294164da7
  10:     0x649253e49d96 - core::result::unwrap_failed::hfa79a499befff387
  11:     0x649253e4dedd - rx888_stream::main::hce7a39d68423d198
  12:     0x649253e6ba43 - std::sys::backtrace::__rust_begin_short_backtrace::h94b3920039075e08
  13:     0x649253e69c59 - std::rt::lang_start::{{closure}}::hb4ff04bf5633c082
  14:     0x649253ef1190 - std::rt::lang_start_internal::h5e7c81cecd7f0954
  15:     0x649253e58c15 - main
  16:     0x7465fc229d90 - __libc_start_call_main
                               at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
  17:     0x7465fc229e40 - __libc_start_main_impl
                               at ./csu/../csu/libc-start.c:392:3
  18:     0x649253e4a095 - _start
  19:                0x0 - <unknown>


As you can see the STD IN/OUT seems to run out of data.

But this is a Ryzen, also tested several Intel machines, it happens on all of them.

Something give a slowdown sometimes causing this.

And the smaller I set the fft-size the faster it happens.