I was installing AMD OpenCL today and it gave me the following error....
Might explain what is wrong and why acceleration stops.
Using OpenCL device: gfx902:xnack+
thread 'main' panicked at src/main.rs:380:52:
Transfer failed: Disconnected
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'main' panicked at library/std/src/io/stdio.rs:1118:9:
failed printing to stdout: Broken pipe (os error 32)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Drijvendekomma-berekeningsfout (geheugendump gemaakt)
Translated, it tells me there is a floating-point-calc error. and it dumped.
Dmesg:
2645.316402] traps: spectrumserver[60248] trap divide error ip:7ffbaefe21c1 sp:7ffbad471420 error:0 in libhsa-runtime64.so.1.13.60103[7ffbaefb8000+10b000]
I hope this helps to find why it stops...
Playing without OpenCL but on Ryzen and this happens after a bit:
Quote|__)|_ _ _ |_ _ _ (_ | \|__) _|_
| | )(_|| )|_(_)|||__)|__/| \ |
Thank you for using PhantomSDR+, you are supporting the Development of an Open-Source WebSDR Project ♥
No FFTW wisdom file found. Planning from scratch. This may take long on the first time but will then be fast.
Markers updated.
Waterfall is sent every 10 FFTs
thread 'main' panicked at src/main.rs:380:52:
Transfer failed: PollTimeout
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Looks to me if it misses a stdin it dies and stops working.
Without acceleration it gives weird noises, with opencl it dies.
I believe the problem is with recovering from a few missed STD's and it goes nuts in all directions.
Tried everything and it can be recreated quickly without OpenCL and run 30MHz....after a few minutes it happens.
Also tested this today:
Thank you for using PhantomSDR+, you are supporting the Development of an Open-Source WebSDR Project ♥
Using MKL
terminate called after throwing an instance of 'char const*'
thread 'main' panicked at library/std/src/io/stdio.rs:1118:9:
failed printing to stdout: Broken pipe (os error 32)
Maybe it helps....seems to me it's failing on STD from time to time.
The last error you posted is because you dont have the libs for MKL Installed nothing to do with pipe.
"Using MKL
terminate called after throwing an instance of 'char const*'"
Means it throws because there is something missing for MKL
Also for stdin i cant really replicate your issues, i have been running for weeks and never had issues. I think fft size should not be increased, its not a buffer. I personally know it as my system is limited and on high fft size stutters.
I think it may be with your intel chip, as its newer than ours i guess? I think rigi and me use 6th generation and we dont have crashes, it may be with 7th gen? We have to look more then.
I know that it runs fine on a rx580, i did that in the past it was awesome. 0-64mhz :)
I ran it with 64mhz for about 2-4 weeks before going back to a smaller setup.
Quote from: magicint1337 on Sep 20, 2024, 09:31 PMAlso for stdin i cant really replicate your issues, i have been running for weeks and never had issues. I think fft size should not be increased, its not a buffer. I personally know it as my system is limited and on high fft size stutters.
It happens on any system I use.
I have 2x RX888MK2 and my public is running on an Intel.
The other on a Ryzen5.
Both crash after a few days.
When I reduce the fft-size it only crashes faster.
I have tried with OpenCL and without, it really doesn't matter.
The RX888 driver keeps running but the Spectrumserver is gone, often with the dmesg but also without.
Somewhere it makes a wrong divide by zero, and it seems to hit it very seldom. But it does happen every now and then.
For moment I have the OpenCL running from Intel from the github, but I doubt it's an OpenCL problem.
I have made you a backtrace, with not too large FFT_size...this is what happens, no acceleration, just the Ryzen5 CPU:
thread 'main' panicked at src/main.rs:380:52:
Transfer failed: Disconnected
stack backtrace:
0: 0x649253ef7945 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h1b9dad2a88e955ff
1: 0x649253f17d8b - core::fmt::write::h4b5a1270214bc4a7
2: 0x649253ef594f - std::io::Write::write_fmt::hd04af345a50c312d
3: 0x649253ef8a91 - std::panicking::default_hook::{{closure}}::h96ab15e9936be7ed
4: 0x649253ef876c - std::panicking::default_hook::h3cacb9c27561ad33
5: 0x649253ef9061 - std::panicking::rust_panic_with_hook::hfe205f6954b2c97b
6: 0x649253ef8f57 - std::panicking::begin_panic_handler::{{closure}}::h6cb44b3a50f28c44
7: 0x649253ef7e09 - std::sys::backtrace::__rust_end_short_backtrace::hf1c1f2a92799bb0e
8: 0x649253ef8be4 - rust_begin_unwind
9: 0x649253e49963 - core::panicking::panic_fmt::h3d8fc78294164da7
10: 0x649253e49d96 - core::result::unwrap_failed::hfa79a499befff387
11: 0x649253e4dedd - rx888_stream::main::hce7a39d68423d198
12: 0x649253e6ba43 - std::sys::backtrace::__rust_begin_short_backtrace::h94b3920039075e08
13: 0x649253e69c59 - std::rt::lang_start::{{closure}}::hb4ff04bf5633c082
14: 0x649253ef1190 - std::rt::lang_start_internal::h5e7c81cecd7f0954
15: 0x649253e58c15 - main
16: 0x7465fc229d90 - __libc_start_call_main
at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
17: 0x7465fc229e40 - __libc_start_main_impl
at ./csu/../csu/libc-start.c:392:3
18: 0x649253e4a095 - _start
19: 0x0 - <unknown>
As you can see the STD IN/OUT seems to run out of data.
But this is a Ryzen, also tested several Intel machines, it happens on all of them.
Something give a slowdown sometimes causing this.
And the smaller I set the fft-size the faster it happens.