raptorq

mirror of https://github.com/cberner/raptorq.git synced 2024-06-20 13:59:00 +00:00

Author	SHA1	Message	Date
Christopher Berner	82eb5dee14	Resolve 2.0 TODOs	2024-03-14 20:55:54 -07:00
Markus Legner	da79ac2ba5	Fix symbol IDs assigned to encoding packets Previously, we incorrectly assigned internal symbol IDs (ISIs) to the `PayloadId` of `EncodingPacket`s. According to RFC 6330, it should be the encoding symbol IDs (ESIs). This commit fixes this inconsistency. BREAKING CHANGE: As the assignment of symbol IDs changes, encoding packets generated before this change cannot be decoded by the new version and vice-versa.	2024-03-11 20:43:29 -07:00
Christopher Berner	1490c5a61f	Fix alignment error on ARM	2024-03-04 19:15:53 -08:00
Christopher Berner	a3a0204585	Run clippy --fix	2023-11-26 11:32:44 -08:00
Christopher Berner	36d8fe89d2	Remove wasm support	2023-11-26 07:57:22 -08:00
Christopher Berner	cd1df04d92	Only set no_std when built without the std feature	2023-11-26 07:32:07 -08:00
Christopher Berner	7939144ef8	Fix division by zero when packet size is less than 32	2023-11-25 09:43:09 -08:00
Christopher Berner	eafdc58d0a	Run cargo clippy --fix	2023-07-03 11:09:19 -07:00
Slesarew	5a720829fa	feat: support no_std (#143 ) * feat: support no_std `metal` feature supports `no_std` in configuration `default-features = false, features = ["metal"]`. Float calculation is done via `micromath` crate. All previously available functionality remains under default `std` feature. Some tweaking of `python` and `wasm` features was done to compile tests. * feat: get rid of floats (#2) * feat: remove conversion to f64, fix features * chore: uncomment symbols_required checker, fmt * revert: add cdylib target for python support * fix: generalize crate type --------- Co-authored-by: varovainen <99664267+varovainen@users.noreply.github.com>	2023-02-02 18:07:41 -08:00
Pavel	02c80b595a	Added wasm build configuration (#136 ) Co-authored-by: Christopher Berner <christopherberner@gmail.com>	2022-10-08 21:08:55 -07:00
Christopher Berner	a1e451349c	Fix clippy warnings	2022-05-16 08:31:45 -07:00
Christopher Berner	9a47489160	Enable NEON optimized code path on aarch64	2022-05-16 08:31:45 -07:00
Christopher Berner	95286b9d0b	Make extended_source_block_symbols unconditionally public	2022-02-07 19:10:01 -08:00
Christopher Berner	98a9806801	Update to 2021 edition	2021-10-21 18:47:08 -07:00
Christopher Berner	88959e05e6	Use vshrq_n_u8 in neon optimizations Now that https://github.com/rust-lang/rust/issues/82072 is fixed this intrinsic works and improves mulassign & FMA performance by ~30% on Raspberry Pi 3 B+. End to end speedup is ~5%	2021-10-16 18:11:46 -07:00
Christopher Berner	2e0befc8df	Fix remaining Clippy warnings	2021-07-27 22:51:14 -07:00
Christopher Berner	5a851083ed	Fix some Clippy warnings	2021-07-27 22:00:38 -07:00
Christopher Berner	8b669faefd	Fix panic in graph traversal There was an off-by-one error in the initialization of storage for connected components, such that if there were the same number of connected components as nodes it would cause an index out of bounds error	2021-07-27 21:46:46 -07:00
Christopher Berner	83101d6a7c	Fix cargo test compilation error	2021-03-18 20:16:19 -07:00
Christopher Berner	893e1c7c79	Optimize fma with GF2 with NEON Improves performance by ~2x on very large symbol counts	2021-02-14 17:15:40 -08:00
Christopher Berner	e06af58ce2	Workaround bad compiler optimization with vshrq NEON instruction The vshrq intrinsic incorrectly compiles to 16 single byte shift instructions. This improves throughput by ~50%	2021-02-14 17:15:40 -08:00
Christopher Berner	a4db356932	Add NEON optimized mul_assign() function Speeds up this op by ~2x	2021-02-14 17:15:40 -08:00
Christopher Berner	d0322d3ca4	Add NEON optimized FMA Speeds up FMA by ~3x, and encoding throughput by ~50%	2021-02-14 17:15:40 -08:00
Christopher Berner	e3e9d6dcc2	Add NEON optimized implementation for octets::add_assign()	2021-02-14 17:15:40 -08:00
Christopher Berner	63b2aec337	Fix 1:255 chance of test failure. The fused FMA function doesn't allow a scalar of 1	2021-02-09 17:55:03 -08:00
Christopher Berner	c134e5b93e	Fix panic on non-x86 platforms This panic'ed because fused_addassign_mul_scalar does not support scalar=1, and it was used as the fallback	2021-02-09 17:55:03 -08:00
Christopher Berner	562e64d438	Optimize column swapping substep for r > 1 Improves performance by ~4%	2021-02-05 19:51:08 -08:00
Christopher Berner	1241928a84	Optimize column swapping substep for r=1 Improves performance by ~1%	2021-02-05 19:51:08 -08:00
Christopher Berner	67a90ede4e	Replace retain() with position() + swap_remove() This improves performance by 1-2%	2021-02-05 19:51:08 -08:00
Christopher Berner	5e506b5b78	Merge .map().filter() into .filter_map()	2021-02-05 19:51:08 -08:00
Christopher Berner	7cfef09bc6	Don't eliminate sparse values from HDPC These are never read, except for debugging, and this improves perf by ~1%	2021-01-17 21:24:08 -08:00
Christopher Berner	42c08b85c8	Remove unnecessary condition This is always true, since we're in the r = 1 case	2021-01-17 21:24:08 -08:00
Christopher Berner	fa2064796c	Fix some Clippy warnings	2021-01-17 15:53:44 -08:00
Christopher Berner	eb07e23208	Optimize HDPC generation with recursive calculation Improves performance on large symbol counts by > 10%	2021-01-17 15:31:47 -08:00
Christopher Berner	f36bf73ca9	Reduce calls to rand() during HDPC generation Small improvement to performance. Perhaps 1%	2021-01-17 13:09:52 -08:00
Christopher Berner	6546b714ad	Skip elimination in V section of A during first phase This is safe due to Errata 11, and speeds up performance by a couple percent	2021-01-17 10:30:34 -08:00
Christopher Berner	24235dd213	Optimize first phase to call ones_in_column() only once for r = 1 case	2020-12-26 20:46:47 -08:00
Christopher Berner	602fc8711d	Reduce length of merge chains in union-find data structure	2020-12-26 20:46:47 -08:00
Christopher Berner	3e88b065dd	Optimize graph substep Use a union-find data structure which is incrementally updated, instead of always recomputing the entire graph This improves performance by 5-10%	2020-12-26 10:05:53 -08:00
Christopher Berner	26c9c2f6a0	Remove eliminate_leading_value() Also fix usage and semantics of selection helper .resize() method. Previously, it said all values in first column had to be zero, but it was called before those were eliminated	2020-12-26 10:05:53 -08:00
Christopher Berner	11d2de97f2	Update to rand 0.8	2020-12-19 13:14:12 -08:00
Christopher Berner	102c6a5a86	Optimize DenseBinaryMatrix Switch to a single contiguous vector instead of vec of vecs This improves performance by ~5%, especially for smaller symbol counts	2020-12-07 22:16:38 -08:00
Christopher Berner	7b0d1c5cff	Remove X matrix from release builds This improve performance on small symbol counts by ~5%	2020-12-06 15:36:43 -08:00
Christopher Berner	5c13e8de6e	Further optimize fused_addassign_mul_scalar_binary_avx2() Move calculation of control flags out of loop to avoid one OR instruction inside loop. Also statically enable BMI1 and detect its presence to ensure that BEXTR2 intrinsic is inlined Improves performance by ~5% on small symbol counts	2020-12-06 15:36:43 -08:00
Christopher Berner	d87e46c625	Optimize DenseBinaryMatrix.swap_columns() Improves performance by 10-15% for symbol count = 100	2020-12-06 08:15:21 -08:00
Christopher Berner	3a05d7be3e	Add BinaryOctetVec Improves encoding speed of large symbol counts by ~5%	2020-12-06 08:15:21 -08:00
Christopher Berner	50301e1b5b	Optimize query_non_zero_columns() This reduces the time spent in the fourth phase from ~6% of encoding time to ~1%, according to perf, and improves overall throughput by 3-4% on large symbol counts.	2020-11-29 09:51:56 -08:00
Christopher Berner	c4d227fba1	Optimize memory layout of dense U matrix Previously we used column major ordering. Switch to row major to optimize sequential access of rows which is much more common in the first phase, and can also be used in the fourth phase This improves performance by ~10% on large symbol counts	2020-11-28 21:08:35 -08:00
Christopher Berner	6245ab1c9a	Fix over-allocation of memory for dense U section of matrix The previous code had an off by one error leading to an extra word being allocated for each row	2020-11-28 17:22:50 -08:00
Christopher Berner	3a4068a726	Fix typo in spelling of "access"	2020-11-28 17:22:50 -08:00

1 2 3 4 5 ...

343 Commits