Commit Graph

305 Commits

Author SHA1 Message Date
Christopher Berner
3e88b065dd Optimize graph substep
Use a union-find data structure which is incrementally updated, instead
of always recomputing the entire graph

This improves performance by 5-10%
2020-12-26 10:05:53 -08:00
Christopher Berner
26c9c2f6a0 Remove eliminate_leading_value()
Also fix usage and semantics of selection helper .resize() method.
Previously, it said all values in first column had to be zero, but it
was called before those were eliminated
2020-12-26 10:05:53 -08:00
Christopher Berner
11d2de97f2 Update to rand 0.8 2020-12-19 13:14:12 -08:00
Christopher Berner
102c6a5a86 Optimize DenseBinaryMatrix
Switch to a single contiguous vector instead of vec of vecs

This improves performance by ~5%, especially for smaller symbol counts
2020-12-07 22:16:38 -08:00
Christopher Berner
7b0d1c5cff Remove X matrix from release builds
This improve performance on small symbol counts by ~5%
2020-12-06 15:36:43 -08:00
Christopher Berner
5c13e8de6e Further optimize fused_addassign_mul_scalar_binary_avx2()
Move calculation of control flags out of loop to avoid one OR
instruction inside loop. Also statically enable BMI1 and detect its
presence to ensure that BEXTR2 intrinsic is inlined

Improves performance by ~5% on small symbol counts
2020-12-06 15:36:43 -08:00
Christopher Berner
d87e46c625 Optimize DenseBinaryMatrix.swap_columns()
Improves performance by 10-15% for symbol count = 100
2020-12-06 08:15:21 -08:00
Christopher Berner
3a05d7be3e Add BinaryOctetVec
Improves encoding speed of large symbol counts by ~5%
2020-12-06 08:15:21 -08:00
Christopher Berner
50301e1b5b Optimize query_non_zero_columns()
This reduces the time spent in the fourth phase from ~6% of encoding
time to ~1%, according to perf, and improves overall throughput by 3-4%
on large symbol counts.
2020-11-29 09:51:56 -08:00
Christopher Berner
c4d227fba1 Optimize memory layout of dense U matrix
Previously we used column major ordering. Switch to row major to
optimize sequential access of rows which is much more common in the
first phase, and can also be used in the fourth phase

This improves performance by ~10% on large symbol counts
2020-11-28 21:08:35 -08:00
Christopher Berner
6245ab1c9a Fix over-allocation of memory for dense U section of matrix
The previous code had an off by one error leading to an extra word being
allocated for each row
2020-11-28 17:22:50 -08:00
Christopher Berner
3a4068a726 Fix typo in spelling of "access" 2020-11-28 17:22:50 -08:00
Christopher Berner
9a849add9b Optimize vector creation in get_sub_row_as_octets()
Improves performance by ~5%
2020-11-28 17:22:50 -08:00
Christopher Berner
8b462f5c83 Optimize processing of U matrix
Optimize Phases 2-5 to avoid writes to the first i columns.
Additionally, use pre-computed ops from first phase to implement third &
fifth phases

This improves encoding performance by ~15%, especially on large symbol
counts
2020-11-27 21:36:17 -08:00
Christopher Berner
a96272a0c7 Remove useless cfg guard 2020-11-23 20:00:42 -08:00
AnthonyMikh
4a6ddf1c26 Avoid bounds checking in loop
Slicing `src` checks bounds only once instead of on every iteration of loop.
2020-10-26 19:33:34 -07:00
Christopher Berner
4bf46ec16b Upgrade primal and pyo3 dependencies 2020-10-24 11:26:33 -07:00
Jonathan Nilsson
ab75fc1b6d OIT is copy 2020-10-22 22:49:59 -07:00
Jonathan Nilsson
a81ca51f41 I need access to the partition function for my decoder and i want to create a encoder from a ObjectTransmissionInformation 2020-10-22 22:49:59 -07:00
Jonathan Nilsson
a788b14bac Remove some clones and removed some allocations 2020-10-16 22:03:36 -07:00
Christopher Berner
95b6b5ae91 Make serde support optional 2020-08-30 09:39:39 -07:00
Christopher Berner
6330f94c4c Simplify multiplication table initialization
Replace unrolled loops with const fn while which is new in Rust 1.46
2020-08-29 21:23:40 -07:00
Christopher Berner
f8240da5b5 Fix incorrect symbol calculation assertion 2020-06-23 21:07:16 -07:00
Christopher Berner
dca2ad8b7c Fix Clippy warnings 2020-06-23 20:50:03 -07:00
Christopher Berner
e08c78a800 Avoid allocating excess memory 2020-05-07 10:16:43 -07:00
Christopher Berner
48a9dcc2c0 Remove dead code 2020-05-07 10:16:43 -07:00
Christopher Berner
97aa0b5003 Fix crash in Decoder when decoding large numbers of blocks 2020-03-28 14:53:34 -07:00
Christopher Berner
88a9d6d582 Fix Clippy style warning 2020-03-20 23:32:45 -07:00
Christopher Berner
69246f50b1 Add public function to calculate object to block splits 2020-03-20 22:58:09 -07:00
Christopher Berner
7847099cd7 Fix source block numbering with uneven blocks
This fixes a critical bug where blocks with ids after ZL, see section
4.4.1.2. in RFC, were incorrectly numbered during encoding
2020-03-14 08:38:52 -07:00
Christopher Berner
e12085c195 Add assertation from RFC to parameter calculation 2020-03-14 08:38:52 -07:00
Christopher Berner
329598c48b Implement sub block support 2020-02-25 22:50:29 -08:00
Christopher Berner
26e8b0e509 Add EncoderBuilder to allow more configuration of encoding 2020-02-24 19:09:23 -08:00
Christopher Berner
04149b42ff Fix max length assertion
The maximum value in RFC6330 of 946270874880 is incorrect as
documented in errata id 5548
2020-02-22 14:36:06 -08:00
Christopher Berner
6eeeb67f70 Merge Python wrapper crate into main crate 2020-02-22 13:31:16 -08:00
Christopher Berner
29131bb4d2 Parallelize repair tests 2020-02-02 10:50:28 -08:00
Christopher Berner
f9edd667dc Remove unused dbg! invocation 2020-01-26 21:41:21 -08:00
Christopher Berner
f796c55332 Remove unnecessary Vec allocation 2020-01-26 21:41:21 -08:00
Christopher Berner
7d29fd95ef Remove outdated TODOs 2020-01-26 21:41:21 -08:00
Christopher Berner
c244828d71 Various minor refactorings 2020-01-26 10:15:56 -08:00
Christopher Berner
ec54f3c838 Replace SourceBlockEncoderCache with SourceBlockEncodingPlan 2020-01-26 10:15:56 -08:00
Anders Martinsson
04786d26fd Add operation vectors for better encoding performance
Using stored operation vectors when generating intermediate symbols make
encoding around three times faster (depends on block size).

Signed-off-by: Anders Martinsson <anders.martinsson@intinor.se>
2020-01-26 10:15:56 -08:00
Anders Martinsson
bdf5627e4c Add SSSE3 SIMD implementation
Add SSSE3 support to be used if AVX2 support is missing and SSSE3 is available.

Signed-off-by: Anders Martinsson <anders.martinsson@intinor.se>
2020-01-21 13:42:48 +01:00
Christopher Berner
f124b6f2be Add more extended test coverage 2020-01-19 11:08:40 -08:00
Christopher Berner
b9dde8e167 Optimize columnar storage in sparse matrix
This reduces memory usage for large symbol counts by 10%+
2020-01-19 11:08:40 -08:00
Christopher Berner
f0177f9311 Convert column iterator to return only 1-valued rows 2020-01-19 11:08:40 -08:00
Christopher Berner
e2ede5e61a Add assertions to check that column index is always accurate 2020-01-19 11:08:40 -08:00
Christopher Berner
87f8e7ae81 Remove logic to update column index 2020-01-19 11:08:40 -08:00
Christopher Berner
b39bce022c Optimize binary sparse vector storage
Reduces memory usage by 10%+ for large symbol counts
2020-01-19 11:08:40 -08:00
Christopher Berner
c07da3e667 Index adjacent nodes in graph
Speeds up graph selection step by ~3x
2020-01-19 11:08:40 -08:00