This commit is contained in:
parent
3ee51fa485
commit
db222ed60b
|
@ -0,0 +1,37 @@
|
||||||
|
# iosifovitch
|
||||||
|
|
||||||
|
Iosifovitch is a blazingly faster implementation of the Levenshtein distance
|
||||||
|
function metric.
|
||||||
|
|
||||||
|
This version is a complexity wise identical version to a version that has since
|
||||||
|
been lost. It is not as fast as the original, since it currently overallocates
|
||||||
|
memory, to serve as a buffer for storing results.
|
||||||
|
|
||||||
|
## Plans for further development
|
||||||
|
|
||||||
|
1. Make benchmarks.
|
||||||
|
2. Introduce more tests.
|
||||||
|
2.1. Tests with pre, post and infix strings shared between the strings
|
||||||
|
2.2 Tests where the length of the strings are combinations of odd and
|
||||||
|
even.
|
||||||
|
3. Reduce the size of the buffer. When this was done with the old version,
|
||||||
|
performance was increased 100%.
|
||||||
|
4. Look into SIMD instructions
|
||||||
|
5. Look into parallelism.
|
||||||
|
|
||||||
|
## SIMD
|
||||||
|
|
||||||
|
I have some ideas for how SIMD instructions might be possible to use to improve
|
||||||
|
performance, but I doubt it will have much effect on small strings and it might
|
||||||
|
even be detremental if the strings are too short.
|
||||||
|
|
||||||
|
The most straightforward approach would be to just do more than one calculation
|
||||||
|
at a time, shifting the results down the SIMD registers.
|
||||||
|
|
||||||
|
## Parallelism
|
||||||
|
|
||||||
|
It should be possible to do the calculations recursively, by splitting the
|
||||||
|
longer string in the middle and then calculating the two parts sperately.
|
||||||
|
|
||||||
|
If that can be done, it should be easy to turn on the threads and make run this
|
||||||
|
on all the cores.
|
Loading…
Reference in New Issue