This is big, these algorithms are useful for a ton of things (and can get very expensive in big sets)
The Osmand/Organic Maps devs know about this?
It’s not that useful for them. This is for calculating the shortest path from a single source to all other vertices in the graph, but for route finding you only need to know a single path.
Very cool! Looks like the paper is here: https://arxiv.org/abs/2504.17033
We care about your data, and we’d like to use cookies to give you a smooth browsing experience. Please agree and read more about our privacy policy.
At least they’re not hiding the fact that couldn’t care less about your privacy, and only care about your data. At least that…
This is awesome! I’m reading through their paper right now, and might try to implement it to see how it works in practice.
In case anyone’s curious, still working on it. It’s not as simple as something like Dijkstra’s algorithm.
What’s really interesting is the requirement that it seems to place on the graph itself. From what I can tell, the graph it wants to use is a graph where each node has a maximum in-degree of 2 and maximum out-degree of 2, with a total degree of no greater than 3. A traditional di-graph can be converted to this format by splitting each node into a strongly connected cycle of nodes, with each node in the cycle containing the in-edge and out-edge needed to maintain that cycle (with weights of 0) plus one of the previous edges.
Theorerically, this invariant can be held by the graph data structure itself by adding nodes as needed when adding edges. That’s what my implementation is doing to avoid having the cost of converting the graph each time you run the algorithm. In this case, one of these node cycles represents your higher level concept of a node (which I’m calling a node group).
The block-based list is also interesting, and I’ve been having trouble converting it to a data structure in code. I’m still working through the paper though, so hopefully that isn’t too bad to get done.
Guess I’ll post another update. The block-based data structure makes no sense to me. At some point it claims that looking up a pair in the data structure is
O(1)
:To delete the key/value pair ⟨a,b⟩, we remove it directly from the linked list, which can be done in O(1) time.
This has me very confused. First, it doesn’t explain how to find which linked list to remove it from (every block is a linked list, and there are many blocks). You can binary search for the blocks that can contain the value and search them in order based on their upper bounds, but that’d be
O(M * |D_0|)
just to search the non-bulk-prepended values.Second, it feels like in general the data structure is described primarily from a theoretical perspective. Linked lists here are only solid in theory, but from a practical standpoint, it’s better to initialize each block as a preallocated array (vector) of size M. Also, it’s not clear if each block’s elements should be sorted by key within the block itself, but it would make the most sense to do that in my opinion, cutting the split operation from
O(M)
toO(1)
, and it’d answer howPULL()
returns “the smallest M values”.Anyway, it’s possible also that the language of the paper is just beyond me.
I like the divide-and-conquer approach, but the paper itself is difficult to implement in my opinion.
Super interesting; I wonder whether the fraction of nodes that need to be represented by cycles eats into the performance benefit vs. other approaches.
This might be a stupid question but I have just some limited experience with sorting algorithms and was wondering something: could some algorithm like this be improved specifically for multi-threaded processing? I mean, an algorithm that is generally NOT the best option with a single thread, become better than other algorithms when it can delegate part of the work to different threads?
Perhaps something that runs the same algorithm on several threads, each starting at a different point of the map, but sharing their findings with the other threads?
Algorithms can be designed for multithreading yes. Divide and conquer algorithms, like this one, break the problem into independent chunks, and a map reduce on that work can force it to be done across multiple threads.
The real question is whether you gain anything from it. Creating a thread and sending data back and forth has a cost as well, and it’s usually a pretty big one relative to the work being done.