I do not think that solving parallelizing problems would be ground breaking. For example taking an exponential problem, like the Traveling Sales Person problem and making an O^2 version would be Nobel prize material, however making the same TSP take advantage of N cores is not. It still would be exponential, but just run simultaneously on more cores.
I do not know what the bottleneck algorithm is, but whatever it is, should be convertible to handle N smaller slices in N threads rather than a single thread handling it all. For example, remember our old friend, the Bubble Sort? How would you write that to take advantage of N cores? The answer is the Odd-even sort. Same sort, but one is for multi-processing systems. That is the kind of thing I am talking about.
Ill grant you this might be difficult, but it is not impossible and certainly not like cracking the big NP Complete problem that would make you a billionaire. Anyway, you just might not have the time or even given the solution it might not integrate into your code base easily. I know it can be very problematic to take a classic single threaded rendering pipeline and parallelize that, even if you have the answer right in front of you. That can be a major structural change, which might not be in the cards.