在Java中的字符串对集合迅速比较字符串、迅速、Java

由网友(良人唤不回 @)分享简介:我试图来计算字符串编辑距离对集合找到最接近的匹配。我现在的问题是,收藏是非常大的(约25000项),所以我不得不缩小设置为相同的长度只是字符串,但仍然只会范围缩小到几千弦,这仍然是很慢的。是否有一个数据结构,允许类似的字符串或快速查找有另一种办法可以解决这个问题?I am trying to calculate ed...

我试图来计算字符串编辑距离对集合找到最接近的匹配。我现在的问题是,收藏是非常大的(约25000项),所以我不得不缩小设置为相同的长度只是字符串,但仍然只会范围缩小到几千弦,这仍然是很慢的。是否有一个数据结构,允许类似的字符串或快速查找有另一种办法可以解决这个问题?

I am trying to calculate edit distances of a string against a collection to find the closest match. My current problem is that the collection is very large (about 25000 items), so I had to narrow down the set to just strings of similar lengths but that still would only narrow it down to a few thousand strings and this still is very slow. Is there a datastructure that allows for a quick lookup of similar strings or is there another way I could address this problem?

推荐答案

听起来像一个 BK树可能是你想要的。这里有一篇文章讨论这些问题:http://blog.notdot.net/2007/4/Damn-Cool-Algorithms-Part-1-BK-Trees.一个快速谷歌产生了一些Java实现。

Sounds like a BK-tree might be what you want. Here's an article discussing them: http://blog.notdot.net/2007/4/Damn-Cool-Algorithms-Part-1-BK-Trees. A quick Google yields some Java implementations.

阅读全文

相关推荐

最新文章