算法:找出两个字符串之间的所有常见的子串顺序为preserved字符串、算法、顺序、常见

由网友(带着梦想出发)分享简介:想讨论的算法,没有code。 问题:令S和T是元素的两个序列。发现它们之间的公共子序列,其中的元素的顺序是preserved 的。 Problem: Let S and T be two sequences of elements. Find the common subsequences between them...

想讨论的算法,没有code。

问题:令S和T是元素的两个序列。发现它们之间的公共子序列,其中的元素的顺序是preserved 的。

Problem: Let S and T be two sequences of elements. Find the common subsequences between them where the order of the elements is preserved.

它应具有O(N + M)的运行时间,其中n是S的长度,m是T的长度我还要使假设大部分两个序列将是相似的。

It should have O(n + m) running time where n is the length of S, and m is the length of T. I would also like to make the assumption that for the most part the two sequences will be similar.

的最佳解决方案:一些研究,这似乎是最佳的是先建立一个通用后缀树的两个序列的一个解决方案之后。然后找到最长公共子并认为这个序列是解决方案的一部分。然后,无论是从树中删除此序列或与该序列从原来的两个序列去除,以形成S'和T'建立一个新的后缀树。然后求S'和T'之间的最长公共子,等等。

An optimal solution?: After some research, one solution that appears to be optimal is to first build a generalised suffix tree for the two sequences. Then find the longest common substring and consider this subsequence to be part of the solution. Then either remove this subsequence from the tree or build a new suffix tree with this subsequence removed from the two original sequences to form S' and T'. Then find the longest common substring between S' and T', and so on.

要分析的运行时间,建设树需要O(n)的,你可以找到在澳长度和S和T的最长公共子串起始位置(N + M)。

To analyze the running time, building the tree takes O(n) and you can find the lengths and starting positions of the longest common substrings of S and T in O(n + m).

还有没有其他的(多)切实可行的解决办法,有人知道的,也可以链接到?任何发表论文考虑相同或相关的问题你都知道?输入和有关上述溶液建设性的批评?感谢您的时间!

Are there other (more) practical solutions that someone knows of or can link to? Any published papers considering the same or related problem you all know about? Input and constructive criticism about the above solution? Thanks for all your time!

推荐答案

我首先想到的是使用一个后缀树,并与濒海战斗舰的问题。但我不知道有什么更好的解决办法是把我的头顶。我做了快速搜索,并遇到了一些论文和项目,可能是有用的,但不能保证。

My first thought was the use of a suffix tree, and relating it to the LCS problem. But I am not sure what a better solution would be off the top of my head. I did a quick search and came across a few papers and projects that might be useful, but no guarantees.

http://dl.acm.org/citation.cfm?id=1625377 (直接链接在这里我相信: HTTP://www.aaai .ORG /说明书/ IJCAI / 2007 / IJCAI07-101.pdf )

http://dl.acm.org/citation.cfm?id=1625377 (direct link here I believe: http://www.aaai.org/Papers/IJCAI/2007/IJCAI07-101.pdf)

HTTP://$c$c.google.com/p /全共子序列/

对不起,它一直是漫长的一天,我不是很够清醒,试图更好的解决方案我自己。

Sorry, it has been a long day and I am not quite awake enough to attempt a better solution myself.

阅读全文

相关推荐

最新文章