代码之家  ›  专栏  ›  技术社区  ›  sbi

有没有框架函数帮助查找多个字符串最长的公共起始子字符串?

c#
  •  2
  • sbi  · 技术社区  · 14 年前

    我有一个字符串列表(代表路径和),它们都应该有一个共同的开始(根路径)。我需要有一个共同的开始。

    这只是几行代码,但我有一种烦躁的感觉,那就是每年必须将这些代码组合成100万次,并且框架中可能有一个算法可以用于此目的,但找不到任何东西。
    另外,我想这是以前问过的,但我觉得很枯燥。

    有什么暗示吗?

    7 回复  |  直到 12 年前
        1
  •  2
  •   sbi    14 年前

    如果有人感兴趣,我想说的是:

        public static string GetCommonStartingSubString(IList<string> strings)
        {
            if (strings.Count == 0)
                return "";
            if (strings.Count == 1)
                return strings[0];
            int charIdx = 0;
            while (IsCommonChar(strings, charIdx))
                ++charIdx;
            return strings[0].Substring(0, charIdx);
        }
        private static bool IsCommonChar(IList<string> strings, int charIdx)
        {
            if(strings[0].Length <= charIdx)
                return false;
            for (int strIdx = 1; strIdx < strings.Count; ++strIdx)
                if (strings[strIdx].Length <= charIdx 
                 || strings[strIdx][charIdx] != strings[0][charIdx])
                    return false;
            return true;
        }
    
        2
  •  1
  •   Thomas Levesque    14 年前

    这种方法应该有效:

    string GetLongestCommonPrefix(IEnumerable<string> items)
    {
        return items.Aggregate(default(string), GetLongestCommonPrefix);
    }
    
    string GetLongestCommonPrefix(string s1, string s2)
    {
        if (s1 == null || s2 == null)
            return s1 ?? s2;
    
        int n = Math.Min(s1.Length, s2.Length);
        int i;
        for (i = 0; i < n; i++)
        {
            if (s1[i] != s2[i])
                break;
        }
        return s1.Substring(0, i);
    }
    
        3
  •  1
  •   Kirk Broadhurst    14 年前

    请原谅我的普通变量命名,它不是很快,但应该这样做:

    // your list of strings...
    List<string> strings;    
    
    string shortestString = strings.First(x => x.Length == 
        strings.Select(y => y.Length).Min());
    while (!strings.All(s => s.StartsWith(shortestString)))
    {
        shortestString = shortestString.Substring(0, shortestString.Length - 1);
    }
    
        4
  •  0
  •   Tomas Petricek    14 年前

    简化实现的一个想法是只编写一个方法来获取两个字符串中最长的子字符串,然后使用 Aggregate 来自Linq的方法。类似:

    strings.Skip(1).Aggregate(strings.First(), GetCommonSubString);
    

    我认为没有任何优雅的方法来实现 GetCommonSubstring 使用标准方法处理字符串。如果您关心性能,那么您可能必须以“直接”的方式实现它。使用LINQ的较慢但较短的替代方案可能如下所示:

    var chars = 
      str1.Zip(str2, (c1, c2) => new { Match = c1 == c2, Char = c1 })
          .TakeWhile(c => c.Match).Select(c => c.Char).ToArray();
    return new string(chars);
    

    这首先“压缩”两个字符串,然后在字符相同的地方使用 TakeWhile . 其余部分生成一个字符数组,可用于创建包含结果的字符串。

        5
  •  0
  •   Jonas Elfström    14 年前

    也许我把你的问题简化得太多了,但是呢

    var rootPath = paths.Select(s => new {path = s, depth = s.Split('\\').Length}). Aggregate((memo, curr) => curr.depth < memo.depth ? curr : memo).path;

    绝望,很可能很慢,到处都是愚蠢的尝试。

    var paths = new List<string> { @"C:\Ruby19\lib\ruby\gems",
                                   @"C:\Ruby19\lib\ruby\gems\1.9.2",
                                   @"C:\Ruby19\lib\ruby\gems",
                                   @"C:\Ruby19\lib\test\fest\hest"};
    
    var rootPath = paths.Select(s => new { p = s.Split('\\') })
                        .Aggregate((memo, curr) => new { p = curr.p.TakeWhile((stp, ind) => stp == memo.p.ElementAtOrDefault(ind)).ToArray() })
                        .p.Join("\\");
    

    =>rootpath=“c:\ruby19\lib”

        6
  •  0
  •   Oliver    14 年前

    我以前也遇到过同样的问题(像其他许多人一样)。这是我想出的解决办法。我没有做任何性能测量,但是我没有对100个元素的列表有任何问题。

    using System;
    using System.Collections.Generic;
    using System.Linq;
    
    namespace FEV.TOPexpert.Common.Extensions
    {
        public static class IEnumerableOfStringExtension
        {
            /// <summary>
            /// Finds the most common left string in a sequence of strings.
            /// </summary>
            /// <param name="source">The sequence to search in.</param>
            /// <returns>The most common left string in the sequence.</returns>
            public static string MostCommonLeftString(this IEnumerable<string> source)
            {
                return source.MostCommonLeftString(StringComparison.InvariantCulture);
            }
    
            /// <summary>
            /// Finds the most common left string in a sequence of strings.
            /// </summary>
            /// <param name="source">The sequence to search in.</param>
            /// <param name="comparisonType">Type of the comparison.</param>
            /// <returns>The most common left string in the sequence.</returns>
            public static string MostCommonLeftString(this IEnumerable<string> source, StringComparison comparisonType)
            {
                if (source == null)
                    throw new ArgumentNullException("source");
    
                string mcs = String.Empty;
    
                using (var e = source.GetEnumerator())
                {
                    if (!e.MoveNext())
                        return mcs;
    
                    mcs = e.Current;
                    while (e.MoveNext())
                        mcs = mcs.MostCommonLeftString(e.Current, comparisonType);
                }
                return mcs;
            }
    
            /// <summary>
            /// Returns a sequence with the most common left strings from a sequence of strings.
            /// </summary>
            /// <param name="source">A sequence of string to search through.</param>
            /// <returns>A sequence of the most common left strings ordered in descending order.</returns>
            public static IEnumerable<string> MostCommonLeftStrings(this IEnumerable<string> source)
            {
                return MostCommonLeftStrings(source, StringComparison.InvariantCulture);
            }
    
            /// <summary>
            /// Returns a sequence with the most common left strings from a sequence of strings.
            /// </summary>
            /// <param name="source">A sequence of string to search through.</param>
            /// <param name="comparisonType">Type of comparison.</param>
            /// <returns>A sequence of the most common left strings ordered in descending order.</returns>
            public static IEnumerable<string> MostCommonLeftStrings(this IEnumerable<string> source, StringComparison comparisonType)
            {
                if (source == null)
                    throw new ArgumentNullException("source");
    
                var listOfMcs = new List<string>();
    
                using (var e = source.GetEnumerator())
                {
                    while (e.MoveNext())
                    {
                        if (e.Current == null)
                            continue;
    
                        string removeFromList = String.Empty;
                        string addToList = String.Empty;
    
                        foreach (var element in listOfMcs)
                        {
                            addToList = e.Current.MostCommonLeftString(element, comparisonType);
    
                            if (addToList.Length > 0)
                            {
                                removeFromList = element;
                                break;
                            }
                        }
    
                        if (removeFromList.Length <= 0)
                        {
                            listOfMcs.Add(e.Current);
                            continue;
                        }
    
                        if (addToList != removeFromList)
                        {
                            listOfMcs.Remove(removeFromList);
                            listOfMcs.Add(addToList);
                        }
                    }
                }
    
                return listOfMcs.OrderByDescending(item => item.Length);
            }
    
            /// <summary>
            /// Returns a string that both strings have in common started from the left.
            /// </summary>
            /// <param name="first">The first string.</param>
            /// <param name="second">The second string.</param>
            /// <returns>Returns a string that both strings have in common started from the left.</returns>
            public static string MostCommonLeftString(this string first, string second)
            {
                return first.MostCommonLeftString(second, StringComparison.InvariantCulture);
            }
    
            /// <summary>
            /// Returns a string that both strings have in common started from the left.
            /// </summary>
            /// <param name="first">The first string.</param>
            /// <param name="second">The second string.</param>
            /// <param name="comparisonType">Type of comparison.</param>
            /// <returns>Returns a string that both strings have in common started from the left.</returns>
            public static string MostCommonLeftString(this string first, string second, StringComparison comparisonType)
            {
                if (first == null
                    || second == null)
                    return null;
    
                int length = Math.Min(first.Length, second.Length);
                first = first.Substring(0, length);
                second = second.Substring(0, length);
    
                while (!first.Equals(second, comparisonType))
                {
                    first = first.Substring(0, first.Length - 1);
                    second = second.Substring(0, second.Length - 1);
                }
    
                return first;
            }
    
            private static bool MatchesWithList(string match, IList<string> elements, StringComparison comparisonType)
            {
                string removeFromList = String.Empty;
                string addToList = String.Empty;
    
                foreach (var element in elements)
                {
                    addToList = match.MostCommonLeftString(element, comparisonType);
    
                    if (addToList.Length > 0)
                    {
                        removeFromList = element;
                    }
                }
    
                if (removeFromList.Length > 0)
                {
                    if (addToList != removeFromList)
                    {
                        elements.Remove(removeFromList);
                        elements.Add(addToList);
                    }
                    return true;
                }
    
                return false;
            }
        }
    }
    
        7
  •  0
  •   cdiggins    12 年前

    下面返回任意一组的最长公共前缀 IEnumerable<T> 不仅仅是字符串。

    public static bool Same<T>(this IEnumerable<T> xs) {
      return !xs.Any() || !xs.Skip(!xs.Skip(1).All(x => x.Equals(xs.First()));
    }
    
    public static IEnumerable<T> CommonPrefix<T>(this IEnumerable<IEnumerable<T>> xss) {
      var r = new List<T>();
      var es = xss.Select(x => x.GetEnumerator()).ToList();
      while (es.Select(x => x.MoveNext()).All(x => x))
         if (!es.Select(x => x.Current).Same())
            return r;
         return r;
      }
    }