代码之家  ›  专栏  ›  技术社区  ›  Hishaam Namooya

使用IsMatch时出现REGEX性能问题

c#
  •  1
  • Hishaam Namooya  · 技术社区  · 6 年前

    string requestedPath = HttpUtility.UrlDecode(this.StripLanguage(currentContext.InputUrl.AbsolutePath));
    string requestedPathAndQuery = HttpUtility.UrlDecode(currentContext.InputUrl.PathAndQuery);
    string requestedRawUrl = HttpUtility.UrlDecode(currentContext.InputUrl.PathAndQuery);
    string requestedUrl =
        HttpUtility.UrlDecode(
            string.Concat(
                currentContext.InputUrl.Scheme,
                "://",
                currentContext.InputUrl.Host,
                requestedRawUrl));
    
    string requestedRawUrlDomainAppended = HttpUtility.UrlDecode(currentContext.InputUrl.AbsoluteUri);
    string requestedPathWithCulture = HttpUtility.UrlDecode(currentContext.InputUrl.AbsolutePath);
    
                        var finalRequestedURL = string.Empty;
    finalRequestedURL = Regex.IsMatch(requestedPathAndQuery,matchPattern.Trim(),RegexOptions.IgnoreCase)
                        ? requestedPathAndQuery
                        : Regex.IsMatch(requestedPath,matchPattern.Trim(),RegexOptions.IgnoreCase)
                            ? requestedPath
                            : Regex.IsMatch(requestedPathWithCulture,matchPattern.Trim(),RegexOptions.IgnoreCase)
                                ? requestedPathWithCulture
                                : Regex.IsMatch(requestedRawUrl,matchPattern.Trim(),RegexOptions.IgnoreCase)
                                    ? requestedRawUrl
                                    : Regex.IsMatch(requestedUrl,matchPattern.Trim(),RegexOptions.IgnoreCase)
                                        ? requestedRawUrlDomainAppended
                                        : string.Empty;
    

    这个 matchPattern (.*)/articles/my-article(.*) http://www.google.com

    正则表达式工作得很好,但是当涉及到大量请求时,我们的CPU会达到100%。

    2 回复  |  直到 6 年前
        1
  •  2
  •   pstrjds    6 年前

    我会尝试创建一个实际的 Regex 变量并重用它。这应该有助于加快速度。我还可能建议将三元业务改为常规的if/elseif/else语句。我认为它更具可读性(只是个人观点)。

    string requestedPath = HttpUtility.UrlDecode(this.StripLanguage(currentContext.InputUrl.AbsolutePath));
    string requestedPathAndQuery = HttpUtility.UrlDecode(currentContext.InputUrl.PathAndQuery);
    string requestedRawUrl = HttpUtility.UrlDecode(currentContext.InputUrl.PathAndQuery);
    string requestedUrl =
        HttpUtility.UrlDecode(
            string.Concat(
                currentContext.InputUrl.Scheme,
                "://",
                currentContext.InputUrl.Host,
                requestedRawUrl));
    
    string requestedRawUrlDomainAppended = HttpUtility.UrlDecode(currentContext.InputUrl.AbsoluteUri);
    string requestedPathWithCulture = HttpUtility.UrlDecode(currentContext.InputUrl.AbsolutePath);
    
    var regex = new Regex(matchPattern.Trim(), RegexOptions.IgnoreCase);
    var finalRequestedURL = regex.IsMatch(requestedPathAndQuery)
                        ? requestedPathAndQuery
                        : regex.IsMatch(requestedPath)
                            ? requestedPath
                            : regex.IsMatch(requestedPathWithCulture)
                                ? requestedPathWithCulture
                                : regex.IsMatch(requestedRawUrl)
                                    ? requestedRawUrl
                                    : regex.IsMatch(requestedUrl)
                                        ? requestedRawUrlDomainAppended
                                        : string.Empty;
    

    正如我在上面的评论中指出的,有两个相同的字符串,如果删除其中一个字符串,则可以省去比较。

    string requestedPath = HttpUtility.UrlDecode(this.StripLanguage(currentContext.InputUrl.AbsolutePath));
    string requestedPathAndQuery = HttpUtility.UrlDecode(currentContext.InputUrl.PathAndQuery);
    
    // This string is identical to requestPathAndQuery, so I am removing it
    // string requestedRawUrl = HttpUtility.UrlDecode(currentContext.InputUrl.PathAndQuery);
    
    string requestedUrl =
        HttpUtility.UrlDecode(
            string.Concat(
                currentContext.InputUrl.Scheme,
                "://",
                currentContext.InputUrl.Host,
                requestedRawUrl));
    
    string requestedRawUrlDomainAppended = HttpUtility.UrlDecode(currentContext.InputUrl.AbsoluteUri);
    string requestedPathWithCulture = HttpUtility.UrlDecode(currentContext.InputUrl.AbsolutePath);
    
    var regex = new Regex(matchPattern.Trim(), RegexOptions.IgnoreCase);
    var finalRequestedURL = string.Empty;
    
    // You could even add in brackets here to aid readability but this
    // helps remove the indententation/nesting that makes the code harder
    // to read and follow
    if (regex.IsMatch(requestedPathAndQuery)) finalRequestURL = requestedPathAndQuery;
    else if(regex.IsMatch(requestedPath)) finalRequestURL = requestedPath;
    else if (regex.IsMatch(requestedPathWithCulture)) finalRequestURL = requestedPathWithCulture;
    else if (regex.IsMatch(requestedUrl)) finalRequestURL = requestedRawUrlDomainAppended;
    
        2
  •  0
  •   Guru Stron    6 年前

    正如我在评论中所说的,如果您只期望在应用程序的整个生命周期中可以重用的有限数量的不同模式,那么您可以创建一个静态的 Dictionary concurrent 一)缓存这个regexp并重用它们。

    示例代码:

    public class MyHandler
    {
        private static ConcurrentDictionary<string, Regex> dict = new ConcurrentDictionary<string, Regex>();
    
        public void Handle(string urlPattern)
        {
            urlPattern = urlPattern.Trim();
            var regex = dict.GetOrAdd(urlPattern, s => new Regex(urlPattern, RegexOptions.Compiled | RegexOptions.IgnoreCase));
            // use regex
        }
    }
    

    同时测试是否 RegexOptions.Compiled slower