代码之家  ›  专栏  ›  技术社区  ›  Sam å±±

简单的常规表达问题

  •  2
  • Sam å±±  · 技术社区  · 14 年前

    我在博客上有一个这样的标题 Main Idea, key term, key term, keyterm

    我希望主要的想法和关键术语有不同的字体大小。首先想到的是搜索第一个逗号和字符串的结尾,并用相同的东西替换该块,但用一个类将跨度标记包围,使字体变小。

    计划如下:

    HTML(以前)

      <a href="stupidreqexquestion">Main Idea, key term, key term, key term</a>
    

    HTML(后)

      <a href="stupidreqexquestion">Main Idea <span class="smaller_font">, key term, key term key term</span></a>
    

    我正在使用Rails,因此我计划将其添加为帮助函数-例如:

    帮手

      def make_key_words_in_title_smaller(title)
          #replace the keywords in the title with key words surrounded by span tags
      end 
    

    看法

      <% @posts.each do |post |%>
          <%= make_key_words_in_title_smaller(post.title)%>
      <% end -%>
    
    2 回复  |  直到 14 年前
        1
  •  3
  •   nonopolarity    14 年前

    如果你不在乎 Main Idea 部分存在 "Welcome home, Roxy Carmichael" ,也就是说,双引号内带有逗号

    >> t = "Main Idea, key term, key term, key term"
    => "Main Idea, key term, key term, key term"
    
    >> t.gsub(/(.*?)(,.*)/, '\1 <span class="smaller_font">\2</span>')
    => "Main Idea <span class=\"smaller_font\">, key term, key term, key term</span>"
    
        2
  •  2
  •   the Tin Man    14 年前

    如果字符串是未加修饰的(即,没有标记),则其中任何一个都可以正常工作:

    data = 'Main Idea, key term, key term, key term'
    
    # example #1
    /^(.+?, )(.+)/.match(data).captures.each_slice(2).map { |a,b| a << %Q{<span class="smaller_font">#{ b }</span>}}.first 
    # => "Main Idea, <span class=\"smaller_font\">key term, key term, key term</span>"
    
    # example #2
    data =~ /^(.+?, )(.+)/
    $1 << %Q{<span class="smaller_font">#{ $2 }</span>} 
    # => "Main Idea, <span class=\"smaller_font\">key term, key term, key term</span>"
    

    如果字符串有标记,则不鼓励使用regex处理HTML或XML,因为它很容易中断。对于您所控制的HTML,非常简单的用法是非常安全的,但是如果内容或格式发生更改,regex可能会破坏您的代码。

    HTML解析器是通常推荐的解决方案,因为如果内容或其格式发生更改,它们将继续工作。这就是我用Nokogiri做的。我故意详细解释发生了什么事:

    require 'nokogiri'
    
    # build a sample document
    html = '<a href="stupidreqexquestion">Main Idea, key term, key term, key term</a>'
    doc = Nokogiri::HTML(html) 
    
    puts doc.to_s, ''
    
    # find the link
    a_tag = doc.at_css('a[href=stupidreqexquestion]')
    
    # break down the tag content
    a_text = a_tag.content
    main_idea, key_terms = a_text.split(/,\s+/, 2) # => ["Main Idea", "key term, key term, key term"]
    a_tag.content = main_idea
    
    # create a new node
    span = Nokogiri::XML::Node.new('span', doc)
    span['class'] = 'smaller_font'
    span.content = key_terms
    
    puts span.to_s, ''
    
    # add it to the old node
    a_tag.add_child(span)
    
    puts doc.to_s
    # >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
    # >> <html><body><a href="stupidreqexquestion">Main Idea, key term, key term, key term</a></body></html>
    # >> 
    # >> <span class="smaller_font">key term, key term, key term</span>
    # >> 
    # >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
    # >> <html><body><a href="stupidreqexquestion">Main Idea<span class="smaller_font">key term, key term, key term</span></a></body></html>
    

    在上面的输出中,您可以看到Nokogiri是如何构建示例文档、添加的跨度以及生成的文档的。

    它可以简化为:

    require 'nokogiri'
    
    doc = Nokogiri::HTML('<a href="stupidreqexquestion">Main Idea, key term, key term, key term</a>')
    
    a_tag = doc.at_css('a[href=stupidreqexquestion]')
    main_idea, key_terms = a_tag.content.split(/,\s+/, 2)
    a_tag.content = main_idea
    
    a_tag.add_child("<span class='smaller_font'>#{ key_terms }</span>")
    
    puts doc.to_s
    # >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
    # >> <html><body><a href="stupidreqexquestion">Main Idea<span class="smaller_font">key term, key term, key term</span></a></body></html>