代码之家  ›  专栏  ›  技术社区  ›  junkone

晨星集团的刮削行业

  •  -1
  • junkone  · 技术社区  · 6 年前

    我想从晨星的网页上搜刮这个行业。我可以看到数据,沃特也看到了。但是当我试着去拿那张桌子的时候,它什么也不回。

       irb(main):001:0> require 'watir'
    => true
    
    irb(main):008:0> browser= Watir::Browser.new
    
    DevTools listening on ws://127.0.0.1:49780/devtools/browser/4e473d9e-4818-45ad-8238-587bc931099a
    => #<Watir::Browser:0x..f0e9773de url="data:," title="">
    irb(main):006:0> path="http://quote.morningstar.ca/Quicktakes/stock/stock_beta.aspx?t=GOOG&region=USA&culture=en-CA"
    => "http://quote.morningstar.ca/Quicktakes/stock/stock_beta.aspx?t=GOOG&region=USA&culture=en-CA"
    irb(main):007:0> goto(path)
    irb(main):009:0> browser.goto(path)
    [41088:42292:1007/225520.743:ERROR:platform_sensor_reader_win.cc(242)] NOT IMPLEMENTED
    => "http://quote.morningstar.ca/Quicktakes/stock/stock_beta.aspx?t=GOOG&region=USA&culture=en-CA"
    irb(main):010:0> browser.text.include?"Sector"  #### CAN FIND THE word sector.
    => true
    irb(main):011:0> browser.div(:class=>"sal-dp-panel")  ##### it cannot find the class at all.
    => #<Watir::Div: located: false; {:class=>"sal-dp-panel", :tag_name=>"div"}>
        irb(main):015:0> divs=browser.divs(:class=>"sal-dp-panel")
    => #<Watir::DivCollection:0x000000079722d0 @query_scope=#<Watir::Browser:0xdbd2266a url="http://quote.morningstar.ca/Quicktakes/stock/stock_beta.aspx?t=GOOG&region=USA&culture=en-CA" title="GOOG 1157.35 -0.93 (Alphabet Inc Class C)">, @selector={:class=>"sal-dp-panel", :tag_name=>"div"}>
    irb(main):018:0> divs.count
    => 0
    irb(main):019:0> divs.each{|div| puts div.text}
    => []
    irb(main):020:0> divs.each{|div| puts "got one"}
    => []
    

    enter image description here

    2 回复  |  直到 6 年前
        1
  •  0
  •   Justin Ko    6 年前

    问题是页面上没有“sal dp panel”类的元素。也许您想得到“sal dp pair”,也就是包含名称/值对的div?

    <div class="sal-dp-pair">
      <div class="sal-dp-name ng-binding">Sector</div>
      <div class="sal-dp-value ng-binding">Technology</div>
    </div>
    

    browser.div(class: 'sal-dp-name', text: 'Sector').following_sibling.text
    #=> "Technology"
    
    browser.div(class: 'sal-dp-name', text: 'Industry').following_sibling.text
    #=> Internet Content & Information"
    
        2
  •  1
  •   Rajagopalan    6 年前

    我想你用错定位器了

    试试下面

    b = Watir::Browser.new 
    b.goto 'http://quote.morningstar.ca/Quicktakes/stock/stock_beta.aspx?t=GOOG&region=USA&culture=en-CA'
    
    p b.divs(class: 'sal-dp-name')[7].text
    
    p b.div(text: 'Technology').preceding_sibling.text
    

    输出

    "Sector"
    "Sector"
    

    Sector 在两个不同的方面,第二个比第一个更可靠,因为我用了 Technology 找到 部门