代码之家  ›  专栏  ›  技术社区  ›  Steve

VBA.getElementsByTagName()不返回元素

  •  0
  • Steve  · 技术社区  · 5 年前

    我正在尝试读取关于EPL的投注数据 betfair

    Sub PullBetfair()
    
        ' SOCCER
        Const soccerEPL  As String = "https://www.betfair.com.au/exchange/plus/football/competition/10932509"   ' EPL
    
        ' DECLARE INTERNET EXPLORER
        Dim ie As New InternetExplorer
        ie.Visible = False
    
        ' NAVIGATE TO URL
        ie.navigate soccerEPL
    
        ' LOOP UNTIL NAVIGATION COMPLETE
        Do
            DoEvents
        Loop Until ie.readyState = READYSTATE_COMPLETE
    
        ' COLLECT HTML DOCUMENT
        Dim html As HTMLDocument
        Set html = ie.document
    
        ' CREATE COLLECTION OF ELEMENTS
        Dim elements As IHTMLElementCollection
    
        Set elements = html.getElementsByTagName("section")
        Debug.Print elements.Length
    
        ie.Quit
        Set ie = Nothing
    End Sub
    

    我已成功地从其他网站收集数据,如 ladbrokes 使用此方法,但不适用于此网站。

    我还尝试使用.getElementsByClassName收集元素,但没有成功。

    一个理想的答案可能会解释层次结构,这样我就可以理解如何深入到我试图阅读的表行中。

    非常感谢

    3 回复  |  直到 5 年前
        1
  •  2
  •   QHarr    5 年前

    下面对长度使用适当的等待和定时循环测试。

    Option Explicit  
    Public Sub TestForTags()
        Dim ie As New InternetExplorer, sections As Object, t As Date
         Const MAX_WAIT_SEC As Long = 10
        With ie
            .Visible = True
            .Navigate2 "https://www.betfair.com.au/exchange/plus/football/competition/10932509"
            While .Busy Or .readyState < 4: DoEvents: Wend
             t = Timer
            Do
                Set sections = ie.document.querySelectorAll("section")
                If Timer - t > MAX_WAIT_SEC Then Exit Do
            Loop While sections.Length = 0
    
            Debug.Print sections.Length
            Stop '<== Delete me later
            '.Quit
        End With
    End Sub
    

    sections For i = 0 To sections.Length -1 .item(i).innerText . 你可以俯冲并使用 Set sections = .document.getElementsByTagName("section") 然后“一个一个”结束。

        2
  •  1
  •   Solus161    5 年前

    While ie.Busy Or ie.readyState <> READYSTATE_COMPLETE
        mHour = Hour(Now())
        mMinute = Minute(Now())
        mSec = Second(Now()) + 1 'Wait one more second
        waitTime = TimeSerial(mHour, mMinute, mSec)
        Application.Wait waitTime
    Wend    
    
    ...
    
    Set elements = html.getElementsByTagName("tr")
    
        For i = 1 To elements.Length - 1 '
            Debug.Print elements(i).textContent
        Next i
    
        3
  •  0
  •   Lewis Morris    5 年前

    我编写了这个用于加载页面的函数。有时我发现页面刷新不正常,卡住&从未真正完成页面的加载。

    它休眠100毫秒,然后检查页面是否已加载,如果在3秒钟后未完成刷新/加载,则刷新并重试。

    ie.navigate "google.com"
    waitforietoload ie 
    

    这需要在模块的顶部

    #If VBA7 Then
        Public Declare PtrSafe Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As LongPtr) 'For 64 Bit Systems
    #Else
        Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long) 'For 32 Bit Systems
    #End If
    
    Option Compare Text
    

    然后,在模块中的任何位置都会出现此问题

    Function waitForIEToLoad(ie As InternetExplorer)
    
    Dim times, times2 As Integer
    
    Do While ie.readyState <> READYSTATE_COMPLETE Or ie.Busy
        DoEvents
        Sleep 100
        times = times + 1
        If times = 30 Then
            ie.Refresh
            times2 = times2 + 1
            If times2 = 3 Then
                Exit Do
            End If
        End If
    Loop
    
    End Function