代码之家  ›  专栏  ›  技术社区  ›  Ashley O

Selenium-从列表中的每个项目收集信息

  •  0
  • Ashley O  · 技术社区  · 7 年前

    from selenium import webdriver
    import time
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.keys import Keys
    
    driver = webdriver.Chrome()
    
    #go to eat24, type in zip code 10007, choose pickup and click search
    
    driver.get("https://new-york.eat24hours.com/restaurants/index.php")
    search_area = driver.find_element_by_name("address_auto_complete")
    search_area.send_keys("10007")
    pickup_element = driver.find_element_by_xpath("//[@id='search_form']/div/table/tbody/tr/td[2]")
    pickup_element.click()
    search_button = driver.find_element_by_xpath("//*[@id='search_form']/div/table/tbody/tr/td[3]/button")
    search_button.click()
    
    
    #scroll up and down on page to load more of 'infinity' list
    
    for i in range(0,3):
        driver.execute_script("window.scrollTo(0, 
    document.body.scrollHeight);")
        driver.execute_script("window.scrollTo(0,0);")
        time.sleep(1)
    
    #find menu buttons
    
    menus_elements = driver.find_elements_by_xpath('//*[@title="View Menu"]')
    #menus_element = driver.find_element_by_xpath('//*[@title="View Menu"]')
    #menus_element.click()
    
    #Problem area: Trying to iterate over menu buttons and collect menu items + prices from each. It goes to the first menu and pulls the prices/menu items, but then when it goes back to first page it says 'stale element reference' and doesn't click the next menu item
    
    
    for i in range(0, len(menus_elements)):
        if menus_elements[i].is_displayed():
            menus_elements[i].click()
     #find menu items
        menu_items = driver.find_elements_by_class_name("cpa")
        menus = [x.text for x in menu_items]
    #find menu prices
        menu_prices = driver.find_elements_by_class_name('item_price')
        menu_prices = [x.text for x in menu_prices]
            #pair menu items and prices
        for menu, menu_price in zip(menus, menu_prices):
            print(menu + ': ' + menu_price)
        driver.execute_script("window.history.go(-1)")
        driver.implicitly_wait(20) 
    

    问题是在这里的最后,它进入第一个菜单并获取项目/价格,但当它返回页面时,它不会选择第二个菜单并执行相同的操作。为什么?谢谢你的建议!!!

    1 回复  |  直到 7 年前
        1
  •  1
  •   Andersson    7 年前

    不必单击每个“查看菜单”按钮、刮取菜单页面并返回结果页面,您可以获得链接列表,然后逐个刮取每个菜单页面:

    menu_urls = [page.get_attribute('href') for page in driver.find_elements_by_xpath('//*[@title="View Menu"]')]
    for url in menu_urls:
        driver.get(url)
        menu_items = driver.find_elements_by_class_name("cpa")
        menus = [x.text for x in menu_items]
        ...