Imagine yourself at CareFirst BlueCross BlueShield
0

I'm using Mechanize on Ruby and keep getting this exception error

C:/Ruby200/lib/ruby/2.0.0/net/protocol.rb:158:in `rescue in rbuf_fill': too many connection resets (due to Net::ReadTimeout - Net::ReadTimeout) after 0 requests on 37920120, last used 1457465950.371121 seconds ago (Net::HTTP::Persistent::Error)
    from C:/Ruby200/lib/ruby/2.0.0/net/protocol.rb:152:in `rbuf_fill'
    from C:/Ruby200/lib/ruby/2.0.0/net/protocol.rb:134:in `readuntil'
    from C:/Ruby200/lib/ruby/2.0.0/net/protocol.rb:144:in `readline'
    from C:/Ruby200/lib/ruby/2.0.0/net/http/response.rb:39:in `read_status_line'
    from C:/Ruby200/lib/ruby/2.0.0/net/http/response.rb:28:in `read_new'
    from C:/Ruby200/lib/ruby/2.0.0/net/http.rb:1406:in `block in transport_request'
    from C:/Ruby200/lib/ruby/2.0.0/net/http.rb:1403:in `catch'
    from C:/Ruby200/lib/ruby/2.0.0/net/http.rb:1403:in `transport_request'
    from C:/Ruby200/lib/ruby/2.0.0/net/http.rb:1376:in `request'
    from C:/Ruby200/lib/ruby/gems/2.0.0/gems/rest-client-1.6.7/lib/restclient/net_http_ext.rb:51:in `request'
    from C:/Ruby200/lib/ruby/gems/2.0.0/gems/net-http-persistent-2.9/lib/net/http/persistent.rb:986:in `request'
    from C:/Ruby200/lib/ruby/gems/2.0.0/gems/mechanize-2.7.3/lib/mechanize/http/agent.rb:259:in `fetch'
    from C:/Ruby200/lib/ruby/gems/2.0.0/gems/mechanize-2.7.3/lib/mechanize.rb:1281:in `post_form'
    from C:/Ruby200/lib/ruby/gems/2.0.0/gems/mechanize-2.7.3/lib/mechanize.rb:548:in `submit'
    from C:/Users/Feshaq/workspace/ERISScrap/eca_sample/eca_on_scraper.rb:152:in `<main>'

Here is line 152:

#Click the form button
agent.page.forms[0].click_button

Alternatively, I tried the given snippet and keep getting the exception error:

#get the form
form = agent.page.form_with(:name => "AdvancedSearchForm")
# get the button you want from the form
button = form.button_with(:value => "Search")
# submit the form using that button
agent.submit(form, button)

Any help is appreciated

2

I have run into this issue many times. The way I handle it is to wrap the block of code running the scraper in a rescue clause and on error I simply kill the connection and reset the agent and its headers. This has worked 100% of the time and has given me no issues. I then carry on where I left off in the code. The below example is a scraper I run for iterating over a list of buildings and looking up pages etc.:

def begin_scraping_list
    Building.all.each do |building_info|
      begin          
      next if convert_boroughs_for_form(building_info) == :no_good
      fill_in_first_page_address_form_and_submit(building_info)
      get_to_proper_second_page
      go_to_page_we_want_for_scraping
      scrape_the_table(building_info)
      rescue
        puts "error happened"
        @agent.shutdown
        @agent = Mechanize.new { |agent| agent.user_agent_alias = 'Windows Chrome'}
        @agent.request_headers
        sleep(5)
        redo
      end
    end
  end

So in your case you would want to wrap the problem area you posted in a rescue block

  begin
    #get the form
    form = agent.page.form_with(:name => "AdvancedSearchForm")
    # get the button you want from the form
    button = form.button_with(:value => "Search")
    # submit the form using that button
    agent.submit(form, button)
    rescue
     agent.shutdown
     agent = Mechanize.new { |agent| agent.user_agent_alias = 'Windows Chrome'}
     agent.request_headers
     sleep(2)
     #get the form
     form = agent.page.form_with(:name => "AdvancedSearchForm")
     # get the button you want from the form
     button = form.button_with(:value => "Search")
     # submit the form using that button
     agent.submit(form, button)
   end

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.