curl not showing the correct source as viewed by view page source by browser

Question

I am trying to learning web scraping I choose https://www.betfair.com as an example, I have successfully get many pages data but when I am going to visit https://www.betfair.com/sport/horse-racing I did not get the full source however if I view page source from the browser its showing me the data, So its out of the question that the contents are generated by JavaScript or similar. Here is my code:

$url ='https://www.betfair.com/sport/horse-racing';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3");
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$page = curl_exec($ch);
curl_close($ch);
echo $page;

If you can see when viewing the source by the browser you can find this:

<a href="/sport/horse-racing?action=loadRacingSpecials&tab=SPECIALS&  modules=multipick-horse-racing" class="ui-nav link ui-clickselect ui-ga-  click" data-dimension3="sports-header" data-dimension4="Specials"   data-dimension5="Horse Racing" data-gacategory="Interface"   data-gaaction="Clicked Horse Racing Header" data-galabel="Specials"
data-loader=".multipick-content-container > div, .antepost-content-  container > div, .future-racing-content-container > div, .bet-finder-content-  container > div, .racing-specials-content-container > div, .future-racing-  market-content-container > div"
>
Specials</a>

But curl is not getting these elements.

it's on the $page result save that to a file , you will see the result prntscr.com/edcdny — Faxsy, Feb 25, 2017 at 20:53
@Faxsy When I echoed this on my local page and see the source its not there can you please tell me how its showing? — Iftikhar1487, Feb 26, 2017 at 7:05

Sabuj Hassan · Accepted Answer · 2017-02-25 21:31:12Z

Fisrt of all the site betfair doesn't entertain doing spider on them (although people are doing this on a regular basis).

I know that I am expert at javascript of html. But things can happen that it was generated by the ajax call. If you use the firebug tool for mozila the you can see what request is the page making to have the data.

But most of all my suggestion will be to use the API they have. That is legal and have a free version with some limitation as well. Api link https://developer.betfair.com/

Actually if I see the view page source on the website its written there, so its not that its generated by an ajax call. — Iftikhar1487, Feb 26, 2017 at 7:07

Faxsy · Accepted Answer · 2017-02-26 17:01:09Z

Try to save that in file you will notice that the code you are looking for is in there.

    $url ='https://www.betfair.com/sport/horse-racing';
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
    curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
    curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3");
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
    $page = curl_exec($ch);
    curl_close($ch);

    $file = fopen("1.txt","a");
    fwrite($file,$page);
    fclose($file);

Sorry Faxsy but did not see the code even after writing to the file — Iftikhar1487, Mar 2, 2017 at 8:16

current community

your communities

more stack exchange communities

curl not showing the correct source as viewed by view page source by browser

2 Answers 2

Your Answer

Not the answer you're looking for? Browse other questions tagged php curl web-scraping or ask your own question.

Hot Network Questions

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged php curl web-scraping or ask your own question.

Related