How To Scrape Web Page That Doesn't Show Its Data?
Solution 1:
The website https://charlotte.realforeclose.com uses AJAX. You need to do some reverse engineering job to find out how does it work.
Open Chrome, press F12 to open Developer Tools or choose the option from the menu.
Open Network tab, choose XHR filter, paste the URL https://charlotte.realforeclose.com/index.cfm?zaction=AUCTION&Zmethod=PREVIEW&AUCTIONDATE=07/16/2019 to the browser address bar and press enter. Check XHRs logged on Network tab while the webpage is loading. First of all inspect XHRs having bigger response size.
Click on the request in the list and check details. Here are URL, headers and parameters for request.
And the response content.
Since the requests method is GET, you can just paste the URLs to address bar and retrieve the content. The URLs for me are:
https://charlotte.realforeclose.com/index.cfm?zaction=AUCTION&Zmethod=UPDATE&FNC=LOAD&AREA=W&PageDir=0&doR=1&tx=1563171184890&bypassPage=1&test=1&_=1563171184890
https://charlotte.realforeclose.com/index.cfm?zaction=AUCTION&Zmethod=UPDATE&FNC=LOAD&AREA=C&PageDir=0&doR=1&tx=1563171185129&bypassPage=0&test=1&_=1563171185129
After playing a bit, you can easily find that parameter AREA=W
is for "Auctions Waiting" section, and AREA=C
is for "Auctions Closed or Canceled" section. Seems the parameters tx
, bypassPage
, test
and _
are not necessary at all.
Open the first page with PageDir=0
and doR=1
, after that navigate to next page with PageDir=1
and doR=0
, and to previous page with PageDir=-1
and doR=0
.
The first page https://charlotte.realforeclose.com/index.cfm?zaction=AUCTION&Zmethod=UPDATE&FNC=LOAD&AREA=W&PageDir=0&doR=1
And the next page https://charlotte.realforeclose.com/index.cfm?zaction=AUCTION&Zmethod=UPDATE&FNC=LOAD&AREA=W&PageDir=1&doR=0
Finally you just need to reproduce that XHRs from your application and parse the responses. Depending on implementation of HTTP requests you may need to add the necessary headers and cookies processing also.
Post a Comment for "How To Scrape Web Page That Doesn't Show Its Data?"