GPA scrapes Heritage and eBay and also other sites like Comic Connect (selective) and also takes data from dealers who submit (like Greg Reece, NewForceComics and a few others).
Where else, you'll have to ask George.
GPA doesn't scrape anything - they have direct access to the eBay data through the eBay API, and receive all sales data direct from the various auction sites (Heritage, ComicConnect, Pedigree, Lone Star, etc). As you mention, they also receive sales info from a bunch of the large dealers.
The main reason they don't include Comiclink data is because Comiclink wanted it to be selective - as far as I know, that's not the case for ComicConnect.
There's a list of the sites that provide them with data on the GPA homepage:
http://comics.gpanalysis.com/
Sorry about the 'scraping'. I'm not a techie.
I believe they stopped accepting Pedigree data years ago because of Doug's reputation.
I also believe they take or receive selective data from Comic Connect. That's just from a personal observation of mine seeing some sales appear and others not appear. It could very well be that those sales I was looking for were not completed though.
Pedigreecomics is still on the list and GPA shows the books for sale/auction from the site so I assume they are getting data.