The Facebook scraper is written in Python, unlike most scripts published in the Hub. The reason is that Selenium webdrivers currently are a lot more stable if handled via Python.
The script runs a Selenium webdriver (currently Chromium) in order to access any user-defined public Facebook page. The webdriver uses the mobile version rather than the Desktop version, since the way the content is dynamically created in the frontend is way easier to scrape. On Desktop all attributes of the HTML-elements are dynamically created by following an access-generated attribute hierarchy. Consequently, every time you access the page, the attributes will be generated anew, while preserving the hierarchy, essentially making it impossible to work with class names, ids, etc. On the mobile version, on the other hand, certain containers are fixed objects which then are dynamically filled. This makes xpath-based DOM-queries much shorter and easier to figure out.
At the current stage, we extract the following information from public Facebook pages of all the big political parties: - Date of post (relative to current year) - Plain text - Number of: + comments + shares + likes - The main link that is shared in a post - Inner HTML of post - Outer HTML - Page URL
We will soon apply the same scraper to the public Facebook pages of members of the Swiss parliament.
For more information, please contact us directly (see Access page).