Email: support@grablab.io
Skype: lorien.name
Github: lorien

Web scraping projects I've done

I have scraped WHOIS of 8 millions domains

July 19, 2016  —  8M pages scraped


I have scraped traffic of 8 million domains from similarweb.com

July 19, 2016  —  8.3M pages scraped


Similarweb TOP 8 millions websites

July 11, 2016  —  30K pages scraped


Collected information about 490 millions VK.com users

Feb. 22, 2016  —  490M pages scraped


Scraping images of hotels and rooms from booking.com

Nov. 8, 2015  —  30M pages scraped


Scraping 4 millions domains from similarsitesearch.com

Oct. 31, 2015  —  5M pages scraped


Scraping products from g2crowd.com

Oct. 27, 2015  —  14K pages scraped


Scraping 40k domains from similarweb.com

Oct. 26, 2015  —  40K pages scraped


Scraping IDs of all vk.com users

Oct. 25, 2015  —  3.3M pages scraped


Scraping 4 million queries from a number of search engines

Oct. 7, 2015  —  30M pages scraped


Scraping PDF documents from apuntes.rincondelvago.com

Oct. 3, 2015  —  90K pages scraped


Scraping PDF user manuals from samsung.com

Oct. 3, 2015  —  3M pages scraped


Development of 24/7 web-site crawler (15M+ pages every day)

Oct. 1, 2015  —  100M pages scraped


Scraping hotels from booking.com

Sept. 12, 2015  —  1000K pages scraped


Scraping of 4 catalogs of russian construction companies

Sept. 11, 2015  —  140K pages scraped


Scraping manualguru.com, about 600K PDF documents

Sept. 5, 2015  —  1.8M pages scraped


Scraping 10 catalogs of russian construction companies

Sept. 3, 2015  —  10K pages scraped


Scraping tripadviser.com, about 1M hotels

Aug. 20, 2015  —  1.1M pages scraped


Web scraping of tidyforms.com, 8 thousands PDF documents

July 20, 2015  —  8K pages scraped


Scraping 2gis.ru, database of two millions companies

June 4, 2015  —  2.1M pages scraped


Scraping 60 thousands articles from web-site

May 31, 2015  —  60K pages scraped


Scraping list of IDs of all users of vk.com

May 25, 2015  —  3M pages scraped


Scraping of 40k domains from similarweb.com

May 21, 2015  —  40K pages scraped


Scraping all companies from crunchbase.com

May 19, 2015  —  364K pages scraped


Research project: search for elasticsearch servers with publicly available data

May 16, 2015  —  600K pages scraped


Scraping all companies from linkedin.com

May 16, 2015  —  7.3M pages scraped


Scraping 4 millions of queries from yahoo.com

May 12, 2015  —  4M pages scraped


Scraping 4 millions of queries from rambler.ru search engine

May 10, 2015  —  4M pages scraped


Scraping 4 millions of queries from baidu.com search engine

May 8, 2015  —  4M pages scraped


Search for people via vk.com API

May 1, 2015  —  100K pages scraped


Aggregator of MOOC courses

April 18, 2015  —  50K pages scraped


Scraping feed URLs from all news sources from alltop.com

Feb. 19, 2015  —  30K pages scraped