Data Extraction and Processing Analyst

# of Openings


IDC CEMA - Tracker Webscraping and Data Harvesting Team in Ostrava is GROWING!


As member of the team that develops web scraper, bot, crawler and machine language solutions you will be key part of our ongoing market research initiatives. You will use ParseHub or Mozenda and other tools on daily basis.


Your daily job will be setting up solutions for extracting data from major web sites and preparing the data for use by the market research analysts.


Your solutions will gather web pricing and product spec information across hundreds of technology products from thousands of web sites across the world.  You will also clean and classify the data into the IDC taxonomy for analyst review and analysis.


  • Figure out optimal solutions for various data sources
  • Keep learning latest technology for web scraping / crawling technology
  • Work with market analyst teams to understand their needs and the analyzed markets
  • Create and maintenance of thousands of agents, scheduling, aggregating data sets, cleaning text and HTML data, and integrating output into other databases
  • Introduce as much automation as possible (data cleaning, usage of standard IDC taxonomy) to save analyst's time
  • Integration with IDC research application and cooperation with development teams


  • Good communication skills
  • Analytical mind
  • Happy to learn and explore new technology / tools
  • Passion for web and data analyses & processing
  • Ability to reasonably communicate in English
  • Relevant B.S. or Master's degree is a plus

Nice to have skills:

    • Experience with web scraping and/or web crawling tool
    • Programming in any language
    • SQL
    • Working with large data sets
    • Excel



IDC offers interesting work with a young international team, the opportunity to participate in a developing and innovative field, professional growth, and a competitive remuneration package.


Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
Share on your newsfeed