Web Data Extraction Specialist (i.e. Data Scrape Specialist)

 

Summary

Competitive Analytics is searching for talented data scientists who are passionate about the acquisition of data with strong skills and knowledge of web scraping, web services, file transfers, and everything data. This new role is a dedicated data scientist who will be assisting with designing and developing the tools, processes and infrastructure to extract large volumes of structured and unstructured data from a variety of database sources, primarily focusing on web data extraction AKA data scraping. At Competitive Analytics we believe data scraping is both an art and science, requiring innovative methodologies, deft programming skills, scientific ingenuity, and keen mathematical expertise. Our goal of providing clients with value-added data scraping also requires what we call creative analytics. Creative analytics requires the analyst to have the innate expertise and intuitive skill needed to transform raw data, decipher complex relationships, develop innovative algorithms, and design meaningful visualizations so that decision makers can truly make faster and better decisions in order to drive and sustain competitive advantage.

Required Submittals for Consideration

  1. Cover letter

  2. Resume

  3. Click Here to complete our quick 5 minute interview questionnaire

  4. Complete our Data Extraction Test (see bottom of page)

Primary Responsibilities

Gather and process both structured and unstructured data from external (scraping, APIs) and internal sources and prepare it for analyses

Design and develop a variety of tools and infrastructure to automate the extraction of publicly available and private information (writing web scrapers, calling third party APIs, creating SQL queries, etc.)

Create tools and processes to download data, parse it for relevant content, and store it in existing data management systems

Design and develop scalable, efficient, and robust internal data management systems

Gather and process raw data at scale (including writing scripts, web scraping, calling APIs, write SQL queries, writing applications, etc.)

Process unstructured data into a form suitable for analysis, utilizing custom applications and modern ETL/ELT

Support business decisions with ad hoc analysis as needed

Work closely with economists, data scientists, and machine learning experts to support both client-facing and internal projects

Set-up Linux Server

Set-up Proxies

Set-up data transfer to CA Server

Recruit off-shore scraper talent

Manage off-shore scraper talent

Education and Experience

B.S. degree in computer science, statistics, or other quantitative field and/or economics

Skills and Knowledge

Experience with SQL development, creating and administering databases, integrating multiple data sources, and performing ETL processes in tools such as MySQL, postgres, or Oracle

Knowledge in data mining, machine learning, natural language processing, or information retrieval

Understanding of distributed computing principles

Big Data experience with Hadoop (Hive/Pig/Impala/Spark) or Greenplum (postgres/madlib) is not required, but is a plus

Database and data warehousing experience, both in RDBMS and NoSQL environments.

Knowledge in MS SQL Server, PostgreSQL, Redshift, Couchbase is not required, but is a plus

Experience with Alteryx and/or Tableau is not required, but is a plus

Familiarity and/or Expertise in the Following is Preferred

Delphi

HTML

PHP

SQL

SQLite Programming

Crawler

Spyder

Perl

MySQL Administration

C#

C++

XML

AWS

Unix

Python

JAVA

Ajax

JQuery

R

SPSS

SAS

Alteryx

Tableau

Compensation

Competitive Analytics offers highly competitive compensation (full time, part time, or contract) based on experience, talent, skill, expertise, knowledge, proven capabilities, and potential capabilities.

Internship Opportunities

If you do not meet the experience and/or expertise required for this position yet are still highly motivated and passionate about this position, Competitive Analytics offers internship opportunities on a case by case basis. If interested, please inquire about our paid and unpaid internship programs.

About Competitive Analytics

From Fortune 100 companies to SMBs spanning myriad industries, Competitive Analytics helps the worlds most successful companies analyze their data and the competitive forces affecting them; with a customized business intelligence approach that addresses the challenges unique to each organization. Since our founding in January 2000, Competitive Analytics literally delivers “competitive analytics”. Competitive Analytics is an innovative, high-tech, and dynamic working environment where analysts work on a variety of advanced analytics, challenging client projects, and innovative business intelligence initiatives . . . across all industries and spanning the entire business intelligence process. For more information, please visit www.competitiveanalytics.com.

Data Extraction Test

The following three exercises were developed to gauge your level of web scraping expertise. Please click here to download a PDF which outlines three web scraping exercisesThank you in advance for your time and we look forward to hearing from you and reviewing your exercises.