• Full Time
  • Distant

Web Scraping Specialist

$75k – $125k • 0.06% – 3.0%

Wynd is an early-stage startup empowering access to public web data. Wynd Network enables access to public web data through decentralized data scraping tools such as our first product, Grass.

Grass is a network sharing application that allows users to sell their unused internet bandwidth.

Your day-to-day:

  • Write, test, and refine code that extracts data from various online sources, ensuring reliability and efficiency.
  • Perform data retrieval tasks, handling complexities such as pagination and dynamic content loaded with AJAX.
  • Clean and format extracted data, ensuring it meets quality standards for further analysis or processing.
  • Database management: Store and manage the scraped data in appropriate databases, optimizing for access speed and data integrity.
  • Regularly monitor the scraping processes, identify and resolve any issues to maintain continuous data flow.

What experience you’ll need:

  • Demonstrated ability to extract data from complex websites with minimal supervision, with a portfolio or examples of past projects.

Technical requirements:

  • Proficiency in languages such as Python or JavaScript, with strong skills in libraries and frameworks like BeautifulSoup, Scrapy, or Selenium.
  • Knowledge of asynchronous programming, multithreading, and distributed scraping.
  • In-depth knowledge of HTML, CSS, JavaScript, and the Document Object Model (DOM).
  • Experience with NoSQL databases (MongoDB, Cassandra), capable of designing efficient storage solutions and managing data integrity.

Bonus points:

  • Ability to apply machine learning algorithms for data cleaning, categorization, or predictive analysis adds significant value.
  • Experience with cloud services (AWS, Google Cloud, Azure) for deploying and managing scraping jobs at scale.
  • Active participation in open-source projects related to web scraping, data processing, or similar fields.

Apply

Our hiring process prioritizes a paid work trial over a traditional interview process.

Pour postuler à cette offre d’emploi veuillez visiter boards.greenhouse.io.