OnHires - Python Scraping Developer
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT
Requirements
• Core Experience: Proven, hands-on professional experience in high-volume web scraping and data extraction using Python. • Technical Depth: Solid understanding of HTML parsing, browser automation techniques, and asynchronous programming. • Technical Depth: • Frameworks: Proficiency with leading web scraping frameworks (e.g., Playwright, Scrapy, or Selenium). • Frameworks: • Web Knowledge: Strong knowledge of REST APIs, HTTP protocols, and effective proxy management. • Web Knowledge: • Database Skills: Familiarity with both SQL and NoSQL databases for efficient data storage and processing. • Infrastructure: Experience with Docker, Linux environments, and version control (Git). • Infrastructure: • Communication: Fluent in English (written and spoken). • Communication: • Mindset: Self-driven, detail-oriented, and capable of taking full ownership of significant projects. • Mindset: • Experience with advanced async libraries (e.g., asyncio) • Understanding of data quality validation and pipeline monitoring tools. • What they offer • What they offer • Impact & Ownership: A high degree of freedom and the opportunity to have a meaningful, measurable impact on a growing scale-up business. • Impact & Ownership: • Flexibility: A high degree of flexibility – our client is a remote-first company and actively support remote work. • Flexibility: • Growth: A competitive compensation package and dedicated support for your personal & professional development (ongoing training & coaching). • Growth: • Team & Atmosphere: A great work atmosphere within a small, talented, and international team. • Team & Atmosphere: • Office (Optional): A modern office located on the campus of Wildau Tech University, easily accessible by public transport (just outside Berlin). • Office (Optional):
Responsibilities
• Design & Development: Develop, test, and deploy robust web scraping scripts and crawlers using advanced Python tools (Playwright, Selenium, Requests, BeautifulSoup, etc.). • Design & Development: • Scalability: Architect and maintain asynchronous scraping systems capable of massive, large-scale data extraction. • Scalability: • Resilience: Implement, monitor, and optimize sophisticated anti-blocking strategies and proxy rotation to ensure high reliability and uptime. • Resilience: • Integration: Manage and automate data ingestion pipelines and seamless integrations with external REST APIs. • Integration: • Operational Excellence: Debug, monitor, and continuously improve scraper performance, reliability, and data quality. • Operational Excellence: • Collaboration: Partner with other engineers to enhance our core scraping infrastructure, tooling, logging, and monitoring systems. • Collaboration: • DevOps Support: Assist with DevOps tasks, including Docker, CI/CD, and managing Linux environments. • DevOps Support:
No credit card. Takes 10 seconds.