Back to Jobs

12 Weeks ago

Hiring Data Analyst(Web Scrapping using Python)

Apply Now

214318 - 1051540 INR - Annual

Pune, India

Information Technology

Full-Time

Texila Educare Healthcare and Technology Enterprise Pvt Ltd

Overview

Experience: 2Years

Key Responsibilities:

Develop and Maintain Web Scraping Scripts: Build efficient, scalable, and robust web scraping tools using Python and relevant libraries (e.g., BeautifulSoup, Scrapy, Selenium).
Data Extraction: Extract structured and unstructured data from websites and APIs, focusing on gathering high-quality and clean datasets.
Data Processing and Storage: Process, clean, and store extracted data in databases (SQL/NoSQL) or data warehouses, ensuring it's ready for analysis and reporting.
Website Parsing and HTML Manipulation: Parse complex HTML structures and interact with websites that require JavaScript rendering.
Error Handling and Logging: Develop error handling and logging mechanisms to ensure scripts run reliably and provide useful diagnostics when failures occur.
Automation and Scheduling: Automate scraping jobs to run on a regular basis using task schedulers (e.g., cron jobs) and ensure minimal downtime.
Ensure Compliance: Implement scraping systems that comply with website Terms of Service and applicable laws (e.g., GDPR, Copyright Laws, and Robots.txt).
Optimize Performance: Optimize scraping performance for speed and reliability. Handle rate limits, CAPTCHAs, and IP blocking mechanisms to ensure smooth operations.
Documentation and Reporting: Maintain clear documentation of scraping processes, data flows, and any issues encountered. Provide status updates and reports to stakeholders.
Collaboration: Work closely with data analysts, product teams, and engineers to ensure data quality and availability for decision-making processes.

Required Skills and Qualifications:

Proficiency in Python: Strong experience with Python, especially in libraries like BeautifulSoup, Scrapy, Requests, Selenium, and Pandas.
Web Scraping Frameworks: Experience with scraping tools such as Scrapy, Selenium, or Puppeteer.
HTML, CSS, JavaScript: Deep understanding of web technologies, including HTML, CSS, and JavaScript to navigate websites and handle dynamic content.
Data Manipulation and Storage: Experience with SQL and NoSQL databases (e.g., MySQL, PostgreSQL, MongoDB) and data processing libraries (e.g., Pandas).
APIs: Experience working with RESTful APIs to extract or push data.
Data Formats: Knowledge of data formats like JSON, XML, CSV, and how to parse/handle them.
Error Handling and Debugging: Strong skills in troubleshooting, debugging, and optimizing web scraping operations.
Networking and HTTP Protocols: Familiarity with HTTP requests, headers, cookies, and web scraping proxies (e.g., rotating proxies, IP management, VPNs).
Version Control: Experience using version control systems like Git.
Problem Solving and Critical Thinking: Ability to handle complex scraping challenges like dynamic content, CAPTCHA, JavaScript rendering, etc.

Preferred Qualifications:

Experience with Cloud Technologies: Familiarity with cloud platforms such as AWS, Google Cloud, or Azure for scalable scraping and storage solutions.
Distributed Systems: Experience with managing distributed web scraping jobs using tools like Celery, RabbitMQ, or Kubernetes.
Data Quality and Validation: Experience in data validation, cleaning, and transforming data for downstream processes.
Knowledge of Machine Learning: Familiarity with applying machine learning techniques to parse and extract data from semi-structured or unstructured sources.

Job Type: Full-time

Pay: ?214,318.07 - ?1,051,539.21 per year

Schedule:

Day shift

Experience:

total work: 2 years (Preferred)

Work Location: In person

Share job

Similar Jobs

View All

13 Minutes ago

MTS II - Software Engineer

Information Technology

4 - 7 Yrs
Pune

MAJOR RESPONSIBILITIES • Design, implement, integrate, and verify software applications and tools using JavaScript, NodeJS, and C++. • Enhance, optimize, and improve the efficiency and robustness of current software, with a particular focus on OSS ...

More info

1 Day ago

Business Advisory Analyst

Information Technology

Bangalore, Karnataka, India

Skill required: Banking Services - Core BankingDesignation: Business Advisory AnalystQualifications:BBA/BCom/Master of Business AdministrationYears of Experience:3 to 5 yearsAbout AccentureAccenture is a global professional services company with lea...

More info

1 Day ago

Front End Developer

Information Technology

Bangalore, Karnataka, India

Position Title: Front End DeveloperCompany: Johnson Controls (JCI)Location: BangaloreJob Summary: We are seeking a talented Front End Developer with 4-7 years of experience to join our dynamic team. The ideal candidate will have a strong background ...

More info

1 Day ago

Database Engineer III (Big Data)

Information Technology

Bangalore, Karnataka, India

LivePerson (NASDAQ: LPSN) is the global leader in enterprise conversations. Hundreds of the world’s leading brands — including HSBC, Chipotle, and Virgin Media — use our award-winning Conversational Cloud platform to connect with millions of consume...

More info

1 Day ago

Data Scientist Manager

Information Technology

Bangalore, Karnataka, India

Job DescriptionLeads a team of people who design, develop and program methods, processes, and systems to consolidate and analyze unstructured, diverse “big data” sources to generate actionable insights and solutions for client services and product e...

More info

1 Day ago

Data Scientist Manager

Information Technology

Bangalore, Karnataka, India

More info

1 Day ago

Sr. QA Engineer

Information Technology

Bangalore, Karnataka, India

Role Summary:Picarro is seeking an exceptional Sr. QA Engineer for functional testing of Picarro Analyzers. This role expects you to analyze requirements, create and execute test-plan, and record results in test-repo. This person is also expected to...

More info

1 Day ago

C++ Graphics and Windowing System Software Engineer - Mir

Information Technology

Bangalore, Karnataka, India

We build a high-performance, high-efficiency stack for window managers and display subsystems in C++, called Mir. We're growing the team and looking for new colleagues who share our passion for precision, performance and user experience.Our goal is ...

More info

Talk to us

Feel free to call, email, or hit us up on our social media accounts.

Email info@antaltechjobs.in