GCP Data Engineer

CLOUDSUFIIndia₹1,000,000 – ₹4,000,000
Adzuna INPosted -60m agoOriginal Listing
it-jobs

Job Description

We are seeking a highly skilled and motivated Data Engineer to join our Development POD for the Integration Project. The ideal candidate will be responsible for designing, building, and maintaining robust data pipelines to ingest, clean, transform, and integrate diverse public datasets into our knowledge graph. This role requires a strong understanding of Cloud Platform (GCP) services, data engineering best practices, and a commitment to data quality and scalability. Location - Noida (Hybrid) / Remote You can also share your resume at ayushi.dwivedi at the rate cloudsufi.com Qualifications and Skills Education: Bachelor's or Master's degree in Computer Science, Data Engineering, Information Technology, or a related quantitative field. Experience: 3+ years of proven experience as a Data Engineer, with a strong portfolio of successfully implemented data pipelines. Programming Languages: Proficiency in Python for data manipulation, scripting, and pipeline development. Cloud Platforms and Tools: Expertise in Google Cloud Platform (GCP) services, including Cloud Storage, Cloud SQL, Cloud Run, Dataflow, Pub/Sub, BigQuery, and Apigee. Proficiency with Git-based version control. Core Competencies: Must Have - SQL, Python, BigQuery, (GCP DataFlow / Apache Beam), Google Cloud Storage (GCS) - Must Have - Proven ability in comprehensive data wrangling, cleaning, and transforming complex datasets from various formats (e.g., API, CSV, XLS, JSON) Secondary Skills - SPARQL, Schema.org, Apigee, CI/CD (Cloud Build), GCP, Cloud Data Fusion, Data Modelling - Solid understanding of data modeling, schema design, and knowledge graph concepts (e.g., Schema.org, RDF, SPARQL, JSON-LD). - Experience with data validation techniques and tools. - Familiarity with CI/CD practices and the ability to work in an Agile framework. - Strong problem-solving skills and keen attention to detail. Preferred Qualifications: - Experience with LLM-based tools or concepts for data automation (e.g., auto-schematization). - Familiarity with similar large-scale public dataset integration initiatives. - Experience with multilingual data integration. Skills:- Google BigQuery, Google Cloud Storage, Apache Airflow, Data flow, SQL and Python

Get AI-Matched to This Job

Upload your resume and our AI will score how well you match this and thousands of similar roles.