Advanced Engineer Software
Albertsons Companies IndiaBangalore, Karnataka
it-jobs
Job Description
Role: Advanced Engineer Software - AI Ops Fullstack JOB DESCRIPTION About the company Albertsons Companies is at the forefront of the revolution in retail. With a fixation on innovation and building culture, our team is rallying our company around a unique vision: forging a retail winner that is admired for national strength, deep roots in the communities we serve, and a team that has passion for food and delivering great service. Albertsons is one of the largest retail employers, providing approximately 300,000 jobs across 2,200 stores, 22 distribution centers, 20 food and beverage plants and various support offices. We operate in 34 states and the District of Columbia under the Albertsons banner, as well as Safeway, Tom Thumb, Jewel Osco, Shaws and many more recognizable names. What you will be doing This role is an individual contributor position responsible for building, fine-tuning, and supporting AIOps platform components for the Observability product. The candidate will work closely with the Lead Engineer, SRE teams, Platform team, Data Ingestion team, Platform DevOps team, Data Visualization team, and other portfolio teams. As part of the AIOps Platform team, you will be expected to design, develop, enhance, and maintain full-stack platform components built on the Grafana open-source stack, with custom Grafana plugin-based UI and backend services developed in Node.js and Python. This position will preferably be based out of India GCC, Bangalore. Key Responsibilities: - Lead offshore development and support for the custom-built AIOps platform built on the Grafana open-source stack, including Grafana plugin-based UI, Node.js and Python microservices, and data platforms such as Neo4j and Cosmos DB Mongo API. - Own offshore delivery commitments, sprint execution, release readiness, and milestone tracking to ensure features, fixes, and enhancements are delivered on time and with quality. - Provide technical guidance on scalable, resilient, and maintainable architecture across frontend plugins, backend microservices, APIs, integrations, and data storage layers. - Design, develop, enhance, and maintain custom Grafana plugins to deliver intuitive, high-performing, and user-friendly observability and AIOps experiences. - Oversee development and support of Node.js and Python microservices, ensuring coding standards, API consistency, logging, error handling, observability, and performance optimization. - Guide implementation and optimization of Neo4j and Cosmos DB Mongo API data models, queries, and storage strategies to support platform scalability, performance, and reliability. - Ensure seamless integration of the AIOps platform with monitoring, event management, alerting, CMDB, ticketing, and other enterprise systems. - Lead offshore support for platform issues, triage production incidents, coordinate root cause analysis, and drive timely resolution to minimize business impact. - Establish and enforce best practices for code reviews, secure coding, unit and integration testing, CI/CD pipelines, release governance, and technical documentation. - Continuously improve platform health, plugin responsiveness, service reliability, database performance, and overall system efficiency. - Coordinate offshore team activities, remove blockers, and ensure alignment across engineering, QA, DevOps, and support teams. - Act as the primary offshore point of contact for technical updates, delivery status, risks, dependencies, and escalations with onsite and leadership teams. - Partner with product owners, architects, and business stakeholders to translate requirements into technical designs, implementation plans, and delivery roadmaps. - Support build, deployment, environment readiness, release validation, and post-release stabilization across development, test, and production environments. - Identify technical risks, delivery challenges, and cross-team dependencies early, and proactively drive mitigation plans. - Mentor offshore engineers on Grafana plugin development, microservices architecture, database design, troubleshooting, and platform best practices. - Maintain technical documentation including architecture diagrams, API specifications, deployment procedures, operational runbooks, and troubleshooting guides. - Drive modernization, automation, and continuous improvement initiatives to enhance the business value and operational efficiency of the AIOps platform. We are searching for someone with the following skills: - Strong experience in React UI design and development, or Grafana Scenes SDK, with hands-on expertise in building and supporting custom Grafana plugins. - Solid development experience in Node.js and Python, with strong understanding of microservices-based architecture. - Experience designing, developing, and maintaining scalable backend services, REST APIs, and platform integrations. - Strong knowledge of frontend-backend interaction patterns for plugin-based or dashboard-driven platforms. - Hands-on experience with Neo4j for graph-based data modeling and query optimization. - Experience with Cosmos DB Mongo API or MongoDB, including schema design, query tuning, and performance optimization. - Hands-on experience developing MCP clients using React, Node.js, or Python. - Experience integrating web applications with RUM-based tools. - Good understanding of observability, monitoring, event correlation, alert management, and AIOps / IT operations concepts. - Experience with CI/CD pipelines, release management, version control systems, and DevOps practices. - Strong knowledge of logging, monitoring, troubleshooting, and production incident management. - Experience in performance tuning, scalability improvements, and reliability engineering for distributed systems. - Ability to review architecture, guide technical design, and ensure alignment with enterprise standards and security requirements. - Strong understanding of SDLC, Agile delivery models, sprint planning, and release execution. - Proven ability to manage delivery timelines, technical risks, dependencies, and stakeholder expectations. - Excellent problem-solving, analytical, and debugging skills across UI, middleware, APIs, and database layers. - Strong communication and coordination skills to work effectively with onsite teams, architects, product owners, and cross-functional stakeholders. - Experience mentoring developers, conducting code reviews, and driving engineering best practices across offshore teams. - Hands-on experience instrumenting OpenTelemetry (OTEL), with a good understanding of Site Reliability Engineering concepts. - Knowledge of monitoring tools such as Log Analytics, AppDynamics, Grafana, Prometheus, Splunk, and SiteScope. - Experience working with ServiceNow or similar IT service management tools. - Familiarity with cloud technologies across Azure, AWS, and Google Cloud. - Experience with Docker and Kubernetes. - Experience with high-performance and high-frequency data streaming, health confirmation techniques, and large-volume batch data processing using technologies such as Kafka is strongly preferred. - In-depth knowledge of modern monitoring and observability tools. - In-depth knowledge of at least one major cloud platform and container/service instance concepts. - Strong knowledge of log querying, service inspection, and troubleshooting techniques. - Strong understanding of software development lifecycle and Agile methodologies. - Ability to understand client expectations and resolve issues impacting service quality. - Strong mentoring, coaching, and training capability for engineering and support teams. - Self-starter mindset with the ability to learn beyond formal training and consistently deliver high-quality solutions. We believe the successful candidate has these qualifications and experience: What it is like at Albertsons? Albertsons Culture Principles Compassion : We always treat each other with kindness and respect Team : We always support and recognize each other Inclusive : We always value everyones perspective Learning : We always strive to grow and develop ourselves and others Competitive : We always act with integrity to win over the customer Ownership : We always take actions to drive our success #LI-ACIPRO - 4-year degree (Computer Science, Information Systems, or relational functional field) and/or equivalent combination of education or work experience. - 6-9+ years of developer experience in as a Full stack engineer is required. - 3+ years of experience on integration engineering related to Observability/Monitoring framework with any open source technologies preferabally Grafana, Mimir, Loki, Tempo, Fluentbit, Vector etc., - Hands-on experience with Tools and Technology is preferred. - Experience working with Open-source platforms and Open Telemetry libraries e.g. Grafana is preferred. - Experience working with Grafana Scenes SDK for plugin development
Get AI-Matched to This Job
Upload your resume and our AI will score how well you match this and thousands of similar roles.