Manager/AM Data Engineer For Goods & Service Tax Network (GSTN)

from 4 to 9 year(s) of Experience
Delhi NCR

Job Description

Manager/AM – Data Engineer


Role:Data Engineer

Reporting to:Head, Business Intelligence and Fraud Analytics

Function:Technical

Experience:4 – 8 Years

Location:Delhi


Role Description

Junior Data Engineer(s) will support in Data Extraction, transformation and processing operations for the Management Information Systems (MIS), Business Intelligence (BI) & Data Science workflows.

They will assist the business users with various ad-hoc queries and MIS report preparation. They will also assist Data Scientists & Senior Analysts in building and maintaining Data Pipelines which feed transformed data into various analytical models & dashboards. They will assist Data architects with Data Platform health reports, data lifecycle & governance workflows. As Data Engineer, you will develop, maintain, evaluate and test big data solutions. You will be involved in the design of data solutions using Hadoop based technologies along with Python & Pyspark programming.

Key Responsibilities

Data Extraction, Load & Transformation

Implement the extraction & load query architecture as per design

Write & maintain extraction queries – batch, realtime ingestion & ad-hoc

Write & maintain multi-source to multi-target extraction to load pipelines

Write data transformation queries as per business logic

Create and maintain Transformation pipelines/DAGs

Continuously optimize queries & pipelines for higher performance

Provide inputs to the Data Architect, Data Scientists, Data Analysts & Business users & resolve queries around data

Create & maintain API’s to expose data As a service

Maintain HDFS folders, Hive/HBase tables. Maintain other data repos & data version control tools

Design ETL flows using NIFI, Airflow & Oozie

Process Large datasets using Python, Pyspark and Scala

Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems

Automate Data Testing & implement continuous testing as part of the DataOps process. Implement key principles of the DataOps manifesto

Data Steward, Governance & Lifecycle Maintenance

Play the role of Data Steward for transformed datasets, Maintain Data dictionaries & Metadata information

Work with Data Architect to implement Data Governance & Lifecycle reports

Create & Maintain housekeeping scripts & maintain the sanctity of the reporting & analytical datawarehouses

Be the custodian of reporting & analytics data warehouses

Interact, work & if required direct the Managed Services Provider (MSP)

Monitor & track the activities of MSP in regards to day to day functioning of the data lake & normal upkeep of data warehouses.

Others

Learn new technologies & tool and suggest better ways of improving data operations

Design, Discuss with Senior Management & Own your personal & professional development plan

Create, Discuss & work through a training plan in the area of your interest which aligns with current & future responsibilities

Key Interfaces

External:

Internal:

External stakeholders (tax authorities, taxpayers, treasuries, banks, policy makers, process groups, etc.)

Internal departments i.e. Technology, Strategy, MIS & Analysis, Services team and CISO

Key Attributes & Skills:

Basic Qualifications:

Experience of 4-8 years as a Big Data Engineer. Graduate with any STEM field

Core Skills

  • Minimum 3+ years of core experience in design, maintenance & optimization of SQL queries
  • Min 3+ yrs exp in working on Hadoop ecosystem & allied Big data technologies
  • Min 3+ yrs exp is working with Large datasets using Python, Pyspark & Scala
  • Excellent knowledge on HDFS, HIVE, HBASE, Spark
  • Ability to work with various CI/CD tools
  • Desirable Skills

  • Working knowledge of Big Data Visualization techniques is desirable
  • Working Knowledge of Apache Kafka, Beam, Flink, Cassandra, Druid, Postgres, Atlas, Falcon, DVC, Elasticsearch, Redis, UNIX shell scripting,
  • Experience in handling & manipulation of Unstructured data is an advantage
  • Experience with workflow tools like Apache Airflow or Nifi is a big plus
  • Exp/Background in Java/C++ programming is desirous
  • Knowledge of Cloud solutions for big data
  • Knowledge of ML tools/techniques is an advantage
  • Desirable Skills:

  • Working knowledge of Big Data Visualization techniques is desirable
  • Working Knowledge of Apache Kafka, Beam, Flink, Cassandra, Druid, Postgres, Atlas, Falcon, DVC, Elasticsearch, Redis, UNIX shell scripting,
  • Experience in handling & manipulation of Unstructured data is an advantage
  • Experience with workflow tools like Apache Airflow or Nifi is a big plus
  • Exp/Background in Java/C++ programming is desirous
  • Knowledge of Cloud solutions for big data
  • Knowledge of ML tools/techniques is an advantage
  • Salary: Not Disclosed by Recruiter

    Industry:IT-Software / Software Services

    Functional Area:IT Software - Application Programming, Maintenance

    Role Category:Programming & Design

    Role:Software Developer

    Key Skills

    Desired Candidate Profile

    Please refer to the Job description above

    Education-

    UG:B.Tech/B.E. - Any Specialization, B.Sc - Maths, Computers

    Company Profile

    NISG (National Institute for Smart Government)

    www.nisg.org
    View Contact Details+

    Contact Company:NISG (National Institute for Smart Government)