SM / AVP - Senior Data Scientist
Reporting to:Head, Business Intelligence and Fraud Analytics
Role:Senior Data Scientist
We are looking for a hands on Data scientists to handle challenging problems in the areas of Indirect Tax Fraud & Analysis using cutting edge data science techniques. You have engineering skills to work with structured and unstructured data. You have applied knowledge in statistics, machine learning and NLP systems. You will be a machine learning expert who will co-create our implementation playbook of our machine learning / deep learning / NLP / AI capabilities. You have sound experience on machine learning, excellent communication skills and constantly keep pace with industry trends and new technologies. You have a passion for conceptualization, prototyping, debugging & developing ML-based products, applications & solutions.
- Leads discovery processes with stakeholders to identify the business requirements and the expected outcome.
- Collaborates with subject matter experts to select the relevant sources of information.
- Design and build analytic solution for our customers in short development cycles.
- Designs experiments, test hypotheses, and build models. Conducts advanced data analysis and designs custom algorithms.
- Creates visualizations, dashboards and reports to explain insights to the stakeholders.
- Presents and depicts the rationale of their findings in easy to understand terms for the business.
- Demonstrated data science experience with both structured and unstructured data.
- Knowledge of statistical techniques (ANOVA, Chi squared, Correlation etc) to design, test and validate hypothesis.
- Experience working with classical machine learning models: Bayesian models, Tree based models, Margin based classifiers, Boosting and Bagging models, Cluster analysis, and Anomaly detection systems.
- Expert in feature engineering and evaluation
- Hands on experience in handling text data using NLP techniques (e.g. document classification and clustering, topic mining, linguistic analysis, entity recognition, semantic search, information retrieval).
- Knowledge of classical time series analysis techniques e.g. Holt-Winters, ARIMA.
- Some Data visualization experience ( Tableau, Qlikview, D3.js etc)
- Experience working with new generation ML algorithms (eg: SVM; LDA; Kernel Based Models; Decision Trees; GBM; NN; Rule mining; Pattern Discovery etc)
- Some experience with Deep Learning models and their applications for CV; NLP and time series forecasting
- Experience and interest with privacy preserving techniques, hyperparameter optimization, model tuning and optimization
- Experience in designing end to end ML workflow and developing the modeling code to agreed engineering specs
- Knowledge distributed data science techniques, methods and tools
- Knowledge of model testing and post production drift, bias and shift detection
- Ability to write efficient, concise and maintainable code
¢¢ Learn new technologies & tool and suggest better ways of improving data science operations
¢¢ Design, Discuss with Senior Management & Own your personal & professional development plan
¢¢ Create, Discuss & work through a training plan in the area of your interest which aligns with current & future responsibilities
¢ External stakeholders (tax authorities, taxpayers, treasuries, banks, policy makers, process groups, etc.)
¢ Internal departments i.e. Technology, Strategy, MIS & Analysis, Services team and CISO
Key Attributes & Skills:
Experience of 8+ years as a Data Scientist. Graduate with any STEM field
Analytics software/programming technologies: Python, SparkML, Tensorflow, Big DL, Horovod, H20, Rapidminer, R, CPLEX, LION
Database technologies: MySQL,MongoDB, Hadoop ecosystem, couchdb, redis, neo4j
ETL technologies: Python, Spark
Languages: Python, Pyspark, Scala, Java Script, UNIX Shell; Django
Data Visualization: Bokeh, Dash, Seaborn, Tableau, D3.js
Statistical Methods: Probability distribution, ANOVA, Factor Analysis, Correlation & Regression, Conjoint, Time series, DoE, Sampling, PCA, Statistical Inference, MCMC
Machine Learning Methods/Algorithms: CHAID, Bayesian models, Linear & logistic Regression, SVM, RVM, Neural Nets, K-means, Expectation maximization, DBSCAN, BIRCH, Ensemble models, ICA, LDA, Graph models, Anomaly Detection models, xgboost, LSA, Lasso, Ridge, Elasticnet, Constraint & Multi-objective programming
Cognitive & NLP: CoreNLP, NLTK, CNTK ; GENSIMl Spacy
Deep Learning: Tensorflow; H2O; Pytorch; Horovod; Caffe; Big DL
Specific Packages: Scikit-learn; Pandas; numpy, PySpark, matplotlib, bokeh, seaburn, R-shiny, ggplot2, Alchemy, Dask, Big DL, Horovod, Distributed ML
Big Data Tools: Spark, Hadoop, Hive, HBASE, Hue, Drill, Presto
Additional Skills (Good to have)
Working knowledge of Spark & Hadoop ecosystem
Experience of Financial Crime modelling
Experience in K8 based ML workflow tooling
Salary: Not Disclosed by Recruiter
Desired Candidate Profile
UG:BCA - Computers, B.Sc - Maths, Computers, B.Tech/B.E. - Any Specialization
NISG (National Institute for Smart Government)
Contact Company:NISG (National Institute for Smart Government)