• Experience in installing, configuring and using Apache Hadoop ecosystem components like Hadoop Distributed File System (HDFS), MapReduce, Yarn, Spark, Nifi, Pig, Hive, Flume, Hbase, Oozie, Zookeeper, Sqoop, Scala. • Hands on experience in creating real-time data streaming solutions using Apache Spark core, Spark SQL & Data Frames, Kafka, Spark streaming and Apache Storm. • Excellent knowledge of Hadoop architecture and daemons of Hadoop clusters, which include Name node, Data node, Resource manager, Node Manager and Job history server. • Expertise in administering the Hadoop Cluster using Hadoop Distributions like Apache Hadoop & Cloudera. • Extensive development experience in different IDE’s like Eclipse, NetBeans and Forte. • Proficient in creating complex data ingestion pipelines, data transformations, data management and data governance, real time streaming engines at an Enterprise level. • Proficient working on NoSQL technologies like HBase, Cassandra and MongoDB. • Extensive experience in working with different ETL tool environments like SSIS, Informatica and reporting tool environments like SQL Server Reporting Services (SSRS) • Experience in Data warehousing concepts like Star Schema, galaxy and Snowflake Schema, DataMarts, Kimball Methodology used in Relational and Multidimensional data modeling. • Hands on experience in coding MapReduce/Yarn Programs using Java, Scala and Python for analyzing Big Data. • Good experience in working with cloud environment like Amazon Web Services (AWS), Microsoft Azure, GCP. • Establishing multiple connections to different Redshift clusters (Bank Prod, Card Prod, SBBDA Cluster) and provide the access for pulling the information we need for analysis. • Strong knowledge in extraction, transformation and loading of data directly from different source systems like flat files, Excel, Hyperion, Oracle and SQL Server, Redshift. • Worked on extensive migration of Hadoop and Spark Clusters to GCP, AWS and Azure. • Experience in Azure Cloud, Azure Data Factory, Azure Data Lake Storage, Azure Synapse Analytics, Azure Analytical services, Azure Cosmos NO SQL DB, Azure Big Data Technologies (Hadoop and Apache Spark) and Data bricks. • Good knowledge in using Apache Nifi to automate the data movement between different Hadoop systems. • Hands on experience in coding MapReduce/Yarn Programs using Scala and Python for analyzing Big Data. • Experienced in migrating HiveQL into Impala to minimize query response time. • Extensive experience in the implementation of Continuous Integration (CI), Continuous Delivery and Continuous Deployment (CD) on various Java based Applications using Jenkins, TeamCity, Azure DevOps, Maven, Git, Nexus, Docker and Kubernetes. • Expertise in Natural Language Processing (NLP), Text Mining, Topic Modelling, Sentiment Analysis. • Proficient at using Spark API’s to explore, cleanse, aggregate, transform and store machine sensor data. • Experience in creating Data frames using PySpark and performing operation on the Data frames using Python. • Developed ETL\Hadoop related java codes, created RESTful APIs using Spring Boot Framework, developed web apps using Spring MVC and JavaScript, developed coding framework, etc. • Experience in developing POC's using Scala, Spark SQL and MLlib libraries then deployed on the Yarn cluster. • Excellent Experience in job workflow scheduling and monitoring tools like Oozie and Zookeeper. • Imported and exported data from different data sources into HDFS using Sqoop and performed transformations using Hive, Map Reduce and then loaded data into HDFS • Good Understanding and experience in Machine Learning Algorithms and Techniques like Classification, Clustering, Regression, Decision Trees, Random Forest, NLP, ANOVA, SVMs, Artificial Neural Networks. • Extensive experience in Text Analytics, developing different Statistical Machine Learning solutions to various business problems and generating data visualizations using Python and R. • Expertise in SQL Server Analysis Services (SSAS), SQL Server Reporting Services (SSRS) tools and in development of T-SQL, Oracle PL/SQL Scripts, Stored Procedures and Triggers for business logic implementation. • Experience in creating interactive Dashboards and Creative Visualizations using tools like Tableau, Power BI • Hands on experience with Microsoft Azure components like HDINSIGHT, Data Factory, Data Lake Storage, Blob, Cosmos DB. • Experience in using distributed computing architectures like AWS products (e.g. EC2, Redshift, and EMR, Elastic search) and working on raw data migration to Amazon cloud into S3 and performed refined data processing. • Extensive skills on LINUX and UNIX Shell command. • Excellent working experience in Scrum / Agile framework and Waterfall project execution methodologies.