Omkar

Azure Data Engineer

11 +

Years of Exp.

Technical Skills

1. Azure Data Factory
2. Azure Data Lake Storage
3. Azure Stream Analytics
4. Azure Data Lake Analytics
5. Azure Data Catalog
6. Azure Data Share
7. Azure Cosmos DB
8. Azure Event Hubs
9. Azure Logic Apps

10. Power BI
11. Java for Data Engineering
12. .NET for Data Engineering
13. Hadoop (including HDFS, Hive, Pig)
14. NoSQL databases (e.g., MongoDB, Cassandra)
15. Apache Kafka for Data Streaming
16. Git for Version Control
17. Jenkins for Continuous Integration/Continuous Deployment (CI/CD)
18. DevOps Practices for Data Engineering

Professional Summary

Responsibility

Projects

Real-time Data Processing and Analytics

Multi-cloud Data Synchronization

1. Designed and developed Azure Data Factory pipelines to ingest data from Azure Event Hubs.
2. Configured Azure Stream Analytics jobs to process and transform incoming streaming data in real-time.
3. Leveraged Azure Data Lake Storage for storing raw and processed data efficiently.
4. Utilized Azure Logic Apps to trigger notifications and alerts based on specific data conditions.
5. Designed Power BI dashboards for visualizing real-time data insights and trends.
6. Collaborated with business analysts to define key performance indicators (KPIs) for monitoring.
7. Integrated Git for version control to manage code changes and track pipeline modifications.
8. Conducted performance tuning of Stream Analytics queries for optimal data processing speed.
9. Implemented CI/CD pipelines using Jenkins to automate the deployment of Azure resources and Stream Analytics jobs.
10. Documented the end-to-end architecture, data flow, and monitoring procedures for knowledge sharing.

1. Designed and deployed Azure Data Factory pipelines to extract, transform, and load (ETL) data from on-premises systems.
2. Implemented Azure Data Lake Storage as a central repository for synchronized data.
3. Utilized Java and .NET for data engineering to develop custom connectors for data synchronization between Azure and non-Azure cloud environments.
4. Integrated Apache Kafka for data streaming to facilitate near-real-time data replication.
5. Configured Azure Data Share for secure data sharing with external cloud partners.
6. Implemented Git-based version control for managing code changes and pipeline configurations.
7. Developed monitoring and alerting using Azure Logic Apps for data synchronization workflows.
8. Set up CI/CD pipelines with Jenkins for continuous integration and deployment of data synchronization processes.
9. Documented architecture diagrams and data lineage for compliance and knowledge sharing.
10. Collaborated with cross-functional teams to ensure seamless data synchronization and collaboration across cloud platforms.

Migration to Cloud-native Data Platform

1. Led migration of on-premises data systems to Azure cloud-native platform.
2. Utilized Azure Data Factory for seamless data migration from on-premises databases.
3. Implemented Azure SQL Database Managed Instances for relational data storage.
4. Transformed data using Azure Databricks and PySpark for migration and analysis.
5. Automated ETL processes with Azure Logic Apps and Azure Functions.
6. Ensured data consistency and security during migration and transformation.
7. Reduced infrastructure costs by 20% and improved data availability for analytics.

Education

BE In Computer Science, DBIT- Mumbai University

Certificate

Microsoft Certified: Azure Data Engineer Associate
Microsoft Data Fundamentals
Databricks Certified Associate Developer for Apache Spark 3.0