Azure Data Bricks Engineer

Kaustubh S

Kaustubh S is an experienced Azure Databricks Engineer, recognized for his advanced skills in managing and optimizing big data workloads within cloud environments. His expertise in building scalable data pipelines, performing in-depth troubleshooting, and enhancing data processing efficiency has been crucial in driving data-driven insights and operational excellence. Kaustubh is available hourly, monthly, or quarterly to help you streamline your Azure Databricks infrastructure and address any complex data engineering challenges.

Hire Now

Hourly ($30)

Part-Time ($2000)

Monthly ($4000)

Responsibility

Currently working as an Azure data engineer, I developed pipelines and codes using Azure data factory and Synapse to ingest data from different sources and transform the data using Databricks according to the end user's needs.
Have Experience creating Linked services to fetch the data from different sources like SQL DB and oracle Databases.
Have enough experience in fetching data from different APIs and loading the data to Azure data lake storage.
Designed and developed Delta live pipelines (ETL) in Databricks to ingest, store, and process data from multiple sources based on Medallion architecture (Bronze, silver, gold).
Perform analyses on data quality and apply business rules in all layers of data extraction transformation and loading process.
Created DevOps pipeline to deploy the codes to higher environments using CI/CD process and proficient in code deployments in development, testing and production environments using Azure DevOps.
Experienced working with Azure BLOB and ADLS container, Data Lake storage and Azure delta tables in Databricks to load data into and manage the folder structure in containers.
Experienced in converting the SQL code from a source system into dataflow transformations in Azure data factory for building pipelines. Used Dataflow transformation in ADF to build the data movement pipeline in ADF.
Experienced In Azure data bricks and handled semi-structured data of API-based source system and RDBMS Oracle & SQL databases source system.
Experienced in PYSPARK programming. Used the Python and Spark programming languages to transform and integrate the data in Azure data bricks.
Experienced in streaming data and worked on Databricks cloud files using autoloader in data bricks &designed the orchestration framework in Azure Databricks.
Created triggers and scheduling, job dependencies for pipeline, created the email notifications for alerting the jobs or data processing failure at any point of time. Used logic app for alerting.
Implemented the ABC framework to ensure data consistency and accuracy across source and sink systems.
Designed testing framework using the PYSPARK test utils to test code functionality with the assert unit test library function in Databricks notebooks.
Testing the Data Pipelines using both Standard test templates and custom test cases (Both Unit & Functional).

Technical Skills

Cloud service: Azure
Data Ingestion tool: Azure data factory, Azure Synapse Analytics
Data Integration tool: Azure data bricks and Azure Data Factory (Data flow)
Programming Languages: PYSPARK, SQL
RDBMS: Oracle 11g, SQL and Azure SQL
Storage: Azure data lake, Databricks Delta lake, databricks Lakehouse, Azure BLOB, delta table.
Operating System: Windows
Source Control: Azure DevOps, GitHub
Bug tracking tool: JIRA, ServiceNow, Azure DEVOPS board.
Documentation: Confluence

Azure Data Bricks Engineer

Kaustubh S

Responsibility

Certifications

Technical Skills

Timeline