top of page
Kaustubh S

Azure Data Bricks Engineer

Kaustubh S

Kaustubh S is an experienced Azure Databricks Engineer, recognized for his advanced skills in managing and optimizing big data workloads within cloud environments. His expertise in building scalable data pipelines, performing in-depth troubleshooting, and enhancing data processing efficiency has been crucial in driving data-driven insights and operational excellence. Kaustubh is available hourly, monthly, or quarterly to help you streamline your Azure Databricks infrastructure and address any complex data engineering challenges.

Hire Now

Responsibility

  • Currently working as an Azure data engineer, I developed pipelines and codes using Azure data factory and Synapse to ingest data from different sources and transform the data using Databricks according to the end user's needs.

  • Have Experience creating Linked services to fetch the data from different sources like SQL DB and oracle Databases.

  • Have enough experience in fetching data from different APIs and loading the data to Azure data lake storage.

  • Designed and developed Delta live pipelines (ETL) in Databricks to ingest, store, and process data from multiple sources based on Medallion architecture (Bronze, silver, gold).

  • Perform analyses on data quality and apply business rules in all layers of data extraction transformation and loading process.

  • Created DevOps pipeline to deploy the codes to higher environments using CI/CD process and proficient in code deployments in development, testing and production environments using Azure DevOps.

  • Experienced working with Azure BLOB and ADLS container, Data Lake storage and Azure delta tables in Databricks to load data into and manage the folder structure in containers.

  • Experienced in converting the SQL code from a source system into dataflow transformations in Azure data factory for building pipelines. Used Dataflow transformation in ADF to build the data movement pipeline in ADF.

  • Experienced In Azure data bricks and handled semi-structured data of API-based source system and RDBMS Oracle & SQL databases source system.

  • Experienced in PYSPARK programming. Used the Python and Spark programming languages to transform and integrate the data in Azure data bricks.

  • Experienced in streaming data and worked on Databricks cloud files using autoloader in data bricks &designed the orchestration framework in Azure Databricks.

  • Created triggers and scheduling, job dependencies for pipeline, created the email notifications for alerting the jobs or data processing failure at any point of time. Used logic app for alerting.

  • Implemented the ABC framework to ensure data consistency and accuracy across source and sink systems.

  • Designed testing framework using the PYSPARK test utils to test code functionality with the assert unit test library function in Databricks notebooks.

  • Testing the Data Pipelines using both Standard test templates and custom test cases (Both Unit & Functional).

bottom of page