Lavian Dsouza

Data Engineer

+971 54 752 6875
Abu Dhabi, UAE

Mathematical Problem Solver & Data Engineering Expert

Results-driven Data Engineer with 8+ years of experience designing and optimizing scalable data pipelines, ETL workflows, and analytical systems. Expert in applying graph theory, statistical inference, and operations research to deliver actionable insights, reduce processing times by up to 70%, and enhance system efficiency. Proficient in Python, SQL, Apache Spark, and cloud platforms (AWS, Azure, GCP, Oracle), with strong expertise in data governance, automation, and predictive analytics.

Professional Experience

Data Analyst (Data Engineering)
AD Ports Group
Sep 2024 – Present
  • Designed and implemented scalable data pipelines integrating Microsoft Dynamics 365 and Oracle Fusion ERP using Python, SQL, and Apache Airflow, reducing reporting time by 40% and overstocking by 20%.
  • Developed automated Power BI dashboards with Kafka streaming integration and Great Expectations for data quality validation, improving operational efficiency across 10+ departments by 15-20%.
  • Conducted predictive modeling for procurement trends using Prophet, Lasso, and Ridge algorithms, optimizing Trino/Presto queries and Delta Lake storage to achieve 15-20% efficiency gains in data processing.
  • Automated cross-department ETL workflows with Apache Airflow and Datahub for metadata management, reducing manual effort by 70% and enhancing data governance compliance.
Technical Data Analyst (ERP + CRM)
GlowTouch Technologies
Sep 2021 – May 2024
  • Built data extraction and ETL pipelines with SQL, Python, integrating HubSpot, Salesforce, and Microsoft Dynamics 365, cutting troubleshooting time by 35% and improving data delivery speed by 20%.
  • Automated log analysis and ETL scripts using Apache Spark, resulting in 30% process efficiency improvements and scalable infrastructure for high-volume data handling.
  • Developed internal compliance dashboards and fraud-monitoring tools using Power BI and Elasticsearch, ensuring data integrity and reducing response times by 25% through QA coordination and governance practices.
Data Associate (Data Handling & Automation)
Amazon
Aug 2020 – Feb 2021
  • Managed high-volume customer data with ETL workflows in Excel and SQL, reducing invoice dispute resolution time by 25% and enhancing audit efficiency by 15%.
  • Supported backend incident diagnostics with Python pipelines and MongoDB integration, incorporating basic ML for trend analysis to improve SLA adherence by 20%.
Data Representative – Technical Support
Concentrix
Nov 2017 – Oct 2019
  • Analyzed system logs and automated data processes with Python and batch scripts, reducing manual effort by 30% and improving issue resolution by 25%.
  • Managed SAP CRM data flows and ETL processes, supporting network security audits and infrastructure maintenance using Active Directory and Linux tools for 95% uptime.
Technical Support Engineer
Anmol Solution
Jun 2016 – Aug 2017
  • Managed server operations, SQL pipelines, VPN setups, and Active Directory configurations, achieving 95% system uptime and reducing downtime by 40%.
  • Installed and maintained hardware/software, optimizing data storage and retrieval through Python scripting, improving operational efficiency by 30% for small business clients.

Education

Master of Science in Mathematics
St. Aloysius College (Deemed to be University), Mangalore
2015
  • • Algebra, Analysis, Topology
  • • Differential Equations, Graph Theory
  • • Operations Research, Functional Analysis
  • • Cryptography, Numerical Analysis
  • • Advanced problem-solving and theorem proving
  • • Mathematical modeling with Python, MATLAB, LaTeX
Bachelor of Science
St. Aloysius College (Deemed to be University)
2013
  • • Physics, Chemistry & Mathematics
  • • Algebra, Analysis, Topology
  • • Differential Equations, Graph Theory
  • • Operations Research
  • • Computational tools (Python, MATLAB, LaTeX)
  • • Data analysis and statistical methods

Professional Certifications

IBM Data Engineering Professional Certificate
Python, SQL, ETL, Apache Spark, Data Pipelines
Snowflake Data Engineering Professional Certificate
Snowflake, SQL, Cloud Data Warehousing
Google Cloud Data Analytics Professional Certificate
BigQuery, Dataproc, Vertex AI, Data Pipelines (In-Progress)
AWS Developer Specialization
AWS Lambda, EC2, S3, Cloud Development
Hadoop & Spark Fundamentals
Hadoop, Spark, MapReduce (In-Progress)
Google Advanced Data Analytics Professional Certificate
Python, ML, PACE Framework, Data Visualization
Google IT Automation with Python Professional Certificate
Python, Git, Automation, Scripting
Tableau Business Intelligence Analyst Professional Certificate
Tableau, Dashboards, Data Visualization

Technical Skills

Programming & Languages

Python SQL R DAX Bash PowerShell Pandas NumPy Scikit-learn Matplotlib Plotly

Data Engineering & ETL

Apache Spark Flink Dask Ray dbt Airbyte Debezium Airflow Prefect Dagster Great Expectations

Databases & Storage

PostgreSQL MySQL CockroachDB TiDB Redis MongoDB Neo4j Cassandra Elasticsearch InfluxDB DuckDB ClickHouse

Data Lakes & OLAP

Delta Lake Iceberg Hudi Trino/Presto Druid Doris/StarRocks

Analytics & Visualization

Power BI Tableau Superset Metabase Redash Dash Streamlit

ML & AI Tools

MLflow Kubeflow DVC Regression Time Series NLP Prophet Lasso Ridge

Cloud Platforms

AWS Azure GCP Oracle Cloud Lambda EC2 S3 BigQuery Dataproc Vertex AI

Tools & Platforms

Microsoft Dynamics 365 Oracle Fusion ERP Salesforce HubSpot SAP CRM Kafka Datahub Git Docker Kubernetes