Hi. I am Vishal Longani.

Hey! I'm an experienced Data Engineer with over years of expertise in building fault-tolerant systems for high-volume data. My skillset spans ETL pipelines, streaming data, and the fascinating world of distributed computing.

☁️ A maestro in Azure and AWS, I orchestrate data symphonies with PySpark and Airflow. Fluent in Python, I navigate data seamlessly using Kafka's magic.

📊 I'm not just crunching numbers; I'm a master storyteller with SQL queries, shaping real-time insights. As a web scraping maestro, I extract gems from the digital landscape.

📈 Bringing data to life with dazzling dashboards using Plotly Dash, I transform the web into a captivating stage for insightful performances! 🎭✨

Projects

Freelance Trends

Freelance Trends provides both ongoing and historical data analysis of over 100,000 Upwork projects in 20+ categories, with daily refreshes

  • python
  • Docker
  • Azure Databricks
  • Azure Table Storage
  • Apache Airflow

RealtimeClickStreamETL

Clickstream ETL (Extract, Transform, Load) pipeline designed to consume real-time data from a Kafka and Zookeeper cluster. The pipeline utilizes Python for initial data processing and enrichment before storing the raw clickstream data in Apache Cassandra. The raw data in Cassandra is then processed and enriched using PySpark on Databricks to perform complex transformations. Finally, the processed data is stored in Elasticsearch for efficient querying and analysis.

  • Python
  • Docker
  • Azure Databricks (Pyspark)
  • Apache Cassandra
  • Elasticsearch
Tech Detector

Tech Detector

A distributed system that detects and analyzes technologies used across 200M+ domains. Utilizes Playwright for HTML extraction through proxies, along with predefined patterns and regex matching to identify web technologies. Built with Python and RabbitMQ for distributed processing across multiple servers.

  • Python
  • Playwright
  • RabbitMQ
  • Elasticsearch
GIF of Stream

Moving-Average-on-Streaming-Data-using-Kafka

Calculates Real time Moving Average of Risk Score (Loan Data)

  • python
  • Kafka

Ecommerce Website(Olyst) Data Engineering

Building a Data Warehouse on Google BigQuery for Ecommerce Data and Writing SQL Queries for Data Analysis

  • python
  • Google Bigquery
  • Data Modeling
  • SQL

Skills

Get in touch