Portfolio

In this page, I list some of the fundamental skills I learnt over the years. I also showcase some off-work projects I'm elaborating as the result of my self-driven passion for Data.

Feel free to open a pull request if you see any fix/suggestion to be made, thank you!


Core Competencies

  • Soft skills : Highly organized, proactive, meticulous, team-oriented, tech-savvy, resilient, solution-focused
  • Methodologies: Time Series Analysis, Statistical Methods, Probability, EDA, CRUD, Machine Learning, Web Scraping
  • Languages: Python, SQL, MATLAB
  • Tools
    • IDE : VS Code, Jupyter Notebook
    • Scripting : Shell, Bash
    • Version control : Git Bash, GitHub
    • DB management
      • SQL : SQL Server, PostgreSQL
      • NoSQL : MongoDB, Redis
    • Workflow orchestration : Mage, Airflow
    • Visualization : Excel, Tableau
    • Cloud resources
      • CDN : CloudFront
      • IaC : Terraform, CloudFormation
      • PaaS : Render, Beanstalk
      • Computing : Compute Engine, EC2, RDS
      • Storage : Cloud Storage, S3
      • Warehouse : BigQuery, Redshift
      • Analytics : Analytics/Looker Studio, Athena

Projects

    Data Analysis

    freeCodeCamp DA challenges

    Good foundational hands-on knowledge of Data Analysis with Python. From loading and cleaning flat files with Pandas to performing mathematical operations and statistical analysis with Numpy, I also created diverse visualizations using Matplotlib and Seaborn. Note that instead of using Replit, I developed, debugged and validated the python modules for each problem using VS Code in my local machine.

    SQL weekly case studies

    In this repository, I responded to the questions of the study cases of the course 8 week SQL challenge by Danny Ma. From essential SQL syntax to the elaboration of more intricate structures compound by CTEs, window functions and recursive queries, these challenges were a splendid way to solidify the hands-on core knowledge in SQL.

    Web Scraping

    The purpose of this repository is to showcase some examples of Web Scraping I created by utilizing Python with BeautifulSoup. I explain how to spot and parse the information from a HTML script in our browser Inspector, so that we can fetch it into csv/tsv flat files or more structured files like jsons. Some of its preferable practices are also detailed along the narration of this repo's markdown file.

    Dashboard works

    In this repo, I showcase some of the off-work dashboard creations I elaborated so far using mainly Tableau. The common denominator of these micro-projects is the use of Data blending, LOD expressions, Calculated fields, Bin creation and Tooltip customization, amongst others. You also can directly take a look at my Tableau public profile instead of checking out my GitHub repository.

    Data Engineering

    GCP Uber end-to-end pipeline

    In this project, I designed and detailed the core stages of an end-to-end data pipeline using Google Cloud Platforms as the main resource/service provider. This was an excellent occasion to review key notions on Data modeling. The orchestration of the ETL process was executed with Mage.

    Azure Tokyo Olympics end-to-end pipeline

    From data ingestion into Data Factory to a brief elaboration of some Charts and Graphs within Synapse Analytics, in this project I elaborated an end-to-end data pipeline utilizing resources from Microsoft Azure. Apache Spark was employed for data processing.

    DTC - Data Eng. Zoomcamp capstone project

    After brushing up some of the hands-on concepts of Containerization, Workflow orchestration, Data warehousing and Analytics engineering, in the final section of this bootcamp I'm elaborating an end-to-end data pipeline with OLAP integration by means of Terraform as IaC and Cloud Run as serverless compute platform, amongst other frameworks and resources.

    Data Science

    WorldQuant University Applied DS Lab projects

    In this repo, I summarized the syllabus of the hands-on courses in Data Science I took at WQU. These were compound by 8 end-to-end DS micro-projects in which the common subjects were data ingestion from SQL/NoSQL databases and APIs, design of ETL pipelines, construction of supervised/unsupervised ML algorithms and web app design, development and deployment.

    fCC Machine Learning challenges

    In this course, I employed TensorFlow framework to build several Neural Networks and explore more advanced techniques like Natural Language Processing and Reinforcement Learning. That helped me learn the principles behind how deep, recurrent, and convolutional Neural Networks can operate. Some of these micro-projects are KNN Book Recommendation Engine or Neural Network SMS Text Classifier , amongst others.

    End-to-End MLOps project

    In this project, I created a web app to predict the performance of students based on their academic background and past results. Once the source code was locally built, tested and debugged in a conda development environment, then production deployment to Beanstalk was executed by means of CodePipeline.

    Statistical Methods and ML essays

    In this repository, I delve into some of the core concepts of Statistics and Probability I learned during my Bachelors of Science in Mechanical Engineering. Without that essential knowledge, it would have been tougher to catch up with the most common methodologies and techniques utilized in Data Science, and more particularly in the branch of Machine Learning.
    Some of these subjects include:

Academic background