hello ✨ professional ✨ academic ✨ 25% katie ✨ 50% katie ✨ 100% katie ✨ portfolio ✨ skills ✨ music ✨
just a lil assortment of projects i've done over the years.
After conducting data mining on a Salto dataset, I created a system of detecting users who share their account with multiple people, a system that will be later applied to a brand recommendation engine.
With Numerama's Cyberguerre, I processed and analysed a leaked dataset to explore the extent of a cyber attack that affected approximately 500,000 people.
This project aimed take Google Analytics Customer Revenue data and clean it up by removing missing values or engineering new features. Visualisations were then run on the data to explore the information.
This project used various classification models — logistic regression, decision trees — to predict user conversion rate.
Starting with a list of the top 35 best cities in France to visit, this project used APIs and web scraping techniques to determine which cities and hotels would be the best to travel to in the next week based on forecasted weather conditions. The collected data was then stored in an AWS S3 bucket and uploaded into an RDS.
In this project, I used machine learning techniques (linear regression, decision tree classification, and gradient boost) to detect fake news.
After taking YouTube playlog data from an S3 bucket or extracting it using a YouTube API, I loaded them into a PySpark dataframe, cleaned the information, and then saved the processed dataframe back to S3 and Redshift.
This project answers the questions presented in task one of the Kaggle project COVID-19 World Vaccination Progress.
Using unsupervised machine learning techniques and data provided by Uber, this project aimed to give drivers recommendations about which hot-zones in majors cities to be in to be able to pick up riders within 5-7 minutes of their ride request.
This project uses the various classifiers -- random forest, SVM -- to predict fraudulent bank activity.
In this project, I built a movie recommendation system from scratch that uses the past ratings made by a user to predict new titles that they will likely enjoy.
Test you knowledge about my pets with this interactive quiz!