25 (not boring) Data portfolio project ideas

I combed through Kaggle to find Datasets that would make for fun & interesting Data portfolio projects.

Here are 25 prompts for your self-driven Data projects.

Topic 1: Fast Food Nutrition

As much as we might try to avoid it, Fast Food is all around us. The question is how bad is Fast Food for us? Are the “healthy” options at these famous Fast Food restaurants indeed healthy?

  • Project idea 1: Develop a fast food recommendation system based on a user’s nutritional goals (e.g., high protein, low calorie).

    Skills: Feature Engineering, Machine Learning, Python

  • Project idea 2: Analyze and visualize nutritional trends across different fast food chains. Provide recommendations for items that are healthy and items to avoid.

    Skills: Exploratory Data Analysis (EDA), Data Visualization, Statistical Analysis, Python

  • Project idea 3: Analyze the relationship between different nutrients in fast food items, including correlations and trade-offs of nutrients.

    Skills: Exploratory Data Analysis (EDA), Statistical Analysis, Data Visualization, Python

  • Project idea 4: Create an interactive dashboard for consumers to explore and compare nutritional information across different fast food items and restaurants.

    Skills: Data Visualization, SQL, Python

  • Project idea 5: Conduct a clustering analysis to group similar fast food items based on their nutritional profiles, potentially uncovering hidden patterns in menu offerings.

    Skills: Exploratory Data Analysis, Unsupervised Machine Learning, Python

Topic 2: Airbnb listings and reviews

Arguably the most common hobby among Millennials and Gen Zs is traveling. Let’s go deep into Airbnb’s listings and find out what makes a listing successful.

  • Project idea 6: Develop a price prediction model for Airbnb listings based on property features and host characteristics. Make a recommendation for optimal prices based on these features.

    Skills: Feature Engineering, Machine Learning, Python

  • Project idea 7: Analyze the impact of superhost status on listing performance, including prices, ratings, and booking frequency.

    Skills: Exploratory Data Analysis, Statistical Analysis, Data Visualization, SQL, Python

  • Project idea 8: Create a recommendation system for Airbnb users based on listing features, user preferences, and review scores.

    Skills: Machine Learning, Feature Engineering, SQL, Python

  • Project idea 9: Build a clustering model to segment Airbnb listings based on their amenity offerings, and analyze how these clusters relate to pricing and guest satisfaction.

    Skills: Unsupervised Machine Learning, Data Visualization, Python

  • Project idea 10: Analyze the geographical distribution of Airbnb listings and their characteristics, creating interactive maps to visualize pricing, ratings, and property types across different neighborhoods.

    Skills: Exploratory Data Analysis , Data Visualization, SQL, Python

Topic 3: Summer Olympics

The Summer Olympics are upon us, and it seems like that’s all everyone is talking about. How about we ride that wave of excitement and build a project in this space?

  • Project idea 11: Build models to predict a country’s medal count in the current 2024 Olympics. Compare different ML algorithms to identify the most accurate predictive model.

    Skills: Feature Engineering, Machine Learning, Python

  • Project idea 12: Build an interactive dashboard to explore each country's performance over time. Identify up-and-coming countries in the Olympics and the sports they are excel in.

    Skills: Time Series Analysis, Data Visualization, SQL, Python

  • Project idea 13: Analyze the evolution of gender representation in Olympic sports, highlighting trends in equality over time.

    Skills: Statistical Analysis, Time Series Analysis, SQL, Python

  • Project idea 14: Analyze athlete performance to identify patterns of dominance, versatility across events, and career longevity.

    Skills: Exploratory Data Analysis, Statistical Analysis, Data Visualization

  • Project idea 15: Enrich the Olympic dataset with external information like country demographics and recent performances.

    Skills: Data Integration, Web Scraping, Python

Topic 4: Movies

Who doesn’t love movies? It transports us to whole different worlds and universes. This comprehensive dataset on Movies gives us an opportunity to go deeper into the trends in the Movie industry.

  • Project idea 16: Analyze factors influencing movie revenue and create a predictive model for box office success.

    Skills: Exploratory Data Analysis, Data Visualization, Supervised Machine Learning, Feature Engineering, Python

  • Project idea 17: Develop a movie recommendation system using collaborative filtering based on user ratings and movie features.

    Skills: Unsupervised Machine Learning, Feature Engineering, Python, SQL

  • Project idea 18: Perform sentiment analysis on movie overviews and taglines to explore their relationship with audience ratings.

    Skills: Natural Language Processing, Data Visualization, Statistical Analysis, Python

  • Project idea 19: Investigate trends in movie genres, production countries, and languages over time to identify industry shifts.

    Skills: Time Series Analysis, Data Visualization, Exploratory Data Analysis, SQL

  • Project idea 20: Build a classification model to predict a movie's genre based on its features and evaluate its performance.

    Skills: Supervised Machine Learning, Feature Engineering, Model Evaluation and Validation, Python

Topic 5: Mental health

Mental health is a topic I’m highly passionate about. Can we figure out trends in the mental health space and make recommendations to help alleviate this global health issue?

  • Project idea 21: Analyze global trends in mental health disorders and create interactive visualizations to showcase prevalence changes over time.

    Skills: Time Series Analysis, Data Visualization, Exploratory Data Analysis, Python

  • Project idea 22: Investigate gender disparities in mental health across countries and develop statistical models to identify significant factors.

    Skills: Statistical Analysis, Data Visualization, Feature Engineering, Python

  • Project idea 23: Build and compare multiple machine learning models to predict mental disorder prevalence based on demographic and socioeconomic factors.

  • Skills: Supervised Machine Learning, Feature Engineering, Model Evaluation and Validation, Python

  • Project idea 24: Perform correlation analysis between mental disorder prevalence and Disability-Adjusted Life Years (DALYs) to quantify the impact of mental health on overall disease burden.

  • Skills: Exploratory Data Analysis, Statistical Analysis, Data Visualization, Python

  • Project idea 25: Apply clustering algorithms to group countries based on their mental health profiles and visualize the results using dimensionality reduction techniques.

  • Skills: Unsupervised Machine Learning, Feature Engineering, Data Visualization, Python

Thanks for reading and for sharing your most valuable resource with me, your time.

If you’re not already following me on LinkedIn, follow me here. I post every day about the Data career.