Alexandra's Data Science
Portofolio

Hi! My name is Alexandra Grecu and I'm a self-taught data science enthusiast who loves exploring data to find meaningful insights. I am constantly learning new things and my projects are all geared towards creating a positive impact on people's lives. Exciting work is always ongoing!

I have completed several projects across various Data Science fields, such as Machine Learning, Statistics and Natural Language Processing. I'm excited to share my work with you!

My journey in Data Science
Throughout the past 12 months and continuing to the present, I have been actively engaged in a variety of Data Science projects. I began my exploration of Data Science in January 2022 and have been dedicated to the field ever since.
After mastering each new concept, I have tackled new projects in order to solidify my understanding and build my skills. These projects have allowed me to put theory into practice, work through real-world scenarios, and make meaningful contributions towards solving problems that impact society.

Recipe Site Traffic

Category: Data Science

Tasty Bytes is a company founded in 2020 during the Covid Pandemic as a recipe search engine to help people use up their limited supplies. It has since grown into a fully-fledged business. Currently, the team chooses their favorite recipe to display on the home page, which increases traffic to the rest of the website by as much as 40%. However, they don't know how to predict which recipes will be popular. The product manager has requested a solution that can predict high traffic recipes with 80% accuracy.
Pens and Printers was founded in 1984 and provides high quality office products to large organizations. Six weeks ago they launched a new line of office stationery. They have tested three different sales strategies for this, targeted email and phone calls, as well as combining the two. The business goal of this project is to identify the best sales strategy for the new product line.

Sales of products

Category: Data Analytics

National Road Safety Traffic

Category: Data Science

Road accidents represent a major problem in Romania, causing a very large number of injuries and deaths every year. The objective of this project is to study the road accidents that occurred in Romania during the year 2021 in order to better understand the factors that cause these accidents and what measures the authorities could take to minimize their number.
This project aims to utilize advanced tools and techniques in the natural language processing field to analyze 2000 Romanian reviews. The reviews were carefully examined and classified into positive and negative categories based on the sentiment expressed in the text. To accomplish this, I used a range of methods including sentiment analysis, machine learning algorithms, and other text analytics techniques to extract valuable insights from the reviews.

Sentiment analysis

Category: Data Science, NLP

Why people are leaving?

Category: Data Science

In this scenario, a corporation is grappling with a growing number of employees leaving, which has raised concerns among corporate management. To address this issue, the objective of this project is to identify the primary reasons why employees are leaving. The project will answer several key questions, such as which department experiences the most employee turnover and which variables have the greatest impact on turnover. Although this project does not involve Big Data encompassing all 5 Vs, it provides a valuable learning opportunity for working with Pyspark.
The average global temperature has risen at a rate of 0.08 degrees Celsius per decade since 1880. In this project, I have been provided with a dataset containing average temperature values in Romania spanning from 1902 to 2020. My goal is to develop a time series model for this dataset and using this model, I will then forecast the temperatures for the next 20 years.

Temperature prediction

Category: Data Science, Time Series

Fake News Detection

Category: Data Science, NLP

The project aims to develop a system to detect fake news using Natural Language Processing and Machine Learning techniques. The focus is on creating a model that can accurately identify misinformation in news content. The project includes advanced pre-processing techniques such as tokenization, stemming, and stop word removal. A variety of machine learning algorithms, including supervised classification, deep learning, and ensemble methods, will be explored and their performance compared to determine the best fit for the dataset.
In this project, my aim is to analyze the weight of Olympic participants from six teams across six continents - Romania, United States, South Africa, Brazil, Australia, and Japan. My analysis will involve using a 2-way ANOVA method to consider both the sex and team of the participants. This will allow me to identify any significant differences in weight between male and female participants from different teams, providing valuable insights into the factors that contribute to weight variations among Olympic athletes.

2-Way ANOVA approach

Category: Data Science, Statistics

Ozone Concentration Prediction

Category: Data Science

In this project, I aim to examine the relationship between weather variables and ozone concentration in the atmosphere of a specific city in America. I will use Linear Regression, with ozone concentration being the independent variable and pressure, temperature, and humidity being the dependent variables. Additionally, I will use the one-way ANOVA method to compare the differences between the weather variables and ozone concentration. This analysis will provide valuable insights into the impact of weather variables on ozone levels in the atmosphere.
Accidentele rutiere reprezintă o problemă majoră în România, provocând un număr foarte mare de răniți și decese în fiecare an. Obiectivul acestui proiect este de a studia accidentele rutiere care au avut loc în România în anul 2021, în scopul de a înțelege mai bine factorii care provoacă aceste accidente și ce măsuri ar putea lua autoritățile pentru a le reduce numărul acestora.

Romanian traffic
(Romanian version)

Category: Data Science, NLP

Why people are leaving? (Romanian version)

Category: Data Science, Big Data

În acest scenariu, o corporație se confruntă cu un număr tot mai mare de angajați care pleacă, ceea ce a ridicat îngrijorări printre conducerea corporativă. Pentru a aborda această problemă, obiectivul acestui proiect este de a identifica motivele principale pentru care angajații pleacă. Proiectul va răspunde la mai multe întrebări cheie, cum ar fi care departament are cel mai mare rata de turnover a angajaților și care variabile au cel mai mare impact.