Blog Post

DataFest 2025

Just over this past weekend, I attended the first DataFest event that Kansas State University has ever hosted. DataFest is a celebration of data and statistics and a friendly competition between data professionals from all backgrounds. Given that I am studying Statistics as a minor, I thought I'd give it a go. And it was well worth it! My team was awarded with the best insight, which was what we were aiming for. I met a lot of crazy smart people at this event, and my team and mentor made the experience unforgettable.

The event kicked off with an introduction to the data donor and their challenge. I signed an NDA since the competitions are still occurring worldwide, but I can describe what I did and what our team presented. The first thing I did was gather the data and review it to determine what was needed and what wasn't. There were entire sections of the data that had NA values for columns unrelated to our research question. So those columns were removed. Then, the names of the observations were inconsistently NA throughout the rows. Those were removed since we could not use them without some identifying name for each. Finally, we created indicator values for categorical values. The entire removal/omission process was conducted using Python-Pandas.

After the data was cleaned and a solid sample size was gathered, I moved over to using R for the regression analysis. I admit that I have just learned about regressions, ANOVA, and R programming over the course of this semester, so I was a bit unsure of how to conduct this. I tried several linear regression models with different predictor variables and interactor variables until I found a reliable statistic that indicated a microtrend. I collaborated this with my teammates' analyses and we all determined that we found a good microtrend. We then transitioned to presentation work for 5 minutes and 3 slides.

Overall I am happy to have attended and will certainly attend next year!