Alright, so today I’m gonna walk you through my little project: “the usos family”. Yeah, I know, sounds kinda random, but stick with me, it was a fun dive.
First thing’s first, I wanted to mess around with some data. I scraped a bunch of info… well, let’s just say “gathered” from a few different spots online. Think wrestling stats, family trees, that kinda stuff. Nothing too crazy, just basic info on the Anoa’i family, focusing on The Usos.
Step one: Get the Data. I started by manually copying and pasting the data from a couple of websites. I know, I know, super old-school, but sometimes that’s just the quickest way to grab what you need. I dumped it all into a CSV file. Messy as hell, but we’ll clean it up later.
Next, I fired up Python and used Pandas. Because, duh, Pandas makes everything easier. I read that CSV into a DataFrame, and that’s when the real fun began. The data was a mess. Missing values everywhere, inconsistent formatting, you name it.
So, step two: Clean the Data. This took a while. I filled in some of the missing values with “Unknown,” standardized the names to make sure everything was consistent (Jey Uso vs. Jimmy Uso vs. Jonathan Fatu, ya know?), and converted some of the columns to the right data types. Basically, a lot of `fillna()`, `replace()`, and `astype()` calls.
Once I had the data cleaned up, step three: Exploratory Data Analysis (EDA). This is where I started to actually see some interesting stuff. I looked at things like win/loss records, tag team partners, championship reigns, all that jazz. Made a few basic visualizations with Matplotlib and Seaborn. Nothing fancy, just bar charts and histograms to get a feel for the data.
Then, step four: Feature Engineering. I figured, let’s try to create some new features that might be interesting. I calculated things like total championship reigns, average match length, and win percentage. It was a bit of trial and error, seeing what actually provided some insights.
Finally, step five: Simple Analysis. I didn’t go too deep into machine learning or anything like that. I just wanted to see if I could find any correlations between the different features. Did having more championship reigns correlate with a higher win percentage? Did tag teams with longer tenures have better records? Stuff like that.
- Data Gathering – Manual copy/paste into CSV
- Data Cleaning – Pandas to the rescue (fillna, replace, astype)
- EDA – Basic visualizations (Matplotlib, Seaborn)
- Feature Engineering – Creating new metrics from existing data
- Simple Analysis – Looking for correlations
The results? Well, nothing earth-shattering, but it was a cool little project. I learned a lot about data manipulation and analysis, and I got to geek out about wrestling for a few hours. Win-win!