A data inclined explorer’s travel blog
Last week I had the utmost treat of spending several days at the long standing institute known to its regulars as “The Ranch”. Founded by the legendary Deborah Szekely and her husband in 1940, the fitness spa has attracted thousands of people to its beautiful, sprawling campus through the years. Indeed, many of these guests have gone on to visit the Ranch on a yearly basis.
While the magic of the ranch is hard to quantify it is easily felt. The campus, which includes 4,000 acres of land, is lush with vegetation. The winding…
Extracting text from PowerPoint files for Natural Language Processing Analysis
From business organizations to school presentations, information is delivered and shared via PowerPoint slide deck presentations. Rich with language, these decks can play a vital role in any natural language processing analysis. Slide deck analysis can be crucial in understanding topics, key words and themes that the creator of the presentation finds important, or wants to emphasize to the presentation’s intended audience.
The pptx library has a lot of functionality for extracting text from pptx files. There may be use cases for extracting specific elements from a presentation for analysis…
With over 500 hours of video uploaded to YouTube every minute worldwide, these videos account for a huge portion of available data in our world. This guide will walk you through how to collect a video’s transcript & clean the text so that it is ready for you to analyze.
Creators on YouTube are constantly updating, changing and deleting their content. This is important for anyone working with video data sources to recognize, as the video you analyze today may not be available to view tomorrow. It is important to always preserve your raw data — in this case the…
I was recently in an interview for a data analyst role when the interview began talking about a package I had never worked with before ‘Alteryx’, and like any nervous job hunter, I panicked. Convincing myself that the key to landing this role (spoiler alert: I did not) and all future success in the data world hinged on my ability to get off the zoom call and become an expert in Alteryx.
When you’re on the data job hunt front, job boards are filled with 101 packages, platforms, and technologies companies want to see experience with. No two companies seem…
Data visualizations at their best tell a story with the data that is both compelling to look at and easy to digest. The Scattertext tool created by Jason Kessler (check out the original documentation here) does just that. This guide will show you how to implement Scattertext with your data and bring the visual WOW to your work.
Pss! If you don’t need help formatting your data skip down to “Turning your text to a Scattertext Corpus”!
Scatter text allows you to visualize unique terms in your corpora and how their frequency differs from one category to another. In the…
Bargraphs are a ubiquitous data visualization tool, but in their ‘raw’ form they can leave a lot to be desired aesthetically. A clean visual can set a tone for the presentation, hold your audience’s attention, reflect professionalism and create harmony within the information you are trying to deliver. This guide will walk you through how to a basic Matplotlib bar chart to this
Let’s start with the basics. I’ve taken data from this Kaggle dataset and graphed the number of dinosaurs in each continent.
An end to end guide to building a random forest classifier, identifying top features, and plotting in Plotly.
There is nothing more satisfying after hours of cleaning and modeling your data than producing a beautiful graph to show off your hard work. I often scroll the web for #plotgoals images which is how I came across the beautiful image you see above.
I created this guide to walk you through step by step exactly how to make a graph like this of your own and upgrade your visuals game.