This is not an article. Rather it is a more detailed description of my background and credentials as a data scientist, with some opinions sprinkled in. You can think of it as a narrative curriculum vitae.
Am I a Data Scientist? I would say yes, though I have no degree in Data Science. Of course I don’t — those have only just begun. Like the early days of computer science when most CS faculty had math or engineering degrees, we are in a phase where we must define the field based on capabilities and background, not certificates or degrees. I believe that I have solid Data Science credentials, so here they are in a bit longer form. …
As a faculty member in a data science program, I interact a lot with students who want to build their skills, learn the profession, and succeed. To that end, they are great at building their professional networks and reaching out to data science professionals to better understand all the different ways in which someone can be a data scientist (a post for another day).
I also teach a course called Survey of Data Science Applications, and one of my goals for this class is to expose students to as many varied applications of data science to real world problems as possible. …
In case you didn’t know, December is grad school application season. Many application deadlines are in January so hopefully you have already started your applications if you plan on applying. Thus, error #1:
This is not the most egregious error, but I see lots and lots obvious cut-and-paste errors. Sometimes, I even see another university name in an essay.
There are two issues here, A. Detail orientation and B. Interest level.
There is a (probably apocryphal) story about a band who listed in their concert contract when they were touring that they wanted a jar full of M&Ms with all the blue ones taken out (or something like this) in their dressing room. It was buried way down in the middle of the contract. They didn’t care about M&Ms, what they cared about was locations that were detail oriented enough to read everything and pay attention to everything. If they walked in and saw no blue M&Ms, they knew that they had a solid location who was paying attention. …
I am currently a faculty member at Vanderbilt University’s Owen Graduate School of Business and affiliated faculty at the Data Science Institute. I just wrote about why you should consider a Master’s degree, so now I’m writing the companion piece.
A bit more about me, then: before my Ph.D. and faculty job, I worked at Accenture for five years and TIAA-CREF for three, both essentially doing software development. I never had taken a programming class in college (still haven’t) and so all my tech work in both places was self-taught or corporate training, mostly in Java. So I am well-acquainted with the mid-twenties search for “what do I want to be when I grow up” and then “how do I get there?” …
I’ve seen quite a few posts on Medium (and other platforms) giving a perspective on how to gain the necessary skills to be successful in Data Science. The typical contrast is a university degree vs. a bespoke collection of online credentials (via MOOCs, boot camps, etc.). As a faculty member at Vanderbilt’s Data Science Institute, I think I have a perspective on this decision that is less often heard online.
My background, in brief: In addition to three university degrees, I’ve also taken several online courses and have several credentials from a popular online coding school. I also now have about 20 years of (diverse) work experience, and so I know a fair amount about getting jobs, getting promoted, and being successful. Importantly, I’ve not been a lifetime academic cloistered in an ivory tower — I’ve had deliverables and hired and fired people. …
Principal Component Analysis (PCA) is a vital tool in the toolbox of the Data Scientist. There are many posts about how to implement PCA and the documentation of how the methods work is often straight forward. What is less obvious is what PCA is doing at an intuitive level. My goal in this post is to provide an intuitive, non-mathematical explanation of what PCA does so you can better understand when and why you want to use it.
The basic description of PCA is that it is an orthogonal projection of your data. Helpful, right? You can also find out that it is used for “dimensional reduction,” which is accurate but only helpful on a very base level. So, you can use it to transform your data set with 100 variables down to 20 or less. Sounds nice, but what do those 20 variables mean now? How do I use them? …
About