I learned data science has been extremely popular and I wanted to expand on that and talk about how I would learn data science if I had to start again from absolutely ground zero things have definitely changed since the start of my data science journey so I wanted to provide a current roadmap for this learning process hopefully this article will provide you with a clear path to shape their own data science journey .
I give you a few of my best tips to actually stick with this plan so stay tuned for that, I want to caveat this article by saying that learning is a little bit different for everyone my word is definitely not gospel and there’s a good chance you might find something that works a little bit better for you, however! I hope that this article is a good foundation for you to build off of and I hope that it instills in you the big-picture priorities that you should have when learning this field.
Data science is a huge field and it makes the most sense to break it down into its components the first thing that you need to be familiar with is programming most people do this in either Python or are you’re free to start with either one, but I generally prefer Python it’s also more commonly asked for in job descriptions as I said Python I believe is a bit more versatile but are still very common in academic and sports circles you’ll also want to have a very basic understanding of statistics to start, I would like to make this article agnostic of you know whether you’re in school or not so if you want to hear my advice on what I would study if I was in college over time I’ve had a change of opinion about how much foundational knowledge you need.
To learn data science after experiencing many different types of learning myself I found that learning by doing real-world projects is the most effective way to grasp this field, I think you should learn just enough programming and statistics to be able to start exploring your own projects in general again I think that you can get to this level of knowledge through very introductory online courses, I previously mentioned but also a tremendous amount of datasets for you to explore the great thing about this is that it’s a public forum for people to submit their analysis of shared data sets you can go in and see the code of established data scientists and from this, you can see what packages they used the way that they explore the data and the different ways that they optimize the algorithms that they use or follow along with a few more of these advanced notebooks and then I would recommend you start on your own basic projects
I made this article about the three beginner projects that I recommend and I’ve also linked that above and below at this point I would split your time about 50/50 between working on your own projects and exploring other people’s code and work , I would try to apply many of the new things that I saw in these more advanced workbooks to the code that I’m working on at this point in time and the projects there as you go along learning this way you’ll see when different people use algorithms and different packages, I recommend compiling a list of all the different things you see you should go through the source code of all these different things and try to grasp how they’re constructed frankly, if you can understand the source code for an algorithm you functionally understand the math behind it it’s still good to supplement this information with some actual theory using Wikipedia or some math textbooks but that will give you the general gist , I find that if I understand how algorithms are built ?
It’s way easier for me to understand the math associated with that again this is fairly common practice in math circles they try examples first then see if they can fit a theory to what they’re seeing this helps to build intuition around the data science skill, set if you’re feeling particularly ambitious it’s extremely valuable to be able to build some of these algorithms from scratch you should try to explore building a linear regression or a k-means clustering algorithm from just basic Python components.
At this point you should already be started to delve into more advanced projects you do this to stretch out your skillset the advanced projects are ones where you strive to find unique insights this can be fairly intimidating but if you collect your own data it can also be relatively easy you can also do this by asking questions of existing data sets that other people haven’t thought of yet again this can seem quite difficult but if you’re spending a significant amount of time doing projects and building intuition these ideas eventually come relatively naturally at this stage.
I would also recommend exploring some deep learning NLP and computer vision concepts I personally enjoyed the fast day I course and I’ve definitely borrowed a bit from their learning philosophy it’s important to push yourself to get feedback on your work as well I highly recommend making your analysis public on Kaggle GitHub your blog or tableau public putting work on Kaggle is definitely something that I personally need to improve on as well after you’ve reached this level you’ve really built almost all of the foundational knowledge that you need from then on it’s about learning new packages or concepts and applying them to your work or in more projects the data science journey is never over you know.
Honestly you’ll constantly be learning and applying new things, but I personally think that that’s what makes the profession really fun now there are a few other details that I think are important on this learning journey, first is how much time you need to spend learning this can vary greatly by a person based on how quickly you actually want to consume this material ?
I think that working around an hour per day would be sufficient to learn the foundations of data science for a year, if I could go back I would schedule in blocks of time that I would study rather than doing it and I learned this concept from the book ultra learning spend too much time catching up on what you did the previous time and if you spend too much time I think there’s a pretty high chance of burnout next I’m a huge believer in setting goals.
I want to accomplish a good goal should be three things measurable something you have complete control over and finally, it should have a time constraint an example of a good goal is that you’d like to be able to do two exploratory analyses where you apply principal component analysis over the next two week: an inferior example would be if you said your goal was to learn PCA the reason why one of these goals is superior to the other is the ability for you to be held accountable to it I think accountability is extremely important and you can hold yourself accountable in a couple of different ways one is by writing things down and another way is by actually telling people or community.
I’m perfectly fine with either of these approaches you know you can tell one of your friends you can have an accountability partner you can also use some of these social groups to actually maintain your accountability as well I would ask you to write in the comments section below what some of your goals are and that the community on this channel will help you stay accountable there hopefully this article will give you a clear path on how to navigate this field it took me five years to really understand these data science and life concepts and I’m still learning every day .