What does a data scientist do in a day?
I get asked this question all the time, so I thought I’d offer some insight on what I do in an average day and why I love my job!
My day begins with diving into huge amounts of data. That includes downloading data using Big Data tools like SQL, Hadoop, and Spark. It is often said that all your data is being downloaded from your activities, and I can verify that because I have seen it… Don’t worry — it’s anonymous!
After data diving, I work on gathering and analyzing data. Data exploration is a big part of a data scientist's journey and thinking about statistical ways to extract information and insights from the data is the first step.
My next challenge is becoming familiar with data and understanding it. Wow, what a feat! I find it remarkable that I can work on medical data whereas I am not a doctor, on gaming data whereas I am not a gamer… The domains of application are so vast and we have to collaborate with business-minded people who understand and know the data, whether they’re experts, clinicians, data analysts, etc.
The subsequent task is to translate business problems into data problems. Do we want to make predictions? If yes, classification or regression? Is it supervised or unsupervised? Is it an optimization problem? Which kind of data are we using?
My advice is to always stay updated and learn constantly. Do quality research, read academic state-of-the-art papers, go over blogs and open-source codes, learn how to use new technologies and tools… If you are one that never grows tired of learning new things, this job is for you.
Another big component of my job is developing models and optimizing algorithms. Creating a model is the first part, but fine-tuning and finding the best parameters and configurations is also a big part. Actually, going from 95.2% to 98.5% accuracy is really satisfying…
I also debug and review code. My best friend since I have been a data scientist has been StackOverflow, and I am sure it has a lot of friends. We have a great and huge community of data scientists and we help each other in solving code issues and questions.
Some aspects of my work can be a bit frustrating. You can build a great model with 98% accuracy, but you may face constraints. Maybe, it demands too many resources, the model is too heavy, or it has to run on a mobile device... There are some things that you aren’t in control of as a data scientist. However, you have to solve them and find the optimal solution as a tradeoff between performance and limitations.
Maybe one of the toughest parts of the job is explaining my work to others, as it is often underestimated. Sharing with people your achievements and progress is enjoyable, but this is not as easy as it seems. You often have to explain your results to people who are not as familiar with algorithms and data science concepts as you are. Data visualization, writing reports, and trying to present your conclusions in the most understandable way is part of your job. Your results have to be specific to the business goals and easily understandable.
After working with developers and software engineers to help them integrate and deploy my models, the real victory is seeing that my work is helpful and being used by real users.
I hope you found this article insightful and that it helped you to get a better understanding of what a data scientist’s day can look like! While you’re here, check out some of start-up.ai’s other blog posts which cover hot topics and issues in AI >> https://www.start-up.ai/blog