Only 8 Techniques to Pretend to be Data Scientists

Pretending to be a data scientist

As of 2019, there should be a lot of people out there who can't help but want to be a data scientist. However, the more you want to pretend, the less you know how to pretend. I completely excluded the muddy data scientist side and wondered how I could pretend it. The conclusions you draw can be put into practice immediately from tomorrow. If you want to be a data scientist, give it a try.

1. Apple straight

Sophisticated products are essential for a pretentious data scientist. Carry your MacBook with you so that you can always have a slapstick face on Starbucks. It gives a professional feeling, so if possible, it's Pro. Take your Pro with you.

2. Editor is VSCODE

<img src="https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/204712/e8d86648-b700-a30e-7d49-7f427a5325fe.png " width="170")> When you open your MacBook, you'll see VS CODE. What do you like about VSCODE? I will answer like this when asked. "Hmm, first of all, lightness, and abundant extensions, the most attractive thing is remote debugging." Editors are always required to be light. And colleagues and friends must be impressed by the fashionable sounds of extensions and remote debugging.

3. I love python

If you're a data scientist, love Python is a shortcut. Let's install PyCharm in vain as well as Python, which is an extension of VSCODE. It's a sign of my love for Python. And never deny R. Even though I don't understand R at all, I feel like I know that R has some good points.

4. Visualize to breathe

Visualization is one of the highlights of data scientists. Once you have the data, let's visualize it with haste, even if nothing else. In addition, let's say to a colleague who draws graphs with MatPlotLib, "Now I recommend visualizing with Plotly. After all, it is most convenient to be able to see the data interactively."

5. Stick to Python notation

Since you are using Python, let's keep in mind Pythonic programming. Let's write fashionable code that looks like a function type by using list comprehension and even the walrus operator that has been supported in the latest update.

6. Don't forget C / C ++

If you love only Python, you may be bullied by core people in terms of speed. Therefore, I will occasionally appeal to myself, "I may need to write in C at the end." People around me have a longing for knowledge from data science to product release.

7. Say Cloud once a day

<img src="https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/204712/7544d416-64da-7718-2868-dc0a431fc1b1.png ", width="200"><img src="https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/204712/a08b87f1-c054-d830-6f2a-657beddf5217.jpeg ", width="200"> Not using the cloud is not data science. Let's bring up the topic of AWS and GCP. And let's fire the keyword scaling. In other words, it would be even better if we could use terms such as S3 and IAM. Show off the size of the field that can be handled both on-premise and in the cloud.

8. Know kaggle out the wind

Kaggle's Grand Master, a longing for data scientists. Of course, let's bring up the topic "Recently in Kaggle competition ..." to create an atmosphere that we are always checking kaggle. The appearance of always aiming for height should be the target of admiration.

Why I wrote this article

This article is meant to "pretend" data scientists, who are said to be one of the most glamorous jobs at the end of 2019. The trigger was the situation around me when I attended a conference of a very famous IT company. It was very interesting because everyone looked the same. I wrote it a little playfully, but I intend to write something that is correct to some extent. Let's talk a little seriously about each and give some useful links and words.

1. love apple

Personally, I think Windows is fine, but I feel that it is excellent in terms of environment construction and compatibility with Linux. Many people recommend Mac. Of course there are Apple followers too. Think about the question of which is better, Windows or Mac for development What I did before I became a data scientist

2. Editor is VSCODE

I personally think this is an option. I don't even want to write Python outside of VSCODE anymore, and so does Markdown. The draft I'm writing this article is also VS CODE. Personally, I don't really feel the reason for choosing another editor now. Somehow VScode is the strongest for beginners, isn't it? 3 reasons to think 24 Recommended Extensions for VS Code (and Some Tips)

3. I love python

If I do data science, I wonder if I can't remove this now. All machine learning frameworks are provided in Python and are very compatible with the Cloud. Recommended programming languages for 2019

Also, if you use Flask etc., you can easily write a small web application, and various applications are easy to work. I think Python is excellent because I think it's important to have a sense of speed to try a little in a job like data science where trial and error are repeated.

4. Visualize to breathe

I think visualization is one of the most important items for those who do data science. I wrote it playfully in the upper part, but Matplotlib is a matter of course, and now Plotly and Dash are highly recommended. I think it is important to display data so that humans can see it so that it can be said that what controls visualization controls data. (Personal view) Visualization tool Dash tutorial --Part1: Installation-Drawing- Create a web application that can be easily visualized with Plotly Dash

5. Stick to Python notation

This area is a bit maniac, but by mastering list comprehensions, Maps, and Lambda, you can achieve what you want with short, clean code. It can also contribute to speeding up. Some people say that it is not readable, but I think it is familiar to some extent. The Hitchhicker's Guide to Python What I did when I wanted to make Python faster Utilization and misuse of list comprehension Introduction to super "practical" Python one-liner starting with list comprehension

6. Don't forget C / C ++

After all, I want to create a new library, think about advanced things, and even faster, I need C ++. If you want to write something close to the hardware, you may need C. Of course, there are limits to interpreter languages, so languages like C ++ can't be ridiculous, of course. Needless to say here. Why is python so slow? Comparison of speeds in Python, Java, C ++

7. Say Cloud once a day

It's so major that you can't say in a hiring interview that you're not using the cloud in this era, so it's natural that you need to catch up. Even if you just started data science, it would be convenient if you could use ElasticSearch, Tableau, Jupyter's development environment quickly, and use many functions of SageMaker. Data science can be started in one day. Introduction to Python Data Science with Amazon SageMaker Part 1 Machine Learning: Data Scientist

8. Know kaggle out the wind

I don't think it's necessary to participate in the kaggle competition, but there are many references to the visualization methods exchanged in the competition and how to create features, so keep an eye on the competition you care about. I don't think it's a bad thing to let it through.

Especially recently, kaggle's kernel has become easier to use, so you can feel free to touch the data a little. Dive into Kaggle with a powered-up kernel

in conclusion

It's the end of the year, so I made a playful article. I would appreciate it if you could think of it as a little bit. That's it.

Recommended Posts

Only 8 Techniques to Pretend to be Data Scientists
Python environment construction 2016 for those who aim to be data scientists
[Introduction to Data Scientists] Basics of Python ♬
Pretend to be a server with two PCs
How to apply markers only to specific data in matplotlib
Only size-1 arrays can be converted to Python scalars
I want to be able to analyze data with Python (Part 3)
[Python] It might be useful to list the data frames
I want to be able to analyze data with Python (Part 1)
I want to be able to analyze data with Python (Part 2)
[Introduction to Data Scientists] Basics of Python ♬ Functions and classes