**R** and **Python** are the most important language for *Data Science*. You need to learn any of them perfectly. In this post, I’ll tell you what to choose based on your experience and interest.

**R**

**Python**

*Data Science*

**Most of the people say that:**

If you have some programming experience,Pythonmight be the language for you.Python’ssyntax is more similar to other languages thanR’ssyntax is.

**But, But**…

**But, But**

### If you want to be a **Data Scientist**, Most of them ( Who are working in this Field ) will tell you to choose R instead of Python, because in real life, As a Data Scientist you have to Clean the Data and Visualize the Data and some Clustering for the company. Here R plays a great role. As I said earlier R is known for its ability to solve Statistical Problems. So, It will be easier for you to analyze the data and to work with those data.

**Data Scientist**

## ** You have to **UP-TO-DATE** yourself with the current books and software on **Data Science** because you don’t know, how and when the problem will arrise, and you have to solve the problem. Data Scientists also studied a lot.

**Let’s talk about the Data Science Packages:**

**Python’s Packages – **

**Python’s Packages –**

##### NumPy

*introduces objects for multidimensional arrays and matrices, as well as routines that allow developers to perform advanced mathematical and statistical functions on those arrays with as little code as possible.*

**NumPy**##### SciPy

*builds on NumPy by adding a collection of algorithms and high-level commands for manipulating and visualizing data. This package includes functions for computing integrals numerically, solving differential equations, optimization, and more.*

**SciPy**##### Pandas

*add data structures and tools that are designed for practical data analysis in finance, statistics, social sciences, and engineering. Pandas works well with incomplete, messy, and unlabeled data, and provides tools for shaping, merging, reshaping, and slicing datasets.*

**Pandas**##### IPython

*extends the functionality of Python’s interactive interpreter with a souped-up interactive shell that adds introspection, rich media, shell syntax, tab completion, and command history retrieval.*

**IPython**##### Matplotlib

*is the standard Python library for creating 2D plots and graphs. It’s pretty low-level, meaning it requires more commands to generate nice-looking graphs and figures than with some more advanced libraries.*

**Matplotlib**##### Scrapy

*is an aptly named library for creating spider bots to systematically crawl the web and extract structured data like prices, contact info, and URLs. Originally designed for web scraping, Scrapy can also extract data from APIs.*

**Scrapy**##### NLTK

*is a set of libraries designed for*

**NLTK***. NLTK’s basic functions allow you to tag text, identify named entities, and display parse trees, which are like sentence diagrams that reveal parts of speech and dependencies.*

**Natural Language Processing (NLP)**##### Pattern

*combines the functionality of Scrapy and NLTK in a massive library designed to serve as an out-of-the-box solution for web mining, NLP, machine learning, and network analysis.*

**Pattern**##### Seaborn

*is a popular visualization library that builds on matplotlib’s foundation. The first thing you’ll notice about Seaborn is that its default styles are much more sophisticated than matplotlib’s.*

**Seaborn**##### Bokeh

Bokeh makes interactive, zoomable plots in modern web browsers using JavaScript widgets. Another nice feature of Bokeh is that it comes with three levels of interface, from high-level abstractions that allow you to quickly generate complex plots, to a low-level view that offers maximum flexibility to app developers.

##### Basemap

*adds support for simple maps to matplotlib by taking matplotlib’s coordinates and applying them to more than 25 different projections.*

**Basemap**##### NetworkX

*allows you to create and analyze graphs and networks. It’s designed to work with both standard and nonstandard data formats, which makes it especially efficient and scalable.*

**NetworkX****Read More – ***Python and R – Best one for Machine Learning*?

*Python and R – Best one for Machine Learning*?

**R’s Packages – **

**R’s Packages –**

##### sqldf

*is used to select from data frames using SQL.*

**sqldf**##### forecast

*is used for easy forecasting of time series.*

**forecast**##### plyr

*is used for data aggregation.*

**plyr**##### stringr

*is used for string manipulation.*

**stringr**##### RPostgreSQL, RMYSQL, RMongo, RODBC, RSQLite

Database connection packages.

##### lubridate

*is used for time and date manipulation.*

**Lubridate**##### ggplot2

*is used for data visualization.*

**ggplot2**##### qcc

statistical quality control and QC charts.

##### reshape2

*is used for data restructuring.*

**reshape2**##### randomForest

random forest predictive models.