Data Science Interview questions and Answers For Graduates Part-1
1.What do you mean by word Data Science?
Data Science is the extraction of knowledge from large volumes of data that are structured or unstructured, which is a continuation of the field data mining and predictive analytics, It is also known as knowledge discovery and data mining.
2.Explain the term botnet?
A botnet is a a type of bot running on an IRC network that has been created with a Trojan.
3.What is Data Visualization?
Data visualization is a common term that describes any effort to help people understand the significance of data by placing it in a visual context.
4.How you can define Data cleaning as a critical part of process?
Cleaning up data to the point where you can work with it is a huge amount of work. If we’re trying to reconcile a lot of sources of data that we don’t control like in this flight, it can take 80% of our time.
5.Point out 7 Ways how Data Scientists use Statistics?
1. Design and interpret experiments to inform product decisions.
2. Build models that predict signal, not noise.
3. Turn big data a into the big picture
4. Understand user retention, engagement, conversion, and leads.
5. Give your users what they want.
6. Estimate intelligently.
7. Tell the story with the data.
6.Differentiate between Data modeling and Database design?
Data Modeling – Data modeling (or modeling) in software engineering is the process of creating a data model for an information system by applying formal data modeling techniques.
Database Design- Database design is the system of producing a detailed data model of a database. The term database design can be used to describe many different parts of the design of an overall database system.
7.Describe in brief the data Science Process flowchart?
1.Data is collected from sensors in the environment.
2. Data is “cleaned” or it can process to produce a data set (typically a data table) usable for processing.
3. Exploratory data analysis and statistical modeling may be performed.
4. A data product is a program such as retailers use to inform new purchases based on purchase history. It may also create data and feed it back into the environment.
8. What do you understand by term hash table collisions?
Hash table (hash map) is a kind of data structure used to implement an associative array, a structure that can map keys to values. Ideally, the hash function will assign each key to a unique bucket, but sometimes it is possible that two keys will generate an identical hash causing both keys to point to the same bucket. It is known as hash collisions.
9.Compare and contrast R and SAS?
SAS is commercial software whereas R is free source and can be downloaded by anyone.
SAS is easy to learn and provide easy option for people who already know SQL whereas R is a low level programming language and hence simple procedures takes longer codes.
10.What do you understand by letter ‘R’?
R is a low level language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at BELL.