# NPTEL Data Science for Engineers Assignment 7 Answers 2023

#### ByBrokenprogrammers

Mar 9, 2023

Hello Learners, In this Post, you will find NPTEL Data Science for Engineers Assignment 7 Week 7 Answers 2023. All the Answers are provided below to help the students as a reference don’t straight away look for the solutions.

###### NPTEL Data Science for Engineers Assignment 8 Answers Join Group👇

Note: First try to solve the questions by yourself. If you find any difficulty, then look for the solutions.

## NPTEL Data Science for Engineers Assignment 7 Answers 2023:

#### Q.1. Which among the following is not a type of cross-validation technique?

• LOOCV
• k-fold croos validation
• Validation set approach

#### Q.2. Which among the following is a classification problem?

• Predicting the average rainfall in a given month.
• Predicting whether a patient is diagnosed with a disease or not.
• Predicting the price of a house.
• Predicting whether it will rain or not tomorrow.

Consider the following confusion matrix for the classication of Hatchback and SUV:

• 0.95
• 0.55
• 0.45
• 0.88

• 0.95
• 0.55
• 1
• 0.88

#### Q.5.Under the ‘family’ parameter of glm() function, which one of the following distributions correspond to logistic regression for a variable with binary output?

• Binomial
• Gaussian
• Gamma
• Poisson

Use the following information to answer Q6, Q7, Q8, Q9, and Q10:

Load the dataset iris.csv as a dataframe irisdata, with the first column as index headers, first row as column headers, dependent variable as factor variable, and answer the following questions.

The iris dataset contains four Sepal and Petal features (Sepal Length, Sepal Width, Petal Length, Petal Width, all in cm) of 50 equal samples of 3 different species of the iris flower (Setosa, Versicolor, and Virginica).

#### Q.6. What is the dimension of the dataframe?

• (150, 5)
• (150, 4)
• (50, 5)
• None of the above

#### Q.7. What can you comment on the distribution of the independent variables in the dataframe?

• The variables Sepal Length and Sepal Width are not normally distributed
• All the variables are normally distributed
• The variable Petal Length alone is normally distributed
• None of the above

• 10
• 5
• 25
• 0

#### Q.9. Which of the following code blocks can be used to summarize the data (finding the mean of the columns PetalLength and PetalWidth), similar to the one given below.

• lapply(irisdata[, 3:4], mean)
• sapply(irisdata[, 3:4], 2, mean)
• apply(irisdata[, 3:4], 2, mean)
• apply(irisdata[, 3:4], 1, mean)

#### Q.10. What can be interpreted from the plot shown below?

• Sepal widths of Versicolor flowers are lesser than 3 cm.
• Sepal lengths of Setosa flowers are lesser than 6 cm.
• Sepal lengths of Virginica flowers are greater than 6 cm.
• Sepals of Setosa flowers are relatively more wider than Versicolor flowers
##### NPTEL Data Science for Engineers Assignment 7 Answers Join Group👇

Disclaimer: This answer is provided by us only for discussion purpose if any answer will be getting wrong don’t blame us. If any doubt or suggestions regarding any question kindly comment. The solution is provided by Brokenprogrammers. This tutorial is only for Discussion and Learning purpose.

#### About NPTEL Data Science for Engineers Course:

Learning Objectives :

1. Introduce R as a programming language
2. Introduce the mathematical foundations required for data science
3. Introduce the first level data science algorithms
4. Introduce a data analytics problem solving framework
5. Introduce a practical capstone case study

Learning Outcomes:

1. Describe a flow process for data science problems (Remembering)
2. Classify data science problems into standard typology (Comprehension)
3. Develop R codes for data science solutions (Application)
4. Correlate results to the solution approach followed (Analysis)
5. Assess the solution approach (Evaluation)
6. Construct use cases to validate approach and identify modifications required (Creating)
##### Course Layout:
• Week 1:  Course philosophy and introduction to R
• Week 2:  Linear algebra for data science
•                 1. Algebraic view – vectors, matrices, product of matrix & vector, rank, null space, solution of over-determined set of equations and pseudo-inverse)
•                 2. Geometric view – vectors, distance, projections, eigenvalue decomposition
• Week 3:  Statistics (descriptive statistics, notion of probability, distributions, mean, variance, covariance, covariance matrix, understanding univariate and multivariate normal distributions, introduction to hypothesis testing, confidence                        interval for estimates)
• Week 4:  Optimization
• Week 5:  1. Optimization
• 2. Typology of data science problems and a solution framework
• Week 6:  1. Simple linear regression and verifying assumptions used in linear regression
• 2. Multivariate linear regression, model assessment, assessing importance of different variables, subset selection
• Week 7:  Classification using logistic regression
• Week 8:  Classification using kNN and k-means clustering
###### CRITERIA TO GET A CERTIFICATE:

Average assignment score = 25% of average of best 8 assignments out of the total 12 assignments given in the course.
Exam score = 75% of the proctored certification exam score out of 100

Final score = Average assignment score + Exam score

YOU WILL BE ELIGIBLE FOR A CERTIFICATE ONLY IF AVERAGE ASSIGNMENT SCORE >=10/25 AND EXAM SCORE >= 30/75. If one of the 2 criteria is not met, you will not get the certificate even if the Final score >= 40/100.

If you have not registered for exam kindly register Through https://examform.nptel.ac.in/