410250 Machine Learning Question and Answer - Pune University End Semester examination, University previous year questions with answers

*Exam*	B.E DEGREE SEMESTER EXAMINATIONS
*Academic Year*	April 2019
*Subject Code*	410250
*Subject Name*	Machine Learning
*Branch*	Computer Engineering
*Semester*	Semester II
*Regulation*	2008

B.E DEGREE SEMESTER EXAMINATIONS, APR 2019

Computer Engineering

Semester II

410250 – Machine Learning

(Pattern 2015)

Time : 2 and Half hours Answer A L L Questions Max. Marks 70

Q1) a) With reference to machine learning, explain the concept of adaptive machines. [6]

b) Explain the role of machine learning algorithms in following applications. [6]

a) Spam filtering.

b) Natural Language processing.

c) Explain role of machine learning the following common un-supervised learning problems: [8]

a) Object segmentation

b) Similarity detection

Q2) a) Explain Data formats for supervised learning problem with example. [6]

b) What is categorical data? What is its significance in classification problems? [6]

Answer:

Categorical variables represent types of data which may be divided into groups. Examples of categorical variables are race, sex, age group, and educational level.

c) Explain the Lasso, and ElasticNet types of regression. [8]

Lasso regression is a type of linear regression that uses shrinkage. Shrinkage is where data values are shrunk towards a central point, like the mean. The lasso procedure encourages simple, sparse models (i.e. models with fewer parameters). This particular type of regression is well-suited for models showing high levels of muticollinearity or when you want to automate certain parts of model selection, like variable selection/parameter elimination.

The acronym “LASSO” stands for Least Absolute Shrinkage and Selection Operator.

[More here ...]

Q3) a) What problems are faced by SVM when used with real datasets? [3]

Answer:

SVM doesn’t perform well if the dataset has more noise. It happens especially in case of target classes overlapping.

b) Explain the non-linear SVM with example. [5]

c) Write shorts notes on: [9]

i) Bernoulli naive Bayes.

Answer:

Bernoulli Naive Bayes is a variant of Naive Bayes. This is used for discrete data and it works on Bernoulli distribution. The main feature of Bernoulli Naive Bayes is that it accepts features only as binary values like true or false, yes or no, success or failure, 0 or 1 and so on. So when the feature values are binary we know that we have to use Bernoulli Naive Bayes classifier.

ii) multinomial naive Bayes.

Answer:

Is is mostly used for document classification problem, i.e whether a document belongs to the category of sports, politics, technology etc. The features/predictors used by the classifier are the frequency of the words present in the document.

iii) Gaussian naive Bayes.

Answer:

In Gaussian Naïve Bayes, continuous values associated with each feature are assumed to be distributed according to a Gaussian distribution(Normal distribution). When plotted, it gives a bell-shaped curve which is symmetric about the mean of the feature values.

Q4) a) Define Bayes Theorem. Elaborate Naive Bayes Classifier working with example. [8]

b) What are Linear support vector machines? Explain with example. [4]

c) Explain with example the variant of SVM, the Support vector regression. [5]

Q5) a) Explain the structure of binary decision tree for a sequential decision process. [8]

b) With reference to Clustering, explain the issue of “Optimization of clusters” [5]

c) Explain Evaluation methods for clustering algorithms. [4]

Answer:

Different type of evaluation metrics;
Classification Accuracy
Logarithmic Loss
Confusion Matrix
Area under Curve
F1 Score
Mean Absolute Error
Mean Squared Error

Q6) a) With reference to Meta Classifiers, explain the concepts of Weak and eager learner. [8]

Answer:

Weak learner: Weak learner is a learner that no matter what the distribution over the training data is will always do better than chance, when it tries to label the data. Doing better than chance means we are always going to have an error rate which is less than 1/2. [Refer for more...]

Eager learner: Eager learners construct a classification model based on the given training data before receiving data for classification. It must be able to commit to a single hypothesis that covers the entire instance space. Due to the model construction, eager learners take a long time for train and less time to predict.

b) Write short notes on: [9]

a) Adaboost.

Answer:

AdaBoost is an ensemble method that trains and deploys trees in series. AdaBoost implements boosting, wherein a set of weak classifiers is connected in series such that each weak classifier tries to improve the classification of samples that were misclassified by the previous weak classifier. In doing so, boosting combines weak classifiers in series to create a strong classifier. The decision trees used in boosting methods are called “stump” because each decision tree tends to be shallow models that do not overfit but can be biased. An individual tree is trained to pay specific attention to the weakness of only the previous tree. The weight of a sample misclassified by the previous tree will be boosted so that the subsequent tree focuses on correctly classifying the previously misclassified sample. The classification accuracy increases when more weak classifiers are added in series to the model; however, this may lead to severe overfitting and drop in generalization capability. AdaBoost is suited for imbalanced datasets but underperforms in the presence of noise.

b) Gradient Tree Boosting.

c) Voting Classifier.

Q7) a) With reference to Hierarchical Clustering, explain the issue of connectivity constraints. [8]

b) What are building blocks of deep networks, elaborate. [8]

Answer:

Here are some of the basic building blocks of deep networks:

Activation functions: linear, ReLU, sigmoid, tanh, etc.

Layers: fully connected, convolutional & pooling, recurrent (Problem: The graph is not acyclic. To compute the gradients, we unroll the computational cycle over timesteps, and back propagate through this structure.), resnet (skip connections over layers), etc

Q8) a) With reference to Deep Learning, Explain the concept of Deep Architectures? [8]

b) Justify with elaboration the following statement: [8]

The k-means algorithm is based on the strong initial condition to decide the Number of clusters through the assignment of ‘k’ initial centroids or means.

Answer:

The reason for different clustering results for K-means using different initial seeds is that K-means try to optimize the cost function. However, a bad initial choice may lead it to get trapped in a local minima, thus with different initial choices you may get trapped in different local minima yielding different clustering results. Therefore, it is hard to repeat the clustering results. Studies have shown that when the initial cluster centers are close to the final cluster centers, the clustering results improve.

***********

Past University Exam Papers - Engineering and Technology

Saturday, 24 April 2021

410250 Machine Learning April 2019 Pune University Questions with Answers

410250 Machine Learning Question and Answer - Pune University End Semester examination, University previous year questions with answers

c) Explain Evaluation methods for clustering algorithms. [4]

No comments:

Post a Comment

Database Management Systems Anna University Exam Questions and Answers

Report Abuse

Labels