Showing posts with label Pune University. Show all posts
Showing posts with label Pune University. Show all posts

Monday, 3 May 2021

410253DC Big Data and Data Analytics Pune University question answers April 2019

410253DC Big Data and Data Analytics Pune University question answers April 2019, Big data and data analytics course exam question paper with answers

 

Exam

B.E DEGREE SEMESTER EXAMINATIONS

Academic Year

April 2019

Subject Code

410253DC

Subject Name

Big Data and Data Analytics

Branch

Computer Engineering

Semester

Semester II

Regulation

2015

 

B.E DEGREE SEMESTER EXAMINATIONS, APR 2019

Computer Engineering

Semester II

410253DC – Big Data and Data Analytics

(Pattern 2015)

Time : 2 and Half hours                  Answer A L L Questions                Max. Marks 70

 

Q1) a) Explain with the given dataset how Decision Support System will help, Laptop shop to predict whether the customer will buy or not buy laptop. [5]



b) Differentiate Operational data and Informational data. [6]

c) Explain following phases of data Analytics lifecycle with example. [6]

i) Data Discovery

ii) Model Building

OR

Q2) a) Explain Hadoop Eco system with diagram. [8]

b) Smoothe the following data set using binning 3,12,1,7,8,5. [6]

c) Justify Snow-Flake schema is better than Star schema. [6]

 

Q3) a) What is linear regression? Explain with Example. [8]

b) What is the significance of Support Vector Machine Classifier Model with example. [5]

c) Differentiate between supervised and unsupervised learning. One more link[4]

Answer:

The main distinction between the two approaches is the use of labeled datasets. To put it simply, supervised learning uses labeled input and output data, while an unsupervised learning algorithm does not.

In supervised learning, the algorithm “learns” from the training dataset by iteratively making predictions on the data and adjusting for the correct answer. While supervised learning models tend to be more accurate than unsupervised learning models, they require upfront human intervention to label the data appropriately. For example, a supervised learning model can predict how long your commute will be based on the time of day, weather conditions and so on. But first, you’ll have to train it to know that rainy weather extends the driving time.

Unsupervised learning models, in contrast, work on their own to discover the inherent structure of unlabeled data. Note that they still require some human intervention for validating output variables. For example, an unsupervised learning model can identify that online shoppers often purchase groups of products at the same time. However, a data analyst would need to validate that it makes sense for a recommendation engine to group baby clothes with an order of diapers, applesauce and sippy cups.

 

OR

Q4) a) What is logistic regression? Explain with example. [8]

Answer:

It is a predictive algorithm using independent variables to predict the dependent variable, just like Linear Regression, but with a difference that the dependent variable should be categorical variable.

 

b) Explain with suitable example to predict whether a student will pass or not using Support vector machine. [5]

c) What is Time series analysis ? Give example. [4]

 

Q5) a) A database has 6 transactions. Let minimum support = 60% and Minimum confidence = 70%. Find all frequent item sets and association rules using Apriori algorithm [8]

Transaction ID

Toys Bought

T1

T2

T3

T4

T5

T6

{A, B, C, E, F}

{A, C, D, E}

{B, C, E, F}

{A, C, D, E}

{C, D, E, F}

{A, D, E}

 

b) What is agglomerative clustering. Give example. [5]

c) Explain the role of Bayes theorem in decision making. [4]

 

Q6) a) What is Bayesian Classifier? Elaborate the training process of a Bayesian classifier with suitable example. [8]

b) Explain with example following terms: [4]

i) Lexicographic order

ii) Confidence

c) Differentiate between single link and complete link methods used in Hierarchical Clustering. [5]

 

Q7) a) Write and explain R code for Naive bayes classification. [8]

b) Differentiate between Data Frames and data lists. [4]

Answer:

A data frame is a list with the following characteristics:

  • The elements of the list are vectors and/or factors.

  • Those vectors and factors are the columns of the data frame.

  • The vectors and factors must all have the same length; in other words, all columns must have the same height.

  • The equal-height columns give a rectangular shape to the data frame.

  • The columns must have names.

A list has the following characteristics:

  • Lists are heterogeneous.

  • Lists can be indexed by position.

  • You can extract sublists from lists.

 

c) What is the role of R in machine learning? [4]

OR

Q8) a) Explain data processing with R? [8]

b) How data is exported from R. [4]

c) Write short notes on Handling Data in R Workspace. [4]

 

***********

 

 

 

Saturday, 1 May 2021

Advanced Databases - Pune University MCA Question Paper - MAY 2013

Pune University MCA Question Papers / Previous year question papers of Pune University / MCA Advanced Databases Question Paper




Total No of Questions: [12]                                                            SEAT NO. :
[Total No. of Pages : 02]
[4366]- 503
TYMCA (Engg. Faculty)
ADVANCED DATABASES
(Semester - V) (2008 Pattern) (710903)
MAY 2013 EXAMINATIONS
[Time: 3 Hours]                                                                 [Max. Marks : 70]
Instructions to the candidates:
1) Answers to the two sections should be written in separate books.
2) Neat diagrams must be drawn wherever necessary.
3) Assume Suitable data if necessary.

SECTION I
OR

Q3) a) Explain Transaction Server Process Structure. [6]
OR
b) Explain centralized and client server database architecture [6]

Q5) a) Explain object identity and reference type? [6]
OR
b) Explain persistent C++ system. [6]

SECTION II
Q7) a) While analyzing the data, it was found that many tuples have no recorded values for several attributes. How this problem of missing values can be solved? [6]
OR
Q8) a) Explain in brief OLAP. What are the possible operations on cube? [6]

Q9) a) Form clusters using clustering K-Means algorithm. Use appropriate distance formula. [8]
RID
Age
Years of Service
1
30
5
2
50
25
3
50
15
4
25
5
5
30
10
6
55
25

b) Explain outlier analysis [4]
OR
Q10) a) Find frequently occurred item using apriori algorithm. [8]
ITD
ITEM
100
1,3,4
200
2,3,5
300
1,2,3,5
400
2,5

b) Explain descriptive & predictive data mining. [4]

Answer:
Descriptive data mining - It is the idea of using the data to identify the relationships. Find human-interpretable patterns that describe the data. Clustering, association rule mining and sequential pattern discovery are some of the descriptive approaches.
Predictive data mining - It is the idea of using data to make a prediction. It uses some variables to predict unknown or future values of other variables. Classification, and regression are some of the predictive approaches.  

b) Define the following terms. [3]
1) Hub 2) Authority 3) Web crawler
OR
Q12) a) Describe the popularity ranking. [8]
b) Define the following terms- [3]
1) Ontology 2) Search engine spamming 3) False positive
Answer:
False positive: A false positive is where you receive a positive result for a test, when you should have received a negative results. It’s sometimes called a “false alarm” or “false positive error.” It’s usually used in the medical field, but it can also apply to other arenas (like software testing). Continue reading.

************************




Database Management Systems Anna University Exam Questions and Answers

Database management systems university question papers with answers, Anna university DBMS exam questions, Solved university exam questions f...