Pune University BE (Computer Engineering) Advanced Databases Question Papers

B.E. (Computer Engineering) ADVANCED DATABASES (2008 Pattern) (Sem. – II) (Elective-Ill)

Time :3 Hours]                                                                           [Max. Marks :100

Instructions to the candidates :-

1)            Answers to the two sections should be written in separate books.

2)             Neat diagrams must be drawn wherever necessary.

3)             Pigures to the right indicate full marks.

4)            Assume suitable data, if necessary.


QI) a) Explain speedup and scaleup in parallel databases with suitable diagram.       [5]

b)            Explain range partitioning sort in parallel database along with its suitability. [5]

c)            Explain partitioning techniques in parallel database along with examples. [6]


Q2) a) Explain fragment and replicate join schemes.                                  [8]

b)            Describe the benefits and drawbacks of pipelined parallelism. [4]

c)             The histograms are used for constructing load balanced range partitions suppose you have a histogram where values are between I and 100, and are partitioned into 10 ranges, 1-10, 11 -20,…….. 91-100, with frequencies 15, 5, 20, 10, 10, 5, 5, 20, 5 and 5, respectively. Give a load-balanced range partitioning function to divide the values into 5 partitions. [4]


Consider the relations:

Q3) a)

employee (name, address, salary, plant-number) machine (machine-number, type, plant-number)

Assume that the employee relation is fragmented horizontally by plant- number, and that each fragment is stored locally at its corresponding plant site. Assume that the machine relation is stored in its entirety at the Armonk site. Describe a good strategy for processing each of the following queries.

i)              Find all employees at the plant that contains machine number 1130.

ii)           Find all machines at the Almaden plant.

iii)         Find employee ^ machine.                                                                       [6]

Explain two phase commit protocol. How three phase commit protocol overcomes the disadvantages of two phase commit protocol. [6]

Explain distributed transaction management.                                               [6]


Explain following concurrency control schemes along with advantages 8 disadvantages in distributed databases.

i)              Distributed lock manager.

ii)           Majority protocol.                                                                                      [8]

List the difference between directory and database. Also explain LDAP. [6]

When is it useful to have replication or fragmentation ? Explain your answer.      [4]

What is N tier architecture? Explain its advantages with example. [8] Explain the components of an XML document with suitable example. [8]


Which are different parsers for XML? Explain them in brief.                    [6]

How will you define simple and complex types using XML schemas? Explain with example.  [6]

Explain the following with respect to web architecture.

i)              Web server

ii)           Common gateway interface.                                                                   [4]



Q7) a) Explain the architecture of Data warehouse.                                                       [8]

b)            Differentiate between OLTP and OLAP systems.                                        [4]

c)             Suppose that a data warehouse for Big – University consists of the following four dimensions: Student, course, semester and instructor, and two measures count and average – grade where average- grade measure stores the actual course grade of the student. Draw a snowflake schema diagram for the data warehouse.      [4]


Q8) a) What is noisy data? Explain data cleaning process. How missing values are handled?    [8]

b)            Explain the following operations of OLAP on multidimensional data with example.

i)              Roll up and drill down.

ii)           Slice and dice.                                                                                           [4]

c)            Write a note on data marts.                                                                     [4]

Q9) a) A database has five transactions. Let min-sup = 20% and min-cont = 75%




X, Y, Z


X, Z, W,


Y, W


U, V, W


V, Y, Z


U, X, Z



Find all frequent itemsets using Apriori Algorithm.



List all strong association rules.



State and explain the algorithm for inducing a decision tree from training





Differentiate between classification and clustering.




Explain the architecture of typical data mining system.                              [6]

QIO) a)


Suppose that the data mining task is to cluster points (with (x,y) representing location) into three clusters, where the points are AI (2,10), A2 (2,5), A3 (8,4), A4 (5,8), A5 (7,5), A6 (6,4), A7 (1,2), A8 (4,9). The distance function is euclidean distance. Suppose initially we assign AI, A4 and A7 as the center of each cluster respectively. Use the K-means algorithm to show final three clusters.      [8]


Explain the following terms with example.

i)              Closed frequent itemset

ii)           Maximal frequent itemset                                                                       [4]





Explain typical architecture of information retrieval system.                     [8]

Write short notes on –

i)             Vector-space model

ii)           TF-IDF method of ranking                                                                      [8]


Explain page rank algorithm with example.                                                  [6]

Write a short note on web crawler.                                                                 [4]

Explain the terms.                                                                                               [6]

i)             Inverted index.

ii)           Ontology.

iii)        Homonyms.

Leave a Comment