VIT Syllabus Computer Science Engineering 8th Semester

Data Warehousing and Mining

Objectives:

1. To study the methodology of engineering legacy databases for data warehousing and data
mining to derive business rules for decision support systems.
2. To analyze the data, identify the problems, and choose the relevant models and
algorithms to apply.

Outcomes:

1. Enable students to understand and implement classical algorithms in data mining
and data warehousing; students will be able to assess the strengths and weaknesses of the
algorithms, identify the application area of algorithms, and apply them.
2. Students would learn data mining techniques as well as methods in integrating
and interpreting the data sets and improving effectiveness, efficiency and quality for data
analysis.

Unit-1

Introduction to Data Warehousing – 1.1 The Need for Data Warehousing; Increasing Demand for Strategic Information; Inability of Past Decision Support System; Operational V/s Decisional Support System; Data Warehouse Defined; Benefits of Data Warehousing ;Features of a Data Warehouse; The Information Flow Mechanism; Role of Metadata; Classification of Metadata; Data Warehouse Architecture; Different Types of Architecture; Data Warehouse and Data Marts; Data Warehousing Design Strategies.

Unit-2

Dimensional Modeling – 2.1 Data Warehouse Modeling Vs Operational Database Modeling; Dimensional Model Vs ER Model; Features of a Good Dimensional Model; The Star Schema; How Does a Query Execute? The Snowflake Schema; Fact Tables and Dimension Tables; The Factless Fact Table; Updates To Dimension Tables: Slowly Changing Dimensions, Type 1 Changes, Type 2 Changes, Type 3 Changes, Large Dimension Tables, Rapidly Changing or Large Slowly Changing Dimensions, Junk Dimensions, Keys in the Data Warehouse Schema, Primary Keys, Surrogate Keys & Foreign Keys; Aggregate Tables; Fact Constellation Schema or Families of Star.

Unit-3

ETL Process – 3.1 Challenges in ETL Functions; Data Extraction; Identification of Data Sources; Extracting Data: Immediate Data Extraction, Deferred Data Extraction; Data Transformation: Tasks Involved in Data Transformation,Data Loading: Techniques of Data Loading, Loading the Fact Tables and Dimension Tables Data Quality; Issues in Data Cleansing.

Unit-4

Online Analytical Processing (OLAP) – Need for Online Analytical Processing; OLTP V/s OLAP; OLAP and Multidimensional Analysis; Hypercubes; OLAP Operations in Multidimensional Data Model; OLAP Models: MOLAP, ROLAP, HOLAP, DOLAP;

Unit-5

Introduction to data mining – 5.1 What is Data Mining; Knowledge Discovery in Database (KDD), What can be Data to be Mined, Related Concept to Data

Unit-6

Data Exploration – 6.1 Types of Attributes; Statistical Description of Data; Data Visualization; Measuring similarity and dissimilarity.

Unit-7

Data Preprocessing – 7.1 Why Preprocessing? Data Cleaning; Data Integration; Data Reduction:
Attribute subset selection, Histograms, Clustering and Sampling; Data Transformation & Data Discretization: Normalization, Binning, Histogram Analysis and Concept hierarchy generation.

Unit-8

Classification 8.1 Basic Concepts; Classification methods:

1. Decision Tree Induction: Attribute Selection Measures, Tree pruning.
2. Bayesian Classification: Naïve Bayes’ Classifier.

8.2 Prediction: Structure of regression models; Simple linear regression, Multiple linear regression.
8.3 Model Evaluation & Selection: Accuracy and Error measures, Holdout, Random Sampling, Cross Validation, Bootstrap; Comparing Classifier performance using ROC Curves.
8.4 Combining Classifiers: Bagging, Boosting, Random Forests.

Unit-9

Clustering – 9.1 What is clustering? Types of data, Partitioning Methods (K-Means, KMedoids)
Hierarchical Methods(Agglomerative , Divisive, BRICH), Density-Based Methods ( DBSCAN, OPTICS)

Unit-10

Mining Frequent Pattern and Association Rule10.1 Market Basket Analysis, Frequent Itemsets, Closed Itemsets, and Association Rules; Frequent Pattern Mining, Efficient and Scalable Frequent Itemset Mining Methods, The Apriori Algorithm for finding Frequent Itemsets Using Candidate Generation, Generating Association Rules from Frequent Itemsets, Improving the Efficiency of Apriori, A pattern growth approach for mining Frequent Itemsets; Mining Frequent itemsets using vertical data formats; Mining closed and maximal patterns; Introduction to Mining Multilevel Association Rules and Multidimensional Association Rules; From Association Mining to Correlation Analysis, Pattern Evaluation Measures; Introduction to Constraint-Based Association Mining.

Text Books:

1) Han, Kamber, “Data Mining Concepts and Techniques”, Morgan Kaufmann 3nd Edition
2) Paulraj Ponniah, “Data Warehousing: Fundamentals for IT Professionals”, Wiley India
3) Reema Theraja “Data warehousing”, Oxford University Press.
4) M.H. Dunham, “Data Mining Introductory and Advanced Topics”, Pearson Education

Reference Books:

1) Randall Matignon, “Data Mining using SAS enterprise miner “, Wiley Student edition.
2) Alex Berson , S. J. Smith, “Data Warehousing, Data Mining & OLAP” , McGraw Hill.
3) Vikram Pudi & Radha Krishna, “Data Mining”, Oxford Higher Education.
4) Daniel Larose, “Data Mining Methods and Models”, Wiley India.

Human Machine Interaction

Objectives:

1. To stress the importance of a good interface design.
2. To understand the importance of human psychology in designing good interfaces.
3. To motivate students to apply HMI in their day – to – day activities.
4. To bring out the creativity in each student – build innovative applications that are user-friendly.
5. To encourage students to indulge in research in Machine Interface Design.

Outcomes:

1. To design user-centric interfaces.
2. To design innovative and user-friendly interfaces.
3. To apply HMI in their day-to-day activities.
4. To criticise existing interface designs, and improve them.
5. To Design application for the social and technical task.

Unit-1

Introduction – 1.1 Introduction to Human Machine Interface, Hardware, software and operating environment to use HMI in various fields.

1.2 The psychopathology of everyday things – complexity of modern devices; human-centred design; fundamental principles of interaction; Psychology of everyday actions- how people do things; the seven stages of action and three levels of processing; human error;

Unit-2

Understanding goal directed design – 2.1 Goal directed design; Implementation models and mental models; Beginners, experts and intermediates – designing for different experience levels; Understanding users; Modeling users – personas and goals.

Unit-3

GUI – 3.1 benefits of a good UI; popularity of graphics; concept of direct manipulation; advantages and disadvantages; characteristics of GUI; characteristics of Web UI; General design principles.

Unit-4

Design guidelines – 4.1 perception, Gesalt principles, visual structure, reading is unnatural, color,
vision, memory, six behavioral patterns, recognition and recall, learning, factors affecting learning, time.

Unit-5

Interaction styles – 5.1 menus; windows; device-based controls, screen-based controls;.

Unit-6

Communication – 6.1 text messages; feedback and guidance; graphics, icons and images;
colours.

Text Books:

1. Alan Dix, J. E. Finlay, G. D. Abowd, R. Beale “Human Computer Interaction”, Prentice Hall.
2. Wilbert O. Galitz, “The Essential Guide to User Interface Design”, Wiley publication.
3. Alan Cooper, Robert Reimann, David Cronin, “About Face3: Essentials of Interaction design”, Wiley publication.
4. Jeff Johnson, “Designing with the mind in mind”, Morgan Kaufmann Publication.
5. Donald A. Normann, “Design of everyday things”, Basic Books; Reprint edition 2002.

Reference Books:

1. Donald A. Norman, “The design of everyday things”, Basic books.
2. Rogers Sharp Preece, “Interaction Design:Beyond Human Computer Interaction”, Wiley.
3. Guy A. Boy “The Handbook of Human Machine Interaction”, Ashgate publishing Ltd.

Parallel and Distributed Systems

Objectives:

1. To provide students with contemporary knowledge in parallel and distributed systems
2. To equip students with skills to analyze and design parallel and distributed applications.
3. To provide master skills to measure the performance of parallel and distributed
algorithms
Outcomes: Learner will be able to…

1. Apply the principles and concept in analyzing and designing the parallel and distributed system
2. Reason about ways to parallelize problems.
3. Gain an appreciation on the challenges and opportunities faced by parallel and distributed systems.
4. Understand the middleware technologies that support distributed applications such as RPC, RMI and object based middleware.
5. Improve the performance and reliability of distributed and parallel programs.

Unit-1

Introduction – 1.1 Parallel Computing, Parallel Architecture, Architectural Classification
Scheme, Performance of Parallel Computers, Performance Metrics for Processors, Parallel Programming Models, Parallel Algorithms.

Unit-2

Pipeline Processing – 2.1 Introduction, Pipeline Performance, Arithmetic Pipelines, Pipelined
Instruction Processing, Pipeline Stage Design, Hazards, Dynamic Instruction Scheduling,

Unit-3

Synchronous Parallel Processing – 3.1 Introduction, Example-SIMD Architecture and Programming Principles, SIMD Parallel Algorithms, Data Mapping and memory in array processors, Case studies of SIMD parallel Processor.

Unit-4

Introduction to Distributed Systems – 4.1 Definition, Issues, Goals, Types of distributed systems, Distributed System Models, Hardware concepts, Software Concept, Models of Middleware, Services offered by middleware, Client-Server model.

Unit-5

Communication – 5.1 Layered Protocols, Remote Procedure Call, Remote Object Invocation,
Message Oriented Communication, Stream Oriented Communication

Unit-6

Resource and Process Management – 6.1 Desirable Features of global Scheduling algorithm, Task assignment approach, Load balancing approach, load sharing approach, Introduction
to process management, process.migration, Threads, Virtualization, Clients, Servers, Code Migration.

Unit-7

Synchronization – 7.1 Clock Synchronization, Logical Clocks, Election Algorithms, Mutual Exclusion, Distributed Mutual Exclusion-Classification of mutual Exclusion Algorithm, Requirements of Mutual Exclusion Algorithms, Performance measure, Non Token based Algorithms: Lamport Algorithm, Ricart–Agrawala’s Algorithm, Maekawa’s Algorithm
7.2 Token Based Algorithms: Suzuki-Kasami’s Broardcast Algorithms, Singhal’s Heurastic Algorithm, Raymond’s Tree based Algorithm,
Comparative Performance Analysis.

Unit-8

Consistency and Replication – 8.1 Introduction, Data-Centric and Client-Centric Consistency Models, Replica Management. Distributed File Systems.
8.2 Introduction, good features of DFS, File models, File Accessing models, File-Caching Schemes, File Replication, Network File System(NFS), Andrew File System(AFS), Hadoop Distributed File System and Map Reduce.

Text Books:

1. M.R. Bhujade, “Parallel Computing”, 2nd edition, New Age International Publishers 2009.
2. Andrew S. Tanenbaum and Maarten Van Steen, “Distributed Systems: Principles and Paradigms, 2nd edition, Pearson Education, Inc., 2007

Reference Books:

1. George Coulouris, Jean Dollimore, Tim Kindberg, “Distributed Systems: Concepts and
Design” (4th Edition), Addison Wesley/Pearson Education.
2. Pradeep K Sinha, “Distributed Operating Systems : Concepts and design”, IEEE
computer society press

Elective-III Machine Learning

Objectives:

1. To introduce students to the basic concepts and techniques of Machine Learning.
2. To become familiar with regression methods, classification methods, clustering methods.
3. To become familiar with support vector machine and Dimensionality reduction Techniques.

Outcomes:

1. Ability to analyze and appreciate the applications which can use Machine Learning Techniques.
2. Ability to understand regression, classification, clustering methods.
3. Ability to understand the difference between supervised and unsupervised learning methods.
4. Ability to appreciate Dimensionality reduction techniques.
5. Students would understand the working of Reinforcement learning.

Unit-1

Introduction to Machine Learning – 1.1 What is Machine Learning?, Key Terminology, Types of Machine Learning, Issues in Machine Learning, Application of Machine Learning, How to choose the right algorithm, Steps in developing a Machine Learning Application.

Unit-2

Learning with Regression – 2.1 Linear Regression, Logistic Regression.

Unit-3

Learning with trees – 3.1 Using Decision Trees, Constructing Decision Trees, Classification and
Regression Trees (CART).

Unit-4

Support Vector Machines(SVM) – 4.1 Maximum Margin Linear Separators, Quadratic Programming solution to finding maximum margin separators, Kernels for learning non-linear
functions.

Unit-5

Learning with Classification – 5.1 Rule based classification, classification by backpropoagation, Bayesian Belief networks, Hidden Markov Models.

Unit-6

Dimensionality Reduction – 6.1 Dimensionality Reduction Techniques, Principal Component Analysis, Independent Component Analysis.

Unit-7 

Learning with Clustering – 7.1 K-means clustering, Hierarchical clustering, Expectation Maximization Algorithm, Supervised learning after clustering, Radial Basis functions.

Unit-8

Reinforcement Learning – 8.1 Introduction, Elements of Reinforcement Learning, Model-based learning, Temporal Difference Learning, Generalization, Partially Observable States.

Text Books:

1. Peter Harrington “Machine Learning In Action”, DreamTech Press
2. Ethem Alpaydın, “Introduction to Machine Learning”, MIT Press
3. Tom M.Mitchell “Machine Learning” McGraw Hill
4. Stephen Marsland, “Machine Learning An Algorithmic Perspective” CRC Press

Reference Books:

1. William W.Hsieh, “Machine Learning Mehods in the Environmental Sciences”, Cambridge
2. Han Kamber, “Data Mining Concepts and Techniques”, Morgann Kaufmann Publishers
3. Margaret.H.Dunham, “Data Mining Introductory and Advanced Topics”, Pearson Education

Elective-III Embedded Systems

Objectives:

1. Develop, among students, an understanding of the technologies behind the embedded computing systems; and to differentiate between such technologies.
2. Make aware of the capabilities and limitations of the various hardware or software components.
3. Evaluate design tradeoffs between different technology choices.
4. Complete or partial design of such embedded systems

Unit-1

Introduction to computational technologies – 1.1 Review of computation technologies (ARM, RISC, CISC, PLD, SOC), architecture, event managers, hardware multipliers, pipelining. Hardware / Software co-design. Embedded systems architecture and design process.

Unit-2

Program Design and Analysis – 2.1 Integrated Development Environment (IDE), assembler, linking and loading. Program-level performance analysis and optimization, energy and power analysis and program size optimization, program validation and testing. Embedded Linux, kernel architecture, GNU cross platform tool chain. Programming with Linux environment.

Unit-3

Process Models and Product development life cycle management – 3.1 State machine models: finite-state machines (FSM), finite-state machines with data-path model (FSMD),hierarchical / concurrent state machine.model (HCFSM), program-state machine model (PSM), concurrent process model. Unified Modeling Language (UML), applications of UML in embedded systems. IP-cores, design process model. Hardware software co-design, embedded product development lifecycle management.

Unit-4

High Performance 32-bit RISC Architecture – 4.1 ARM processor family, ARM architecture, instruction set, addressing modes, operating modes, interrupt structure, and internal peripherals.
ARM coprocessors, ARM Cortex-M3.

Unit-5 

Processes and Operating Systems – 5.1 Introduction to Embedded Operating System, multiple tasks and multiple processes. Multi rate systems, preemptive real-time operating systems, priority-based scheduling, inter-process communication mechanisms. Operating system performance and optimization strategies. Examples of real-time operating systems.

Unit-6

Real-time Digital Signal Processing (DSP) – 6.1 Introduction to Real-time simulation, numerical solution of the mathematical model of physical system. DSP on ARM, SIMD techniques. Correlation, Convolution, DFT, FIR filter and IIR Filter implementation on ARM. Open
Multimedia Applications Platform (OMAP).

Text Books:

1. Embedded Systems an Integrated Approach – Lyla B Das, Pearson
2. Computers as Components – Marilyn Wolf, Third Edition Elsevier
3. Embedded Systems Design: A Unified Hardware/Software Introduction – Frank Vahid and Tony Givargis, John Wiley & Sons
4. An Embedded Software Primer – David E. Simon – Pearson Education Sough Asia
5. ARM System Developer’s Guide Designing and Optimizing System Software –
Andrew N. Sloss, Dominic Sysmes and Chris Wright – Elsevier Inc.

Reference Books:

1. Embedded Systems, Architecture, Programming and Design Raj Kamal Tata McGraw Hill
2. Embedded Linux – Hollabaugh, Pearson Education

Elective-III Adhoc Wireless Networks

Unit-1

Introduction – 1.1 Introduction to wireless Networks. Characteristics of Wireless channel, Issues in Ad hoc wireless networks, Adhoc Mobility Models:- Indoor and outdoor models.
1.2 Adhoc Networks: Introduction to adhoc networks – definition, characteristics features, applications.

Unit-2

MAC Layer – 2.1 MAC Protocols for Ad hoc wireless Networks: Introduction, Issues in designing a MAC protocol for Ad hoc wireless Networks, Design goals and Classification of a MAC protocol, Contention-based protocols with reservation mechanisms.
2.2 Scheduling algorithms, protocols using directional antennas. IEEE standards: 802.11a, 802.11b, 802.11g, 802.15, 802.16, HIPERLAN.

Unit-3

Network Layer – 3.1 Routing protocols for Ad hoc wireless Networks: Introduction, Issues in
designing a routing protocol for Ad hoc Wireless Networks, Classification of routing protocols, Table-driven routing protocol, Ondemand routing protocol.
3.2 Proactive Vs reactive routing, Unicast routing algorithms, Multicast routing algorithms, hybrid routing algorithm, Energy-aware routing algorithm, Hierarchical Routing, QoS aware routing.

Unit-4

Transport Layer – 4.1 Transport layer protocols for Ad hoc wireless Networks: Introduction,
Issues in designing a transport layer protocol for Ad hoc wireless Networks, Design goals of a transport layer protocol for Ad hoc wireless.Networks, Classification of transport layer solutions, TCP over Ad hoc wireless Networks, Other transport layer protocols for Ad hoc wireless Networks.

Unit-5

Security – 5.1 Security: Security in wireless Ad hoc wireless Networks, Network security requirements, Issues & challenges in security provisioning, Network security attacks, Key management, Secure routing in Ad hoc wireless Networks.

Unit-6

QoS – 6.1 Quality of service in Ad hoc wireless Networks: Introduction, Issues and challenges in providing QoS in Ad hoc wireless Networks, Classification of QoS solutions, MAC layer solutions, network layer solutions.

Text Books:

1. Siva Ram Murthy and B.S.Manoj, “Ad hoc Wireless Networks Architectures and protocols”,
2nd edition, Pearson Education, 2007
2. Charles E. Perkins, “Adhoc Networking”, Addison – Wesley, 2000
3. C. K. Toh,”Adhoc Mobile Wireless Networks”, Pearson Education, 2002

Reference Books:

1. Matthew Gast, “802.11 Wireless Networks: The Definitive Guide”, 2nd Edition, O’Reilly Media, April 2005.
2. Stefano Basagni, Marco Conti, Silvia Giordan and Ivan Stojmenovic, “Mobile Adhoc Networking”, Wiley-IEEE Press, 2004.
3. Mohammad Ilyas, “The handbook of Adhoc Wireless Networks”, CRC Press, 2002

Leave a Comment