MPŠ
MPŠ MP&Scaron MP&Scaron MP&Scaron Avtorji lukatarina

Mednarodna
podiplomska šola
Jožefa Stefana

Jamova 39
SI-1000 Ljubljana
Slovenija

Tel: (01) 477 31 00
Faks: (01) 477 31 10
E-pošta: info@mps.si

Išči

Course description (the part of 4 ECTS that is covered by prof. dr. Bojan Cestnik)

Data and Text Mining (ICT2)
Data Mining and Knowledge Discovery (ICT3)

- Under Construction

Program:

Information and Communication Technologies, second-level study programme

Lecturers:

prof. dr. Nada Lavrač
prof. dr. Bojan Cestnik
dr. Petra Kralj Novak
prof. dr. Dunja Mladenić
doc. dr. Martin Žnidaršič

More course materials can be found here:
http://kt.ijs.si/PetraKralj/IPSKnowledgeDiscovery1112.html

Goals and contents

Knowledge discovery in databases is a process of discovering patterns and models, described by rules or other human understandable representation formalisms. The most important step in this process is data mining, performed by using methods, techniques and tools for automated discovery of patterns and construction of models from data. The course objectives are to:

  • introduce the basics of data mining, the process of knowledge discovery in databases, the CRISP-DM methodology and the basics of knowledge management
  • present standard data formats, train students for the manipulation of tabular data, databases and data warehouses, as well as text, web and multimedia data
  • present selected methods and techniques for mining of tabular data
  • present selected methods and techniques for text, web and multimedia mining
  • train students for practical use of selected data mining techniques and evaluation methods.

In this part of the course we will deal with data representation and manipulation, in particular with presentation of standard data formats, creation and manipulation of tabular data, databases and data warehouses, as well as handling of text, web and multimedia data.

Course materials IKT2:

IKT2 Course I (October 24, 2017, 15:00-18:00): IKT2 DM & KD I

IKT2 Course II (Decemember 12, 2017, 17:00-19:00): IKT2 DM & KD II

Course materials IKT3:

IKT3 Course I (October 25, 2016, 15:00-17:00): IKT3 DM & KD I

Questions and Answers activity: QTvity
Course: MPS DMKD

Data analysis in R:

Instacart Market Basket Analysis: Market.zip; password as for QTvity
Link to kaggle

Points for QTvity collaboration during the course lectures in 2017/18:

    24.10.2017Total
No. Student Ans.PtsΣ andΣ pts
1Ana75.0075.0
2Ilin75.0075.0
3Jasmin75.0075.0
4Luka75.0075.0

Seminar assignment:

ICT2 Students are kindly asked to send me a half page proposal with their seminar problem description. It should contain the title, data set description, data preprocessing steps, and the potential benefits of the proposed activities.

After my approval of the proposed problem students are expected to complete their work and write 15-20 page document using the following template:

Important dates:

  • November 13, 2017, 12:00 Send me a half page seminar problem description,
  • December 11, 2017, 12:00 Send me completed seminar reports (.doc file) and presentations (.ppt file),
  • December 12, 2017, 18:00 Present seminars in front of the class (15 minute presentation, 15 minutes questions and discussion).

Literature:

Gordon S. Linoff. Data Analysis Using SQL and Excel. Wiley, 2008.

Dorian Pyle. Data Preparation for Data Mining. Morgan Kaufmann, 1999.

Gerhard Widmer et al. In Search of the Horowitz Factor. AI Magazine, 2003.

• Ian Witten, Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, 2000.
• Dunja Mladenić, Nada Lavrač, Marko Bohanec, Steve Moyle (ur.). Data Mining and Decision Support: Integration and Collaboration. Kluwer 2003.
• Igor Kononenko, Matjaž Kukar. Machine Learning and Data Mining. Horwood Publishing, 2007.
• Tom Mitchell: Machine Learning. McGraw Hill, 1997.
• Michael Berthold, David J. Hand (ur.). Intelligent Data Analysis: An Introduction, Springer, Berlin-Heidelberg, 1999.
• Sašo Džeroski, Nada Lavrač (ur.). Relational Data Mining. Springer 2001.
• Chakrabarti. S., Mining the Web: Analysis of Hypertext and Semi Structured Data, Morgan Kaufmann, 2002.
• Fayyad, U., Grinstein, G. G. and Wierse, A. (editors), (2001). Information Visualization in Data Mining and Knowledge Discovery, Morgan Kaufmann.

Info for students

9.11.2009: Creation of this site

6.1.2010: Updated course materials and seminar template

20.1.2010: Updated seminar requirements and important dates

22.11.2010: Updated site for 2010/2011

15.11.2011: Updated site for 2011/2012

26.11.2014: Updated site for 2014/2015

23.11.2015: Updated site for 2015/2016

25.11.2016: Updated site for 2016/2017

24.10.2017: Updated site for 2017/2018