Royal University of Phnom Penh: Excellence in Education
Discover Cambodia’s largest and oldest university, where education, research, and service to society come together to shape future leaders and innovators.
Global Partnerships for a Brighter Future
RUPP fosters global connections to enhance learning opportunities and cultural exchange for students and staff.
Research and Innovation
RUPP is at the forefront of innovation, driving impactful research in science, technology, and social development to address real-world challenges.
Diverse Faculties and Institutes
Explore a wide range of academic programs from Science, Social Sciences, Engineering, and Humanities to the renowned Institute of Foreign Languages and Development Studies.

The aim of this course is to provide students with an advanced level of study in classical and web information retrieval, including web search and the related areas of text classification and text clustering. It gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents and of methods for evaluating systems, along with an introduction to the use of machine learning methods on text collections. Designed as the primary course for a graduate or advanced undergraduate course in information retrieval, the course assumes students have attended introductory courses in data structures and algorithms, linear algebra, and probability theory.
The potential topics that will be covered in this course include:

  1. Boolean retrieval
  2. The term vocabulary and postings lists
  3. Dictionaries and tolerant retrieval
  4. Index construction
  5. Index compression
  6. Scoring, term weighting, and the vector space model
  7. Computing score in a complete search system
  8. Evaluation in information retrieval
  9. Relevance feedback and query expansion
  10. XML Retrieval
  11. Probabilistic information retrieval
  12. Language models for information retrieval
  13. Text classification
  14. Vector space classification
  15. Support vector machines on documents
  16. Flat clustering
  17. Hierarchical clustering
  18. Web search basics
  19. Web crawling and indexes
  20. Link analysis