Machine Learning & Support Vector Machines (SVM) Lecture 9 CSCSI 494

March 7th, 2012 admin No comments

Last lecture we discussed the major topics in machine learning and an important classification algorithm entitled Support Vector Machines (SVM). Modern search engines use a combination of these fundamental techniques to find the relevance of documents w.r.t a given search query.
Papers to read: Standard SVM [Cortes and Vapnik, 1995]

We also discussed a two dimensional problem and showed how to expand a machine learning problem to infinite dimensions.

Lecture Notes

Share and Enjoy:
  • Print this article!
  • E-mail this story to a friend!
  • Facebook
  • TwitThis
  • del.icio.us
  • Digg
  • LinkedIn
  • Ping.fm
  • Sphinn
  • Yahoo! Buzz
  • Technorati
Categories: Lectures, Machine Learning Tags:

Probabilistic Retrieval Models - Lect. 8 CSCS 494

March 1st, 2012 admin No comments

Lecture 8 covered probabilistic retrieval models and some review of basic probability theory.
Also covered were some of the early retrieval models. Vector space models and cosine similarity.

Slides for Lect 8.

Share and Enjoy:
  • Print this article!
  • E-mail this story to a friend!
  • Facebook
  • TwitThis
  • del.icio.us
  • Digg
  • LinkedIn
  • Ping.fm
  • Sphinn
  • Yahoo! Buzz
  • Technorati

Lecture 7 - Text Statistics & Document Parsing

February 22nd, 2012 admin No comments

Today we discussed the fundamentals behind text statistics and how to calculate probabilities of n-grams appearing in a document. Here are the slides for Lect. 7.

Some useful links.
Google’s n-grams data
Text REtreival conference
A corpus of twitter data:Tweets2011

Share and Enjoy:
  • Print this article!
  • E-mail this story to a friend!
  • Facebook
  • TwitThis
  • del.icio.us
  • Digg
  • LinkedIn
  • Ping.fm
  • Sphinn
  • Yahoo! Buzz
  • Technorati
Categories: Uncategorized Tags:

Information Retrieval, Indexing, BigTable Lect 6 CSCI 494

February 15th, 2012 admin No comments

Lecture 6 covered crawling issues, indexing, Google’s BigTable, detecting near duplicate and duplicate content.
Also, the next paper was handed out.
R. Song et al,. “Learning Block Importance Models for Web Pages”

Here are the slides from lecture 6

Share and Enjoy:
  • Print this article!
  • E-mail this story to a friend!
  • Facebook
  • TwitThis
  • del.icio.us
  • Digg
  • LinkedIn
  • Ping.fm
  • Sphinn
  • Yahoo! Buzz
  • Technorati
Categories: Uncategorized Tags:

Deriving The Google Matrix

February 1st, 2012 admin No comments

Today we went through Google’s original page rank algorithm and then discussed important modifications. We showed how to derive the “Google Matrix” and how it relates to Markov chains. Here are the slides from lecture 4.

Share and Enjoy:
  • Print this article!
  • E-mail this story to a friend!
  • Facebook
  • TwitThis
  • del.icio.us
  • Digg
  • LinkedIn
  • Ping.fm
  • Sphinn
  • Yahoo! Buzz
  • Technorati
Categories: Lectures Tags: ,