Language Modeling for Information Retrieval

Передняя обложка
Bruce Croft, John Lafferty
Springer Science & Business Media, 31 мая 2003 г. - Всего страниц: 246
A statisticallanguage model, or more simply a language model, is a prob abilistic mechanism for generating text. Such adefinition is general enough to include an endless variety of schemes. However, a distinction should be made between generative models, which can in principle be used to synthesize artificial text, and discriminative techniques to classify text into predefined cat egories. The first statisticallanguage modeler was Claude Shannon. In exploring the application of his newly founded theory of information to human language, Shannon considered language as a statistical source, and measured how weH simple n-gram models predicted or, equivalently, compressed natural text. To do this, he estimated the entropy of English through experiments with human subjects, and also estimated the cross-entropy of the n-gram models on natural 1 text. The ability of language models to be quantitatively evaluated in tbis way is one of their important virtues. Of course, estimating the true entropy of language is an elusive goal, aiming at many moving targets, since language is so varied and evolves so quickly. Yet fifty years after Shannon's study, language models remain, by all measures, far from the Shannon entropy liInit in terms of their predictive power. However, tbis has not kept them from being useful for a variety of text processing tasks, and moreover can be viewed as encouragement that there is still great room for improvement in statisticallanguage modeling.
 

Содержание

III
1
IV
2
V
6
VI
9
VII
11
VIII
15
IX
18
X
31
XXXVI
137
XXXVII
139
XXXVIII
141
XLII
142
XLIII
143
XLIV
144
XLV
146
XLVI
147

XI
51
XII
54
XIII
57
XIV
58
XV
59
XVI
65
XVII
70
XVIII
73
XIX
76
XX
81
XXI
89
XXII
95
XXIII
96
XXIV
107
XXV
116
XXVI
120
XXVII
125
XXVIII
127
XXIX
129
XXX
130
XXXI
131
XXXII
132
XXXIV
134
XXXV
135
XLVII
148
XLVIII
160
XLIX
167
L
169
LI
171
LII
178
LIII
183
LIV
185
LV
186
LVII
189
LVIII
191
LIX
196
LX
201
LXI
204
LXII
213
LXIII
219
LXIV
220
LXV
221
LXVI
223
LXVII
226
LXVIII
231
LXIX
241
LXX
Авторские права

Другие издания - Просмотреть все

Часто встречающиеся слова и выражения

Библиографические данные