LANGUAGE IDENTIFICATION USING G-LDA

Tags

Breeding, Fitness, Genetic Algorithm, Gibbs Sampling, Language Identification, Latent Dirichlet Allocation, Roulette Wheel. ---------------------------------------------, Topic Modeling

Language Identification has an important role in Natural Language processing applications as one of the pre-processing steps. There
are various mechanisms in use today to achieve this task with brilliant recognition rates.

Recent years have seen rapid growth in international communication which has lead to the requirement of systems capable of
correctly identifying languages of documents. Possible applications of language identification include information retrieval, web
crawlers, text mining and email filtering.

The paper uses a process called G-LDA [1], which takes concepts from Latent Dirichlet Allocation (LDA) and Genetic Evolution
techniques. This involves framing a set of words having a high frequency of occurrence in any given document. The method was tested
on Leipzig Corpora. The phrases that were evolved through the generations reflected significant improvement.

	International Journa… on AUTOMATIC HEADLIGHT DIMMER A P…
	International Journa… on COMPARATIVE STUDY OF FUZZY LOG…
	Arush Jeetun on COMPARATIVE STUDY OF FUZZY LOG…
	Agbenyi victor I on AUTOMATIC HEADLIGHT DIMMER A P…
	Ipad Stylus on IJRET – UNITY POWER FACT…

International Journal of Research in Engineering and Technology

~ IJRET.org

LANGUAGE IDENTIFICATION USING G-LDA

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply