New Mexico Supercomputing Challenge

Computer Linguistics and Sentence Synthesis

Team: 51

School: Los Alamos High

Area of Science: Artificial Intelligence


Interim: Problem Definition:
In our supercomputing challenge project, we are attempting to write a program in the Python programming language that learns new words of the English language and applies those words in some way testable by us. The program will learn new words in a way similar to how the human brain does this. The human brain is a very complex and powerful system, but we have simplified how it actually learns new words. We will apply our simplified method to the Python language and let our program learn new words. First, it will just take a single word you give it, but hopefully we will be able to expand it so that it can actually pull the unknown word out of a sentence and attempt to learn it it then. It could possibly use the context of where the word is in the sentence, among other things, to help it do this.

Problem Solution:
We plan to do this by writing a program that, first, takes a word you provide, and then tries to relate it to other words it knows. Then, it will attempt to classify and define that word. If it can’t, we will have to help it by providing defining adjectives of the word, such as “mammal” to the word “dog”. Then we will write the definition to an external text file, to be read later. This too is similar to how a person would learn because when someone is learning their first language and they run into difficulty they ask for help with certain words. Once we have ensured that our program has learned a lot of words, we hope that we can program it to recognize how two words are similar. For example, if we provide it with the word “mountain”, it would be able to recognize the word “mountaineering”, because it knows the word “mountain”. We could also help the program define new words by defining suffixes, such as “-less”, “-ment”, “ist”, and more. This way, it would know that “mountaineering” is a verb because the “-eering” suffix makes it so. In the future, this part of the program could even be expanded so that the program learns these suffixes on its own as well as the whole word.

Progress:
Up to this time we have begun writing several different experimental programs and functions to help determine the best way in which to construct our program. We have several different ideas of how to go about doing this. We have talked to our mentor and we are setting up a calendar of when we want to be done with each part of of project.

Expected Result:
We hope to eventually have a program that can pull a word that it doesn’t know out of a sentence. Then, it learns this new word over time (with several uses of the word). It will also apply it’s knowledge of words somehow, by writing sentences using the word, or showing words it thinks are similar or creating a rough definition of the word.

Citations:
1. http://ling.umd.edu/~ellenlau/courses/ling646/Friederici_2012.pdf
2.http://www.hms.harvard.edu/hmni/On_The_Brain/Volume04/Number4/F95Lang.html
3. http://web.mit.edu/newsoffice/2011/language-from-games-0712.html
4. http://www.dailyrx.com/reading-trouble-may-start-brain-having-trouble-processing-how-words-are-taught-student
5. http://howthebrainlearns.wordpress.com/2012/03/05/how-the-brain-acquires-language/
6. http://delivery.acm.org/10.1145/1180000/1174526/coli.2006.32.3.443.pdf?ip=75.161.124.52&acc=OPEN&CFID=155466490&CFTOKEN=24017142&__acm__=1355191755_529cba8694edb1388bc9b00eb664c61b
7. http://www.gelbukh.com/clbook/Computational-Linguistics.htm#_Toc86751621


Team Members:

  Connor Bailey
  Hayden Walker

Sponsoring Teacher: Lee Goodwin

Mail the entire Team

For questions about the Supercomputing Challenge, a 501(c)3 organization, contact us at: consult @ challenge.nm.org

New Mexico Supercomputing Challenge, Inc.
Post Office Box 30102
Albuquerque, New Mexico 87190
(505) 667-2864

Supercomputing Challenge Board of Directors
Board page listing meetings and agendas
If you have volunteered for the Challenge, please fill out our In Kind form.