Search This Blog

Thursday, December 31, 2009

XOBDO, the community dictionary with a mission!

By Bikram M. Baruah, Abu Dhabi (UAE), Coordinator, XOBDO.ORG

How many of you, whose mother tongue is Assamese; have the basic knowledge of the languages of the neighbouring states, like Khasi, Ao, Mizo, and Meetelon? Let alone these languages, how many of you know even the basics of few languages of Assam itself? Like Bodo, Mising, Rabha, Karbi? Probably very few of you are, that can be counted on finger-tips. The reverse is also true… there is hardly any genuine effort to learn Assamese by the non-native-speakers residing in the state, let alone those in the neighbouring seven-sister states.

This man-made linguistic barrier is probably one of the main causes of the misunderstanding and the related unrest arising today among the different ethno-linguistic groups of the north-east.

Language

Word Count

Assamese

24702

English

13099

Dimasa

2870

Karbi

1878

Meeteilon

1230

Tai

920

Bodo

804

Mising

634

Hmar

632

Khasi

405

Table 1: Word Count of majoe langugaes in XOBDO.ORG as of 10-March-2010.

 

Can we break this barrier and create a harmonious society of mutual understanding and respect in the entire north-east?

This is what we are trying to do in XOBDO in a small way! It is an informal gathering of people living across the globe to collectively do something good for the region. The love for their mother tongue has generated more than 1300 selfless volunteers who are working day and night from different parts of the world to create this unique project – a multi-directional, multi-lingual, multi-media embedded, online dictionary of the languages of the North-East India. Apart from achieving 24000 Assamese words, it is already galloping ahead with a large corpus of Karbi, Dimasa, Mising and Meeteilon words. Efforts are on to attract volunteers to provide a constant addition of words in the other 16 languages adopted in the project.

XOBDO is an effort of the community. It is also a descriptive dictionary – that is, it does not prescribe spelling and meanings of the words, rather it describes how people use these words. Therefore, the community magazines across the globe like Posoowa, Luitor Pora Mississipi, Prabaxi Bihuwan, Jetuka etc along with the regular newspapers, magazines etc have a very important role to play in this effort. What they print, along with the writings of renowned writers and journalists will dictate what is included in XOBDO. To help in this effort, if possible, we would like to request the local language newspapers and magazines that have online presence to publish in UNICODE, so that XOBDO can analyze them and easily pick up new words from there.

Also, do not forget to register yourself in www.xobdo.org and add words, point out, errors, challenge the exitsing words and their meanings and discuss it with other members to resolve the differences and do many more such things.

Tuesday, September 15, 2009

Making of XOBDO!!

-Bikram M. Baruah, Coordinator, XOBDO.

Sometime in early 2006, in the middle of the Arabian Desert I was enjoying the life of a forced single. Now read this as a life with plenty of extra time to spare on your hobbies!! I used that extra time to write some short-stories, articles etc. In the process, I often had to look through bulky Assamese dictionaries, which was quite frustrating for me. Out of desperation, I googled “Online Assamese Dictionary”, not surprisingly nothing useful came up. Then I googled “Online Bengali Dictionary” and at least 5 results popped-up. So was the case with Hindi, Punjabi, Malayalam, Tamil and other Indian languages.

This prompted me to try out some small experiments with my laptop and a website I already had – www.baruah.in. After few sleepless nights, the experiment clicked!! Far away from its homeland in Assam, Poob-Nagari or Eastern Nagari script appeared in the cyberspace. It was nothing extra-ordinary; there were already thousands of websites in Eastern Nagari script. What was unique about this experiment was that those scripts came out from an underlying structured database and dynamically presented to the user based on his/her queries!! Another important aspect of this experiment was that the script was based on UNICODE, an international standard of various writing systems of the world. Thus, the online Assamese dictionary came to being and the journey began. On March 10, 2006 an email was sent out to the AssamNet with an announcement of the dictionary.

The Growth

I am neither a professional programmer, nor database administrator or not even a linguist. To be successful, the online dictionary needed expertise of all these different fields of people. Fortunately, a ‘hidden force’ drew all the kind of expertise needed for such project to be successful. Priyankoo Sharma, Pallav Saikia, PKD, Ujjwal Saikia initially joined the gang. Gradually more and more people joined in the endeavor. Many people left (or become silent), however, the saga continued. With relentless efforts from Priyankoo, the group continued to have the much needed vitality. He is also instrumental in the publicity of XOBDO, an important area that I always shied away from. Working behind the scene, Pallav Saikia, provided and is still continuing to provide the technical know-how to keep the “heart” of XOBDO, the database, beating. Ujjwal Saikia, an engineering student, assisted in developing some initial parts of the website. In the initial stages of XOBDO Dipankar M. Barua, Dwaipayan Bora, Apurba Mili from Duliajan, Rajib Kr. Dutta of Jorhat, Nilotpal Borpujari from Moscow (now in Dubai), Kishor Kumar Barman and Rubut Maout of TIFR, Mumbai, Hasinus Sultan of Nagaon, Rupam Kumar Sharma from S. Korea, Reshmi Rekha Dutta from Guwhati, Sudipta Gogoi of NIT, Warangal and Anis-Uz-Zaman, CIC, Agomani etc. contributed and gave an impetus to the development of XOBDO.

A number of people from IIT-Guwahati also got involved at different points of time. Buljit Buragohain, Rituraj Saikia, Swapnita Kakati, Archana Rajbonshi, Sanjib Sarma, Dr (Mrs) Krishna Barua to name a few. With their efforts, a large number of good words appended to XOBDO. A successful meeting was also held at IIT-Guwahati campus on 17-Jan-2007 with a good number of attendees. This is the first formal meeting dedicated to XOBDO.

We were always in dearth of editors! For a long time, it was only me who painstakingly carried out the job. Soon Priyankoo took initiative to learn the process and became a very active editor. Again, we two kept editing and approving the submissions for a while. With a large number of word contributions from various quarters, the work-load increased. Among the contributors, two people were requested to become an editor. Rupankar Mahanta and Rupkamal Takuldar came on-board and did a tremendous job as contributors and editors.

The Bridal Make-Up!

 “An open dictionary being created by the people, for the people, of the people” built on Microsoft’s Technology! Why should we be dependent on a corporate giant? It makes more sense, if this dictionary is built on a software also developed by “the people”. So,  in later parts of 2007, a decision was made to redesign XOBDO in Open Source technologies.

Switching the database from Microsoft’s MSSQL to Open Source MySQL was not easy. A lot of planning and synchronized activities were carried out between Pallav Saikia in Hyderabad and me in Abu Dhabi with constant feedback from a number of people, primarily Priyankoo (Florida), Partha (Bhopal), Rajib (Jorhat) and Arup (JNU) between Aug-Nov, 2007. As we were developing the database from scratch, we took this opportunity to incorporate a number of new features – encyclopedic entries, word varieties, subject contexts, related words etc. Pallav did a fantastic job in transferring the existing data from MSSQL to the newly developed MySQL database.

We needed somebody to redevelop the existing frame based webpages that use ASP to access the database to DIV and stylesheet based webpages that would use PHP to access the new MySQL database. Nobody in the group knew these new technologies at that time. We were looking for somebody to help us in this regard. I started to study these from the resources available in the internet. Priyankoo caught hold of Sakib Rahman Saikia studying Computer Science in the University of Florida where Priyankoo, too, was persuing his PhD. Sakib sprung into action… working day and night for around seven days. Sometime in October he developed few sample pages in PHP accessing the MySQL database. Finally, I got a break…. following his footsteps, I started to copy the scripts to develop the other pages. While Sakib and I busied ourselves writing the scripts, Priyankoo helped maintaining the layout and the theme of the pages using his skill of handling stylesheets. As usual we received critical review and beta-testing of our work from members like Rupankar (Delhi), Rupkamal (Mysore).

Thus, our beloved princess XOBDO got a new look … almost like a bridal make-up! We wanted to have a grand inauguration event to show off our beloved ‘bride’. We initiated an effort to have a stall and a release event in the Guwahati Book Fair. The members based in/around Guwahati were contacted. Almost twenty volunteers came forward to form a group called “GHY-Team” to organize the event. Initially Neelotpal and then Kuldeep took the lead to go to Oxom Prokaxon Porixod and find out the details. Few of them met in Nehru Park, Panbazar to discuss the details. However, soon we realized that we had to spend a significant amount of money and effort for it. Looking at the situation, few members suggested that it was not worthwhile spending so much money. If the purpose is to get publicity, we could do it in much better way through the web and other means. Therefore, in spite of the enthusiasm of the members of our “GHY-Team”, we had to abandon the plan. However, Buljit kept some leaflets in the stall of Kiron Prakashan, a Dhemaji based publishing house he is closely associated with.

Very quietly all the traffics were diverted to the new website sometime in late November 2007. People gradually started to pour in to visit the ‘bride’.

FASS (Friends of Assam and Seven Sisters) organized their annual meet in January 2008 and they gave an opportunity to show our ‘bride’ to the public. We took full advantage of the event. The news media covered the event. As usual, many papers published lots of incorrect information. However, some did publish very accurate information. Buljit did a great job in tracking down wrong information and spreading the correct ones.

 

Achieving the 20K goal!

The year 2008 was dedicated to increase the Assamese word count. We had around 10,200 words at the beginning of the year. We set the target of 20,000 words for the year. It was not an easy task. We were not just adding words to the database. For each word, we had to do an semantic analysis, properly write down the meaning, associate appropriate English words, make sure words or the meaning were not duplicated and so on. While the honorary volunteers worked hard, an automated target calculator kept track of how many words needs to add on daily basis. After successfully maintaining the daily targets by painstaking works of a large number of people in a very coordinated manner, we accomplished the final goal of 20000 words 5 days ahead of the dateline. It was a Christmas day and a jubilant moment for XOBDO. The people who worked to reach this goal were - Biraj Kumar Kakati, Anjal Borah Anjali Sonowal, Prasanta Borah, Partha P Sarmah, Prabin Kakat, Prasenjit Khanikar, Priyankoo, Pankaj Bora, Mousumi Hazarika, Rupankar Mahanta, and Abdul Wahab & Papori Gogoi.

Meanwhile, during the year, we added few long pending modules: to handle Fokora-Jujonas (Assamese Proverbs/Idioms), ability to upload images by the contributors, an integrated discussion form, a voting system to measure the (popularity of) usages of the words, extensive use of AJAX to make the pages quicker and more interactive.

Earlier, all search operations could be done only in English or Assamese. Other languages were categorized as "Other Languages" ! We felt no language should be called "Other Language", all the languages of the North-East should be given equal status at least in XOBDO. Codes were modified to search from any language to any other language with a seamless interface.

All these were really big milestones for XOBDO.

 

The silent year

The year 2009 is a relatively quite for XOBDO. Many key members moved from one part of the globe to another. Many got busy in their personal endeavors.

Nevertheless, we pushed hard to attract speakers of the North-Eastern languages, especially Bodo, Karbi, Mising, Dimasa, Meitelon etc. We achieved some success, but not as expected. We did have some very active members contributing to these languages – Anjali Sonowal in Tai and Mising langauges; Lalremthang Hmar in Hmar; Banlam Warjr in Khasi; Mohen Naorem in Meetelon; Kulendra Daulagupu, Anuj Phonglosa, Arnab Phonglosa & Uttam Bathati in Dimasa; Pranab Doley in Mising; Nava Boro, Nwgwt Brahma in Bodo; Morningkeey Phangcho and Dipak Tumung in Karbi. However, number of members is still very low.

2009 is not entirely idle. We did entered around 2500 new words (so far), spent time to quality check and refine the existing entries by active discussions, had all the Assamese words pronunciations written in Roman scripts and most importantly, planed and prepared for a "big way" in 2010 and beyond !!