Quick Search

Threads: 9,438
Posts: 33,518
Members: 3,615
Welcome to our newest member, m6ll7yk2y
Indian Domain Name Blog

TBR Domains

DomainBits Blog

AmazingDomains.co.uk

Domain Name Forum

Buy a Domains Name

Sell Domain Names

Arbel Arif

Domainer from India

Hyderabad

DN Attorney

Australian Domains

Hindi Domain Names

Napier Domains

UK Domain Forum





 
INForum.in - Home of the Indian Domain Name Industry
  #1 (permalink)  
Old 04-01-2009, 08:02 AM
Ross Ross is offline
Junior Member
 
Join Date: Feb 2009
Posts: 13
iTrader: (0)
Thanks: 0
Thanked 1 Time in 1 Post
Ross is on a distinguished road
Post Statistics for .IN zone file

I am developing algorithm for keywords detection in domain names. I obtained a copy of .IN zone file and applied this algorithm to it.

Here is some statistics (done only on .IN):
Total number of domains: 315K
I have splitted 175K domains. Domains that are only numeric are not included.
Dictionary used is 100K English only words.

Total number of keywords detected: 366K or 29K unique.
Some domain like "name-keyword.in", where name is Indian proper name, for example, are splitted and only keyword included in final statistics.

Following is a list of keyword frequencies for first 300:


Code:
i       7200    em      404     up      259
india   6495    test    402     ads     258
on      3008    star    401     play    254
line    2504    or      401     micro   253
the     2447    your    401     finance 252
my      2348    mail    400     film    251
in      2149    love    399     wedding 251
tech    1786    digital 395     college 248
web     1495    ate     393     center  246
indian  1470    education393    site    246
group   1447    us      391     way     246
world   1379    market  390     creative246
net     1264    guide   386     people  244
it      1163    shop    382     inc     243
an      1102    technologies382 security242
travel  1077    loan    380     tour    241
info    1056    service 377     click   239
and     1038    blue    373     pay     239
free    1015    times   366     team    238
go      969     card    365     casino  237
jobs    958     chennai 361     today   236
solutions956    network 358     lab     236
of      944     poker   357     good    236
en      925     hosting 353     directory235
home    906     sun     352     dream   234
media   903     game    350     vision  233
health  892     food    345     future  233
global  862     ur      344     company 233
to      858     first   344     sky     233
city    853     biz     343     reliance 232
am      851     plus    339     royal   232
me      823     phone   337     san     231
tv      798     cheap   336     call    230
life    792     bio     335     baby    229
sex     774     data    331     products228
ad      758     books   330     photo   227
design  755     red     330     planet  227
business734     get     328     cars    227
news    730     os      328     simply  226
hotel   720     realest 327     movies  225
hotels  711     pc      325     corp    225
care    693     zone    323     cash    223
car     692     holidays320     print   222
at      687     win     320     mall    222
mobile  685     tar     319     deals   220
art     670     travels 319     law     220
as      663     tours   318     mind    219
club    656     ind     317     girls   216
hop     641     max     316     tourism 215
services639     med     316     video   215
hi      632     eco     314     corporate215
pro     630     gold    312     academy 213
one     627     soft    311     consultants213
air     608     capital 310     foundation213
green   604     sms     309     solar   211
new     602     consulting305   source  210
best    595     just    304     bazaar  209
no      589     ms      303     fun     209
all     588     marketing302    tex     209
insurance577    cricket 302     fly     208
is      576     internet301     pages   208
for     565     properties301   now     207
property563     sports  301     centre  207
power   536     point   300     cards   207
music   536     raj     300     loans   206
job     532     direct  300     kids    206
delhi   529     time    300     dr      205
live    529     porn    300     techno  205
search  524     asia    296     dating  205
ala     524     energy  295     talk    204
do      520     homes   292     he      203
you     514     im      292     open    203
school  514     we      291     log     203
bank    509     space   290     work    201
money   494     box     288     radio   201
man     493     land    287     euro    201
smart   486     bangalore287    help    200
auto    485     career  287     sale    200
com     475     studio  282     store   198
international473real    282     pace    196
systems 473     tel     282     ticket  196
games   469     management281   by      195
buy     466     host    277     shopping195
big     453     forum   275     retail  194
credit  448     fashion 274     solution192
de      445     stock   274     technology191
mart    437     movie   273     golf    190
book    430     find    273     day     189
trade   426     park    271     family  189
pr      426     tore    270     pal     189
ker     426     tax     269     liberty 189
oft     425     be      268     holiday 189
top     424     computer267     mob     189
domain  421     office  267     plan    188
hot     418     super   265     yoga    188
easy    412     medical 261     realty  188
guru    412     water   260     trip    187
software411     con     259     eye     187
house   407     express 259     labs    187
link    405     goa     259     tickets 186
At the beginning some words are "skip words" as expected, like "on", "the", "in" etc.

On my blog some more zone file charts and stats.
__________________
Reply With Quote
  #2 (permalink)  
Old 04-01-2009, 01:35 PM
Jeff's Avatar
Jeff Jeff is online now
Administrator
 
Join Date: Mar 2008
Posts: 2,899
iTrader: (12)
Thanks: 1,631
Thanked 392 Times in 240 Posts
Jeff has much to be proud ofJeff has much to be proud ofJeff has much to be proud ofJeff has much to be proud ofJeff has much to be proud ofJeff has much to be proud ofJeff has much to be proud ofJeff has much to be proud of
Default Re: Statistics for .IN zone file

This is really interesting stuff - thanks so much for putting it together!

Interesting that India and Indian are so common when that's already implied by the .in.

Bangalore does well for a city that's no longer officially called by that name
Reply With Quote
  #3 (permalink)  
Old 04-01-2009, 01:46 PM
Ross Ross is offline
Junior Member
 
Join Date: Feb 2009
Posts: 13
iTrader: (0)
Thanks: 0
Thanked 1 Time in 1 Post
Ross is on a distinguished road
Default Re: Statistics for .IN zone file

Yes I have the same observations.
Do anybody have list of Indian proper names and Geo names (in ascii)? I can try to add them to the dictionary. What is the new name of Bangalore - Bengaluru? I will check if it is in the dictionary. Take note that some proper names do not appear at all due to the fact that my dictionary is not specificaly prepared for Indian names, just a standard aspell with some tuning :-D
__________________
Reply With Quote
  #4 (permalink)  
Old 04-01-2009, 02:57 PM
Ceres's Avatar
Ceres Ceres is offline
Senior Member
 
Join Date: Mar 2008
Location: Canada
Posts: 2,206
iTrader: (0)
Thanks: 544
Thanked 572 Times in 345 Posts
Ceres is a splendid one to beholdCeres is a splendid one to beholdCeres is a splendid one to beholdCeres is a splendid one to beholdCeres is a splendid one to beholdCeres is a splendid one to beholdCeres is a splendid one to behold
Default Re: Statistics for .IN zone file

Ross, thanks for sharing the information. I'm surprised quite a few people include the word "company" in their domain name. I wonder if "company" is part of their business name?
Reply With Quote
  #5 (permalink)  
Old 04-01-2009, 06:02 PM
skyshipper's Avatar
skyshipper skyshipper is offline
Senior Member
 
Join Date: Jan 2009
Location: under the sun
Posts: 418
iTrader: (3)
Thanks: 84
Thanked 72 Times in 49 Posts
skyshipper is on a distinguished road
Wink Re: Statistics for .IN zone file

Quote:
Originally Posted by Ross View Post
I am developing algorithm for keywords detection in domain names. Thanks to Jeff I obtained a copy of .IN zone file and applied this algorithm to it.

Here is some statistics (done only on .IN):
Total number of domains: 315K
I have splitted 175K domains. Domains that are only numeric are not included.
Dictionary used is 100K English only words.

Total number of keywords detected: 366K or 29K unique.
Some domain like "name-keyword.in", where name is Indian proper name, for example, are splitted and only keyword included in final statistics.

Following is a list of keyword frequencies for first 300:

[/code]At the beginning some words are "skip words" as expected, like "on", "the", "in" etc.

On my blog some more zone file charts and stats.
great work! also great work for providing the .in zone.congrats to jeff.

Quote:
Bangalore does well for a city that's no longer officially called by that name
it would be interesting if an analysis is carried out from when the name change occured [around nov06].
if still bangalore scores over bengaluru,it shows the mindset! .ins going to have a hard time overcoming the .co.in headset[which most domainers are trying to overturn]
Quote:
I'm surprised quite a few people include the word "company" in their domain name
actually it is there in some company registration,used as limited,corporation .also limited and corporation is quite popular.oil-oil india limited.lic-life insurance corporation.
atc-assamteacompany.
Reply With Quote
  #6 (permalink)  
Old 04-01-2009, 06:30 PM
Ceres's Avatar
Ceres Ceres is offline
Senior Member
 
Join Date: Mar 2008
Location: Canada
Posts: 2,206
iTrader: (0)
Thanks: 544
Thanked 572 Times in 345 Posts
Ceres is a splendid one to beholdCeres is a splendid one to beholdCeres is a splendid one to beholdCeres is a splendid one to beholdCeres is a splendid one to beholdCeres is a splendid one to beholdCeres is a splendid one to behold
Default Re: Statistics for .IN zone file

Quote:
Originally Posted by kaustavk666 View Post
actually it is there in some company registration,used as limited,corporation .also limited and corporation is quite popular.oil-oil india limited.lic-life insurance corporation.
atc-assamteacompany.
kaustavk666, do you think it's better to include the 'company,' 'corporation' and 'limited' in the domain name when it's part of the business name? Personally, I'd prefer ATC.in, AssamTea.in or AssamTea.co.in rather than AssamTeaCompany.in as they are shorter and easier to remember. Which one do you think works best for Assam Tea Company?
Reply With Quote
  #7 (permalink)  
Old 04-01-2009, 06:39 PM
skyshipper's Avatar
skyshipper skyshipper is offline
Senior Member
 
Join Date: Jan 2009
Location: under the sun
Posts: 418
iTrader: (3)
Thanks: 84
Thanked 72 Times in 49 Posts
skyshipper is on a distinguished road
Wink Re: Statistics for .IN zone file

atc.in,atc.co.in would seem repeatative. assamtea.in -too generic ,since there are various tea companies in assam.
assamteacompany.in is ok, but then the .com will be available for such a long name.

so.
[btw.atc.in redirects to atc.co.in! another one for .co.in]
Reply With Quote
  #8 (permalink)  
Old 04-01-2009, 06:57 PM
Ceres's Avatar
Ceres Ceres is offline
Senior Member
 
Join Date: Mar 2008
Location: Canada
Posts: 2,206
iTrader: (0)
Thanks: 544
Thanked 572 Times in 345 Posts
Ceres is a splendid one to beholdCeres is a splendid one to beholdCeres is a splendid one to beholdCeres is a splendid one to beholdCeres is a splendid one to beholdCeres is a splendid one to beholdCeres is a splendid one to behold
Default Re: Statistics for .IN zone file

Yes, I also prefer ATC.in of them all. Too bad it's not owned by Assam Tea Company - they opted for AssamTeaCompany.com instead.
Reply With Quote
  #9 (permalink)  
Old 04-06-2009, 02:13 PM
RaghavK RaghavK is offline
Senior Member
 
Join Date: Sep 2008
Posts: 413
iTrader: (0)
Thanks: 0
Thanked 3 Times in 3 Posts
RaghavK is on a distinguished road
Default Re: Statistics for .IN zone file

yes..ATC was so much better..I think no one wants to spend a fortune on acquiring the right name..
Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Domain Name Registration Statistics 2008 Ceres Non-Indian Domains 2 10-14-2008 06:58 PM


All times are GMT. The time now is 10:31 PM.

 

Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.0.0

A vBSkinworks Design

Please Note: INForum reserves the right to remove domains that are listed here that we consider, in our sole discretion, to have no legitimate reason for their registration other than to trade off the goodwill of a third party.